Resilient Distributed Datasets

Resilient Distributed Datasets. Rdd is the core data abstraction api and is available since very first release of spark (spark 1.0). Spark rdd (resilient distributed dataset):


Resilient Distributed Datasets

Rdds are parallel data structures that enable users to persist intermediate data in. Resilient distributed datasets (rdd) is a fundamental data structure of spark.

Resilient Distributed Datasets Images References :