Packages

class CassandraTableScanRDD[R] extends CassandraRDD[R] with CassandraTableRowReaderProvider[R]

RDD representing a Table Scan of A Cassandra table.

This class is the main entry point for analyzing data in Cassandra database with Spark. Obtain objects of this class by calling com.datastax.spark.connector.SparkContextFunctions.cassandraTable.

Configuration properties should be passed in the SparkConf configuration of SparkContext. CassandraRDD needs to open connection to Cassandra, therefore it requires appropriate connection property values to be present in SparkConf. For the list of required and available properties, see CassandraConnector.

CassandraRDD divides the data set into smaller partitions, processed locally on every cluster node. A data partition consists of one or more contiguous token ranges. To reduce the number of roundtrips to Cassandra, every partition is fetched in batches.

The following properties control the number of partitions and the fetch size: - spark.cassandra.input.split.sizeInMB: approx amount of data to be fetched into a single Spark partition, default 512 MB - spark.cassandra.input.fetch.sizeInRows: number of CQL rows fetched per roundtrip, default 1000

A CassandraRDD object gets serialized and sent to every Spark Executor, which then calls the compute method to fetch the data on every node. The getPreferredLocations method tells Spark the preferred nodes to fetch a partition from, so that the data for the partition are at the same node the task was sent to. If Cassandra nodes are collocated with Spark nodes, the queries are always sent to the Cassandra process running on the same node as the Spark Executor process, hence data are not transferred between nodes. If a Cassandra node fails or gets overloaded during read, the queries are retried to a different node.

By default, reads are performed at ConsistencyLevel.LOCAL_ONE in order to leverage data-locality and minimize network traffic. This read consistency level is controlled by the spark.cassandra.input.consistency.level property.

Linear Supertypes
CassandraTableRowReaderProvider[R], CassandraRDD[R], RDD[R], Logging, Serializable, Serializable, AnyRef, Any
Known Subclasses
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. CassandraTableScanRDD
  2. CassandraTableRowReaderProvider
  3. CassandraRDD
  4. RDD
  5. Logging
  6. Serializable
  7. Serializable
  8. AnyRef
  9. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Type Members

  1. type Self = CassandraTableScanRDD[R]

    This is slightly different than Scala this.type.

    This is slightly different than Scala this.type. this.type is the unique singleton type of an object which is not compatible with other instances of the same type, so returning anything other than this is not really possible without lying to the compiler by explicit casts. Here SelfType is used to return a copy of the object - a different instance of the same type

    Definition Classes
    CassandraTableScanRDDCassandraRDD

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. def ++(other: RDD[R]): RDD[R]
    Definition Classes
    RDD
  4. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  5. def aggregate[U](zeroValue: U)(seqOp: (U, R) ⇒ U, combOp: (U, U) ⇒ U)(implicit arg0: ClassTag[U]): U
    Definition Classes
    RDD
  6. def as[B, A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11](f: (A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10, A11) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5], arg7: TypeConverter[A6], arg8: TypeConverter[A7], arg9: TypeConverter[A8], arg10: TypeConverter[A9], arg11: TypeConverter[A10], arg12: TypeConverter[A11]): CassandraRDD[B]
    Definition Classes
    CassandraRDD
  7. def as[B, A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10](f: (A0, A1, A2, A3, A4, A5, A6, A7, A8, A9, A10) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5], arg7: TypeConverter[A6], arg8: TypeConverter[A7], arg9: TypeConverter[A8], arg10: TypeConverter[A9], arg11: TypeConverter[A10]): CassandraRDD[B]
    Definition Classes
    CassandraRDD
  8. def as[B, A0, A1, A2, A3, A4, A5, A6, A7, A8, A9](f: (A0, A1, A2, A3, A4, A5, A6, A7, A8, A9) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5], arg7: TypeConverter[A6], arg8: TypeConverter[A7], arg9: TypeConverter[A8], arg10: TypeConverter[A9]): CassandraRDD[B]
    Definition Classes
    CassandraRDD
  9. def as[B, A0, A1, A2, A3, A4, A5, A6, A7, A8](f: (A0, A1, A2, A3, A4, A5, A6, A7, A8) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5], arg7: TypeConverter[A6], arg8: TypeConverter[A7], arg9: TypeConverter[A8]): CassandraRDD[B]
    Definition Classes
    CassandraRDD
  10. def as[B, A0, A1, A2, A3, A4, A5, A6, A7](f: (A0, A1, A2, A3, A4, A5, A6, A7) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5], arg7: TypeConverter[A6], arg8: TypeConverter[A7]): CassandraRDD[B]
    Definition Classes
    CassandraRDD
  11. def as[B, A0, A1, A2, A3, A4, A5, A6](f: (A0, A1, A2, A3, A4, A5, A6) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5], arg7: TypeConverter[A6]): CassandraRDD[B]
    Definition Classes
    CassandraRDD
  12. def as[B, A0, A1, A2, A3, A4, A5](f: (A0, A1, A2, A3, A4, A5) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4], arg6: TypeConverter[A5]): CassandraRDD[B]
    Definition Classes
    CassandraRDD
  13. def as[B, A0, A1, A2, A3, A4](f: (A0, A1, A2, A3, A4) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3], arg5: TypeConverter[A4]): CassandraRDD[B]
    Definition Classes
    CassandraRDD
  14. def as[B, A0, A1, A2, A3](f: (A0, A1, A2, A3) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2], arg4: TypeConverter[A3]): CassandraRDD[B]
    Definition Classes
    CassandraRDD
  15. def as[B, A0, A1, A2](f: (A0, A1, A2) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1], arg3: TypeConverter[A2]): CassandraRDD[B]
    Definition Classes
    CassandraRDD
  16. def as[B, A0, A1](f: (A0, A1) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0], arg2: TypeConverter[A1]): CassandraRDD[B]
    Definition Classes
    CassandraRDD
  17. def as[B, A0](f: (A0) ⇒ B)(implicit arg0: ClassTag[B], arg1: TypeConverter[A0]): CassandraRDD[B]

    Maps each row into object of a different type using provided function taking column value(s) as argument(s).

    Maps each row into object of a different type using provided function taking column value(s) as argument(s). Can be used to convert each row to a tuple or a case class object:

    sc.cassandraTable("ks", "table")
      .select("column1")
      .as((s: String) => s)                 // yields CassandraRDD[String]
    
    sc.cassandraTable("ks", "table")
      .select("column1", "column2")
      .as((_: String, _: Long))             // yields CassandraRDD[(String, Long)]
    
    case class MyRow(key: String, value: Long)
    sc.cassandraTable("ks", "table")
      .select("column1", "column2")
      .as(MyRow)                            // yields CassandraRDD[MyRow]
    Definition Classes
    CassandraRDD
  18. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  19. def barrier(): RDDBarrier[R]
    Definition Classes
    RDD
    Annotations
    @Experimental() @Since( "2.4.0" )
  20. def cache(): CassandraTableScanRDD.this.type
    Definition Classes
    RDD
  21. def cartesian[U](other: RDD[U])(implicit arg0: ClassTag[U]): RDD[(R, U)]
    Definition Classes
    RDD
  22. def cassandraCount(): Long

    Counts the number of items in this RDD by selecting count(*) on Cassandra table

    Counts the number of items in this RDD by selecting count(*) on Cassandra table

    Definition Classes
    CassandraTableScanRDDCassandraRDD
  23. lazy val cassandraPartitionerClassName: String
    Attributes
    protected
    Definition Classes
    CassandraTableRowReaderProvider
  24. def checkpoint(): Unit
    Definition Classes
    RDD
  25. implicit val classTag: ClassTag[R]
  26. def cleanShuffleDependencies(blocking: Boolean): Unit
    Definition Classes
    RDD
    Annotations
    @Experimental() @DeveloperApi() @Since( "3.1.0" )
  27. def clearDependencies(): Unit
    Attributes
    protected
    Definition Classes
    RDD
  28. def clone(): AnyRef
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native() @HotSpotIntrinsicCandidate()
  29. def clusteringOrder(order: ClusteringOrder): Self

    Adds a CQL ORDER BY clause to the query.

    Adds a CQL ORDER BY clause to the query. It can be applied only in case there are clustering columns and primary key predicate is pushed down in where. It is useful when the default direction of ordering rows within a single Cassandra partition needs to be changed.

    Definition Classes
    CassandraRDD
  30. val clusteringOrder: Option[ClusteringOrder]
    Definition Classes
    CassandraTableScanRDDCassandraRDD
  31. def coalesce(numPartitions: Int, shuffle: Boolean = false, partitionCoalescer: Option[PartitionCoalescer])(implicit ord: Ordering[R] = null): RDD[R]

    This method overrides the default spark behavior and will not create a CoalesceRDD.

    This method overrides the default spark behavior and will not create a CoalesceRDD. Instead it will reduce the number of partitions by adjusting the partitioning of C* data on read. Using this method will override spark.cassandra.input.split.size. The method is useful with where() method call, when actual size of data is smaller then the table size. It has no effect if a partition key is used in where clause.

    numPartitions

    number of partitions

    shuffle

    whether to call shuffle after

    partitionCoalescer

    is ignored if no shuffle, or just passed to shuffled CoalesceRDD

    returns

    new CassandraTableScanRDD with predefined number of partitions

    Definition Classes
    CassandraTableScanRDD → RDD
  32. def collect[U](f: PartialFunction[R, U])(implicit arg0: ClassTag[U]): RDD[U]
    Definition Classes
    RDD
  33. def collect(): Array[R]
    Definition Classes
    RDD
  34. val columnNames: ColumnSelector
  35. def compute(split: Partition, context: TaskContext): Iterator[R]
    Definition Classes
    CassandraTableScanRDD → RDD
  36. val connector: CassandraConnector
  37. def consistencyLevel: ConsistencyLevel
    Attributes
    protected
    Definition Classes
    CassandraTableRowReaderProvider
  38. def context: SparkContext
    Definition Classes
    RDD
  39. def convertTo[B](implicit arg0: ClassTag[B], arg1: RowReaderFactory[B]): CassandraTableScanRDD[B]
    Attributes
    protected
    Definition Classes
    CassandraTableScanRDDCassandraRDD
  40. def copy(columnNames: ColumnSelector = columnNames, where: CqlWhereClause = where, limit: Option[CassandraLimit] = limit, clusteringOrder: Option[ClusteringOrder] = None, readConf: ReadConf = readConf, connector: CassandraConnector = connector): Self

    Allows to copy this RDD with changing some of the properties

    Allows to copy this RDD with changing some of the properties

    Attributes
    protected
    Definition Classes
    CassandraTableScanRDDCassandraRDD
  41. def count(): Long
    Definition Classes
    RDD
  42. def countApprox(timeout: Long, confidence: Double): PartialResult[BoundedDouble]
    Definition Classes
    RDD
  43. def countApproxDistinct(relativeSD: Double): Long
    Definition Classes
    RDD
  44. def countApproxDistinct(p: Int, sp: Int): Long
    Definition Classes
    RDD
  45. def countByValue()(implicit ord: Ordering[R]): Map[R, Long]
    Definition Classes
    RDD
  46. def countByValueApprox(timeout: Long, confidence: Double)(implicit ord: Ordering[R]): PartialResult[Map[R, BoundedDouble]]
    Definition Classes
    RDD
  47. final def dependencies: Seq[Dependency[_]]
    Definition Classes
    RDD
  48. def distinct(): RDD[R]
    Definition Classes
    RDD
  49. def distinct(numPartitions: Int)(implicit ord: Ordering[R]): RDD[R]
    Definition Classes
    RDD
  50. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  51. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  52. def fetchSize: Int
    Attributes
    protected
    Definition Classes
    CassandraTableRowReaderProvider
  53. def filter(f: (R) ⇒ Boolean): RDD[R]
    Definition Classes
    RDD
  54. def first(): R
    Definition Classes
    RDD
  55. def firstParent[U](implicit arg0: ClassTag[U]): RDD[U]
    Attributes
    protected[org.apache.spark]
    Definition Classes
    RDD
  56. def flatMap[U](f: (R) ⇒ TraversableOnce[U])(implicit arg0: ClassTag[U]): RDD[U]
    Definition Classes
    RDD
  57. def fold(zeroValue: R)(op: (R, R) ⇒ R): R
    Definition Classes
    RDD
  58. def foreach(f: (R) ⇒ Unit): Unit
    Definition Classes
    RDD
  59. def foreachPartition(f: (Iterator[R]) ⇒ Unit): Unit
    Definition Classes
    RDD
  60. def getCheckpointFile: Option[String]
    Definition Classes
    RDD
  61. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  62. def getDependencies: Seq[Dependency[_]]
    Attributes
    protected
    Definition Classes
    RDD
  63. final def getNumPartitions: Int
    Definition Classes
    RDD
    Annotations
    @Since( "1.6.0" )
  64. def getOutputDeterministicLevel: org.apache.spark.rdd.DeterministicLevel.Value
    Attributes
    protected
    Definition Classes
    RDD
    Annotations
    @DeveloperApi()
  65. def getPartitions: Array[Partition]
    Definition Classes
    CassandraTableScanRDD → RDD
  66. def getPreferredLocations(split: Partition): Seq[String]
    Definition Classes
    CassandraTableScanRDD → RDD
  67. def getResourceProfile(): ResourceProfile
    Definition Classes
    RDD
    Annotations
    @Experimental() @Since( "3.1.0" )
  68. def getStorageLevel: StorageLevel
    Definition Classes
    RDD
  69. def glom(): RDD[Array[R]]
    Definition Classes
    RDD
  70. def groupBy[K](f: (R) ⇒ K, p: Partitioner)(implicit kt: ClassTag[K], ord: Ordering[K]): RDD[(K, Iterable[R])]
    Definition Classes
    RDD
  71. def groupBy[K](f: (R) ⇒ K, numPartitions: Int)(implicit kt: ClassTag[K]): RDD[(K, Iterable[R])]
    Definition Classes
    RDD
  72. def groupBy[K](f: (R) ⇒ K)(implicit kt: ClassTag[K]): RDD[(K, Iterable[R])]
    Definition Classes
    RDD
  73. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  74. val id: Int
    Definition Classes
    RDD
  75. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  76. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  77. def intersection(other: RDD[R], numPartitions: Int): RDD[R]
    Definition Classes
    RDD
  78. def intersection(other: RDD[R], partitioner: Partitioner)(implicit ord: Ordering[R]): RDD[R]
    Definition Classes
    RDD
  79. def intersection(other: RDD[R]): RDD[R]
    Definition Classes
    RDD
  80. lazy val isBarrier_: Boolean
    Attributes
    protected
    Definition Classes
    RDD
    Annotations
    @transient()
  81. def isCheckpointed: Boolean
    Definition Classes
    RDD
  82. def isEmpty(): Boolean
    Definition Classes
    RDD
  83. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  84. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  85. final def iterator(split: Partition, context: TaskContext): Iterator[R]
    Definition Classes
    RDD
  86. def keyBy[K]()(implicit classtag: ClassTag[K], rrf: RowReaderFactory[K], rwf: RowWriterFactory[K]): CassandraTableScanRDD[(K, R)]

    Extracts a key of the given class from all the available columns.

    Extracts a key of the given class from all the available columns.

    See also

    keyBy(ColumnSelector)

  87. def keyBy[K](columns: ColumnRef*)(implicit classtag: ClassTag[K], rrf: RowReaderFactory[K], rwf: RowWriterFactory[K]): CassandraTableScanRDD[(K, R)]

    Extracts a key of the given class from the given columns.

    Extracts a key of the given class from the given columns.

    See also

    keyBy(ColumnSelector)

  88. def keyBy[K](columns: ColumnSelector)(implicit classtag: ClassTag[K], rrf: RowReaderFactory[K], rwf: RowWriterFactory[K]): CassandraTableScanRDD[(K, R)]

    Selects a subset of columns mapped to the key and returns an RDD of pairs.

    Selects a subset of columns mapped to the key and returns an RDD of pairs. Similar to the builtin Spark keyBy method, but this one uses implicit RowReaderFactory to construct the key objects. The selected columns must be available in the CassandraRDD.

    If the selected columns contain the complete partition key a CassandraPartitioner will also be created.

    columns

    column selector passed to the rrf to create the row reader, useful when the key is mapped to a tuple or a single value

  89. def keyBy[K](f: (R) ⇒ K): RDD[(K, R)]
    Definition Classes
    RDD
  90. val keyspaceName: String
  91. def limit(rowLimit: Long): Self

    Adds the limit clause to CQL select statement.

    Adds the limit clause to CQL select statement. The limit will be applied for each created Spark partition. In other words, unless the data are fetched from a single Cassandra partition the number of results is unpredictable.

    The main purpose of passing limit clause is to fetch top n rows from a single Cassandra partition when the table is designed so that it uses clustering keys and a partition key predicate is passed to the where clause.

    Definition Classes
    CassandraRDD
  92. val limit: Option[CassandraLimit]
    Definition Classes
    CassandraTableScanRDDCassandraRDD
  93. def localCheckpoint(): CassandraTableScanRDD.this.type
    Definition Classes
    RDD
  94. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  95. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  96. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  97. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  98. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  99. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  100. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  101. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  102. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  103. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  104. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  105. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  106. def map[U](f: (R) ⇒ U)(implicit arg0: ClassTag[U]): RDD[U]
    Definition Classes
    RDD
  107. def mapPartitions[U](f: (Iterator[R]) ⇒ Iterator[U], preservesPartitioning: Boolean)(implicit arg0: ClassTag[U]): RDD[U]
    Definition Classes
    RDD
  108. def mapPartitionsWithIndex[U](f: (Int, Iterator[R]) ⇒ Iterator[U], preservesPartitioning: Boolean)(implicit arg0: ClassTag[U]): RDD[U]
    Definition Classes
    RDD
  109. def max()(implicit ord: Ordering[R]): R
    Definition Classes
    RDD
  110. def min()(implicit ord: Ordering[R]): R
    Definition Classes
    RDD
  111. def minimalSplitCount: Int
  112. var name: String
    Definition Classes
    RDD
  113. def narrowColumnSelection(columns: Seq[ColumnRef]): Seq[ColumnRef]

    Filters currently selected set of columns with a new set of columns

    Filters currently selected set of columns with a new set of columns

    Definition Classes
    CassandraTableRowReaderProvider
  114. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  115. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  116. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @HotSpotIntrinsicCandidate()
  117. def parent[U](j: Int)(implicit arg0: ClassTag[U]): RDD[U]
    Attributes
    protected[org.apache.spark]
    Definition Classes
    RDD
  118. lazy val partitionGenerator: CassandraPartitionGenerator[V, T]
    Annotations
    @transient()
  119. val partitioner: Option[Partitioner]
    Definition Classes
    CassandraTableScanRDD → RDD
  120. final def partitions: Array[Partition]
    Definition Classes
    RDD
  121. def perPartitionLimit(rowLimit: Long): Self

    Adds the PER PARTITION LIMIT clause to CQL select statement.

    Adds the PER PARTITION LIMIT clause to CQL select statement. The limit will be applied for every Cassandra Partition. Only Valid For Cassandra 3.6+

    Definition Classes
    CassandraRDD
  122. def persist(): CassandraTableScanRDD.this.type
    Definition Classes
    RDD
  123. def persist(newLevel: StorageLevel): CassandraTableScanRDD.this.type
    Definition Classes
    RDD
  124. def pipe(command: Seq[String], env: Map[String, String], printPipeContext: ((String) ⇒ Unit) ⇒ Unit, printRDDElement: (R, (String) ⇒ Unit) ⇒ Unit, separateWorkingDir: Boolean, bufferSize: Int, encoding: String): RDD[String]
    Definition Classes
    RDD
  125. def pipe(command: String, env: Map[String, String]): RDD[String]
    Definition Classes
    RDD
  126. def pipe(command: String): RDD[String]
    Definition Classes
    RDD
  127. final def preferredLocations(split: Partition): Seq[String]
    Definition Classes
    RDD
  128. def randomSplit(weights: Array[Double], seed: Long): Array[RDD[R]]
    Definition Classes
    RDD
  129. val readConf: ReadConf
  130. def reduce(f: (R, R) ⇒ R): R
    Definition Classes
    RDD
  131. def repartition(numPartitions: Int)(implicit ord: Ordering[R]): RDD[R]
    Definition Classes
    RDD
  132. lazy val rowReader: RowReader[R]
  133. implicit val rowReaderFactory: RowReaderFactory[R]

    RowReaderFactory and ClassTag should be provided from implicit parameters in the constructor of the class implementing this trait

    RowReaderFactory and ClassTag should be provided from implicit parameters in the constructor of the class implementing this trait

    Definition Classes
    CassandraTableScanRDDCassandraTableRowReaderProvider
    See also

    CassandraTableScanRDD

  134. def sample(withReplacement: Boolean, fraction: Double, seed: Long): RDD[R]
    Definition Classes
    RDD
  135. def saveAsObjectFile(path: String): Unit
    Definition Classes
    RDD
  136. def saveAsTextFile(path: String, codec: Class[_ <: CompressionCodec]): Unit
    Definition Classes
    RDD
  137. def saveAsTextFile(path: String): Unit
    Definition Classes
    RDD
  138. val sc: SparkContext
  139. def select(columns: ColumnRef*): Self

    Narrows down the selected set of columns.

    Narrows down the selected set of columns. Use this for better performance, when you don't need all the columns in the result RDD. When called multiple times, it selects the subset of the already selected columns, so after a column was removed by the previous select call, it is not possible to add it back.

    The selected columns are ColumnRef instances. This type allows to specify columns for straightforward retrieval and to read TTL or write time of regular columns as well. Implicit conversions included in com.datastax.spark.connector package make it possible to provide just column names (which is also backward compatible) and optional add .ttl or .writeTime suffix in order to create an appropriate ColumnRef instance.

    Definition Classes
    CassandraRDD
  140. def selectedColumnNames: Seq[String]
    Definition Classes
    CassandraRDD
  141. lazy val selectedColumnRefs: IndexedSeq[ColumnRef]

    Returns the columns to be selected from the table.

    Returns the columns to be selected from the table.

    Definition Classes
    CassandraTableRowReaderProvider
  142. def setName(_name: String): CassandraTableScanRDD.this.type
    Definition Classes
    RDD
  143. def sortBy[K](f: (R) ⇒ K, ascending: Boolean, numPartitions: Int)(implicit ord: Ordering[K], ctag: ClassTag[K]): RDD[R]
    Definition Classes
    RDD
  144. def sparkContext: SparkContext
    Definition Classes
    RDD
  145. def splitCount: Option[Int]
    Attributes
    protected
    Definition Classes
    CassandraTableRowReaderProvider
  146. def splitSize: Long
    Attributes
    protected[spark.connector]
    Definition Classes
    CassandraTableRowReaderProvider
  147. def subtract(other: RDD[R], p: Partitioner)(implicit ord: Ordering[R]): RDD[R]
    Definition Classes
    RDD
  148. def subtract(other: RDD[R], numPartitions: Int): RDD[R]
    Definition Classes
    RDD
  149. def subtract(other: RDD[R]): RDD[R]
    Definition Classes
    RDD
  150. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  151. lazy val tableDef: TableDef
  152. val tableName: String
  153. def take(num: Int): Array[R]
    Definition Classes
    CassandraRDD → RDD
  154. def takeOrdered(num: Int)(implicit ord: Ordering[R]): Array[R]
    Definition Classes
    RDD
  155. def takeSample(withReplacement: Boolean, num: Int, seed: Long): Array[R]
    Definition Classes
    RDD
  156. def toDebugString: String
    Definition Classes
    RDD
  157. def toEmptyCassandraRDD: EmptyCassandraRDD[R]
    Definition Classes
    CassandraTableScanRDDCassandraRDD
  158. def toJavaRDD(): JavaRDD[R]
    Definition Classes
    RDD
  159. def toLocalIterator: Iterator[R]
    Definition Classes
    RDD
  160. def toString(): String
    Definition Classes
    RDD → AnyRef → Any
  161. def top(num: Int)(implicit ord: Ordering[R]): Array[R]
    Definition Classes
    RDD
  162. def treeAggregate[U](zeroValue: U)(seqOp: (U, R) ⇒ U, combOp: (U, U) ⇒ U, depth: Int)(implicit arg0: ClassTag[U]): U
    Definition Classes
    RDD
  163. def treeReduce(f: (R, R) ⇒ R, depth: Int): R
    Definition Classes
    RDD
  164. def union(other: RDD[R]): RDD[R]
    Definition Classes
    RDD
  165. def unpersist(blocking: Boolean): CassandraTableScanRDD.this.type
    Definition Classes
    RDD
  166. def verify(): RowReader[R]

    Checks for existence of keyspace and table.

    Checks for existence of keyspace and table.

    Definition Classes
    CassandraTableRowReaderProvider
  167. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  168. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... ) @native()
  169. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  170. def where(cql: String, values: Any*): Self

    Adds a CQL WHERE predicate(s) to the query.

    Adds a CQL WHERE predicate(s) to the query. Useful for leveraging secondary indexes in Cassandra. Implicitly adds an ALLOW FILTERING clause to the WHERE clause, however beware that some predicates might be rejected by Cassandra, particularly in cases when they filter on an unindexed, non-clustering column.

    Definition Classes
    CassandraRDD
  171. val where: CqlWhereClause
    Definition Classes
    CassandraTableScanRDDCassandraRDD
  172. def withAscOrder: Self
    Definition Classes
    CassandraRDD
  173. def withConnector(connector: CassandraConnector): Self

    Returns a copy of this Cassandra RDD with specified connector

    Returns a copy of this Cassandra RDD with specified connector

    Definition Classes
    CassandraRDD
  174. def withDescOrder: Self
    Definition Classes
    CassandraRDD
  175. def withReadConf(readConf: ReadConf): Self

    Allows to set custom read configuration, e.g.

    Allows to set custom read configuration, e.g. consistency level or fetch size.

    Definition Classes
    CassandraRDD
  176. def withResources(rp: ResourceProfile): CassandraTableScanRDD.this.type
    Definition Classes
    RDD
    Annotations
    @Experimental() @Since( "3.1.0" )
  177. def zip[U](other: RDD[U])(implicit arg0: ClassTag[U]): RDD[(R, U)]
    Definition Classes
    RDD
  178. def zipPartitions[B, C, D, V](rdd2: RDD[B], rdd3: RDD[C], rdd4: RDD[D])(f: (Iterator[R], Iterator[B], Iterator[C], Iterator[D]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[C], arg2: ClassTag[D], arg3: ClassTag[V]): RDD[V]
    Definition Classes
    RDD
  179. def zipPartitions[B, C, D, V](rdd2: RDD[B], rdd3: RDD[C], rdd4: RDD[D], preservesPartitioning: Boolean)(f: (Iterator[R], Iterator[B], Iterator[C], Iterator[D]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[C], arg2: ClassTag[D], arg3: ClassTag[V]): RDD[V]
    Definition Classes
    RDD
  180. def zipPartitions[B, C, V](rdd2: RDD[B], rdd3: RDD[C])(f: (Iterator[R], Iterator[B], Iterator[C]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[C], arg2: ClassTag[V]): RDD[V]
    Definition Classes
    RDD
  181. def zipPartitions[B, C, V](rdd2: RDD[B], rdd3: RDD[C], preservesPartitioning: Boolean)(f: (Iterator[R], Iterator[B], Iterator[C]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[C], arg2: ClassTag[V]): RDD[V]
    Definition Classes
    RDD
  182. def zipPartitions[B, V](rdd2: RDD[B])(f: (Iterator[R], Iterator[B]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[V]): RDD[V]
    Definition Classes
    RDD
  183. def zipPartitions[B, V](rdd2: RDD[B], preservesPartitioning: Boolean)(f: (Iterator[R], Iterator[B]) ⇒ Iterator[V])(implicit arg0: ClassTag[B], arg1: ClassTag[V]): RDD[V]
    Definition Classes
    RDD
  184. def zipWithIndex(): RDD[(R, Long)]
    Definition Classes
    RDD
  185. def zipWithUniqueId(): RDD[(R, Long)]
    Definition Classes
    RDD

Deprecated Value Members

  1. def finalize(): Unit
    Attributes
    protected[lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] ) @Deprecated @deprecated
    Deprecated

    (Since version ) see corresponding Javadoc for more information.

Inherited from CassandraRDD[R]

Inherited from RDD[R]

Inherited from Logging

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped