com.datastax.spark

connector

package connector

The root package of Cassandra connector for Apache Spark. Offers handy implicit conversions that add Cassandra-specific methods to SparkContext and RDD.

Call cassandraTable method on the SparkContext object to create a CassandraRDD exposing Cassandra tables as Spark RDDs.

Call RDDFunctions saveToCassandra function on any RDD to save distributed collection to a Cassandra table.

Example:

CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1 };
CREATE TABLE test.words (word text PRIMARY KEY, count int);
INSERT INTO test.words(word, count) VALUES ("and", 50);
import com.datastax.spark.connector._

val sparkMasterHost = "127.0.0.1"
val cassandraHost = "127.0.0.1"
val keyspace = "test"
val table = "words"

// Tell Spark the address of one Cassandra node:
val conf = new SparkConf(true).set("spark.cassandra.connection.host", cassandraHost)

// Connect to the Spark cluster:
val sc = new SparkContext("spark://" + sparkMasterHost + ":7077", "example", conf)

// Read the table and print its contents:
val rdd = sc.cassandraTable(keyspace, table)
rdd.toArray().foreach(println)

// Write two rows to the table:
val col = sc.parallelize(Seq(("of", 1200), ("the", "863")))
col.saveToCassandra(keyspace, table)

sc.stop()
Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By inheritance
Inherited
  1. connector
  2. AnyRef
  3. Any
  1. Hide All
  2. Show all
Learn more about member selection
Visibility
  1. Public
  2. All

Type Members

  1. trait AbstractGettableData extends AnyRef

  2. sealed trait BatchSize extends AnyRef

  3. case class BytesInBatch(batchSize: Int) extends BatchSize with Product with Serializable

  4. final class CassandraRow extends ScalaGettableData with Serializable

    Represents a single row fetched from Cassandra.

  5. case class ColumnIndex(columnIndex: Int) extends ColumnRef with Product with Serializable

    References a column by its index in the row.

  6. case class ColumnName(columnName: String, alias: Option[String] = scala.None) extends NamedColumnRef with Product with Serializable

    References a column by name.

  7. implicit final class ColumnNameFunctions extends AnyVal

  8. class ColumnNotFoundException extends Exception

    Thrown when the requested column does not exist in the result set.

  9. sealed trait ColumnRef extends AnyRef

    Unambiguous reference to a column in the query result set row.

  10. sealed trait ColumnSelector extends AnyRef

  11. sealed trait NamedColumnRef extends SelectableColumnRef

    A selectable column based on a real, non-virtual column with a name in the table

  12. class PairRDDFunctions[K, V] extends Serializable

  13. class RDDFunctions[T] extends WritableToCassandra[T] with Serializable

    Provides Cassandra-specific methods on RDD

  14. case class RowsInBatch(batchSize: Int) extends BatchSize with Product with Serializable

  15. trait ScalaGettableData extends AbstractGettableData

  16. sealed trait SelectableColumnRef extends ColumnRef

    A column that can be selected from CQL results set by name

  17. case class SomeColumns(columns: SelectableColumnRef*) extends ColumnSelector with Product with Serializable

  18. class SparkContextFunctions extends Serializable

    Provides Cassandra-specific methods on SparkContext

  19. case class TTL(columnName: String, alias: Option[String] = scala.None) extends NamedColumnRef with Product with Serializable

  20. final class UDTValue extends ScalaGettableData with Serializable

  21. case class WriteTime(columnName: String, alias: Option[String] = scala.None) extends NamedColumnRef with Product with Serializable

Value Members

  1. object AbstractGettableData

  2. object AllColumns extends ColumnSelector with Product with Serializable

  3. object BatchSize

  4. object CassandraRow extends Serializable

  5. object NamedColumnRef

  6. object PartitionKeyColumns extends ColumnSelector with Product with Serializable

  7. object RowCountRef extends SelectableColumnRef with Product with Serializable

  8. object SelectableColumnRef

  9. object SomeColumns extends Serializable

  10. object UDTValue extends Serializable

  11. package cql

    Contains a cql.CassandraConnector object which is used to connect to a Cassandra cluster and to send CQL statements to it.

  12. package mapper

    Provides machinery for mapping Cassandra tables to user defined Scala classes or tuples.

  13. package metrics

  14. package rdd

    Contains com.datastax.spark.connector.rdd.CassandraTableScanRDD class that is the main entry point for analyzing Cassandra data from Spark.

  15. package streaming

  16. implicit def toNamedColumnRef(columnName: String): ColumnName

  17. implicit def toPairRDDFunctions[K, V](rdd: RDD[(K, V)]): PairRDDFunctions[K, V]

  18. implicit def toRDDFunctions[T](rdd: RDD[T]): RDDFunctions[T]

  19. implicit def toSparkContextFunctions(sc: SparkContext): SparkContextFunctions

  20. package types

    Offers type conversion magic, so you can receive Cassandra column values in a form you like the most.

  21. package util

    Useful stuff that didn't fit elsewhere.

  22. package writer

    Contains components for writing RDDs to Cassandra

Inherited from AnyRef

Inherited from Any

Ungrouped