Datacenter-aware Round Robin Policy
A specialized Round Robin load balancing policy allows for querying remote datacenters only when all local nodes are down. This policy will round robin requests across hosts in the local datacenter, falling back to remote datacenter if necessary. The name of the local datacenter must be supplied by the user.
By default, this policy will not actually fall back to nodes of a remote datacenter. You must configure the exact number of remote hosts that will be used by passing that number when constructing a policy instance. A nil value means an unlimited number of remote hosts can be potentially used.
By default, this policy will not attempt to use remote hosts for local
consistencies (:local_one
or :local_quorum
), however, it is possible to
change that behavior via the constructor.
Background
- Given
- a running cassandra cluster in 2 datacenters with 2 nodes in each
- And
- the following schema:
CREATE KEYSPACE simplex WITH replication = {'class': 'NetworkTopologyStrategy', 'dc1': 2, 'dc2': 2}; CREATE TABLE simplex.songs ( id uuid PRIMARY KEY, title text, album text, artist text, tags set<text>, data blob ); INSERT INTO simplex.songs (id, title, album, artist, tags) VALUES ( 756716f7-2e54-4715-9f00-91dcbea6cf50, 'La Petite Tonkinoise', 'Bye Bye Blackbird', 'Joséphine Baker', {'jazz', '2013'}) ; INSERT INTO simplex.songs (id, title, album, artist, tags) VALUES ( f6071e72-48ec-4fcb-bf3e-379c8a696488, 'Die Mösch', 'In Gold', 'Willi Ostermann', {'kölsch', '1996', 'birds'} ); INSERT INTO simplex.songs (id, title, album, artist, tags) VALUES ( fbdf82ed-0063-4796-9c7c-a3d4f47b4b25, 'Memo From Turner', 'Performance', 'Mick Jager', {'soundtrack', '1991'} );
First seen datacenter is considered local when not explicitly given
- Given
- the following example:
require 'cassandra' policy = Cassandra::LoadBalancing::Policies::DCAwareRoundRobin.new hosts = ['127.0.0.3', '127.0.0.4'] cluster = Cassandra.cluster(hosts: hosts, load_balancing_policy: policy) session = cluster.connect('simplex') hosts_used = 4.times.map do info = session.execute("SELECT * FROM songs").execution_info info.hosts.last.ip end.sort.uniq puts hosts_used
- When
- it is executed
- Then
- its output should contain:
127.0.0.3 127.0.0.4
Requests are automatically routed to local datacenter
- Given
- the following example:
require 'cassandra' datacenter = "dc2" policy = Cassandra::LoadBalancing::Policies::DCAwareRoundRobin.new(datacenter) cluster = Cassandra.cluster(load_balancing_policy: policy, consistency: :one) session = cluster.connect('simplex') hosts_used = 4.times.map do info = session.execute("SELECT * FROM songs").execution_info info.hosts.last.ip end.sort.uniq puts hosts_used
- When
- it is executed
- Then
- its output should contain:
127.0.0.3 127.0.0.4
Requests are routed to remote datacenters if local datacenter is down
- Given
- the following example:
require 'cassandra' datacenter = "dc2" remotes_to_try = nil policy = Cassandra::LoadBalancing::Policies::DCAwareRoundRobin.new(datacenter, remotes_to_try) cluster = Cassandra.cluster(load_balancing_policy: policy, consistency: :one) session = cluster.connect('simplex') hosts_used = 4.times.map do info = session.execute("SELECT * FROM songs").execution_info info.hosts.last.ip end.sort.uniq puts hosts_used
- And
- node 3 is stopped
- And
- node 4 is stopped
- When
- it is executed
- Then
- its output should contain:
127.0.0.1 127.0.0.2
Requests are routed up to a maximum number of hosts in remote datacenters
- Given
- the following example:
require 'cassandra' datacenter = "dc2" remotes_to_try = 1 policy = Cassandra::LoadBalancing::Policies::DCAwareRoundRobin.new(datacenter, remotes_to_try) cluster = Cassandra.cluster(load_balancing_policy: policy, consistency: :one) session = cluster.connect('simplex') hosts_used = 4.times.map do info = session.execute("SELECT * FROM songs").execution_info info.hosts.last.ip end.sort.uniq puts "Used #{hosts_used.size} host, with ip #{hosts_used.first}"
- And
- node 3 is stopped
- And
- node 4 is stopped
- When
- it is executed
- Then
- its output should match:
Used 1 host, with ip 127\.0\.0\.(1|2)
Requests with local consistencies are not routed to remote datacenters by default
- Given
- the following example:
require 'cassandra' datacenter = "dc2" remotes_to_try = nil policy = Cassandra::LoadBalancing::Policies::DCAwareRoundRobin.new(datacenter, remotes_to_try) cluster = Cassandra.cluster(load_balancing_policy: policy, consistency: :one) session = cluster.connect('simplex') begin session.execute("SELECT * FROM songs", consistency: :local_one) puts "failure" rescue Cassandra::Errors::NoHostsAvailable puts "success" end
- And
- node 3 is stopped
- And
- node 4 is stopped
- When
- it is executed
- Then
- its output should contain:
success
Routing requests with local consistencies to remote datacenters
- Given
- the following example:
require 'cassandra' datacenter = "dc2" remotes_to_try = nil use_remote = true policy = Cassandra::LoadBalancing::Policies::DCAwareRoundRobin.new(datacenter, remotes_to_try, use_remote) cluster = Cassandra.cluster(load_balancing_policy: policy) session = cluster.connect('simplex') hosts_used = 4.times.map do info = session.execute("SELECT * FROM songs", consistency: :local_one).execution_info info.hosts.last.ip end.sort.uniq puts hosts_used
- And
- node 3 is stopped
- And
- node 4 is stopped
- When
- it is executed
- Then
- its output should contain:
127.0.0.1 127.0.0.2