There are several different parts of cstar_perf that can be used either as a whole, or as individual components (See Architecture.) This guide will walk through the installation of cstar_perf.tool, which will be the core of what you need to start benchmarking Cassandra. The next chapter of this guide will focus on the setup of cstar_perf.frontend which sets up a full web-based interface for scheduling tests, archiving results, and monitoring multiple clusters.

Setup cstar_perf.tool

cstar_perf.tool is the core module of cstar_perf. It is what bootstraps Cassandra and runs performance tests. You should install it on a machine within the same network as your Cassandra cluster. It’s best to dedicate a machine to it, as it will be what runs cassandra-stress and ideally should not have any resource contention on it. If you don’t have an extra machine, you can install it on the same machine as one of your Cassandra nodes, just be aware of any performance penalty you’re introducing by doing so.

In this example, we have four computers:

             +------> cnode1
             |      10.0.0.101
             |
 stress1 +----------> cnode2
10.0.0.100   |      10.0.0.102
             |
             +------> cnode3
                    10.0.0.103
  • stress1 is the node hosting cstar_perf.tool.
  • cnode1, cnode2, and cnode3 are Cassandra nodes. These nodes have 4 SSDs for data storage, mounted at /mnt/d1, /mnt/d2, /mnt/d3, and /mnt/d4

Setting up your cluster

Key based SSH access

The machine hosting cstar_perf.tool should have key based SSH access to the Cassandra cluster for both your regular user account as well as root.

In terms of our example, from your user account on stress1 you should be able to run ssh your_username@cnode1 as well as ssh root@cnode1 without any password prompts.

When generating SSH keys, it works best if you don’t specify a password. You can use an SSH agent if you are uncomfortable doing this, but be aware things will stop working when that agent isn’t running (system reboots, not logged in, etc.)

Software requirements

The machine running cstar_perf.tool needs to have the following packages installed:

  • Python 2.7
  • Python 2.7 development packages - (python-dev on debian)
  • pip - (python-pip on debian)
  • git

The Cassandra nodes also need to have the following:

  • Python 2.7
  • git

In addition, you need to prepare a ~/fab directory to install on each of your nodes. This will contain the JDK as well as a copy of ant. Prepare this directory on the controller node (stress1 in our example) and then rsync it to the others. Here’s an example to set this up on 64-bit Linux with Java 7u67 and ant 1.9.4 (links may change, so modify accordingly.):

mkdir ~/fab
cd ~/fab
wget --no-cookies --header "Cookie: oraclelicense=accept-securebackup-cookie;" http://download.oracle.com/otn-pub/java/jdk/7u67-b01/jdk-7u67-linux-x64.tar.gz
tar xfv jdk-7u67-linux-x64.tar.gz
rm jdk-7u67-linux-x64.tar.gz
ln -s jdk1.7.0_67 java
wget http://archive.apache.org/dist/ant/binaries/apache-ant-1.9.4-bin.tar.bz2
tar xfv apache-ant-1.9.4-bin.tar.bz2
rm apache-ant-1.9.4-bin.tar.bz2
ln -s apache-ant-1.9.4 ant

The end result being that we can invoke java from ~/fab/java/bin/java and ant from ~/fab/ant/bin/ant.

Copy this directory to each of your cassandra nodes:

rsync -av ~/fab cnode1:
rsync -av ~/fab cnode2:
rsync -av ~/fab cnode3:

You’ll know you got your SSH keys sorted out if copying those files didn’t require you to enter any passwords.

Cassandra Stress

Additionally, on the node hosting cstar_perf.tool (stress1 in our example) you need to download and build cassandra-stress. This is only needs to be run on the controller node (stress1):

mkdir ~/fab/stress
cd ~/fab/stress
git clone http://git-wip-us.apache.org/repos/asf/cassandra.git
cd cassandra
git checkout cassandra-2.1
JAVA_HOME=~/fab/java ~/fab/ant/bin/ant clean jar
cd ..
mv cassandra cassandra-2.1
ln -s cassandra-2.1 default

The end result being that you find cassandra-stress in ~/fab/stress/default/tools/bin/cassandra-stress. You’ll know you have java and ant installed correctly if this build was successful.

Install cstar_perf.tool

Finally, you should install cstar_perf.tool onto your designated machine (stress1 in our example):

pip install cstar_perf.tool

Depending on your environment, this may need to be run as root.

Configuration

cstar_perf.tool needs to know about your cluster. For this you need to create a JSON file located in ~/.cstar_perf/cluster_config.json. Here’s the config for our example cluster:

{
    "commitlog_directory": "/mnt/d1/commitlog",
    "data_file_directories": [
        "/mnt/d2/data",
        "/mnt/d3/data",
        "/mnt/d4/data"
    ],
    "block_devices": [
        "/dev/sdb",
        "/dev/sdc",
        "/dev/sdd",
        "/dev/sde"
    ],
    "blockdev_readahead": "256",
    "hosts": {
        "cnode1": {
            "internal_ip": "10.0.0.101",
            "hostname": "cnode1",
            "seed": true
        },
        "cnode2": {
            "internal_ip": "10.0.0.102",
            "hostname": "cnode2",
            "seed": true
        },
        "cnode3": {
            "internal_ip": "10.0.0.103",
            "hostname": "cnode3",
            "seed": true
        }
    },
    "user": "your_username",
    "name": "example1",
    "saved_caches_directory": "/mnt/d2/saved_caches"
}

If you want to use DSE and install it from a tarball, you can add the following keys:

"dse_url": "http://my-dse-repo/tar/",
"dse_username": "XXXX",
"dse_password": "YYYY"

If you want to use DSE and install it from a source branch, you can add the following keys:

"dse_source_build_artifactory_url": "https://dse-artifactory-url.com"
"dse_source_build_artifactory_username" = "dse-artifactory-username"
"dse_source_build_artifactory_password" = "dse-artifactory-password"
"dse_source_build_oauth_token" = "dse-oauth-token-for-github-access"

The required settings :

  • hosts - all of your Cassandra nodes need to be listed here, including hostname and IP address.
  • name - the name you want to give to this cluster.
  • block_devices - The physical block devices that Cassandra is using to store data and commitlogs.
  • blockdev_readahead - The default block device readhead setting for your drives (get it from running blockdev --getra /dev/DEVICE)
  • user - The user account that you use on the Cassandra nodes.
  • dse_** - Only if you want DSE support.

If you’re familiar with Cassandra’s cassandra.yaml, you’ll recognize the rest of these settings because they are from there. You can actually put more cassandra.yaml settings here if you know you’ll always need them, but it’s usually better to rely on the defaults and introduce different settings in your test scenarios, which you’ll define later.

Test cstar_perf_bootstrap

Now that cstar_perf.tool is installed and configured, you can bring up a test cluster to test that everything is working:

cstar_perf_bootstrap -v apache/cassandra-2.1

If you want to install DSE instead of pure Cassandra, then use the following command to bring up the cluster and install the specified version from the given dse_url, specified in your ~/.cstar_perf/cluster_config.json:

cstar_perf_bootstrap -v 4.8.1

This command will tell all of the cassandra nodes to download, from git, the latest development version of Cassandra 2.1, build it, and create a cluster. You’ll see a lot of text output showing you what the script is doing, but at the end of it all, you should see something like:

[10.0.0.101] All nodes available!
INFO:benchmark:Started cassandra on 3 nodes with git SHA: bd396ec8acb74436fd84a9cf48542c49e08a17a6

Assuming that worked, your cluster is now fully automated via cstar_perf. Next steps include creating some test definitions, or to setup the web frontend.

Flamegraph

It is possible to generate flamegraphs when running tests. Follow these instructions to enable the feature:

Install system dependencies on all workers of the cluster:

sudo apt-get install cmake dtach linux-tools-`uname -r`
sudo pip install sh

Ensure your kernel has performance profling support:

$ sudo perf record -F 99 -g -p <a_running_process_pid> -- sleep 10
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.098 MB perf.data (~4278 samples) ]

Add NOPASSWD sudo configuration for the cstar/automaton user:

echo "cstar ALL = (root) NOPASSWD: ALL" | sudo tee /etc/sudoers.d/perf

Enable flamegraph feature in your cluster configuration:

"flamegraph": true,
"flamegraph_directory": "/mnt/data/cstar_perf/flamegraph"

# The flamegraph working directory default to /tmp/flamegraph if not specified.

In case you update your kernel, you might also need to install the matching version of linux-tools as described above.

Yourkit Profiler

It is possible to enable the yourkit profiler when running tests. The snapshot will be available as artifact at the end of the test. Some details:

  • The yourkit agent has to be uploaded on the nodes manually due to the license
  • The telemetry window is 1 hour
  • The yourkit profiler options used are: “onexit=memory,onexit=snapshot”

Enable yourkit feature in your cluster configuration:

"yourkit_profiler": true,
"yourkit_agentpath": "/path/to/yjp-2014-build-14112/bin/linux-x86-64/libyjpagent.so",
"yourkit_directory": "/path/to/Snapshots/",

Ctool Command

It is possible to run a ctool on the cstar_perf cluster when running tests. This has been mainly implemented to use ctool metrics with cstar_perf. Follow these intructions to enable the feature:

Install automaton:

git clone https://github.com/riptano/automaton.git

Configure the cluster using ctool setup_existing. Create a json config file:

{
  "cluster_name": "cstar_perf",
  "private_key_path": "/home/cstar/.ssh/id_rsa",
  "ssh_user": "cstar",
  "hosts": [
      {
          "host_name": "172.17.0.2",
          "ip_address": "172.17.0.2",
          "private_host_name": "172.17.0.2",
          "private_ip_address": "172.17.0.2"
      }
  ]
}

Then setup the existing cluster:

cd automaton
PYTHONPATH=. ./bin/ctool setup_existing ctool_cluster.json

Add the following configuration in your ~/.automaton.conf file:

[ssh]
user = cstar
force_user = true

Enable ctool feature by adding the automaton path in your cluster configuration:

"automaton_path": "/home/cstar/automaton/"

Test the ctool feature using the frontend by selecting the ‘ctool’ operation and use “info cstar_perf” as command.