Performance of GeminiDB Cassandra and On-Premises Open Source Cassandra Clusters
This section describes how the performance of an open-source Cassandra cluster compares to a GeminiDB Cassandra cluster. The test environment, test model, and test steps will all be described.
Test Environment
- Open-source Cassandra test environment
Table 1 Test environment description Name
Open-source Cassandra Cluster
Version
3.11.5
Nodes
3
OS
CentOS 7.4
ECS Specifications
- General computing-plus 4 vCPUs | 16 GB
- General computing-plus 8 vCPUs | 32 GB
- General computing-plus 16 vCPUs | 64 GB
- General computing-plus 32 vCPUs | 128 GB
- GeminiDB Cassandra test environment
Table 2 Test environment description Name
GeminiDB Cassandra Cluster
Region
CN-Hong Kong
Nodes
3
AZ
AZ 3
Version
3.11
Instance Specifications
- 4 vCPUs | 16 GB
- 8 vCPUs | 32 GB
- 16 vCPUs | 64 GB
- 32 vCPUs | 128 GB
Load Test Tool Environment
- Load test tool specifications
Table 3 Specifications description Name
Test client ECS
vCPUs
16
Memory
64 GB
OS
CentOS 7.4
- Load test tool information
Table 4 Load test tool information Test Tool
YCSB
Version
0.12.0
Download Address
https://github.com/brianfrankcooper/YCSB
curl -O --location https://github.com/brianfrankcooper/YCSB/releases/download/0.12.0/ycsb-0.12.0.tar.gz
Testing Models
Service Model |
Description |
---|---|
_read95_update5 |
95% read and 5% update |
_update50_read50 |
50% update and 50% read |
_read65_update25_insert10 |
65% read, 25% update, and 10% write |
_insert90_read10 |
90% write and 10% read |
Test Procedure
Testing open-source Cassandra
- Purchase an ECS.
- Log in to the management console.
- Choose Computing > Elastic Cloud Server.
- Click Buy ECS in the upper right corner of the page and configure related parameters as follows:
- Region: CN-Hong Kong
- AZ: AZ3
- Specifications: General computing-plus | c6.xlarge.4
- Image: Public image and CentOS 7.6 64bit(40 GB)
- Data Disk: Ultra-high I/O and 200 GB
- Network: Select a VPC and subnet.
- Other parameters: Set other parameters as needed. You can ignore optional parameters.
- Repeat the preceding steps to create five ECSs named Cassandra-1 (192.168.0.15), Cassandra-2 (192.168.0.240), Cassandra-3 (192.168.0.153), Cassandra-4 (192.168.0.175) and ycsb-Cassandra (192.168.0.60).
ECSs Cassandra-1, Cassandra-2, and Cassandra-3 are for initializing Cassandra clusters. ECS Cassandra-4 is for capacity expansion. ECS ycsb-Cassandra serves as the load test server.
Figure 1 ECS details
- After those ECSs are created, log in to them using the remote login option provided on the management console.
Figure 2 Logging in to an ECS
- Install Java Runtime Environment:
- Install the Cassandra service and create a data directory.
- Download the Cassandra installation package:
wget https://archive.apache.org/dist/cassandra/3.11.5/apache-cassandra-3.11.5-bin.tar.gz
- Decompress the installation package:
- Change the installation directory:
- Configure environment variables:
echo "export PATH=/usr/local/Cassandra/bin:$PATH" >> /etc/profile
- Apply the variables:
- Create a data directory:
- Confirm that the installation was successful.
Figure 3 Successful installation
- Download the Cassandra installation package:
- Configure an open-source Cassandra cluster.
- Log in to ECSs Cassandra-1, Cassandra-2, and Cassandra-3.
- Go to the /usr/local/Cassandra/conf directory and modify the Cassandra-topology.properties file as follows:
- Comment out the content in the area marked by No.1 in Figure 4.
- Add the content in the area marked by No.2 in Figure 4.
The Cassandra-topology.properties configuration files of Cassandra-1, Cassandra-2, and Cassandra-3 must be the same.
- Modify the Cassandra.yaml file as follows:
data_file_directories: - /data commitlog_directory: /usr/local/Cassandra/commitlog saved_caches_directory: /usr/local/Cassandra/saved_caches seed_provider: # Addresses of hosts that are deemed contact points. # Cassandra nodes use this list of hosts to find each other and learn # the topology of the ring. You must change this if you are running # multiple nodes! - class_name: org.apache.Cassandra.locator.SimpleSeedProvider parameters: # seeds is actually a comma-delimited list of addresses. # Ex: "<ip1>,<ip2>,<ip3>" - seeds: "192.168.0.153,192.168.0.240,192.168.0.15" ##Enter IP addresses of the three nodes in the cluster. listen_address: 192.168.0.153 #IP address of each node rpc_address: 192.168.0.153 #IP address of each node
- Run the following command on Cassandra-1, Cassandra-2, and Cassandra-3 to start the Cassandra cluster:
Cassandra –R &
- Add nodes to the open-source Cassandra cluster.
- Log in to Cassandra-4.
- Go to the /usr/local/cassandra/conf directory and edit the Cassandra-topology.properties file as follows:
- Modify the Cassandra.yaml file as follows:
data_file_directories: - /data commitlog_directory: /usr/local/Cassandra/commitlog saved_caches_directory: /usr/local/Cassandra/saved_caches seed_provider: # Addresses of hosts that are deemed contact points. # Cassandra nodes use this list of hosts to find each other and learn # the topology of the ring. You must change this if you are running # multiple nodes! - class_name: org.apache.Cassandra.locator.SimpleSeedProvider parameters: # seeds is actually a comma-delimited list of addresses. # Ex: "<ip1>,<ip2>,<ip3>" - seeds: "192.168.0.153,192.168.0.240,192.168.0.15" ## Enter IP addresses of the three seed nodes in the cluster, which must be the same as the values entered in step 1. listen_address: 192.168.0.175 #IP address of each node rpc_address: 192.168.0.175 #IP address of each node
- Log in to Cassandra-1.
- Stop compaction on all nodes:
- Stop the ongoing compaction task:
- Limit migration traffic of the node:
nodetool setstreamthroughput 32
In the preceding command, the value of nodetool setstreamthroughput 32 is set to 32 MB/s to reduce the impact of migration on services.
- Log in to Cassandra-4.
- Start the Cassandra service:
- Log in to Cassandra-1.
- During the scaling, run the following command every 30 seconds:
If the status of Cassandra-4 is UJ, data is being migrated. The migration is complete when the status changes to UN.
Figure 6 Node statuses
Testing GeminiDB Cassandra
- Purchase a GeminiDB Cassandra cluster.
- Log in to the management console.
- Choose Databases > GeminiDB.
- Click Buy DB Instance in the upper right corner of the page and set required parameters as follows:
- Region: CN-Hong Kong
- Compatible API: Cassandra
- Specifications: 4 vCPUs | 16 GB
- Storage Space: 200 GB
- Nodes: Enter 3.
- VPC: The same as that of the purchased ECS.
- Security Group: The same as that of the purchased ECS.
- Add nodes to the GeminiDB Cassandra cluster.
- Log in to the management console.
- Choose Databases > GeminiDB.
- Select an existing GeminiDB Cassandra instance.
- Click the instance name to enter the Basic Information page.
- In the Node Information area on the Basic Information page, click Add Node.
Figure 7 Node information
- On the displayed page, click + on the right of field Add Nodes .
Figure 8 Adding nodes
- Wait until the nodes are added.
- View the change of QPS during the scale-out process.
Figure 9 QPS changes
During the scale-out process, the QPS of the GeminiDB Cassandra instance decreases slightly for about 10 seconds, which almost has no effect on services. The whole scaling process takes about 10 minutes.
After the scale-out is complete, you can analyze test data.
Test Results
- Performance results
Table 6 Performance data qps_avg Statistics
Node Class
Concurrent Threads of the Client
Data Volume to Be Prepared
_read95_update5
_update50_read50
_read65_update25_insert10
_insert90_read10
Open-source Cassandra cluster
4 vCPUs | 6 GB
32
50
2884
5068
8484
10694
8 vCPUs | 32 GB
64
100
2796
2904
5180
7854
16 vCPUs | 64 GB
128
200
5896
14776
14304
15707
32 vCPUs | 128 GB
256
400
8964
22284
19592
22344
GeminiDB Cassandra cluster performance data
4 vCPUs | 6 GB
32
50
8439
10565
9468
23830
8 vCPUs | 32 GB
64
100
24090
24970
21716
44548
16 vCPUs | 64 GB
128
200
48985
51335
43557
67290
32 vCPUs | 128 GB
256
400
91280
85748
74313
111540
Performance comparison between GeminiDB Cassandra and open-source Cassandra
4 vCPUs | 6 GB
32
50
2.93
2.08
1.12
2.23
8 vCPUs | 32 GB
64
100
8.62
8.60
4.19
5.67
16 vCPUs | 64 GB
128
200
8.31
3.47
3.05
4.28
32 vCPUs | 128 GB
256
400
10.18
3.85
3.79
4.99
- Test Conclusion
- The GeminiDB Cassandra cluster performs ten times better than the open-source Cassandra cluster in terms of read latency.
- GeminiDB Cassandra cluster gives you basically the same write performance as the open-source cluster.
- Adding nodes slightly affects both the GeminiDB Cassandra and open-source clusters.
- The scale-out for GeminiDB Cassandra is fast and only affects services briefly (10s). You do not need to change parameters, and the scale-out process takes 10 minutes.
- For an open-source Cassandra cluster, the time needed for adding nodes depends on the data volume and parameter settings, and the impact on performance varies. In this test, the scale-out took more than 30 minutes when the preset data size was 50 GB.
- Calculation formula: Highest migration speed = (nodetool setstreamthroughput 32 value, 200 Mbit/s by default) x Original nodes
In this test, the highest migration speed = 32 Mbit/s x 3 = 12 MB/s = 720 MB/min = 0.703 GB/min. So, the time needed for migrating 50 GB of data in this scenario was 71.1 minutes (50/0.703).
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot