Compute
Elastic Cloud Server
Huawei Cloud Flexus
Bare Metal Server
Auto Scaling
Image Management Service
Dedicated Host
FunctionGraph
Cloud Phone Host
Huawei Cloud EulerOS
Networking
Virtual Private Cloud
Elastic IP
Elastic Load Balance
NAT Gateway
Direct Connect
Virtual Private Network
VPC Endpoint
Cloud Connect
Enterprise Router
Enterprise Switch
Global Accelerator
Management & Governance
Cloud Eye
Identity and Access Management
Cloud Trace Service
Resource Formation Service
Tag Management Service
Log Tank Service
Config
OneAccess
Resource Access Manager
Simple Message Notification
Application Performance Management
Application Operations Management
Organizations
Optimization Advisor
IAM Identity Center
Cloud Operations Center
Resource Governance Center
Migration
Server Migration Service
Object Storage Migration Service
Cloud Data Migration
Migration Center
Cloud Ecosystem
KooGallery
Partner Center
User Support
My Account
Billing Center
Cost Center
Resource Center
Enterprise Management
Service Tickets
HUAWEI CLOUD (International) FAQs
ICP Filing
Support Plans
My Credentials
Customer Operation Capabilities
Partner Support Plans
Professional Services
Analytics
MapReduce Service
Data Lake Insight
CloudTable Service
Cloud Search Service
Data Lake Visualization
Data Ingestion Service
GaussDB(DWS)
DataArts Studio
Data Lake Factory
DataArts Lake Formation
IoT
IoT Device Access
Others
Product Pricing Details
System Permissions
Console Quick Start
Common FAQs
Instructions for Associating with a HUAWEI CLOUD Partner
Message Center
Security & Compliance
Security Technologies and Applications
Web Application Firewall
Host Security Service
Cloud Firewall
SecMaster
Anti-DDoS Service
Data Encryption Workshop
Database Security Service
Cloud Bastion Host
Data Security Center
Cloud Certificate Manager
Edge Security
Situation Awareness
Managed Threat Detection
Blockchain
Blockchain Service
Web3 Node Engine Service
Media Services
Media Processing Center
Video On Demand
Live
SparkRTC
MetaStudio
Storage
Object Storage Service
Elastic Volume Service
Cloud Backup and Recovery
Storage Disaster Recovery Service
Scalable File Service Turbo
Scalable File Service
Volume Backup Service
Cloud Server Backup Service
Data Express Service
Dedicated Distributed Storage Service
Containers
Cloud Container Engine
SoftWare Repository for Container
Application Service Mesh
Ubiquitous Cloud Native Service
Cloud Container Instance
Databases
Relational Database Service
Document Database Service
Data Admin Service
Data Replication Service
GeminiDB
GaussDB
Distributed Database Middleware
Database and Application Migration UGO
TaurusDB
Middleware
Distributed Cache Service
API Gateway
Distributed Message Service for Kafka
Distributed Message Service for RabbitMQ
Distributed Message Service for RocketMQ
Cloud Service Engine
Multi-Site High Availability Service
EventGrid
Dedicated Cloud
Dedicated Computing Cluster
Business Applications
Workspace
ROMA Connect
Message & SMS
Domain Name Service
Edge Data Center Management
Meeting
AI
Face Recognition Service
Graph Engine Service
Content Moderation
Image Recognition
Optical Character Recognition
ModelArts
ImageSearch
Conversational Bot Service
Speech Interaction Service
Huawei HiLens
Video Intelligent Analysis Service
Developer Tools
SDK Developer Guide
API Request Signing Guide
Terraform
Koo Command Line Interface
Content Delivery & Edge Computing
Content Delivery Network
Intelligent EdgeFabric
CloudPond
Intelligent EdgeCloud
Solutions
SAP Cloud
High Performance Computing
Developer Services
ServiceStage
CodeArts
CodeArts PerfTest
CodeArts Req
CodeArts Pipeline
CodeArts Build
CodeArts Deploy
CodeArts Artifact
CodeArts TestPlan
CodeArts Check
CodeArts Repo
Cloud Application Engine
MacroVerse aPaaS
KooMessage
KooPhone
KooDrive
Help Center/ GeminiDB/ GeminiDB Cassandra API/ Best Practices/ Performance of GeminiDB Cassandra and On-Premises Open Source Cassandra Clusters

Performance of GeminiDB Cassandra and On-Premises Open Source Cassandra Clusters

Updated on 2024-08-03 GMT+08:00

This section describes how the performance of an open-source Cassandra cluster compares to a GeminiDB Cassandra cluster. The test environment, test model, and test steps will all be described.

Test Environment

  • Open-source Cassandra test environment
    Table 1 Test environment description

    Name

    Open-source Cassandra Cluster

    Version

    3.11.5

    Nodes

    3

    OS

    CentOS 7.4

    ECS Specifications

    • General computing-plus 4 vCPUs | 16 GB
    • General computing-plus 8 vCPUs | 32 GB
    • General computing-plus 16 vCPUs | 64 GB
    • General computing-plus 32 vCPUs | 128 GB
  • GeminiDB Cassandra test environment
    Table 2 Test environment description

    Name

    GeminiDB Cassandra Cluster

    Region

    CN-Hong Kong

    Nodes

    3

    AZ

    AZ 3

    Version

    3.11

    Instance Specifications

    • 4 vCPUs | 16 GB
    • 8 vCPUs | 32 GB
    • 16 vCPUs | 64 GB
    • 32 vCPUs | 128 GB

Load Test Tool Environment

  • Load test tool specifications
    Table 3 Specifications description

    Name

    Test client ECS

    vCPUs

    16

    Memory

    64 GB

    OS

    CentOS 7.4

  • Load test tool information
    Table 4 Load test tool information

    Test Tool

    YCSB

    Version

    0.12.0

    Download Address

    https://github.com/brianfrankcooper/YCSB

    curl -O --location https://github.com/brianfrankcooper/YCSB/releases/download/0.12.0/ycsb-0.12.0.tar.gz

Testing Models

Table 5 Testing models

Service Model

Description

_read95_update5

95% read and 5% update

_update50_read50

50% update and 50% read

_read65_update25_insert10

65% read, 25% update, and 10% write

_insert90_read10

90% write and 10% read

Test Procedure

Testing open-source Cassandra

  1. Purchase an ECS.

    1. Log in to the management console.
    2. Choose Computing > Elastic Cloud Server.
    3. Click Buy ECS in the upper right corner of the page and configure related parameters as follows:
      • Region: CN-Hong Kong
      • AZ: AZ3
      • Specifications: General computing-plus | c6.xlarge.4
      • Image: Public image and CentOS 7.6 64bit(40 GB)
      • Data Disk: Ultra-high I/O and 200 GB
      • Network: Select a VPC and subnet.
      • Other parameters: Set other parameters as needed. You can ignore optional parameters.
    4. Repeat the preceding steps to create five ECSs named Cassandra-1 (192.168.0.15), Cassandra-2 (192.168.0.240), Cassandra-3 (192.168.0.153), Cassandra-4 (192.168.0.175) and ycsb-Cassandra (192.168.0.60).

      ECSs Cassandra-1, Cassandra-2, and Cassandra-3 are for initializing Cassandra clusters. ECS Cassandra-4 is for capacity expansion. ECS ycsb-Cassandra serves as the load test server.

      Figure 1 ECS details
    5. After those ECSs are created, log in to them using the remote login option provided on the management console.
      Figure 2 Logging in to an ECS
    6. Install Java Runtime Environment:

      yum install jre

    7. Install the Cassandra service and create a data directory.
      1. Download the Cassandra installation package:

        wget https://archive.apache.org/dist/cassandra/3.11.5/apache-cassandra-3.11.5-bin.tar.gz

      2. Decompress the installation package:

        tar -zxvf apache-Cassandra-3.11.5-bin.tar.gz -C /root/

      3. Change the installation directory:

        mv /root/apache-Cassandra-3.11.5 /usr/local/Cassandra

      4. Configure environment variables:

        echo "export PATH=/usr/local/Cassandra/bin:$PATH" >> /etc/profile

      5. Apply the variables:

        source /etc/profile

      6. Create a data directory:

        mkdir /data

      7. Confirm that the installation was successful.

        cqlsh

        Figure 3 Successful installation

  2. Configure an open-source Cassandra cluster.

    1. Log in to ECSs Cassandra-1, Cassandra-2, and Cassandra-3.
    2. Go to the /usr/local/Cassandra/conf directory and modify the Cassandra-topology.properties file as follows:
      • Comment out the content in the area marked by No.1 in Figure 4.
      • Add the content in the area marked by No.2 in Figure 4.
      Figure 4 Modifying the configuration file
      NOTE:

      The Cassandra-topology.properties configuration files of Cassandra-1, Cassandra-2, and Cassandra-3 must be the same.

    3. Modify the Cassandra.yaml file as follows:
      data_file_directories:
      - /data
      commitlog_directory: /usr/local/Cassandra/commitlog
      saved_caches_directory: /usr/local/Cassandra/saved_caches
      seed_provider:
      # Addresses of hosts that are deemed contact points.
      # Cassandra nodes use this list of hosts to find each other and learn
      # the topology of the ring.  You must change this if you are running
      # multiple nodes!
      - class_name: org.apache.Cassandra.locator.SimpleSeedProvider
      parameters:
      # seeds is actually a comma-delimited list of addresses.
      # Ex: "<ip1>,<ip2>,<ip3>"
      - seeds: "192.168.0.153,192.168.0.240,192.168.0.15" ##Enter IP addresses of the three nodes in the cluster.
      listen_address: 192.168.0.153       #IP address of each node
      rpc_address: 192.168.0.153			#IP address of each node
    4. Run the following command on Cassandra-1, Cassandra-2, and Cassandra-3 to start the Cassandra cluster:

      Cassandra –R &

  3. Add nodes to the open-source Cassandra cluster.

    1. Log in to Cassandra-4.
    2. Go to the /usr/local/cassandra/conf directory and edit the Cassandra-topology.properties file as follows:
      • Comment out the content in the area marked by No.1 in Figure 5.
      • Add the content in the area marked by No.2 in Figure 5.
        Figure 5 Editing the configuration file
    3. Modify the Cassandra.yaml file as follows:
      data_file_directories:
      - /data
      commitlog_directory: /usr/local/Cassandra/commitlog
      saved_caches_directory: /usr/local/Cassandra/saved_caches
      seed_provider:
      # Addresses of hosts that are deemed contact points.
      # Cassandra nodes use this list of hosts to find each other and learn
      # the topology of the ring.  You must change this if you are running
      # multiple nodes!
      - class_name: org.apache.Cassandra.locator.SimpleSeedProvider
      parameters:
      # seeds is actually a comma-delimited list of addresses.
      # Ex: "<ip1>,<ip2>,<ip3>"
      - seeds: "192.168.0.153,192.168.0.240,192.168.0.15" ## Enter IP addresses of the three seed nodes in the cluster, which must be the same as the values entered in step 1.
      listen_address: 192.168.0.175       #IP address of each node
      rpc_address: 192.168.0.175			#IP address of each node
    4. Log in to Cassandra-1.
    5. Stop compaction on all nodes:

      nodetool disableautocompaction

    6. Stop the ongoing compaction task:

      nodetool stop COMPACTION

    7. Limit migration traffic of the node:

      nodetool setstreamthroughput 32

      NOTE:

      In the preceding command, the value of nodetool setstreamthroughput 32 is set to 32 MB/s to reduce the impact of migration on services.

    8. Log in to Cassandra-4.
    9. Start the Cassandra service:

      Cassandra –R &

    10. Log in to Cassandra-1.
    11. During the scaling, run the following command every 30 seconds:

      nodetool status

      If the status of Cassandra-4 is UJ, data is being migrated. The migration is complete when the status changes to UN.

      Figure 6 Node statuses

Testing GeminiDB Cassandra

  1. Purchase a GeminiDB Cassandra cluster.

    1. Log in to the management console.
    2. Choose Databases > GeminiDB.
    3. Click Buy DB Instance in the upper right corner of the page and set required parameters as follows:
      • Region: CN-Hong Kong
      • Compatible API: Cassandra
      • Specifications: 4 vCPUs | 16 GB
      • Storage Space: 200 GB
      • Nodes: Enter 3.
      • VPC: The same as that of the purchased ECS.
      • Security Group: The same as that of the purchased ECS.

  2. Add nodes to the GeminiDB Cassandra cluster.

    1. Log in to the management console.
    2. Choose Databases > GeminiDB.
    3. Select an existing GeminiDB Cassandra instance.
    4. Click the instance name to enter the Basic Information page.
    5. In the Node Information area on the Basic Information page, click Add Node.
      Figure 7 Node information

    6. On the displayed page, click + on the right of field Add Nodes .
      Figure 8 Adding nodes

    7. Wait until the nodes are added.
    8. View the change of QPS during the scale-out process.
      Figure 9 QPS changes

      During the scale-out process, the QPS of the GeminiDB Cassandra instance decreases slightly for about 10 seconds, which almost has no effect on services. The whole scaling process takes about 10 minutes.

      After the scale-out is complete, you can analyze test data.

Test Results

  • Performance results
    Table 6 Performance data

    qps_avg Statistics

    Node Class

    Concurrent Threads of the Client

    Data Volume to Be Prepared

    _read95_update5

    _update50_read50

    _read65_update25_insert10

    _insert90_read10

    Open-source Cassandra cluster

    4 vCPUs | 6 GB

    32

    50

    2884

    5068

    8484

    10694

    8 vCPUs | 32 GB

    64

    100

    2796

    2904

    5180

    7854

    16 vCPUs | 64 GB

    128

    200

    5896

    14776

    14304

    15707

    32 vCPUs | 128 GB

    256

    400

    8964

    22284

    19592

    22344

    GeminiDB Cassandra cluster performance data

    4 vCPUs | 6 GB

    32

    50

    8439

    10565

    9468

    23830

    8 vCPUs | 32 GB

    64

    100

    24090

    24970

    21716

    44548

    16 vCPUs | 64 GB

    128

    200

    48985

    51335

    43557

    67290

    32 vCPUs | 128 GB

    256

    400

    91280

    85748

    74313

    111540

    Performance comparison between GeminiDB Cassandra and open-source Cassandra

    4 vCPUs | 6 GB

    32

    50

    2.93

    2.08

    1.12

    2.23

    8 vCPUs | 32 GB

    64

    100

    8.62

    8.60

    4.19

    5.67

    16 vCPUs | 64 GB

    128

    200

    8.31

    3.47

    3.05

    4.28

    32 vCPUs | 128 GB

    256

    400

    10.18

    3.85

    3.79

    4.99

  • Test Conclusion
    1. The GeminiDB Cassandra cluster performs ten times better than the open-source Cassandra cluster in terms of read latency.
    2. GeminiDB Cassandra cluster gives you basically the same write performance as the open-source cluster.
    3. Adding nodes slightly affects both the GeminiDB Cassandra and open-source clusters.
      • The scale-out for GeminiDB Cassandra is fast and only affects services briefly (10s). You do not need to change parameters, and the scale-out process takes 10 minutes.
      • For an open-source Cassandra cluster, the time needed for adding nodes depends on the data volume and parameter settings, and the impact on performance varies. In this test, the scale-out took more than 30 minutes when the preset data size was 50 GB.
      • Calculation formula: Highest migration speed = (nodetool setstreamthroughput 32 value, 200 Mbit/s by default) x Original nodes

        In this test, the highest migration speed = 32 Mbit/s x 3 = 12 MB/s = 720 MB/min = 0.703 GB/min. So, the time needed for migrating 50 GB of data in this scenario was 71.1 minutes (50/0.703).

We use cookies to improve our site and your experience. By continuing to browse our site you accept our cookie policy. Find out more

Feedback

Feedback

Feedback

0/500

Selected Content

Submit selected content with the feedback