Help Center/CloudTable Service/User Guide/CloudTable Service Selection
Updated on 2025-12-29 GMT+08:00

CloudTable Service Selection

Capability Comparison Between CloudTable Components

Table 1 Comparison of component capabilities

Comparison Item

HBase

ClickHouse

Doris

Data Storage

  • Column-oriented storage which is suitable for large-scale datasets and fast random access
  • Support for structured, semi-structured, and unstructured data
  • Column-oriented data storage which is suitable for real-time analysis and query of mass data
  • Support only for structured data
  • Column-oriented data storage which is suitable for real-time analysis and query of mass data
  • Support only for structured data

Data Processing

Support for real-time data update and insertion

A high-performance distributed query engine which is strong at complex aggregations but weak at transaction processing

A high-performance distributed query engine which is strong at complex aggregations but weak at transaction processing

Cross-table Query

Not good at

Not good at

Good at foreign and internal table join queries

SQL Support

No support for SQL statements

Support for complex SQL statement operations

Support for complex SQL statement operations

Large Wide Table Support

Support for large tables with millions of columns

Support for querying large wide tables with thousands of columns

Suitable for querying small tables with hundreds of columns

OLTP Capability

No support for transactions, no Atomicity, Consistency, Isolation, and Durability (ACID) capability

No support for transactions, no ACID capability

No support for transactions, no ACID capability

Indexing

Global index, covering index, local index, row key index, and column family index

Sparse index and secondary index

  • Point query index: prefix index and sort keys
  • Skip index: ZoneMap index and BloomFilter index

Application Scenarios of CloudTable Components

Table 2 Application scenarios of components

Comparison Item

HBase

ClickHouse

Doris

Application Scenario

  • Real-time data storage and processing: HBase is a distributed, column-oriented open-source data warehouse. It is designed to handle massive amounts of data, both real-time and non-real-time, and is well-suited for scenarios requiring high-concurrency read and write operations.
  • Big data query and analysis: HBase integrates with big data processing frameworks like Hadoop and Spark to deliver fast data access, making it suitable for big data analysis scenarios.
  • Data warehouse: HBase functions as a component of a data warehousing solution. It stores structured or semi-structured data and supports real-time data query and analysis.
  • Real-time data analysis

    This component applies to data analysis scenarios that require low latency and quick response.

    • Log and monitoring analysis: This component is used for log aggregation and monitoring data analysis systems.
    • Real-time user behavior analytics: Data of user behaviors (such as clicks, browsing, and purchase behaviors) can be analyzed in real time, which is used for product recommendation, user preference analysis, and ad placement optimization.
  • Massive data processing

    Data warehouse: TB-level or even PB-level data is supported. It is usually used to build an enterprise data platform to support enterprise decision-making and data insight by aggregating data sources and analyzing historical data.

  • OLAP

    This component delivers excellent performance and can handle complex analytical queries like aggregation, filtering, and sorting. It improves complex query execution efficiency through column-based storage, data compression, and query optimization.

  • Real-time data analysis
    • Data stream processing: Doris can import and analyze data streams in real time, supporting scenarios like real-time monitoring and real-time user behavior analysis.
    • Real-time dashboard: Doris is suitable for building real-time visualized dashboards, providing real-time data support for operations and business decision-making.
  • Data warehouse
    • OLAP: Doris can handle complex OLAP queries on large-scale datasets, delivering quick multidimensional analysis and report generation.
    • ETL process: Doris can quickly import data from various data sources (such as Kafka, Hadoop, and MySQL) and perform cleanup, aggregation, and analysis.
  • Multi-table federated query and analysis

    Join query can be performed between foreign tables in other databases and your internal tables as well as among internal tables in the same database, providing excellent query performance.

Supported Capability

  • Mass data storage
    • It can store massive structured, semi-structured, and unstructured data. A single HBase table can accommodate tens of billions of rows and millions of columns, and data can be inserted horizontally and vertically. Therefore, HBase exhibits high elasticity and capacity.
    • HBase can implement large-scale data storage by deploying inexpensive server clusters with high scalability.
  • Fast random query

    It is suitable for processing large-scale datasets, supports fast random access, and supports real-time data update and insertion.

  • NoSQL query

    HBase is suitable for querying and analyzing data using non-SQL statements.

  • Data query and analysis of ultra-large wide tables

    It supports data query and analysis of a large wide table with thousands of columns, delivering subsecond-level performance.

  • Reads > Writes

    Partial columns in a large wide table can be read. Reading one column at a time can meet requirements. Data is integrated in append or update batches.

  • Distributed scalability

    It adopts a distributed architecture and supports flexible horizontal scaling.

  • Real-time data import, query, and analysis

    Data can be imported from various data sources such as Kafka, Hadoop, and MySQL in real time, and can be queried and analyzed in real time.

  • High-performance online multi-table join query and analysis

    It supports high-concurrency multi-table join query and analysis on PB-level massive data. With subsecond-level performance, it is applicable to real-time analysis and service reports.

  • Distributed scalability

    It adopts a distributed architecture and supports flexible horizontal scaling.

  • Flexible data models

    Multiple data models and data types are supported to meet requirements of different service scenarios.

Not Recommended Capability

  • No support for OLTP capability

    The ACID capability is not supported. RDS capabilities are unavailable in terms of atomicity, consistency, and real-time performance of point query transactions.

  • No support for SQL statement capability

    There is no support for using SQL statements to import or query data.

  • Weak OLTP capability

    The ACID capability is not supported. RDS capabilities are unavailable in terms of atomicity, consistency, and real-time performance of point query transactions.

  • Not applicable to scenarios where a small amount of data is imported and updated frequently

    There is no support for full update or deletion operations. Existing data cannot be modified or deleted at a high frequency and low latency. Only batch deletion or modification of data is supported.

  • Not good at sorting out unstructured data

    It is effective in querying and analyzing structured data based on SQL statements, but not good at processing semi-structured or unstructured data.

  • Weak multi-table join operation capability
  • WeakOLTP capability

    The ACID capability is not supported. RDS capabilities are unavailable in terms of atomicity, consistency, and real-time performance of point query transactions.

  • Not applicable to scenarios where a small amount of data is imported and updated frequently

    There is no support for full update or deletion operations. Existing data cannot be modified or deleted at a high frequency and low latency. Only batch deletion or modification of data is supported.

  • Weak query and analysis capabilities for large wide tables

    It is suitable for query and analysis of small- and medium-sized data warehouses. The query and analysis performance of ultra-wide tables with more than 1000 columns is weak.

  • Not good at sorting out unstructured data

    It is effective in querying and analyzing structured data based on SQL statements, but not good at processing semi-structured or unstructured data.