Overview
What Is Near Data Processing?
Near Data Processing (NDP) is a compute pushdown solution to improve data query efficiency. For data-intensive queries, operations such as column extraction, aggregation calculation, and condition filtering are pushed down to multiple nodes on a distributed storage layer for parallel execution. This reduces query processing pressure on compute nodes, improves parallel processing capabilities, and saves network traffic.
How It Works
GaussDB(for MySQL) uses an architecture with decoupled storage and compute to reduce network traffic. Based on this architecture, NDP is used to accelerate data queries. Without NDP, all raw data needs to be transmitted from storage nodes to compute nodes for query processing. NDP pushed the most I/O-intensive and CPU-intensive query tasks down to storage nodes. Only the required columns and filtered rows or aggregated results are sent back to compute nodes, greatly reducing network traffic. Additionally, parallel processing across storage nodes reduces the CPU usage of compute nodes and improves the query efficiency.
NDP is integrated with parallel query. Pages are prefetched in batches to realize the entire process in parallel. The query execution efficiency is greatly improved.
Scenarios
NDP is suitable for the following scenarios:
- Projection
Column pruning: Only the fields required by a query statement are sent to the compute node.
- Aggregate
Typical aggregation operations include COUNT, SUM, AVG, MAX, MIN, and GROUP BY. Only the aggregated results (not all tuples) are sent to the query engine. COUNT (*) is the most common.
- SELECT - WHERE clause for filtering
Common condition expressions are COMPARE(>=,<=,<,>,==), BETWEEN, IN, AND/OR, and LIKE.
A filter expression is executed on the storage nodes. Only the rows that meet the conditions are sent to the compute node.
Application Constraints
- InnoDB tables.
- Tables with rows in the COMPACT or DYNAMIC format.
- Primary keys or B-tree indexes. Hash and full-text indexes are not supported.
- SELECT statements among the DML statements. INSERT INTO SELECT statements and SELECT statements that will lock rows (such as SELECT FOR SHARE/UPDATE) are not supported.
- Expressions with numeric, log, time, or partial string types (CHAR and VARCHAR). The utf8mb4 and utf8 character sets are supported.
- Expression predicates with comparison operators (<,>,=,<=,>=,!=), IN, NOT IN, LIKE, NOT LIKE, BETWEEN AND, and AND/OR.
Parameters
Parameter |
Level |
Description |
---|---|---|
ndp_mode |
Global
NOTE:
|
Enables or disables NDP. Value: off or on Default value: off |
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot