EXPLAIN
Syntax
EXPLAIN [ ( option [, ...] ) ] statement
The option can be the following:
FORMAT { TEXT | GRAPHVIZ | JSON }
TYPE { LOGICAL | DISTRIBUTED | VALIDATE | IO | COST}
Description
This statement is used to display the logical or distributed execution plan of a statement. It can also be used to verify an SQL statement or analyze I/Os.
The TYPE DISTRIBUTED parameter is used to display the fragmented plan. Each fragment is executed by one or more nodes. Fragments separation indicates that data is exchanged between two nodes. Fragment type indicates how a fragment is executed and how data is distributed among different fragments.
- HASH
Fragments are executed on a fixed number of nodes, and the input data is distributed using the hash function.
- ROUND_ROBIN
Fragments are executed on a fixed number of nodes, and input data is distributed in round-robin mode.
Example
- LOGICAL:
CREATE TABLE testTable (regionkey int, name varchar); EXPLAIN SELECT regionkey, count(*) FROM testTable GROUP BY 1; Query Plan ------------------------------------------------------------------------------------------------------------------------------------- Output[regionkey, _col1] │ Layout: [regionkey:integer, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} │ _col1 := count └─ RemoteExchange[GATHER] │ Layout: [regionkey:integer, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} └─ Project[] │ Layout: [regionkey:integer, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} └─ Aggregate(FINAL)[regionkey][$hashvalue] │ Layout: [regionkey:integer, $hashvalue:bigint, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} │ count := count("count_8") └─ LocalExchange[HASH][$hashvalue] ("regionkey") │ Layout: [regionkey:integer, count_8:bigint, $hashvalue:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} └─ RemoteExchange[REPARTITION][$hashvalue_9] │ Layout: [regionkey:integer, count_8:bigint, $hashvalue_9:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} └─ Aggregate(PARTIAL)[regionkey][$hashvalue_10] │ Layout: [regionkey:integer, $hashvalue_10:bigint, count_8:bigint] │ count_8 := count(*) └─ ScanProject[table = hive:default:testtable] Layout: [regionkey:integer, $hashvalue_10:bigint] Estimates: {rows: 0 (0B), cpu: 0, memory: 0B, network: 0B}/{rows: 0 (0B), cpu: 0, memory: 0B, network: 0B} $hashvalue_10 := "combine_hash"(bigint '0', COALESCE("$operator$hash_code"("regionkey"), 0)) regionkey := regionkey:int:0:REGULAR
- DISTRIBUTED:
EXPLAIN (type DISTRIBUTED) SELECT regionkey, count(*) FROM testTable GROUP BY 1; Query Plan ----------------------------------------------------------------------------------------------------------------------- Fragment 0 [SINGLE] Output layout: [regionkey, count] Output partitioning: SINGLE [] Stage Execution Strategy: UNGROUPED_EXECUTION Output[regionkey, _col1] │ Layout: [regionkey:integer, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} │ _col1 := count └─ RemoteSource[1] Layout: [regionkey:integer, count:bigint] Fragment 1 [HASH] Output layout: [regionkey, count] Output partitioning: SINGLE [] Stage Execution Strategy: UNGROUPED_EXECUTION Project[] │ Layout: [regionkey:integer, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} └─ Aggregate(FINAL)[regionkey][$hashvalue] │ Layout: [regionkey:integer, $hashvalue:bigint, count:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} │ count := count("count_8") └─ LocalExchange[HASH][$hashvalue] ("regionkey") │ Layout: [regionkey:integer, count_8:bigint, $hashvalue:bigint] │ Estimates: {rows: ? (?), cpu: ?, memory: ?, network: ?} └─ RemoteSource[2] Layout: [regionkey:integer, count_8:bigint, $hashvalue_9:bigint] Fragment 2 [SOURCE] Output layout: [regionkey, count_8, $hashvalue_10] Output partitioning: HASH [regionkey][$hashvalue_10] Stage Execution Strategy: UNGROUPED_EXECUTION Aggregate(PARTIAL)[regionkey][$hashvalue_10] │ Layout: [regionkey:integer, $hashvalue_10:bigint, count_8:bigint] │ count_8 := count(*) └─ ScanProject[table = hive:default:testtable, grouped = false] Layout: [regionkey:integer, $hashvalue_10:bigint] Estimates: {rows: 0 (0B), cpu: 0, memory: 0B, network: 0B}/{rows: 0 (0B), cpu: 0, memory: 0B, network: 0B} $hashvalue_10 := "combine_hash"(bigint '0', COALESCE("$operator$hash_code"("regionkey"), 0)) regionkey := regionkey:int:0:REGULAR
- VALIDATE:
EXPLAIN (TYPE VALIDATE) SELECT id, count(*) FROM testTable GROUP BY 1; Valid ------- true
- I/O:
EXPLAIN (TYPE IO, FORMAT JSON) SELECT regionkey , count(*) FROM testTable GROUP BY 1; Query Plan --------------------------------- { "inputTableColumnInfos" : [ { "table" : { "catalog" : "hive", "schemaTable" : { "schema" : "default", "table" : "testtable" } }, "columnConstraints" : [ ] } ] }
- COST:
EXPLAIN (TYPE COST, FORMAT JSON) SELECT regionkey , count(*) FROM testTable GROUP BY 1; Query Plan ----------------------------- "Aggregated": { "CPU Time": 0.0 "MaxMemory": 0.0 "outputRows": 0.0 "outputSize(bytes)": 0.0 } (1 row)
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.