Updated on 2024-05-21 GMT+08:00

UNION, CASE, and Related Constructs

SQL UNION constructs must match up possibly dissimilar types to become a single result set. The resolution algorithm is applied separately to each output column of a union query. INTERSECT and EXCEPT construct resolve dissimilar types in the same way as UNION. The CASE, ARRAY, VALUES, GREATEST and LEAST constructs use the identical algorithm to match up their component expressions and select a result data type.

Type Resolution for UNION, CASE, and Related Constructs

  • If all inputs are of the same type and are not unknown, resolve them as the unknown type.
  • If all inputs are of the unknown type, resolve them as the text type (the preferred type of the string category). Otherwise, unknown inputs are ignored. (Exception: The UNION operation resolves a group of two unknown types into the text type, and then continues to match the type with other groups.)
  • If the inputs are not all of the same type category, a failure will be resulted. (The unknown type is not included in this case.)
  • If the inputs are all of the same type category, choose the top preferred type in that category. (Exception: The UNION operation regards the type of the first branch as the selected type.)

    typcategory in the pg_type system catalog indicates the data type category. typispreferred indicates whether a type is preferred in typcategory.

  • Convert all inputs to the selected type. (Retain the original lengths of strings). Fail if there is not an implicit conversion from a given input to the selected type.
  • If the input contains the json, txid_snapshot, sys_refcursor, or geometry type, UNION cannot be performed.

Type Resolution for CASE and COALESCE in TD Compatibility Type

  • If all inputs are of the same type and are not unknown, resolve them as the unknown type.
  • If all inputs are of the unknown type, resolve them as the text type.
  • If inputs are of the string type (including unknown which is resolved as text) and digit type, resolve them as the string type. If the inputs are not of the two types, an error will be reported.
  • If the inputs are all of the same type category, choose the top preferred type in that category.
  • Convert all inputs to the selected type. Fail if there is not an implicit conversion from a given input to the selected type.

Type Resolution for CASE in ORA Compatibility Mode

decode(expr, search1, result1, search2, result2, ..., defresult): When the sql_beta_feature is set to a_style_coerce, the final return value type of the expression is set to the data type of result1 or a higher-precision data type in the same type as result1, as that in ORA-compatible mode. (For example, numeric and int are both numeric data types, but numeric has higher precision and priority than int.) For CASE WHEN, the behavior is the same as the default behavior in ORA-compatible mode.

  • If all inputs are of the same type and are not unknown, resolve them as the unknown type. Otherwise, proceed to the next step.
  • Set the data type of result1 to the final return value type preferType, which belongs to preferCategory.
  • Consider the data types of result2, result3, and defresult in sequence. If the type category is also preferCategory, which is the same as that of result1, check whether the precision (priority) is higher than that of preferType. If it is, update preferType to a data type with higher precision. If the type category is not preferCategory, check whether the category can be implicitly converted to preferType. If it cannot, an error is reported.
  • Uses the data type recorded by preferType as the return value type of the expression. The expression result is implicitly converted to this data type.

Note 1:

There is a special case where the character type of a super-large number is converted to the numeric type, for example, select decode(1, 2, 2, '53465465676465454657567678676'), in which the large number exceeds the range of the bigint and double types. If result1 is of the numeric type and does not meet the condition that all inputs are of the same type, the type of the return value is set to numeric to be compatible with this special case.

Note 2:

Priority of the numeric types: numeric > float8 > float4 > int8 > int4 > int2 > int1

Priority of the character types: text > varchar (nvarchar2) > bpchar > char

Priority of date types: timestamptz > timestamp > smalldatetime > date > abstime > timetz > time

Priority of date span types: interval > tinterval > reltime

Note 3:

The following figure shows the supported implicit type conversion when set sql_beta_feature is set to 'a_style_coerce' in ORA-compatible mode. \ indicates that conversion is not required, yes indicates that conversion is supported, and the null value indicates that conversion is not supported.

Examples

Example 1: Use type resolution with underspecified types in a union as the first example. Here, the unknown-type literal 'b' will be resolved to the text type.

1
2
3
4
5
6
gaussdb=# SELECT text 'a' AS "text" UNION SELECT 'b';
 text
------
 a
 b
(2 rows)

Example 2: Use type resolution in a simple union as the second example. The literal 1.2 is of numeric type, and the integer value 1 can be cast implicitly to numeric type, so that type is used.

1
2
3
4
5
6
gaussdb=# SELECT 1.2 AS "numeric" UNION SELECT 1;
 numeric
---------
       1
     1.2
(2 rows)

Example 3: Use type resolution in a transposed union as the third example. Since the real type cannot be implicitly cast to integer, but integer can be implicitly cast to real, the union result type is resolved as real.

1
2
3
4
5
6
gaussdb=# SELECT 1 AS "real" UNION SELECT CAST('2.2' AS REAL);
 real
------
    1
  2.2
(2 rows)

Example 4: In TD mode, if input parameters for COALESCE are of int and varchar types, resolve them as the varchar type. In ORA mode, an error is reported.

-- In Oracle mode, create the oracle_1 database compatible with Oracle.
gaussdb=# CREATE DATABASE oracle_1 dbcompatibility = 'ORA';

-- Switch to the oracle_1 database.
gaussdb=# \c oracle_1

-- Create the t1 table.
oracle_1=# CREATE TABLE t1(a int, b varchar(10));
-- View the execution plan of the query statement whose coalesce parameter is of the int or varchar type.
a_1=# EXPLAIN SELECT coalesce(a, b) FROM t1;
ERROR:  COALESCE types integer and character varying cannot be matched
LINE 1: EXPLAIN SELECT coalesce(a, b) FROM t1;
                                   ^
CONTEXT:  referenced column: coalesce

-- Delete the table.
oracle_1=# DROP TABLE t1;

-- Switch to the testdb database.
oracle_1=# \c testdb

-- In TD mode, create the td_1 database compatible with Teradata.
gaussdb=# CREATE DATABASE td_1 dbcompatibility = 'TD';

-- Switch to the td_1 database.
gaussdb=# \c td_1

-- Create the t2 table.
td_1=# CREATE TABLE t2(a int, b varchar(10));

-- View the execution plan of the query statement whose coalesce parameter is of the int or varchar type.
td_1=# EXPLAIN VERBOSE select coalesce(a, b) from t2;
                                      QUERY PLAN
---------------------------------------------------------------------------------------
 Data Node Scan  (cost=0.00..0.00 rows=0 width=0)
   Output: (COALESCE((t2.a)::character varying, t2.b))
   Node/s: All DNs
   Remote query: SELECT COALESCE(a::character varying, b) AS "coalesce" FROM public.t2
(4 rows)

-- Delete the table.
td_1=# DROP TABLE t2;

-- Switch to the testdb database.
td_1=# \c testdb

-- Delete Oracle- and TD-compatible databases.
gaussdb=# DROP DATABASE oracle_1;
gaussdb=# DROP DATABASE td_1;

Example 5: In ORA mode, set the final return value type of the expression to the data type of result1 or a higher-precision data type whose category is the same as that of the data type of result1.

-- In ORA mode, create the ora_1 database compatible with ORA.
gaussdb=# CREATE DATABASE ora_1 dbcompatibility = 'A';

-- Switch to the ora_1 database.
gaussdb=# \c ora_1

-- Enable the decode compatibility parameters.
set sql_beta_feature='a_style_coerce';

-- Create the t1 table.
ora_1=# CREATE TABLE t1(c_int int, c_float8 float8, c_char char(10), c_text text, c_date date);

-- Insert data.
ora_1=# INSERT INTO t1 VALUES(1, 2, '3', '4', date '12-10-2010');

-- The data type of result1 is char and that of defresult is text. The precision of text is higher, and the type of the return value is changed to text from char.
ora_1=# SELECT decode(1, 2, c_char, c_text) AS result, pg_typeof(result) FROM t1;
 result | pg_typeof 
--------+-----------
 4      | text
(1 row)

-- The data type of result1 is int, which is a numeric type. The type of the return value is set to numeric.
ora_1=# SELECT decode(1, 2, c_int, c_float8) AS result, pg_typeof(result) FROM t1;
 result | pg_typeof 
--------+-----------
      2 | numeric
(1 row)

-- The implicit conversion from the data type of defresult to that of result1 does not exist. If it is performed, an error is reported.
ora_1=# SELECT decode(1, 2, c_int, c_date) FROM t1;
ERROR:  CASE types integer and timestamp without time zone cannot be matched
LINE 1: SELECT decode(1, 2, c_int, c_date) FROM t1;
                                   ^
CONTEXT:  referenced column: c_date

-- Disable the decode compatibility parameters.
set sql_beta_feature='none';

-- Delete the table.
ora_1=# DROP TABLE t1;
DROP TABLE

-- Switch to the testdb database.
ora_1=# \c testdb

-- Delete the ORA-compatible database.
gaussdb=# DROP DATABASE ora_1;
DROP DATABASE

Example 6: The UNION operation resolves a group of two unknown types into the text type, and then continues to match the type with other groups.

1
2
3
4
5
6
7
8
-- Resolve the first two NULL values of the unknown type as the text type. Then, match the text type with the third element of the varchar2 type, and select the text type.
gaussdb=# SELECT "text", pg_typeof("text") as type from (SELECT NULL AS "text" UNION ALL SELECT NULL AS "text" UNION ALL SELECT 'a'::varchar2 as "text");
 text | type
------+------
      | text
      | text
 a    | text
(3 rows)