DML(Teradata)
本节主要介绍Teradata DML的迁移语法。迁移语法决定了关键字/特性的迁移方式。
在Teradata中,如果某文件中包含SELECT、INSERT、UPDATE、DELETE和MERGE语句,则该文件中的SQL查询可迁移到GaussDB(DWS)。
详见以下节点内容:
INSERT
Teradata的INSERT(缩写关键字为INS)语句用于向表中插入记录。DSC支持INSERT语句。
Teradata SQL中存在INSERT INTO TABLE table_name语法,但GaussDB(DWS)不支持。GaussDB(DWS)仅支持INSERT INTO table_name。DSC工具需要去除关键词TABLE。
输入
INSERT TABLE tab1 SELECT col1, col2 FROM tab2 WHERE col3 > 0;
输出
INSERT INTO tab1 SELECT col1, col2 FROM tab2 WHERE col3 > 0;
SELECT
- ANALYZE
Teradata的SELECT命令(缩写关键字为SEL)用于指定从哪一列中检索数据。
在GaussDB(DWS)中使用ANALYZE来收集优化器统计信息,这些统计信息将用于查询性能。
输入:ANALYZE,使用INSERT
1 2
INSERT INTO employee(empno,ename) Values (1,'John'); COLLECT STAT on employee;
输出
1 2 3
INSERT INTO employee( empno, ename) SELECT 1 ,'John'; ANALYZE employee;
输入:ANALYZE,使用UPDATE
1 2 3
UPD employee SET ename = 'Jane' WHERE ename = 'John'; COLLECT STAT on employee;
输出
1 2 3
UPDATE employee SET ename = 'Jane' WHERE ename = 'John'; ANALYZE employee;
输入:ANALYZE,使用DELETE
1 2
DEL FROM employee WHERE ID > 10; COLLECT STAT on employee;
输出
1 2
DELETE FROM employee WHERE ID > 10; ANALYZE employee;
- 子句顺序
从Teradata迁移SELECT语句时,各子句(FROM、WHERE、HAVING和GROUP BY)可按任意顺序排列。如果语句的FROM子句之前包含作为ALIAS的QUALIFY子句,则DSC不会迁移该语句。
可以使用tdMigrateALIAS参数来配置ALIAS的迁移。
输入:子句顺序
1 2 3 4 5 6 7 8 9 10 11 12 13 14
SELECT expr1 AS alias1 , expr2 AS alias2 , expr3 AS alias3 , MAX( expr4 ), ... FROM tab1 T1 INNER JOIN tab2 T2 ON T1.c1 = T2.c2 ... AND T3.c5 = '010' AND ... WHERE T1.c7 = '000' AND ... HAVING alias1 <> 'IC' AND alias2 <> 'IC' AND alias3 <> '' GROUP BY 1, 2, 3 ;
输出
1 2 3 4 5 6 7 8 9 10 11 12 13 14
SELECT expr1 AS alias1 , expr2 AS alias2 , expr3 AS alias3 , MAX( expr4 ), ... FROM tab1 T1 INNER JOIN tab2 T2 ON T1.c1 = T2.c2 ... AND T3.c5 = '010' AND ... WHERE T1.c7 = '000' AND ... GROUP BY 1 ,2 ,3 HAVING expr1 <> 'IC' AND expr2 <> 'IC' AND expr3 <> '';
输入:子句顺序
1 2 3 4 5 6 7 8
SELECT TOP 10 * GROUP BY DeptNo WHERE empID < 100 FROM tbl_employee;
输出
1 2 3 4 5 6 7 8 9
SELECT * FROM tbl_employee WHERE empID < 100 GROUP BY DeptNo LIMIT 10 ;
如果输入脚本的FROM子句之前包含作为ALIAS的QUALIFY子句,DSC将不会迁移该语句,也不会逐字复制输入的语句。
输入:子句顺序,在FROM子句之前使用QUALIFY作为ALIAS
1 2 3 4 5 6 7 8 9 10 11 12 13 14
SELECT * FROM table1 WHERE abc = ( SELECT col1 AS qualify FROM TABLE WHERE col1 = 5 ) ;
输出
1 2 3 4 5 6 7 8 9 10 11 12 13 14
SELECT * FROM table1 WHERE abc = ( SELECT col1 AS qualify FROM TABLE WHERE col1 = 5 ) ;
- 扩展Group By子句
如果用户希望数据库根据expr(s)的值对选定的行进行分组,则可指定GROUP BY子句。 如果此子句包含CUBE,ROLLUP或GROUPING SETS扩展,则除了常规分组之外,数据库还会生成超级聚合分组。这些特性在GaussDB(DWS)中不可用,使用UNION ALL操作符可以实现类似的功能。
可以使用extendedGroupByClause参数来配置扩展GROUP BY子句的迁移。
输入:扩展Group By子句,使用CUBE
1 2 3 4 5 6 7 8 9 10 11 12 13 14
SELECT expr1 AS alias1 , expr2 AS alias2 , expr3 AS alias3 , MAX( expr4 ), ... FROM tab1 T1 INNER JOIN tab2 T2 ON T1.c1 = T2.c2 ... AND T3.c5 = '010' AND ... WHERE T1.c7 = '000' AND ... HAVING alias1 <> 'IC' AND alias2 <> 'IC' AND alias3 <> '' GROUP BY 1, 2, 3 ;
输出
1 2 3 4 5 6 7 8 9 10 11 12 13 14
SELECT expr1 AS alias1 , expr2 AS alias2 , expr3 AS alias3 , MAX( expr4 ), ... FROM tab1 T1 INNER JOIN tab2 T2 ON T1.c1 = T2.c2 ... AND T3.c5 = '010' AND ... WHERE T1.c7 = '000' AND ... GROUP BY 1 ,2 ,3 HAVING expr1 <> 'IC' AND expr2 <> 'IC' AND expr3 <> '';
输入:扩展Group By子句,使用ROLLUP
1 2 3 4 5
SELECT d.dname, e.job, MAX(e.sal) FROM emp e RIGHT OUTER JOIN dept d ON e.deptno=d.deptno WHERE e.job IS NOT NULL GROUP BY ROLLUP (d.dname, e.job);
输出
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
SELECT dname, job, ColumnAlias1 FROM ( SELECT MAX(e.sal) AS ColumnAlias1, d.dname, e.job FROM emp e RIGHT OUTER JOIN dept d ON e.deptno = d.deptno WHERE e.job IS NOT NULL GROUP BY d.dname ,e.job UNION ALL SELECT MAX(e.sal) AS ColumnAlias1, d.dname, NULL AS job FROM emp e RIGHT OUTER JOIN dept d ON e.deptno = d.deptno WHERE e.job IS NOT NULL GROUP BY d.dname UNION ALL SELECT MAX( e.sal ) AS ColumnAlias1, NULL AS dname, NULL AS job FROM emp e RIGHT OUTER JOIN dept d ON e.deptno = d.deptno WHERE e.job IS NOT NULL );
输入:扩展Group By子句,使用GROUPING SETS
1 2 3 4 5
SELECT d.dname, e.job, MAX(e.sal) FROM emp e RIGHT OUTER JOIN dept d ON e.deptno=d.deptno WHERE e.job IS NOT NULL GROUP BY GROUPING SETS(d.dname, e.job);
输出
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
SELECT dname, job, ColumnAlias1 FROM ( SELECT MAX(e.sal) AS ColumnAlias1 , d.dname, NULL AS job FROM emp e RIGHT OUTER JOIN dept d ON e.deptno = d.deptno WHERE e.job IS NOT NULL GROUP BY d.dname UNION ALL SELECT MAX(e.sal) AS ColumnAlias1 , NULL AS dname, e.job FROM emp e RIGHT OUTER JOIN dept d ON e.deptno = d.deptno WHERE e.job IS NOT NULL GROUP BY e.job );
- TOP和SAMPLE子句
Teradata的TOP和SAMPLE子句在GaussDB(DWS)中迁移为LIMIT。
- TOP
- 对于包含WITH TIES的TOP语句,需要指定ORDER BY子句,否则工具不会迁移该语句,只会原样复制。
- 使用TOP和动态参数时:
- 按照以下形式输入动态参数:
1
TOP :<parameter_name>
可使用的字符包括:小写英文字母(a-z)、大写英文字母(A-Z)、数字(0-9)、下划线(_)
- 按照以下形式输入动态参数:
输入:SELECT...TOP
1 2 3 4
SELECT TOP 1 c1, COUNT (*) cnt FROM tab1 GROUP BY c1 ORDER BY cnt;
输出
1 2 3 4 5
SELECT c1, COUNT( * ) cnt FROM tab1 GROUP BY c1 ORDER BY cnt LIMIT 1;
输入:SELECT...TOP PERCENT
1 2 3 4
SELECT TOP 10 PERCENT c1, c2 FROM employee WHERE ... ORDER BY c2 DESC;
输出
1 2 3 4 5 6 7 8 9 10
WITH top_percent AS ( SELECT c1, c2 FROM employee WHERE ... ORDER BY c2 DESC ) SELECT * FROM top_percent LIMIT (SELECT CEIL(COUNT( * ) * 10 / 100) FROM top_percent);
输入:SELECT...TOP,使用动态参数
1 2 3 4 5 6 7 8 9 10 11 12
SELECT TOP :Limit WITH TIES c1 ,SUM (c2) sc2 FROM tab1 WHERE c3 > 10 GROUP BY c1 ORDER BY c1 ;
输出
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
WITH top_ties AS ( SELECT c1 ,SUM (c2) sc2 ,rank ( ) OVER( ORDER BY c1 ) AS TOP_RNK FROM tab1 WHERE c3 > 10 GROUP BY c1 ) SELECT c1 ,sc2 FROM top_ties WHERE TOP_RNK <= :Limit ORDER BY TOP_RNK ;
输入:SELECT...TOP,使用动态参数和TIES
1 2 3 4 5 6 7
SELECT TOP :Limit WITH TIES Customer_ID FROM Customer_t ORDER BY Customer_ID ;
输出
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
WITH top_ties AS ( SELECT Customer_ID ,rank ( ) OVER( order by Customer_id) AS TOP_RNK FROM Customer_t ) SELECT Customer_ID FROM top_ties WHERE TOP_RNK <= :Limit ORDER BY TOP_RNK ;
输入:SELECT...TOP PERCENT,使用动态参数
1 2 3 4 5 6 7 8 9 10
SELECT TOP :Input_Limit PERCENT WITH TIES c1 ,SUM (c2) sc2 FROM tab1 GROUP BY c1 ORDER BY c1 ;
输出
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
WITH top_percent_ties AS ( SELECT c1 ,SUM (c2) sc2 ,rank ( ) OVER( ORDER BY c1 ) AS TOP_RNK FROM tab1 GROUP BY c1 ) SELECT c1 ,sc2 FROM top_percent_ties WHERE TOP_RNK <= ( SELECT CEIL(COUNT( * ) * :Input_Limit / 100) FROM top_percent_ties ) ORDER BY TOP_RNK ;
- SAMPLE
工具仅支持在SAMPLE子句中使用单个正整数。
输入:SELECT...SAMPLE
1 2 3 4
SELECT c1, c2, c3 FROM tab1 WHERE c1 > 1000 SAMPLE 1;
输出
1 2 3 4
SELECT c1, c2, c3 FROM tab1 WHERE c1 > 1000 LIMIT 1;
- TOP
UPDATE
该工具支持和迁移UPDATE语句(缩写关键字为UPD)。
输入:UPDATE,使用TABLE ALIAS
1 2 3 4 5 |
UPDATE T1 FROM tab1 T1, tab2 T2 SET c1 = T2.c1 , c2 = T2.c2 WHERE T1.c3 = T2.c3; |
输出
1 2 3 4 5 |
UPDATE tab1 T1 SET c1 = T2.c1 , c2 = T2.c2 FROM tab2 T2 WHERE T1.c3 = T2.c3; |
输入:UPDATE,使用TABLE ALIAS和子查询
1 2 3 4 5 |
UPDATE t1 FROM tab1 t1, ( SELECT c1, c2 FROM tab2 WHERE c2 > 100 ) t2 SET c1 = t2.c1 WHERE t1.c2 = t2.c2; |
输出
1 2 3 4 5 |
UPDATE tab1 t1 SET c1 = t2.c1 FROM ( SELECT c1, c2 FROM tab2 WHERE c2 > 100 ) t2 WHERE t1.c2 = t2.c2; |
输入:UPDATE,使用ANALYZE
1 2 3 |
UPD employee SET ename = 'Jane' WHERE ename = 'John'; COLLECT STAT on employee; |
输出
1 2 3 |
UPDATE employee SET ename = 'Jane' WHERE ename = 'John'; ANALYZE employee; |
DELETE
DELETE(缩写关键字为DEL)是ANSI标准的SQL语法操作符,用于从表中删除记录。DSC支持Teradata的DELETE语句及其缩写关键字DEL。不包含WHERE子句的DELETE语句在GaussDB(DWS)中被迁移为TRUNCATE。通过deleteToTruncate参数可以配置是否启用/禁用此行为。
输入:DELETE
1 2 |
DEL FROM tab1 WHERE a =10; |
输出
1 2 |
DELETE FROM tab1 WHERE a =10; |
输入:DELETE,不使用WHERE(如果deletetoTruncate=TRUE,则迁移为TRUNCATE)
1
|
DELETE FROM ${schemaname} . "tablename" ALL; |
输出
1 2 3 |
TRUNCATE TABLE ${schemaname} . "tablename"; |
以下输入示例中,DELETE和FROM子句引用相同表,按是否使用WHERE子句区分:
输入
DELETE DP_TMP.M_P_TX_SCV_REMAINING_PARTY FROM DP_TMP.M_P_TX_SCV_REMAINING_PARTY ALL ; --- DELETE DP_VMCTLFW.CTLFW_Process_Id FROM DP_VMCTLFW.CTLFW_Process_Id WHERE (Process_Name = :_spVV2 ) AND (Process_Id NOT IN (SELECT MAX(Process_Id )(NAMED Process_Id ) FROM DP_VMCTLFW.CTLFW_Process_Id WHERE Process_Name = :_spVV2 ) ); --- DELETE CPID FROM DP_VMCTLFW.CTLFW_Process_Id AS CPID WHERE (Process_Name = :_spVV2 ) AND (Process_Id NOT IN (SELECT MAX(Process_Id )(NAMED Process_Id ) FROM DP_VMCTLFW.CTLFW_Process_Id WHERE Process_Name = :_spVV2 ) );
输出
DELETE FROM DP_TMP.M_P_TX_SCV_REMAINING_PARTY; --- DELETE FROM DP_VMCTLFW.CTLFW_Process_Id WHERE (Process_Name = :_spVV2 ) AND (Process_Id NOT IN (SELECT MAX(Process_Id )(NAMED Process_Id ) FROM DP_VMCTLFW.CTLFW_Process_Id WHERE Process_Name = :_spVV2 ) ); --- DELETE FROM DP_VMCTLFW.CTLFW_Process_Id AS CPID WHERE (Process_Name = :_spVV2 ) AND (Process_Id NOT IN (SELECT MAX(Process_Id )(NAMED Process_Id ) FROM DP_VMCTLFW.CTLFW_Process_Id WHERE Process_Name = :_spVV2 ) );
DELETE table_alias FROM table
输入
SQL_Detail10124.sql delete a from ${BRTL_DCOR}.BRTL_CS_POT_CUST_UMPAY_INF_S as a where a.DW_Snsh_Dt = cast('${v_Trx_Dt}' as date format 'yyyy-mm-dd') and a.DW_Job_Seq = 1 ; was migrated as below: DELETE FROM BRTL_DCOR.BRTL_CS_POT_CUST_UMPAY_INF_S AS a USING WHERE a.DW_Snsh_Dt = CAST( lv_mig_v_Trx_Dt AS DATE ) AND a.DW_Job_Seq = 1 ; SQL_Detail10449.sql delete a from ${BRTL_DCOR}.BRTL_EM_YISHITONG_USR_INF as a where a.DW_Job_Seq = 1 ; was migrated as below: DELETE FROM BRTL_DCOR.BRTL_EM_YISHITONG_USR_INF AS a USING WHERE a.DW_Job_Seq = 1 ; SQL_Detail5742.sql delete a from ${BRTL_DCOR}.BRTL_PD_FP_NAV_ADT_INF as a; was migrated as DELETE a FROM BRTL_DCOR.BRTL_PD_FP_NAV_ADT_INF AS a ;
输出
SQL_Detail10124.sql delete from ${BRTL_DCOR}.BRTL_CS_POT_CUST_UMPAY_INF_S as a where a.DW_Snsh_Dt = cast('${v_Trx_Dt}' as date format 'yyyy-mm-dd') and a.DW_Job_Seq = 1 ; SQL_Detail10449.sql delete from ${BRTL_DCOR}.BRTL_EM_YISHITONG_USR_INF as a where a.DW_Job_Seq = 1 ; SQL_Detail5742.sql delete from ${BRTL_DCOR}.BRTL_PD_FP_NAV_ADT_INF as a;
MERGE
6.5.0及之后版本的GaussDB支持MERGE功能。
MERGE是ANSI标准的SQL语法操作符,用于从一个或多个来源中选择行来更新或插入到表或视图中,可以指定更新或插入到目标表或视图的条件。
输入:MERGE
1 2 3 4 5 6 7 8 |
MERGE INTO tab1 A using ( SELECT c1, c2, ... FROM tab2 WHERE ...) AS B ON A.c1 = B.c1 WHEN MATCHED THEN UPDATE SET c2 = c2 , c3 = c3 WHEN NOT MATCHED THEN INSERT VALUES (B.c1, B.c2, B.c3); |
输出
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
WITH B AS ( SELECT c1 ,c2 ,... FROM tab2 WHERE ... ) ,UPD_REC AS ( UPDATE tab1 A SET c2 = c2 ,c3 = c3 FROM B WHERE A.c1 = B.c1 returning A. * ) INSERT INTO tab1 SELECT B.c1 ,B.c2 ,B.c3 FROM B WHERE NOT EXISTS ( SELECT 1 FROM UPD_REC A WHERE A.c1 = B.c1 ) ; |
NAMED
Teradata中的NAMED用于为表达式或列分配临时名称。用于表达式的NAMED语句在GaussDB(DWS)中被迁移为AS。用于列名的NAMED语句保留在相同的语法中。
输入:NAMED表达式,迁移为AS
1 2 3 |
SELECT Name, ((Salary + (YrsExp * 200))/12) (NAMED Projection) FROM Employee WHERE DeptNo = 600 AND Projection < 2500; |
输出
1 2 3 |
SELECT Name, ((Salary + (YrsExp * 200))/12) AS Projection FROM Employee WHERE DeptNo = 600 AND ((Salary + (YrsExp * 200))/12) < 2500; |
输入:NAMED AS,定义列名
1 2 |
SELECT product_id AS id FROM emp where pid=2 or id=2; |
输出
1 2 |
SELECT product_id (NAMED "pid") AS id FROM emp where product_id=2 or product_id=2; |
输入:NAMED( ),定义列名
1
|
INSERT INTO Neg100 (NAMED,ID,Dept) VALUES ('TEST',1,'IT'); |
输出
1
|
INSERT INTO Neg100 (NAMED,ID,Dept) SELECT 'TEST',1, 'IT'; |
输入:NAMED别名,使用TITLE别名,不使用AS
1 2 3 |
SELECT dept_name (NAMED alias1) (TITLE alias2 ) FROM employee WHERE dept_name like 'Quality'; |
输出
1 2 3 4 |
SELECT dept_name AS alias1 FROM employee WHERE dept_name like 'Quality'; |
输入:NAMED别名,使用TITLE别名和AS
DSC将跳过NAMED别名和TITLE别名,仅使用AS别名。
1 2 3 4 |
SELECT sale_name (Named alias1 ) (Title alias2) AS alias3 FROM employee WHERE sname = 'Stock' OR sname ='Sales'; |
输出
1 2 3 4 |
SELECT sale_name AS alias3 FROM employee WHERE sname = 'Stock' OR sname ='Sales'; |
输入:NAMED,使用TITLE
NAMED和TITLE一起使用,通过逗号隔开。
1 2 3 |
SELECT customer_id (NAMED cust_id, TITLE 'Customer Id') FROM Customer_T WHERE cust_id > 10; |
输出
1 2 3 4 |
SELECT cust_id AS "Customer Id" FROM (SELECT customer_id AS cust_id FROM customer_t WHERE cust_id > 10); |
ACTIVITYCOUNT
输入
状态变量,返回嵌入式SQL中受DML语句影响的行数。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
SEL tablename FROM dbc.tables WHERE databasename ='tera_db' AND tablename='tab1'; .IF ACTIVITYCOUNT > 0 THEN .GOTO NXTREPORT; CREATE MULTISET TABLE tera_db.tab1 , NO FALLBACK , NO BEFORE JOURNAL , NO AFTER JOURNAL , CHECKSUM = DEFAULT ( Tx_Zone_Num CHAR( 4 ) , Tx_Org_Num VARCHAR( 30 ) ) PRIMARY INDEX ( Tx_Org_Num ) INDEX ( Tx_Teller_Id ) ; .LABEL NXTREPORT DEL FROM tera_db.tab1; |
输出
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 |
DECLARE v_verify TEXT ; v_no_data_found NUMBER ( 1 ) ; BEGIN BEGIN v_no_data_found := 0 ; SELECT mig_td_ext.vw_td_dbc_tables.tablename INTO v_verify FROM mig_td_ext.vw_td_dbc_tables WHERE mig_td_ext.vw_td_dbc_tables.schemaname = 'tera_db' AND mig_td_ext.vw_td_dbc_tables.tablename = 'tab1' ; EXCEPTION WHEN NO_DATA_FOUND THEN v_no_data_found := 1 ; END ; IF v_no_data_found = 1 THEN CREATE TABLE tera_db.tab1 ( Tx_Zone_Num CHAR( 4 ) ,Tx_Org_Num VARCHAR( 30 ) ) DISTRIBUTE BY HASH ( Tx_Org_Num ) ; CREATE INDEX ON tera_db.tab1 ( Tx_Teller_Id ) ; END IF ; DELETE FROM tera_db.tab1 ; END ; / |
TIMESTAMP
输入:TIMESTAMP,使用FORMAT
FORMAT短语设置特定TIME或TIMESTAMP列或值的格式。FORMAT短语会覆盖系统格式。
1 2 |
SELECT 'StartDTTM' as a ,CURRENT_TIMESTAMP (FORMAT 'HH:MI:SSBMMMBDD,BYYYY'); |
输出
1 2 |
SELECT 'StartDTTM' AS a ,TO_CHAR( CURRENT_TIMESTAMP ,'HH:MI:SS MON DD, YYYY' ) ; |
TIMESTAMP类型装换:
输入
COALESCE( a.Snd_Tm ,TIMESTAMP '0001-01-01 00:00:00' ) should be migrated as below: COALESCE( a.Snd_Tm , CAST('0001-01-01 00:00:00' AS TIMESTAMP) )
输出
COALESCE( a.Snd_Tm , CAST('0001-01-01 00:00:00' AS TIMESTAMP) )