AnalyticDB for PostgreSQL is compatible with Teradata syntax. This guide covers the syntax differences, data type mappings, and function equivalents you need to plan your migration from Teradata to AnalyticDB for PostgreSQL.
Migration requirements
A successful migration from Teradata to AnalyticDB for PostgreSQL is designed to meet the following goals:
-
Perform a complete migration for the underlying data platform of the data warehouse.
-
Perform smooth migration for deployed applications on the data warehouse system.
-
Perform migration of which business users are not aware for business platforms and data.
-
Ensure good performance of the data warehouse after migration.
-
Set favorable migration duration and migration plans.
-
Fully reuse the original system architecture, ETL process, data structure, and management tools.
Migration scope
A complete Teradata migration involves four areas:
-
Historical data: Export data as text files using the specified delimiter and character encoding, store them on Elastic Compute Service (ECS) instance local disks or in Object Storage Service (OSS) buckets on the same network as the AnalyticDB for PostgreSQL instance, and load the files through OSS external tables over the gpfdist protocol. Export DDL scripts from Teradata and batch-convert them using AnalyticDB for PostgreSQL syntax.
-
Extract-Transform-Load (ETL) processes: Convert ETL statements to AnalyticDB for PostgreSQL Data Manipulation Language (DML) syntax using the script-based conversion tools. Replace Teradata functions with their AnalyticDB for PostgreSQL equivalents based on the function mapping table in this document. Reconfigure and restart ETL jobs after historical data migration completes.
-
API connections: AnalyticDB for PostgreSQL supports JDBC and ODBC protocols, so Business Intelligence (BI) frontend tools connect the same way. Update the instance IP address to complete the API migration.
-
Management tools: Deploy backup and recovery tools for the AnalyticDB for PostgreSQL instance and schedule regular recovery drills.
Data type mapping
Most core data types map directly between Teradata and AnalyticDB for PostgreSQL. The script-based Data Definition Language (DDL) conversion tools handle batch conversion automatically.
| Teradata | AnalyticDB for PostgreSQL | Notes |
|---|---|---|
| CHAR | CHAR | Direct mapping |
| VARCHAR | VARCHAR | Direct mapping |
| LONG VARCHAR | VARCHAR(64000) | Fixed maximum length |
| VARBYTE(size) | BYTEA | Binary data type |
| BYTEINT | BYTEA | Binary data type |
| SMALLINT | SMALLINT | Direct mapping |
| INTEGER | INTEGER | Direct mapping |
| DECIMAL(size,dec) | DECIMAL(size,dec) | Direct mapping |
| NUMERIC(precision,dec) | NUMERIC(precision,dec) | Direct mapping |
| FLOAT | FLOAT | Direct mapping |
| REAL | REAL | Direct mapping |
| DOUBLE PRECISION | DOUBLE PRECISION | Direct mapping |
| DATE | DATE | Direct mapping |
| TIME | TIME | Direct mapping |
| TIMESTAMP | TIMESTAMP | Direct mapping |
Table creation statements
The following examples show how a Teradata DDL statement maps to AnalyticDB for PostgreSQL. Key differences are annotated in the converted statement.
Teradata:
CREATE MULTISET TABLE test_table,NO FALLBACK ,
NO BEFORE JOURNAL,
NO AFTER JOURNAL,
CHECKSUM = DEFAULT,
DEFAULT MERGEBLOCKRATIO
(
first_column DATE FORMAT 'YYYYMMDD' TITLE 'COLUMN 1' NOT NULL,
second_column INTEGER TITLE 'COLUMN 2' NOT NULL ,
third_column CHAR(6) CHARACTER SET LATIN CASESPECIFIC TITLE 'COLUMN 3' NOT NULL ,
fourth_column CHAR(20) CHARACTER SET LATIN CASESPECIFIC TITLE 'COLUMN 4' NOT NULL,
fifth_column CHAR(1) CHARACTER SET LATIN CASESPECIFIC TITLE 'COLUMN 5' NOT NULL,
sixth_column CHAR(24) CHARACTER SET LATIN CASESPECIFIC TITLE 'COLUMN 6' NOT NULL,
seventh_column VARCHAR(18) CHARACTER SET LATIN CASESPECIFIC TITLE 'COLUMN 7' NOT NULL,
eighth_column DECIMAL(18,0) TITLE 'COLUMN 8' NOT NULL ,
nineth_column DECIMAL(18,6) TITLE 'COLUMN 9' NOT NULL )
PRIMARY INDEX ( first_column ,fourth_column )
PARTITION BY RANGE_N(first_column BETWEEN DATE '1999-01-01' AND DATE '2050-12-31' EACH INTERVAL '1' DAY );
CREATE INDEX test_index (first_column, fourth_column) ON test_table;
AnalyticDB for PostgreSQL:
-- MULTISET TABLE, NO FALLBACK, NO BEFORE JOURNAL, NO AFTER JOURNAL,
-- CHECKSUM, DEFAULT MERGEBLOCKRATIO: Teradata-specific clauses removed
-- FORMAT 'YYYYMMDD', TITLE, CHARACTER SET, CASESPECIFIC: column attributes removed
CREATE TABLE test_table
(
first_column DATE NOT NULL, -- FORMAT 'YYYYMMDD' removed; use SET client_encoding for encoding
second_column INTEGER NOT NULL ,
third_column CHAR(6) NOT NULL ,
fourth_column CHAR(20) NOT NULL,
fifth_column CHAR(1) NOT NULL,
sixth_column CHAR(24) NOT NULL,
seventh_column VARCHAR(18) NOT NULL,
eighth_column DECIMAL(18,0) NOT NULL ,
nineth_column DECIMAL(18,6) NOT NULL )
DISTRIBUTED BY ( first_column ,fourth_column ) -- PRIMARY INDEX -> DISTRIBUTED BY
PARTITION BY RANGE(first_column) -- RANGE_N syntax converted to standard RANGE
(START (DATE '1999-01-01') INCLUSIVE
END (DATE '2050-12-31') INCLUSIVE
EVERY (INTERVAL '1 DAY' ) );
create index test_index on test_table(first_column, fourth_column); -- index syntax unchanged
DDL differences summary
| Area | Teradata | AnalyticDB for PostgreSQL |
|---|---|---|
| Distribution key | PRIMARY INDEX (col1, col2) |
DISTRIBUTED BY (col1, col2) |
| Partitioning | PARTITION BY RANGE_N(...) |
PARTITION BY RANGE(...) with START/END/EVERY |
| Index creation | CREATE INDEX idx (cols) ON table |
CREATE INDEX idx ON table(cols) |
| Column title | TITLE 'label' |
COMMENT ON COLUMN table.col IS 'label'; |
| Column encoding | CHARACTER SET LATIN CASESPECIFIC |
SET client_encoding = latin1; (session-level) |
| Table attributes | MULTISET, NO FALLBACK, CHECKSUM, etc. |
Not supported; remove these clauses |
Data import and export formats
Both Teradata and AnalyticDB for PostgreSQL support TXT and CSV file formats. The key difference is the separator:
-
Teradata: Two-character separators
-
AnalyticDB for PostgreSQL: Single-character separators
Convert separators in your data files before loading.
SQL statement differences
AnalyticDB for PostgreSQL supports most Teradata SQL syntax. The following statements require conversion.
CAST
Teradata CAST statements support format specifiers; AnalyticDB for PostgreSQL does not.
| Statement | Teradata | AnalyticDB for PostgreSQL |
|---|---|---|
| Integer cast | CAST(x AS INT FORMAT '999999') |
Write a custom function for replacement |
| Date cast | CAST(x AS DATE FORMAT 'YYYYMMDD') |
CAST(x AS DATE) — no change needed; YYYY-MM-DD is supported natively |
QUALIFY
Teradata's QUALIFY keyword filters window function results inline. AnalyticDB for PostgreSQL does not support QUALIFY; rewrite as a subquery.
Teradata:
SELECT itemid, sumprice, RANK() OVER (ORDER BY sumprice DESC)
FROM (SELECT a1.item_id, SUM(a1.sale)
FROM sales AS a1
GROUP BY a1.itemID) AS t1 (itemid, sumprice)
QUALIFY RANK() OVER (ORDER BY sum_price DESC) <=100;
AnalyticDB for PostgreSQL:
-- Wrap the window function in a subquery, then filter in the outer WHERE clause
SELECT itemid, sumprice, rank FROM
(SELECT itemid, sumprice, RANK() OVER (ORDER BY sumprice DESC) AS rank
FROM (SELECT a1.item_id, SUM(a1.sale)
FROM sales AS a1
GROUP BY a1.itemID) AS t1 (itemid, sumprice)
) AS a
WHERE rank <=100;
MACRO
Teradata macros execute a group of SQL statements as a named unit. AnalyticDB for PostgreSQL does not support macros; use a FUNCTION instead.
Teradata:
CREATE MACRO Get_Emp_Salary(EmployeeNo INTEGER) AS (
SELECT
EmployeeNo,
NetPay
FROM
Salary
WHERE EmployeeNo = :EmployeeNo;
);
AnalyticDB for PostgreSQL:
CREATE OR REPLACE FUNCTION Get_Emp_Salary(
EmployeeNo INTEGER,
OUT EmployeeNo INTEGER,
OUT NetPay FLOAT
) RETURNS SETOF RECORD AS
$$
SELECT EmployeeNo, NetPay
FROM Salary
WHERE EmployeeNo = $1
$$
LANGUAGE SQL;
Function mapping
The following table maps Teradata functions to their AnalyticDB for PostgreSQL equivalents. For functions with no direct equivalent, the table provides the SQL expression to use.
| Teradata | AnalyticDB for PostgreSQL | Description |
|---|---|---|
ZEROIFNULL(x) |
COALESCE(x, 0) |
Converts NULL to 0 for cumulative calculations |
NULLIFZERO(x) |
COALESCE |
Replaces 0 values with NULL for cumulative data |
INDEX(str, substr) |
POSITION(substr IN str) |
Returns the position of a substring in a string |
ADD_MONTHS(date, n) |
TO_DATE |
Adds or subtracts a specified number of months to or from the input date |
FORMAT |
TO_CHAR / TO_DATE |
Formats data as a string or date |
CSUM(col, order_col) |
Subquery | Returns the cumulative sum of a value expression for each row in a partition |
MAVG(col, n, order_col) |
Subquery | Computes the moving average of a specified column based on a defined number of rows (query width) |
MSUM(col, n, order_col) |
Subquery | Computes the moving sum of a specified column based on a defined query width |
MDIFF(col, n, order_col) |
Subquery | Computes the moving difference of a specified column based on a defined query width |
QUALIFY |
Subquery with WHERE on window function result |
Filters window function results; see the QUALIFY section above |
CHAR(str) / CHARACTERS(str) |
LENGTH(str) |
Returns the number of characters in a string |