AnalyticDB for PostgreSQL allows you to read data from and write data to a Hadoop cluster. This topic describes how to read and write Hadoop Distributed File System (HDFS) data by using the gphdfs protocol in AnalyticDB for PostgreSQL.

Note To use this feature, Submit a ticket.

Create an HDFS test file

Log on to an HDFS cluster and run the following commands to create a test file in a test directory:

root@namenode:/# hadoop fs -mkdir /test #Create a test folder in the root directory.
root@namenode:/# echo "1 abc" > data.txt #Create a test file.
root@namenode:/# hadoop fs -put local.txt /test #Upload the test file to HDFS.
root@namenode:/# hadoop fs -cat /test/data.txt #View the test file content.
1 abc

Create a readable HDFS external table and read data

In the CREATE EXTERNAL TABLE statement, specify the HDFS cluster endpoint that uses the gphdfs external table protocol, test file path, file format, and delimiter. For more information about how to create an external table, see CREATE EXTERNAL TABLE.

CREATE READABLE EXTERNAL TABLE test (id int, name text)
LOCATION ('gphdfs://namenode_IP:port/test/data.txt')
FORMAT 'text' (delimiter ' ');

Execute the following statement to read data from an external table:


The following result is returned:

 id | name
  1 | abc
(1 row)

Create a writable HDFS external table and write data

In the CREATE EXTERNAL TABLE statement, declare WRITABLE to specify that the external table is writable.

CREATE WRITABLE EXTERNAL TABLE test_write (id int, name text)
FORMAT 'text' (delimiter ' ');

Execute the following INSERT statement to write data:

INSERT INTO test_write VALUES(2, 'def');

Run the following command to view the test file content in the HDFS cluster and check whether the data is written to the file:

root@namenode:/# hadoop fs -cat /test/data.txt #View the test file content.
1 abc
2 def