This topic describes how to use clickhouse-client to import data from a local file. clickhouse-client is a command line client provided by ClickHouse.

Procedure

  1. Obtain the endpoint of the ApsaraDB for ClickHouse cluster to which you want to import data. For more information, see Connect to a cluster.
  2. Download the clickhouse-client program that applies to the ClickHouse version and the OS that you use.
    The following table provides the download link of the clickhouse-client program that applies to the listed ClickHouse version and OS.
    OS ClickHouse version Download link
    Linux OS 19.15.2.2 Download

    For information about how to download and install other versions of the client, see Install and deploy ClickHouse.

  3. Run the following command to import data:
    cat <data_file> | ./clickhouse-client --host=<host> --port=<port> --user=<username> --password=<password> --query="INSERT INTO <table_name> FORMAT <format>";
    Note Data import over the Internet is slow. If you want to import a large amount of data or you need to test the import performance, we recommend that you use one of the following methods:
    • Import data over an internal network: Purchase an Elastic Compute Service (ECS) instance. Connect the ECS instance to the virtual private cloud (VPC) that is connected to the destination ApsaraDB for ClickHouse cluster. Connect to the ApsaraDB for ClickHouse cluster from the ECS instance by using the private endpoint of the cluster and import data to the cluster.
    • Import data in multiple concurrent processes: Divide the original file into multiple parts, and then start multiple clients to concurrently import these parts.
    Table 1. Formats of the imported files
    Common file format Description
    TabSeparated Data is written into such a text file by row. Columns are separated with tabs. Escape characters are required to write tabs, line feeds, and backslashes. The escape character for a tab is \t. The escape character for a line feed is \n. The escape character for a backslash is \\. \N represents NULL.
    TabSeparatedWithNames This file format is similar to the TabSeparated format, except that the values in the first row are the column names. The first row is ignored during parsing. Therefore, the column names in the first row cannot be used to determine the positions of columns.
    TabSeparatedWithNamesAndTypes This file format is similar to the TabSeparated format, except that the values in the first row are the column names and the values in the second row are the data types. The first and second rows are ignored during parsing.
    CSV Data is written into such a file by row. By default, columns are separated with commas (,). Strings are enclosed within double quotation marks ("). Two double quotation marks are added as escape characters before the double quotation marks in a string. A value of the numeric type is not enclosed within double quotation marks. You can use the -- format_csv_delimiter parameter to specify other column delimiters. For example, to separate columns with vertical bars (|), run the following command:
    cat data.csv | clickhouse-client --format_csv_delimiter="|" --query="INSERT INTO test.csv FORMAT CSV"
    CSVWithNames This format is similar to the CSV format. The first row lists the column names. The first row is ignored during parsing.