All Products
Search
Document Center

E-MapReduce:Overview

Last Updated:Mar 03, 2025

This topic describes the data import methods that are provided by Doris, the supported formats of data that can be imported, and the common features of importing data by using Doris.

Data import methods

Doris provides various data import methods. You can select a data import method based on the data source that you use.

Supported data formats

The supported formats of data that can be imported vary based on the data import methods.

Data import methodSupported data format
Broker LoadParquet, ORC, CSV, and GZIP
Stream LoadCSV, GZIP, and JSON
Routine LoadCSV and JSON

Features

This section describes the common features of importing data by using Doris.

Atomicity

Each import job in Doris is a complete transaction regardless of whether you use Broker Load to import multiple data records at the same time or use the INSERT statement to import a single data record. An import transaction can ensure the atomicity of data that is imported in a batch. This prevents data from being lost during the import process.

Labels are used to identify import jobs. Each import job has a label. The label of an import job in a database is unique. You can specify a label for an import job or use the label that is generated by Doris for an import job.

The label of an import job ensures that data in the import job can be successfully imported only once. If an import job is successful, you cannot use the label of the import job for another import job. If you use the label for another import job, the request is denied and the error message Label already used is returned. This way, the at-most-once semantics is implemented for Doris. You can implement the exactly-once semantics for data import based on the at-most-once semantics for Doris and the at-least-once semantics for the upstream system.

Synchronous and asynchronous modes

You can import data in synchronous or asynchronous mode. In synchronous mode, Doris returns a result after an import job is complete. You can determine whether data is successfully imported based on the result. In asynchronous mode, after an import job is submitted, Successful is returned. However, this result does not mean that data is imported. To check the status of the import job, you must run the related command.