Create and run a task to migrate data from a remote file system - Data Transport

This topic describes how to create and run a task to migrate data from a remote file system to the on-premises network-attached storage (NAS) of a Data Transport device.

Warning

If the device experiences network interruptions or power failures during data migration, some data may not be migrated. Before running a migration job, make sure that your network and power supply are reliable.

Create a migration task

Check whether migration processes exist

Run the ps -ef | grep jarcommand.
Check whether the master.jar, worker.jar, and tracker.jar files exist. If they exist, run the kill -9 <PID> command to stop the master, worker, and tracker processes.
Run the cd /mnt/cube1/software/ossimport command to go to the specified directory.

Configure the job.cfg files

Run the cd /mnt/cube1/software/ossimport/conf command to find the job.cfg files.
Confirm the number of migration tasks that you need to create and copy the corresponding number of job.cfg files. One task corresponds to one job.cfg file.
Note
- Distinguish the job.cfg files. For example, you can specify different file names, such as job1.cfg.
- If you want to migrate data from all directories, create one migration task.
- If you want to migrate data from some directories, create one task for each directory.
Open one configuration file, such as vi job1.cfg, and configure the parameters described in the following table.

Parameter	Description
jobName	The task name. Example: example_job.
srcType	The type of the data source. In this example, the value is set to local.
srcPrefix	The source path from which the data is migrated to the Data Transport device. Note that a forward slash (/) must be added at the end of the path. Example: /mnt/nas/example_dir/.
destType	The destination to which data is migrated. In this example, data is migrated from a remote file system to the on-premises NAS of a Data Transport device. Set the value to local.
destPrefix	The destination path in which the data is stored in the Data Transport device. The path of storage pool 1 for the Data Transport device is /mnt/cube1/data/. The path of storage pool 2 for the Data Transport device is /mnt/cube2/data/.
auditMode	The verification mode. Set the value to simple.

Deploy and submit the task

Important

Note that the following operations are performed in the /mnt/cube1/software/ossimport directory.

1. Deploy the built-in service of Data Transport.

Run the bash console.sh deploy command.

2. Start the service.

Run the bash console.sh start command.

3. Check whether the processes are started.

Run the ps -ef | grep jar command to check whether the master.jar, worker.jar, and tracker.jar files exist. If they exist, the processes are started.

4. Submit the migration task.

Run the bash console.sh submit conf/<Name of the configuration file> command. Example: bash console.sh submit conf/job1.cfg.

(Optional) Configure incremental migration

If you want to configure incremental migration, you must modify the following parameters in the configuration file:

isIncremental: specifies the incremental migration mode. Set the value to true.
incrementalModeInterval: specifies the interval at which incremental migration is performed. Unit: seconds. The value cannot be less than 900.
repeatCount: specifies the number of incremental migration times. We recommend that you set a value less than or equal to 30. The value you set must be the actual number of incremental migration times plus 1. For example, if you want to perform incremental migration twice, you must set the value of repeatCount to 3.

(Optional) Retry the migration task

If your task fails, you can run the bash console.sh retry [job_name] command to retry the task.

What to do next

Note

Note that the /mnt/cube1/data directory is used in the following operations.

If the state of the migration task is failed after the task is complete, contact Data Transport technical support for troubleshooting.
If the state of the migration task is succeed, verify the consistency between the total number of files and data volume in the migration task and those in the /mnt/cube1/data directory.
View the number and size of files that are successfully migrated in the jobsForSls_v1.log file.

View logs

In the ossimport/workdir/logs/ directory, some logs are stored to record the upload status of files and the status of tasks.

Task progress logs

The job_status.log file is updated every 30 seconds and is used to record the status of the tasks. The following section describes the parameters in the file:
- jobname indicates the task name.
- jobState indicates the task status. The running state indicates that the task is in progress. The succeed state indicates that the task succeeds. The failed state indicates that the task fails and the data of some files is not successfully migrated.
- pending task count indicates the number of tasks waiting to be distributed.
- dispatched task count indicates the number of distributed tasks.
- succeed task count indicates the number of successful tasks.
- failed task count indicates the number of failed tasks.
- is scan Finished indicates whether the data scan is complete. The value true indicates that the scan is complete. The value false indicates that the scan is not complete. Note that the value is always false if you set isIncremental to true.

File uploading logs

The fileStatusForSls_v1.log file records the logs of file uploading. A log entry is generated each time a file is uploaded. Fields are separated with commas (,). The following section describes the corresponding location information of fields recorded in the file:
1. Column 4: the file path name.
2. Column 6: the task name.
3. Column 12: the file migration state. Valid values: succeed and failed.
4. Column 13: the cause of failure.
5. Column 16: the object size.

Task status logs

The jobsForSls_v1.log file records the status of the tasks and is updated every minute until all tasks are complete. The following section describes the information recorded in the file:
- Column 5: the task name.
- Column 9: the task status.
- Column 14: the total number of files.
- Column 15: the total size of files.
- Column 16: the number of files whose data is successfully migrated.
- Column 17: the size of files whose data is successfully migrated.
- Column 21: indicates whether the data scan is complete. The value 1 indicates that the scan is complete. The value 0 indicates that the scan is not complete.