All Products
Search
Document Center

Object Storage Service:Distributed deployment

Last Updated:Oct 11, 2023

This topic describes how to deploy ossimport in distributed mode. Distributed deployment of ossimport is supported only for Linux.

Prerequisites

  • A cluster of at least two machines is deployed, with one being a master and the others being workers.

  • A connection over SSH is established between the master and the workers.

  • All workers use the same username and password.

    Note

    An SSH connection is established between the master and workers, or login credentials of the workers are configured in sys.properties.

Download and install ossimport

  1. Download ossimport.

    Download ossimport-2.3.6.tar.gz to your local computer.

  2. Install ossimport.

    Note

    All subsequent operations are performed on the master.

    1. Log on to the server and run the following command to create the ossimport directory:

      mkdir -p $HOME/ossimport
    2. Go to the directory of the package and run the following command to decompress the package to the specified directory:

      tar -zxvf ossimport-2.3.6.tar.gz -C $HOME/ossimport

      The structure of the decompressed file is as follows:

      ossimport
      ├── bin
      │   ├── console.jar     # The JAR package for the Console module.
      │   ├── master.jar      # The JAR package for the Master module.
      │ ├── tracker.jar     # The JAR package for the Tracker module.
      │   └── worker.jar      # The JAR package for the Worker module.
      ├── conf
      │   ├── job.cfg         # The Job configuration file template.
      │   ├── sys.properties  # The configuration file that contains system parameters.
      │ └── workers         # The list of workers.
      ├── console.sh          # The command-line tool. Only Linux is supported.
      ├── logs                # The directory that contains logs.
      └── README.md           # The file that introduces and explains ossimport. We recommend that you read this file before you use ossimport.
      • OSS_IMPORT_HOME: the root directory of ossimport. By default, the root directory is $HOME/ossimport in the decompression command. You can specify a root directory by using export OSS_IMPORT_HOME=<dir> or by modifying the $HOME/.bashrc configuration item in the system configuration file. We recommend that you use the default root directory.

      • OSS_IMPORT_WORK_DIR: the working directory of ossimport. You can specify a working directory by configuring workingDir in conf/sys.properties. We recommend that you use $HOME/ossimport/workdir as the working directory.

      • Specify absolute paths for OSS_IMPORT_HOME or OSS_IMPORT_WORK_DIR, such as /home/<user>/ossimport or /home/<user>/ossimport/workdir.

Configurations

The distributed deployment of ossimport has three configuration files: conf/sys.properties, conf/job.cfg, and conf/workers.

  • conf/job.cfg: the configuration file template used to configure jobs in distributed mode. Configure the parameters based on your actual migration job.

  • conf/sys.properties: the configuration file that contains system operating parameters, such as the working directory and worker-related parameters.

  • conf/workers: the worker list.

Important
  • Before you start a migration job, check the parameters in sys.properties and job.cfg. After a migration job is submitted, you cannot modify parameter settings in the files.

  • Configure and check workers before you start the service. You cannot add an item to or remove an item from the file after the service is started.

Running

  • Migration jobs

    If you use ossimport in distributed mode to perform migration jobs, you need to perform the following steps in most cases:

    • Deploy the service. To do so, run the bash console.sh deploy command in the Linux terminal. This command deploys ossimport to all machines specified in the conf/workers configuration file.

      Note

      Ensure that the configuration files conf/job.cfg and conf/workers are properly configured before you deploy the service.

    • Clear jobs with the same name. If you have run a job with the same name and want to run the job again, clear the job with the same name first. If you have not run the job or you want to retry the tasks of a failed job, do not run the clean command. To clear a job with the same name, run bash console.sh clean job_name in the Linux terminal.

    • Submit the data migration job. You cannot submit jobs with the same name. If you have jobs with the same name, run the clean command to clear the jobs. A configuration file is required to submit a job. You can create a job configuration file based on the conf/job.cfg template file. To submit a job, run bash console.sh submit [job_cfg_file] in the Linux terminal. The job_cfg_file parameter in the command is optional and is set to $OSS_IMPORT_HOME/conf/job.cfg by default, where $OSS_IMPORT_HOME is the directory that contains console.sh.

    • Start the service. To do so, run bash console.sh start in the Linux terminal.

    • View the job status. To do so, run bash console.sh stat in the Linux terminal.

    • Retry failed tasks. Tasks may fail due to network issues or other reasons. When you run the retry command, only failed tasks are retried. To retry failed tasks, run bash console.sh retry [job_name] in the Linux terminal. In the command, the optional job_name parameter specifies the job whose failed tasks you want to retry. If you do not configure this parameter, failed tasks of all jobs are retried.

    • Stop the service. To do so, run bash console.sh stop in the Linux terminal.

    Note:

    • If an error occurs because of incorrect parameters in a bash console.sh command, the correct command format is displayed.

    • We recommend that you specify absolute paths in configuration files and submitted jobs.

    • The job.cfg file contains job configuration items.

      Important

      You cannot modify the configuration items in the file after the job submitted.

  • Common causes of job failures

    • The file in the source directory is modified during the upload process. In this case, the SIZE_NOT_MATCH error is recorded in the log/audit.log, meaning that the original file is uploaded and modifications are not uploaded to OSS.

    • The source file is deleted during upload. This causes download failures.

    • The name of the file to upload does not conform to the naming rules of OSS. For example, upload fails if the name of the file to upload starts with a forward slash (/) or is left empty.

    • The source file fails to be downloaded.

    • The program exits unexpectedly and the job state is Abort. If this happens, contact our technical support team.

  • Job status and logs

    After a job is submitted, the master splits the job into tasks, the workers run the tasks, and the tracker collects the task status. After a job is completed, the structure of the workdir directory is as follows:

    workdir
    ├── bin
    │   ├── console.jar     # The JAR package for the Console module.
    │   ├── master.jar      # The JAR package for the Master module.
    │ ├── tracker.jar     # The JAR package for the Tracker module.
    │   └── worker.jar      # The JAR package for the Worker module.
    ├── conf
    │   ├── job.cfg         # The Job configuration file template.
    │   ├── sys.properties  # The configuration file that contains system parameters.
    │ └── workers         # The list of workers.
    ├── logs
    │   ├── import.log      # Migration logs.
    │   ├── master.log      # Master logs.
    │   ├── tracker.log     # Tracker logs.
    │ └── workers         # Worker logs.
    ├── master
    │   ├── jobqueue                 # Jobs that are not split.
    │   └── jobs                     # Job status.
    │       └── xxtooss              # Job names.
    │           ├── checkpoints      # Checkpoints generated when the master splits jobs into tasks.
    │           │   └── 0
    │           │       └── ED09636A6EA24A292460866AFDD7A89A.cpt
    │           ├── dispatched       # Tasks that are dispatched to workers but not complete.
    │           │   └── 192.168.1.6
    │           ├── failed_tasks     # Failed tasks.
    │           │   └── A41506C07BF1DF2A3EDB4CE31756B93F_1499348973217@192.168.1.6
    │           │       ├── audit.log     # The logs of tasks. You can view the logs to identify error causes.
    │           │       ├── DONE          # The mark file of successful tasks. If the task fails, the content is empty.
    │           │       ├── error.list    # The list of task errors. You can view the errors in the file.
    │           │       ├── STATUS        # The mark file that indicates task status. The content of this file is Failed or Completed, indicating that the task failed or succeeded.
    │           │       └── TASK          # Description of the tasks.
    │           ├── pending_tasks    # Tasks that are not dispatched.
    │           └── succeed_tasks    # Tasks that run successfully.
    │               └── A41506C07BF1DF2A3EDB4CE31756B93F_1499668462358@192.168.1.6
    │                   ├── audit.log    # The logs of tasks. You can view the logs to identify error causes.
    │                   ├── DONE         # The mark file of successful tasks.
    │                   ├── error.list   # The list of task errors. If the tasks are successful ,the error list is empty.
    │                   ├── STATUS       # The mark file that indicates task status. The content of this file is Failed or Completed, indicating that the task failed or succeeded.
    │                   └── TASK         # Description of the tasks.
    └── worker  # Stores the status of the tasks being run by the worker. After tasks are run, they are managed by the master.
        └── jobs
            ├── local_test2
            │   └── tasks
            └── local_test_4
                └── tasks
    Important
    • To view the information about job running, check logs/import.log.

    • To troubleshoot failed tasks, check master/jobs/${JobName}/failed_tasks/${TaskName}/audit.log.

    • To view errors, check master/jobs/${JobName}/failed_tasks/${TaskName}/error.list.

    • The preceding logs are only for your reference. Do not deploy your services and applications based on them.

Common errors and troubleshooting

For more information about common errors and troubleshooting, see FAQ.