Embodied Intelligence - AnalyticDB - Alibaba Cloud Documentation Center

For scenarios requiring large-scale synthetic video data such as robotics, autonomous driving, and embodied intelligence model training, this module provides an integrated factory capability from source material ingestion to batch production, quality evaluation, and dataset export.

Console layout

After entering the Embodied Intelligence Training Data Factory, the left-side navigation contains 6 entries:

Entry	Feature
Quick generation	Generate training videos individually.
Production projects	Dataset management: source materials, production configuration, execution status, output, and export.
Quality evaluation	Scoring dashboard, scoring scheme management, and quick trial evaluation.
Storage audit	Cross-dataset storage and file auditing.
API	Integrate into automation pipelines through REST API.
Settings	Training data factory settings.

Production projects

Create a project

Go to the Production Projects page and click New Project.
Enter the name and description, and optionally fill in: robot model, task type, environment, and source type.
After creation, you are automatically taken to the project workbench.

Project workbench

Step	Feature
Source materials	Upload source videos and reference images, view by category.
Production configuration	Batch-configure production tasks using form or JSON advanced mode.
Execution status	View the progress of all production tasks in this project.
Output	Browse training data entries produced by this project.
Export	Export the entire dataset with one click (including metadata and files).

Source material upload

Supports video (MP4 / MOV / WebM) and image (JPG / PNG / WebP) formats.
Maximum 500 MB per video and 100 MB per image.
A batch progress bar is displayed during upload, with retry on failure.
Uploaded materials are archived to the current dataset and can be reused in production configuration.

Production configuration

Two modes are provided:

Form configuration: Select source videos and independently configure prompts and reference images for each video, suitable for intuitive operation.
JSON advanced: Submit batch tasks at once through JSON, suitable for advanced users familiar with scripting.

Execution status

The execution status page is presented in batches:

Overall progress, success / failure / in-progress counts.
Real-time progress and stage labels for each task.
Failed tasks can be retried with one click.

Output and entry details

The output list supports:

Filter by quality status: pending evaluation, auto-passed, auto-failed, confirmed, overturned, skipped.
Filter by source type: collection, slicing, generation, assembly.
Search by text keywords.
Click an entry to view details: labels, generation prompts, video preview, metadata (duration, resolution, frame rate, source task), and lineage chain (trace parent entries).

Dataset export

Export the entire dataset as a packaged file (including videos and metadata) with one click.
Supports filtering exports by quality status (for example, export only "confirmed" entries).
Export tasks are executed asynchronously. You can view the progress and download results in the Export tab.

Quality evaluation

Evaluation overview

Cumulative evaluation count, pass rate, and compliance rate for each dimension.
Visual bar charts for intuitive comparison of dimension performance.

Scoring schemes

Built-in system scoring schemes evaluate multiple dimensions (such as semantic consistency, motion consistency, visual stability, appearance preservation, trajectory plausibility, and physical plausibility) on a 0-2 scale.
You can view dimension details, evaluation criteria, and pass thresholds.
Multiple schemes can be switched. After setting one as default, all new tasks use that scheme.

Quick trial

In the quick trial area, enter the URL of the video to evaluate (and an optional source video URL).
Click Start Evaluation to get scoring results within seconds.
Evaluation results include scores for each dimension and an overall pass conclusion, which can serve as a reference for scheme comparison.

API

Integrate training data generation capabilities into your automation pipelines through REST API.

The API page provides a list of core interfaces, including file upload, task submission, task status query, and task file cleanup. You can make API calls directly from the page. Quick-start Python examples are also provided for easy development integration.

Storage audit

The storage audit module provides cross-dataset statistics:

Total OSS file usage, orphaned files, and cleanable objects.
Supports safe cleanup operations.