For scenarios requiring large-scale synthetic video data such as robotics, autonomous driving, and embodied intelligence model training, this module provides an integrated factory capability from source material ingestion to batch production, quality evaluation, and dataset export.
Console layout
After entering the Embodied Intelligence Training Data Factory, the left-side navigation contains 6 entries:
|
Entry |
Feature |
|
Quick generation |
Generate training videos individually. |
|
Production projects |
Dataset management: source materials, production configuration, execution status, output, and export. |
|
Quality evaluation |
Scoring dashboard, scoring scheme management, and quick trial evaluation. |
|
Storage audit |
Cross-dataset storage and file auditing. |
|
API |
Integrate into automation pipelines through REST API. |
|
Settings |
Training data factory settings. |
Production projects
Create a project
-
Go to the Production Projects page and click New Project.
-
Enter the name and description, and optionally fill in: robot model, task type, environment, and source type.
-
After creation, you are automatically taken to the project workbench.
Project workbench
|
Step |
Feature |
|
Source materials |
Upload source videos and reference images, view by category. |
|
Production configuration |
Batch-configure production tasks using form or JSON advanced mode. |
|
Execution status |
View the progress of all production tasks in this project. |
|
Output |
Browse training data entries produced by this project. |
|
Export |
Export the entire dataset with one click (including metadata and files). |
Source material upload
-
Supports video (MP4 / MOV / WebM) and image (JPG / PNG / WebP) formats.
-
Maximum 500 MB per video and 100 MB per image.
-
A batch progress bar is displayed during upload, with retry on failure.
-
Uploaded materials are archived to the current dataset and can be reused in production configuration.
Production configuration
Two modes are provided:
-
Form configuration: Select source videos and independently configure prompts and reference images for each video, suitable for intuitive operation.
-
JSON advanced: Submit batch tasks at once through JSON, suitable for advanced users familiar with scripting.
Execution status
The execution status page is presented in batches:
-
Overall progress, success / failure / in-progress counts.
-
Real-time progress and stage labels for each task.
-
Failed tasks can be retried with one click.
Output and entry details
The output list supports:
-
Filter by quality status: pending evaluation, auto-passed, auto-failed, confirmed, overturned, skipped.
-
Filter by source type: collection, slicing, generation, assembly.
-
Search by text keywords.
-
Click an entry to view details: labels, generation prompts, video preview, metadata (duration, resolution, frame rate, source task), and lineage chain (trace parent entries).
Dataset export
-
Export the entire dataset as a packaged file (including videos and metadata) with one click.
-
Supports filtering exports by quality status (for example, export only "confirmed" entries).
-
Export tasks are executed asynchronously. You can view the progress and download results in the Export tab.
Quality evaluation
Evaluation overview
-
Cumulative evaluation count, pass rate, and compliance rate for each dimension.
-
Visual bar charts for intuitive comparison of dimension performance.
Scoring schemes
-
Built-in system scoring schemes evaluate multiple dimensions (such as semantic consistency, motion consistency, visual stability, appearance preservation, trajectory plausibility, and physical plausibility) on a 0-2 scale.
-
You can view dimension details, evaluation criteria, and pass thresholds.
-
Multiple schemes can be switched. After setting one as default, all new tasks use that scheme.
Quick trial
-
In the quick trial area, enter the URL of the video to evaluate (and an optional source video URL).
-
Click Start Evaluation to get scoring results within seconds.
-
Evaluation results include scores for each dimension and an overall pass conclusion, which can serve as a reference for scheme comparison.
API
Integrate training data generation capabilities into your automation pipelines through REST API.
The API page provides a list of core interfaces, including file upload, task submission, task status query, and task file cleanup. You can make API calls directly from the page. Quick-start Python examples are also provided for easy development integration.
Storage audit
The storage audit module provides cross-dataset statistics:
-
Total OSS file usage, orphaned files, and cleanable objects.
-
Supports safe cleanup operations.