All Products
Search
Document Center

Intelligent Media Services:Create videos from images and text

Last Updated:Feb 27, 2026

Configure Timeline parameters for the SubmitMediaProducingJob API operation to produce videos from images, text overlays, and text-to-speech (TTS) audio. For details about all available Timeline parameters, see Timeline configurations.

Usage notes

  • Intelligent Media Services (IMS) intelligent production supports editing, compositing, effect rendering, and templates for live streams, VOD files, and material files from Object Storage Service (OSS). For more information, see Intelligent production overview.

  • Produce a video from one or more videos, audio files, images, and subtitle materials by configuring Timeline parameters and calling the SubmitMediaProducingJob operation.

  • A timeline defines the structure of a video. It consists of tracks, materials, and effects. For more information, see Timeline configurations.

  • For information about how to use the IMS SDK for audio and video editing, see Preparations.

Voice-over aligned slideshow

Produces a narrated slideshow where each image displays for the duration of its corresponding TTS audio clip. Suitable for webpage-to-video or document-to-video workflows.

The timeline links image clips to TTS audio clips through ReferenceClipId. Each image clip sets ReferenceClipId to the ClipId of a TTS audio clip. IMS automatically adjusts the image duration to match the TTS audio duration.

Output video: 1.mp4

Timeline structure

Track typeCountPurpose
VideoTracks1 track, 4 image clipsDisplays images with text overlays. Each clip references a TTS audio clip through ReferenceClipId.
AudioTracks2 tracksTrack 1 (MainTrack: true): 4 AI_TTS clips that generate voice-over narration. Track 2: background music with LoopMode: true.
SubtitleTracks1 track, 1 clipDisplays a persistent title at the top of the video.

Key parameters

ParameterValueDescription
ReferenceClipId"speech1", "speech2", etc.Set on each image clip. Links the image to a TTS audio clip so their durations match.
ClipId"speech1", "speech2", etc.Set on each TTS audio clip. Serves as the reference target for image clips.
Type"AI_TTS"Specifies that this audio clip is generated by text-to-speech.
Voice"zhichu"TTS voice model.
SpeechRate-200TTS speaking speed. Negative values produce slower speech.
MainTracktrueMarks the TTS audio track as the primary track for timeline duration.
LoopModetrueLoops background music to fill the entire video duration.
AdaptMode"AutoWrap"Automatically wraps text within the specified TextWidth.
EffectColorStyle"CS0004-000005"Preset color style applied to subtitle text.

Timeline JSON

{
  "VideoTracks": [{
    "VideoTrackClips": [{
      "ReferenceClipId": "speech1",
      "MediaURL": "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/image/a1.png",
      "Type": "Image",
      "Effects": [{
        "Type": "Text",
        "Content": "Confucius said: \"Is it not a pleasure to learn something and constantly put it into practice? Is it not a delight to have friends from afar? Is it not a gentleman to remain unsoured even though one's merits are unrecognized by others?\"",
        "Alignment": "CenterCenter",
        "FontSize": 55,
        "AdaptMode": "AutoWrap",
        "TextWidth": 0.7,
        "Font": "FZKai-Z03S",
        "FontColor": "#ffffe0",
        "Outline": 2,
        "OutlineColour": "#000000",
        "FontFace": {
          "Bold": true,
          "Italic": false,
          "Underline": false
        }
      }]
    },{
      "ReferenceClipId": "speech2",
      "MediaURL": "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/image/a2.png",
      "Type": "Image",
      "Effects": [{
        "Type": "Text",
        "Content": "Confucius said: \"Isn't it a pleasure to review and practice what you have learned on time? Isn't it happy to have friends from far away? Isn't it a cultured gentleman's merit to remain unsoured even though one is unrecognized by others?\"",
        "Alignment": "CenterCenter",
        "FontSize": 55,
        "AdaptMode": "AutoWrap",
        "TextWidth": 0.7,
        "Font": "FZKai-Z03S",
        "FontColor": "#ffffe0",
        "Outline": 2,
        "OutlineColour": "#000000",
        "FontFace": {
          "Bold": true,
          "Italic": false,
          "Underline": false
        }
      }]
    },{
      "ReferenceClipId": "speech3",
      "MediaURL": "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/image/a3.png",
      "Type": "Image",
      "Effects": [{
        "Type": "Text",
        "Content": "Confucius said: \"If a man keeps cherishing his old knowledge, so as continually to be acquiring new, he may be a teacher of others.\"",
        "Alignment": "CenterCenter",
        "FontSize": 55,
        "AdaptMode": "AutoWrap",
        "TextWidth": 0.7,
        "Font": "FZKai-Z03S",
        "FontColor": "#ffffe0",
        "Outline": 2,
        "OutlineColour": "#000000",
        "FontFace": {
          "Bold": true,
          "Italic": false,
          "Underline": false
        }
      }]
    },{
      "ReferenceClipId": "speech4",
      "MediaURL": "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/image/a4.png",
      "Type": "Image",
      "Effects": [{
        "Type": "Text",
        "Content": "Confucius said: \"If you gain something new when you review your old knowledge, you can become a teacher.\"",
        "Alignment": "CenterCenter",
        "FontSize": 55,
        "AdaptMode": "AutoWrap",
        "TextWidth": 0.7,
        "Font": "FZKai-Z03S",
        "FontColor": "#ffffe0",
        "Outline": 2,
        "OutlineColour": "#000000",
        "FontFace": {
          "Bold": true,
          "Italic": false,
          "Underline": false
        }
      }]
    }]
  }],
  "AudioTracks": [{
    "MainTrack": true,
    "AudioTrackClips": [{
      "ClipId": "speech1",
      "Type": "AI_TTS",
      "Voice": "zhichu",
      "Content": "The Analects, Chapter 12. Confucius said: \"Is it not a pleasure to learn something and constantly put it into practice? Is it not a delight to have friends from afar? Is it not a gentleman to remain unsoured even though one's merits are unrecognized by others? \"",
      "SpeechRate": -200
    },{
      "ClipId": "speech2",
      "Type": "AI_TTS",
      "Voice": "zhichu",
      "Content": "Translation. Confucius said: \"Isn't it a pleasure to review and practice what you have learned on time? Isn't it happy to have friends from far away? Isn't it a cultured gentleman's merit to remain unsoured even though one is unrecognized by others?\"",
      "SpeechRate": -200
    },{
      "ClipId": "speech3",
      "Type": "AI_TTS",
      "Voice": "zhichu",
      "Content": "Confucius said: \"If a man keeps cherishing his old knowledge, so as continually to be acquiring new, he may be a teacher of others.\"",
      "SpeechRate": -200
    },{
      "ClipId": "speech4",
      "Type": "AI_TTS",
      "Voice": "zhichu",
      "Content": "Translation. Confucius said: \"If you gain something new when you review your old knowledge, you can become a teacher.\"",
      "SpeechRate": -200
    }]
  },{
    "AudioTrackClips": [{
      "MediaUrl": "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/music/m1.wav",
      "LoopMode": true
    }]
  }],
  "SubtitleTracks":[
    {
      "SubtitleTrackClips": [
        {
          "Type": "Text",
          "Y": 150,
          "Content": "The Analects, Chapter 12",
          "Font": "HappyZcool-2016",
          "Alignment": "TopCenter",
          "EffectColorStyle": "CS0004-000005",
          "AdaptMode": "AutoWrap",
          "FontSize": 70,
          "TextWidth": 900,
          "FontFace": {
            "Bold": false,
            "Italic": false,
            "Underline": false
          }
        }
      ]
    }
  ]
}

Music album from images

Produces a slideshow video with background music, transitions between images, and blurred background fills. Each image displays for a fixed duration. No TTS or text overlays are used.

Output video: 2.mp4

Timeline structure

Track typeCountPurpose
VideoTracks1 track, 7 image clipsDisplays images sequentially. Each clip has a blurred background fill and a random transition.
AudioTracks1 trackBackground music with LoopMode: true to cover the full video duration.

Key parameters

ParameterValueDescription
Duration3Each image displays for 3 seconds.
Type (effect)"Background"Adds a blurred background behind images that do not fill the video frame.
SubType (background)"Blur"Applies a blur effect for the background fill.
Radius0.1Blur intensity for the background effect.
Type (effect)"Transition"Adds a transition effect between consecutive clips.
SubType (transition)"random"Applies a randomly selected transition style.
Duration (transition)0.5Transition duration in seconds.
LoopModetrueLoops background music to fill the entire video duration.

Timeline JSON

{
  "VideoTracks": [{
    "VideoTrackClips": [{
      "Type": "Image",
      "MediaUrl": "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/image/01.jpg",
      "Duration": 3,
      "Effects": [{
        "Type": "Background",
        "SubType": "Blur",
        "Radius": 0.1
      }, {
        "Type": "Transition",
        "SubType": "random",
        "Duration": 0.5
      }]
    },{
      "Type": "Image",
      "MediaUrl": "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/image/02.jpg",
      "Duration": 3,
      "Effects": [{
        "Type": "Background",
        "SubType": "Blur",
        "Radius": 0.1
      }, {
        "Type": "Transition",
        "SubType": "random",
        "Duration": 0.5
      }]
    },{
      "Type": "Image",
      "MediaUrl": "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/image/03.jpg",
      "Duration": 3,
      "Effects": [{
        "Type": "Background",
        "SubType": "Blur",
        "Radius": 0.1
      }, {
        "Type": "Transition",
        "SubType": "random",
        "Duration": 0.5
      }]
    },{
      "Type": "Image",
      "MediaUrl": "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/image/04.jpg",
      "Duration": 3,
      "Effects": [{
        "Type": "Background",
        "SubType": "Blur",
        "Radius": 0.1
      }, {
        "Type": "Transition",
        "SubType": "random",
        "Duration": 0.5
      }]
    },{
      "Type": "Image",
      "MediaUrl": "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/image/05.jpg",
      "Duration": 3,
      "Effects": [{
        "Type": "Background",
        "SubType": "Blur",
        "Radius": 0.1
      }, {
        "Type": "Transition",
        "SubType": "random",
        "Duration": 0.5
      }]
    },{
      "Type": "Image",
      "MediaUrl": "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/image/06.jpg",
      "Duration": 3,
      "Effects": [{
        "Type": "Background",
        "SubType": "Blur",
        "Radius": 0.1
      }, {
        "Type": "Transition",
        "SubType": "random",
        "Duration": 0.5
      }]
    },{
      "Type": "Image",
      "MediaUrl": "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/image/07.jpg",
      "Duration": 3,
      "Effects": [{
        "Type": "Background",
        "SubType": "Blur",
        "Radius": 0.1
      }, {
        "Type": "Transition",
        "SubType": "random",
        "Duration": 0.5
      }]
    }]
  }],
  "AudioTracks": [{
    "AudioTrackClips": [{
      "MediaUrl": "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/music/m1.wav",
      "LoopMode": true
    }]
  }]
}

References