All Products
Search
Document Center

Alibaba Cloud Model Studio:Speech Synthesis Markup Language

Last Updated:Dec 12, 2025

Speech Synthesis Markup Language (SSML) is an XML-based markup language for speech synthesis. It allows large speech synthesis models to process richer text content and provides fine-grained control over speech features, such as speech rate, pitch, pause, and volume. You can also add background music to create more expressive speech effects. This topic describes the SSML features of CosyVoice and how to use them.

Important

To use a model in the China (Beijing) region, go to the API key page for the China (Beijing) region

Availability

  • This feature supports only cloned voices from cosyvoice-v3-flash, cosyvoice-v3-plus, and cosyvoice-v2, and system voices that are marked as supported in Voice list.

  • This feature is available only for some APIs:

  • SSML does not support streaming text input but supports both streaming and non-streaming speech output.

Note

When implementing SSML, the speech synthesis service references the W3C SSML 1.0 specification. However, our implementation prioritizes business adaptability. Therefore, not all standard tags are supported. Instead, we have implemented a collection of the most practical tags based on common usage scenarios.

Usage

You can pass the text that contains SSML tags as the value of the text parameter to the speech synthesis service.

The specific usage depends on the call method:

The following code provides an example for the Java SDK:

String text = "<speak>Please close your eyes and take a rest.<break time=\"500ms\"/>Okay, please open your eyes.</speak>";
synthesizer.call(text);
Note
  • All text content that uses SSML features must be enclosed within <speak></speak> tags.

  • You can use multiple <speak> tags consecutively, such as <speak></speak><speak></speak>. You cannot nest them, such as <speak><speak></speak></speak>.

  • If the text within the tags contains XML special characters, you must escape them. The following table lists common special characters and their escaped forms.

    • " (double quotation mark) → &quot;

    • ' (single quotation mark/apostrophe) → &apos;

    • & (ampersand) → &amp;

    • < (less than sign) → &lt;

    • > (greater than sign) → &gt;

Tags

<speak>: Root node

  • Description

    The <speak> tag is the root node for all SSML documents. All text that uses SSML features must be enclosed within <speak></speak> tags.

  • Syntax

     <speak>Text that requires SSML features</speak>
  • Properties

    Property

    Type

    Required

    Description

    voice

    String

    No

    Specifies the voice.

    This property has a higher priority than the voice parameter in the API request.

    • Valid values: For more information about specific voices, see cosyvoice-v2 voices.

    • Example:

      <speak voice="longcheng_v2">
        I am a male voice.
      </speak>

    rate

    String

    No

    Specifies the speech rate. This property has a higher priority than the speech_rate parameter in the API request.

    • Valid values: a decimal number from 0.5 to 2.

    • Default value: 1

      • A value greater than 1 indicates a faster speech rate.

      • A value less than 1 indicates a slower speech rate.

    • Example:

      <speak rate="2">
        My speech rate is faster than normal.
      </speak>

    pitch

    String

    No

    Specifies the pitch. This property has a higher priority than the pitch_rate parameter in the API request.

    • Valid values: a decimal number from 0.5 to 2.

    • Default value: 1

      • A value greater than 1 indicates a higher pitch.

      • A value less than 1 indicates a lower pitch.

    • Example:

      <speak pitch="0.5">
        However, my pitch is lower than others.
      </speak>

    volume

    String

    No

    Specifies the volume. This property has a higher priority than the volume parameter in the API request.

    • Valid values: an integer from 0 to 100.

    • Default value: 50

      • A value greater than 50 indicates a higher volume.

      • A value less than 50 indicates a lower volume.

    • Example:

      <speak volume="80">
        My volume is also very high.
      </speak>

    effect

    String

    No

    Specifies the sound effect.

    • Valid values:

      • robot: robot sound effect

      • lolita: lively female voice effect

      • lowpass: low-pass sound effect

      • echo: echo sound effect

      • eq: equalizer (advanced)

      • lpfilter: low-pass filter (advanced)

      • hpfilter: high-pass filter (advanced)

      Note
      • The eq, lpfilter, and hpfilter are advanced sound effect types. You can use the effectValue parameter to customize their specific effects.

      • Each SSML tag supports only one sound effect. Multiple effect attributes cannot coexist.

      • Using sound effects increases system latency.

    • Example:

      <speak effect="robot">
        Do you like the robot WALL-E?
      </speak>

    effectValue

    String

    No

    Specifies the specific effect of the sound effect (the effect parameter).

    • Valid values:

      • eq (equalizer): The system supports eight frequency levels by default:

        ["40 Hz", "100 Hz", "200 Hz", "400 Hz", "800 Hz", "1600 Hz", "4000 Hz", "12000 Hz"].

        The bandwidth of each frequency band is 1.0 q.

        When you use this effect, you must use the effectValue parameter to specify the gain value for each frequency band. This parameter is a string of eight integers separated by spaces. The value of each integer ranges from -20 to 20. A value of 0 indicates that the gain of the corresponding frequency is not adjusted.

        For example: effectValue="1 1 1 1 1 1 1 1"

      • lpfilter (low-pass filter): Enter the frequency value of the low-pass filter. The value is an integer in the range of (0, target sample rate/2]. For example, effectValue="800".

      • hpfilter (high-pass filter): Enter the frequency value of the high-pass filter. The value is an integer in the range of (0, target sample rate/2]. For example, effectValue="1200".

    • Example:

      <speak effect="eq" effectValue="1 -20 1 1 1 1 20 1">
        Do you like the robot WALL-E?
      </speak>
      
      <speak effect="lpfilter" effectValue="1200">
        Do you like the robot WALL-E?
      </speak>
      
      <speak effect="hpfilter" effectValue="1200">
        Do you like the robot WALL-E?
      </speak>

    bgm

    String

    No

    Adds the specified background music to the synthesized speech. The background music file must be stored in Alibaba Cloud OSS (see Upload files), and its bucket must have at least public-read permissions.

    If the background music URL contains XML special characters, such as &, <, and >, you must escape them.

    • Audio requirements:

      There is no upper limit on the audio file size, but larger files may increase download time. If the duration of the synthesized content exceeds the duration of the background music, the background music is automatically looped to match the length of the synthesized audio.

      • Sample rate: 16 kHz

      • Number of sound channels: mono

      • File format: WAV

        If the original audio is not in WAV format, use the ffmpeg tool to transform it:

        ffmpeg -i input_audio -acodec pcm_s16le -ac 1 -ar 16000 output.wav
      • Bit depth: 16-bit

    • Example:

      <speak bgm="http://nls.alicdn.com/bgm/2.wav" backgroundMusicVolume="30" rate="-500" volume="40">
        <break time="2s"/>
        The old trees on the shady cliff are shrouded in mist
        <break time="700ms"/>
        The sound of rain is still in the bamboo forest
        <break time="700ms"/>
        I know that cotton contributes to the country's plan
        <break time="700ms"/>
        The scenery of Mianzhou is always pitiable
        <break time="2s"/>
      </speak>
    Important

    You are legally responsible for the copyright of the uploaded audio.

    backgroundMusicVolume

    String

    No

    Controls the volume of the background music. This is configured using the backgroundMusicVolume property.

  • Tag relationships

    The <speak> tag can contain text and the following tags:

  • More examples

    • Empty attribute

      <speak>
        Text that requires SSML tags
      </speak>
    • Attribute combination (separated by spaces)

      <speak rate="200" pitch="-100" volume="80">
        So when put together, my voice sounds like this.
      </speak>

<break>: Controls pause duration

  • Description

    Adds a period of silence during speech synthesis to simulate a natural pause. You can set the duration in seconds (s) or milliseconds (ms). This tag is optional.

  • Syntax

    # Empty attribute
    <break/>
    # With the time attribute
    <break time="string"/>
  • Properties

    Note

    If you use the <break> tag without attributes, the default pause duration is 1 s.

    Property

    Type

    Required

    Description

    time

    String

    No

    Sets the pause duration in seconds or milliseconds, such as "2s" or "50ms".

    • Valid values:

      • In seconds (s): an integer from 1 to 10.

      • In milliseconds (ms): an integer from 50 to 10000.

    • Example:

      <speak>
        Please close your eyes and take a rest.<break time="500ms"/>Okay, please open your eyes.
      </speak>
    Important

    If you use multiple <break> tags consecutively, the total pause duration is the sum of the time specified in each tag. If the total duration exceeds 10 seconds, only the first 10 seconds take effect.

    For example, in the following SSML segment, the cumulative duration of the <break> tags is 15 seconds, which exceeds the 10-second limit. The final pause duration will be truncated to 10 seconds:

    <speak>
      Please close your eyes and take a rest.<break time="5s"/><break time="5s"/><break time="5s"/>Okay, please open your eyes.
    </speak>
  • Tag relationships

    <break> is an empty tag and cannot contain any other tags.

<sub>: Replaces text

  • Description

    Replaces a string of text with a specified alternative that is read aloud instead. For example, the text "W3C" can be read as "network protocol". This tag is optional.

  • Syntax

    <sub alias="string"></sub>
  • Properties

    Property

    Type

    Required

    Description

    alias

    String

    Yes

    Replaces a piece of text with text that is more suitable for reading.

    Example:

     <speak>
       <sub alias="network protocol">W3C</sub>
     </speak>
  • Tag relationships

    The <sub> tag can only contain text.

<phoneme>: Specifies pronunciation (Pinyin/phonetic alphabet)

  • Description

    Controls the pronunciation of a specific string of text. You can use Pinyin for Chinese and phonetic alphabet, such as CMU, for English. This tag is suitable for scenarios that require precise pronunciation and is optional.

  • Syntax

    <phoneme alphabet="string" ph="string">text</phoneme>
  • Properties

    Property

    Type

    Required

    Description

    alphabet

    String

    Yes

    Specifies the pronunciation type: Pinyin (for Chinese) or phonetic alphabet (for English).

    Valid values:

    ph

    String

    Yes

    Specifies the specific Pinyin or phonetic alphabet:

    • The Pinyin for each character is separated by a space, and the number of Pinyin syllables must match the number of characters.

    • Each Pinyin syllable consists of a pronunciation part and a tone. The tone is an integer from 1 to 5, where 5 indicates a neutral tone.

    • Example:

      <speak>
        去<phoneme alphabet="py" ph="dian3 dang4 hang2">典当行</phoneme>把这个玩意<phoneme alphabet="py" ph="dang4 diao4">当掉</phoneme>
      </speak>
      
      <speak>
        How to spell <phoneme alphabet="cmu" ph="S AY N">sin</phoneme>?
      </speak>
  • Tag relationships

    The <phoneme> tag can only contain text.

<soundEvent>: Inserts an external sound (such as a ringtone or a cat's meow)

  • Description

    Allows you to insert sound effect files, such as prompt tones or ambient sounds, into the synthesized speech to enrich the audio output. This tag is optional.

  • Syntax

     <soundEvent src="URL"/>
  • Properties

    Property

    Type

    Required

    Description

    src

    String

    Yes

    Sets the external audio URL.

    The audio file must be stored in OSS (see Upload files), and its bucket must have at least public-read permissions. If the URL contains XML special characters, such as &, <, and >, you must escape them.

    • Audio requirements:

      • Sample rate: 16 kHz

      • Number of sound channels: mono

      • File format: WAV

        If the original audio is not in WAV format, use the ffmpeg tool to transform it:

        ffmpeg -i input_audio -acodec pcm_s16le -ac 1 -ar 16000 output.wav
      • File size: no more than 2 MB

      • Bit depth: 16-bit

    • Example:

      <speak>
        A horse was frightened<soundEvent src="http://nls.alicdn.com/sound-event/horse-neigh.wav"/>and people scattered to avoid it.
      </speak>
    Important

    You are legally responsible for the copyright of the uploaded audio.

  • Tag relationships

    <soundEvent> is an empty tag and cannot contain any other tags.

<say-as>: Sets how text is read (such as numbers, dates, and phone numbers)

  • Description

    Indicates the content type of a text string, which allows the model to read the text in the appropriate format. This tag is optional.

  • Syntax

     <say-as interpret-as="string">text</say-as>
  • Properties

    Property

    Type

    Required

    Description

    interpret-as

    String

    Yes

    Indicates the information type of the text within the tag.

    Valid values:

    • cardinal: Read as a cardinal number (integer or decimal).

    • digits: Read as individual digits. For example, 123 is read as one two three.

    • telephone: Read as a telephone number.

    • name: Read as a name.

    • address: Read as an address.

    • id: Suitable for account names and nicknames. Read in the conventional way.

    • characters: Read the text within the tag character by character.

    • punctuation: Read the text within the tag as punctuation marks.

    • date: Read as a date.

    • time: Read as a time.

    • currency: Read as a currency amount.

    • measure: Read as a unit of measure.

  • Supported formats for each <say-as> type

    • cardinal

      Format

      Example

      English output

      Description

      Number string

      145

      one hundred forty five

      Integer input range: positive or negative integers within 13 digits, [-999999999999, 999999999999].

      Decimal input range: There is no special limit on the number of decimal places, but it is recommended not to exceed 10.

      Number string starting with zero

      0145

      one hundred forty five

      Negative sign + number string

      -145

      minus hundred forty five

      Three-digit number string separated by commas

      60,000

      sixty thousand

      Negative sign + three-digit number string separated by commas

      -208,000

      minus two hundred eight thousand

      Number string + decimal point + zero

      12.00

      twelve

      Number string + decimal point + number string

      12.34

      twelve point three four

      Three-digit number string separated by commas + decimal point + number string

      1,000.1

      one thousand point one

      Negative sign + number string + decimal point + number string

      -12.34

      minus twelve point three four

      Negative sign + three-digit number string separated by commas + decimal point + number string

      -1,000.1

      minus one thousand point one

      (Three-digit comma-separated) number string + hyphen + (three-digit comma-separated) number

      1-1,000

      one to one thousand

      Other default readings

      012.34

      twelve point three four

      None

      1/2

      one half

      -3/4

      minus three quarters

      5.1/6

      five point one over six

      -3 1/2

      minus three and a half

      1,000.3^3

      one thousand point three to the power of three

      3e9.1

      three times ten to the power of nine point one

      23.10%

      twenty three point one percent

    • digits

      Format

      Example

      English output

      Description

      Number string

      12034

      one two zero three four

      There is no special limit on the length of the number string, but it is recommended not to exceed 20 digits.

      When the number string is grouped by spaces or hyphens, a comma is inserted between the groups to create an appropriate pause. Up to 5 groups are supported.

      Number string + space or hyphen + number string + space or hyphen + number string + space or hyphen + number string

      1-23-456 7890

      one, two three, four five six, seven eight nine zero

    • telephone

      Format

      Example

      English output

      Description

      Number string

      12034

      one two oh three four

      There is no special limit on the length of the number string, but it is recommended not to exceed 20 digits. When the number string is grouped by spaces or hyphens, a comma is inserted between the groups to create an appropriate pause. Up to 5 groups are supported.

      Number string + space or hyphen + number string + space or hyphen + number string

      1-23-456 7890

      one, two three, four five six, seven eight nine oh

      Plus sign + number string + space or hyphen + number string

      +43-211-0567

      plus four three, two one one, oh five six seven

      Left parenthesis + number string + right parenthesis + space + number string + space or hyphen + number string

      (21) 654-3210

      (two one) six five four, three two one oh

    • address

      This tag is not supported for English text.

    • id

      For English text, this tag functions the same as the characters tag.

    • characters

      Format

      Example

      English output

      Description

      string

      *b+3$.c-0'=α

      asterisk B plus three dollar dot C dash zero apostrophe equals alpha

      Supports Chinese characters, uppercase and lowercase English characters, Arabic numerals 0-9, and some full-width and half-width characters.

      The spaces in the output indicate that a pause is inserted between each character, meaning the characters are read one by one.

      If the text within the tag contains XML special characters, you must escape them.

    • punctuation

      For English text, this tag functions the same as the characters tag.

    • date

      Format

      Example

      English output

      Description

      Four digits/two digits or four digits-two digits

      2000/01

      two thousand, oh one

      Spans across years.

      1900-01

      nineteen hundred, oh one

      2001-02

      twenty oh one, oh two

      2019-20

      twenty nineteen, twenty

      1998-99

      nineteen ninety eight, ninety nine

      1999-00

      nineteen ninety nine, oh oh

      Four-digit number starting with 1 or 2

      2000

      two thousand

      Four-digit year.

      1900

      nineteen hundred

      1905

      nineteen oh five

      2021

      twenty twenty one

      Day of the week-Day of the week

      or

      Day of the week~Day of the week

      or

      Day of the week&Day of the week

      mon-wed

      monday to wednesday

      If the text in the day-of-the-week range tag contains special XML characters, escape the characters.

      tue~fri

      tuesday to friday

      sat&sun

      saturday and sunday

      DD-DD MMM, YYYY

      or

      DD~DD MMM, YYYY

      or

      DD&DD MMM, YYYY

      19-20 Jan, 2000

      the nineteen to the twentieth of january two thousand

      DD indicates a two-digit day. MMM indicates the three-letter abbreviation or full name of a month. YYYY indicates a four-digit year starting with 1 or 2.

      01 ~ 10 Jul, 2020

      the first to the tenth of july twenty twenty

      05&06 Apr, 2009

      the fifth and the sixth of april two thousand nine

      MMM DD-DD

      or

      MMM DD~DD

      or

      MMM DD&DD

      Feb 01 - 03

      feburary the first to the third

      MMM indicates the three-letter abbreviation or full name of a month. DD indicates a two-digit day.

      Aug 10–20

      august the tenth to the twentieth

      Dec 11&12

      december the eleventh and the twelfth

      MMM-MMM

      or

      MMM~MMM

      or

      MMM&MMM

      Jan-Jun

      january to june

      MMM indicates the three-letter abbreviation or full name of a month.

      Jul - Dec

      july to dcember

      sep&oct

      september and october

      YYYY-YYYY

      or

      YYYY~YYYY

      1990 - 2000

      nineteen ninety to two thousand

      YYYY indicates a four-digit year that starts with 1 or 2.

      2001–2021

      two thousand one to twenty twenty one

      WWW DD MMM YYYY

      Sun 20 Nov 2011

      sunday the twentieth of november twenty eleven

      WWW is the three-letter abbreviation or full name for a day of the week. DD is a two-digit day. MMM is the three-letter abbreviation or full name for a month. MM is a two-digit month (or the three-letter abbreviation or full name for a month). YYYY is a four-digit year starting with 1 or 2.

      WWW DD MMM

      Sun 20 Nov

      sunday the twentieth of november

      WWW MMM DD YYYY

      Sun Nov 20 2011

      sunday november the twentieth twenty eleven

      WWW MMM DD

      Sun Nov 20

      sunday november the twentieth

      WWW YYYY-MM-DD

      Sat 2010-10-01

      saturday october the first twenty ten

      WWW YYYY/MM/DD

      Sat 2010/10/01

      saturday october the first twenty ten

      WWW MM/DD/YYYY

      Sun 11/20/2011

      sunday november the twentieth twenty eleven

      MM/DD/YYYY

      11/20/2011

      november the twentieth twenty eleven

      YYYY

      1998

      nineteen ninety eight

      Other default readings

      10 Mar, 2001

      the tenth of march two thousand one

      None

      10 Mar

      the tenth of march

      Mar 2001

      march two thousand one

      Fri. 10/Mar/2001

      friday the tenth of march two thousand one

      Mar 10th, 2001

      march the tenth two thousand one

      Mar 10

      march the tenth

      2001/03/10

      march the tenth two thousand one

      2001-03-10

      march the tenth two thousand one

      2000s

      two thousands

      2010's

      twenty tens

      1900's

      nineteen hundreds

      1990s

      nineteen nineties

    • time

      Format

      Example

      English outputs

      Description

      HH:MM AM or PM

      09:00 AM

      nine A M

      HH represents a one- or two-digit hour. MM represents a two-digit minute. AM/PM represents morning or afternoon.

      09:03 PM

      nine oh three P M

      09:13 p.m.

      nine thirteen p m

      HH:MM

      21:00

      twenty one hundred

      HHMM

      100

      one oclock

      Time point-Time point

      8:00 am - 05:30 pm

      eight a m to five p m

      Supports common time and time range formats.

      7:05~10:15 AM

      seven oh five to ten fifteen A M

      09:00-13:00

      nine oclock to thirteen hundred

    • currency

      Format

      Example

      English output

      Description

      Number + Currency identifier

      1.00 RMB

      one yuan

      Supported number formats: integers, decimals, and the international format that uses commas as thousands separators.

      Supported currency identifiers:

      CN¥ (yuan)

      CNY (yuan)

      RMB (yuan)

      AUD (australian dollar)

      CAD (canadian dollar)

      CHF (swiss franc)

      DKK (danish krone)

      EUR (euro)

      GBP (british pound)

      HKD (Hong Kong(China) dollar)

      JPY (japanese yen)

      NOK (norwegian krone)

      SEK (swedish krona)

      SGD (singapore dollar)

      USD (united states dollar)

      2.02 CNY

      two point zero two yuan

      1,000.23 CN¥

      one thousand point two three yuan

      1.01 SGD

      one singapore dollar and one cent

      2.01 CAD

      two canadian dollars and one cent

      3.1 HKD

      three hong kong dollars and ten cents

      1,000.00 EUR

      one thousand euros

      Currency identifier + Number

      US$ 1.00

      one US dollar

      Supported number formats: integers, decimals, and the international format that uses commas as thousands separators.

      Supported currency identifiers:

      US$ (US dollar)

      CA$ (Canadian dollar)

      AU$ (Australian dollar)

      SG$ (Singapore dollar)

      HK$ (Hong Kong(China) dollar)

      C$ (Canadian dollar)

      A$ (Australian dollar)

      $ (dollar)

      £ (pound)

      € (euro)

      CN¥ (yuan)

      CNY (yuan)

      RMB (yuan)

      AUD (australian dollar)

      CAD (canadian dollar)

      CHF (swiss franc)

      DKK (danish krone)

      EUR (euro)

      GBP (british pound)

      HKD (Hong Kong (China) dollar)

      JPY (japanese yen)

      NOK (norwegian krone)

      SEK (swedish krona)

      SGD (singapore dollar)

      USD (united states dollar)

      $0.01

      one cent

      JPY 1.01

      one japanese yen and one sen

      £1.1

      one pound and ten pence

      €2.01

      two euros and one cent

      USD 1,000

      one thousand united states dollars

      Number + Quantifier + Currency identifier

      or

      Currency identifier + Number + Quantifier

      1.23 Tn RMB

      one point two three trillion yuan

      Supported quantifier formats include the following:

      thousand

      million

      billion

      trillion

      Mil (million)

      mil (million)

      Bil (billion)

      bil (billion)

      MM (million)

      Bn (billion)

      bn (billion)

      Tn (trillion)

      tn (trillion)

      K(thousand)

      k (thousand)

      M (million)

      m (million)

      $1.2 K

      one point two thousand dollars

    • measure

      Format

      Example

      English Outputs

      Description

      Number + Unit of measurement

      1.0 kg

      one kilogram

      Supports integers, decimals, and international notation with comma separators.

      Supports common unit abbreviations.

      1,234.01 km

      one thousand two hundred thirty-four point zero one kilometers

      Unit of measurement

      mm2

      square millimeter

    • The following table lists the pronunciations of common symbols for <say-as>.

      Symbol

      English pronunciation

      !

      exclamation mark

      double quote

      #

      pound

      $

      dollar

      %

      percent

      &

      and

      left quote

      left parenthesis

      right parenthesis

      *

      asterisk

      +

      plus

      ,

      comma

      -

      dash

      .

      dot

      /

      slash

      :

      Solon

      semicolon

      <

      less than

      =

      equals

      >

      greater than

      ?

      question mark

      @

      at

      [

      left bracket

      \

      backslash

      ]

      right bracket

      ^

      caret

      _

      underscore

      `

      backtick

      {

      left brace

      |

      vertical bar

      }

      right brace

      ~

      tilde

      exclamation mark

      left double quote

      right double quote

      left quote

      right quote

      left parenthesis

      right parenthesis

      comma

      full stop

      em dash

      :

      colon

      semicolon

      question mark

      enumeration comma

      ellipsis

      ……

      ellipsis

      left guillemet

      right guillemet

      yuan

      greater than or equal to

      less than or equal to

      not equal

      approximately equal

      ±

      plus or minus

      ×

      times

      π

      pi

      Α

      alpha

      Β

      beta

      Γ

      gamma

      Δ

      delta

      Ε

      epsilon

      Ζ

      zeta

      Θ

      theta

      Ι

      iota

      Κ

      kappa

      lambda

      Μ

      mu

      Ν

      nu

      Ξ

      ksi

      Ο

      omicron

      pi

      Ρ

      rho

      sigma

      Τ

      tau

      Υ

      upsilon

      Φ

      phi

      Χ

      chi

      Ψ

      psi

      Ω

      omega

      α

      alpha

      β

      beta

      γ

      gamma

      δ

      delta

      ε

      epsilon

      ζ

      zeta

      η

      eta

      θ

      theta

      ι

      iota

      κ

      kappa

      λ

      lambda

      μ

      mu

      ν

      nu

      ξ

      ksi

      ο

      omicron

      π

      pi

      ρ

      rho

      σ

      sigma

      τ

      tau

      υ

      upsilon

      φ

      phi

      χ

      chi

      ψ

      psi

      ω

      omega

    • The following table lists common units of measurement for <say-as>.

      Format

      Category

      English example

      Abbreviation

      Length

      nm (nanometer), μm (micrometer), mm (millimeter), cm (centimeter), m (meter), km (kilometer), ft (foot), in (inch)

      Area

      cm² (square centimeter), m² (square meter), km² (square kilometer), SqFt (square foot)

      Volume

      cm³ (cubic centimeter), m³ (cubic meter), km3 (cubic kilometer), mL (milliliter), L (liter), gal (gallon)

      Weight

      μg (microgram), mg (milligram), g (gram), kg (kilogram)

      Time

      min (minute), sec (second), ms (millisecond)

      Electromagnetism

      μA (microamp), mA (milliamp), Hz (hertz), kHz (kilohertz), MHz (megahertz), GHz (gigahertz), V (volt), kV (kilovolt), kWh (kilowatt hour)

      Sound

      dB (decibel)

      Atmospheric pressure

      Pa (pascal), kPa (kilopascal), MPa (megapascal)

      Other common units

      Supports units of measurement that are not limited to the preceding categories, such as tsp (teaspoon), rpm (revolutions per minute), KB (kilobyte), and mmHg (millimetre of mercury).

  • Relationship

    The <say-as> tag can contain text and the <vhml/> tag.

  • Examples

    • cardinal

      <speak>
        <say-as interpret-as="cardinal">12345</say-as>
      </speak>
      <speak>
        <say-as interpret-as="cardinal">10234</say-as>
      </speak>
    • digits

      <speak>
        <say-as interpret-as="digits">12345</say-as>
      </speak>
      <speak>
        <say-as interpret-as="digits">10234</say-as>
      </speak>
    • telephone

      <speak>
        <say-as interpret-as="telephone">12345</say-as>
      </speak>
      <speak>
        <say-as interpret-as="telephone">10234</say-as>
      </speak>
    • name

      <speak>
        Her former name is <say-as interpret-as="name">Zeng Xiaofan</say-as>
      </speak>
    • address

      <speak>
        <say-as interpret-as="address">Fulu International, Building 1, Unit 3, Room 304</say-as>
      </speak>
    • id

      <speak>
        <say-as interpret-as="id">myid_1998</say-as>
      </speak>
    • characters

      <speak>
        <say-as interpret-as="characters">Greek letters αβ</say-as>
      </speak>
      <speak>
        <say-as interpret-as="characters">*b+3.c$=α</say-as>
      </speak>
    • punctuation

      <speak>
        <say-as interpret-as="punctuation"> -./:;</say-as>
      </speak>
    • date

      <speak>
        <say-as interpret-as="date">1000-10-10</say-as>
      </speak>
      <speak>
        <say-as interpret-as="date">10-01-2020</say-as>
      </speak>
    • time

      <speak>
        <say-as interpret-as="time">5:00am</say-as>
      </speak>
      <speak>
        <say-as interpret-as="time">0500</say-as>
      </speak>
    • currency

      <speak>
        <say-as interpret-as="currency">13,000,000.00RMB</say-as>
      </speak>
      <speak>
        <say-as interpret-as="currency">$1,000.01</say-as>
      </speak>
    • measure

      <speak>
        <say-as interpret-as="measure">100m12cm6mm</say-as>
      </speak>
      <speak>
        <say-as interpret-as="measure">1,000.01kg</say-as>
      </speak>