All Products
Search
Document Center

:FAQ about encoding

Last Updated:Mar 08, 2024

This topic provides answers to some commonly asked questions about encoding in EdgeRoutine (ER).

What encoding methods does ER support?

ER supports only UTF-8 encoding.

Does ER affect data transmission in pass-through mode?

No. ER does not fetch the request body in pass-through mode. ER transmits data as a stream. Only the request header is modified. The request body is transmitted without modification. ER transmits data as network streams. The streams do not pass through JavaScript virtual machines.

Note

By default, the Fetch API decompresses streams. Therefore, ER also decompresses streams. If you want to pass data without modifications, set the decompress parameter to manual.

Are JavaScript strings encoded in UTF-16?

No. UTF-16 encoding is not compatible with ASCII, and uses surrogate code points. If a web page contains code points that are encoded as surrogate pairs, character errors may occur.

String.substring extracts a substring of characters represented as UTF-16 code points. A surrogate pair contains two types of UTF-16 code point. In the substring, a surrogate pair may be unpaired. If a substring contains an unpaired surrogate pair, the substring is encoded as INVALID REPLACEMENT CHAR (65533) in UTF-8. The code is not displayed in a browser.

How can I modify content?

You can use the following code to buffer data:

text/arrayBuffer/JSON ...
Important
  • When you perform stream processing, check the surrogate code points. Make sure that the surrogate pairs are not unpaired. If a surrogate pair is unpaired, ER cannot determine the range of the substring that you want to extract. If most web pages do not contain characters that use surrogate pairs but contain emojis that use surrogate pairs, you can ignore this consideration.

  • Alibaba Cloud will soon launch an HTML parser to help you modify HTML code in a more efficient manner. For more information, see the announcements on the Alibaba Cloud International site.

How do I encode an ArrayBuffer in UTF-8 or decode a UTF-8-encoded string to an ArrayBuffer?

You can use TextEncoder or TextDecoder to perform encoding or decoding.