This topic provides some commonly asked questions about encoding in EdgeRoutine (ER).
What encoding methods does ER support?
ER supports only UTF-8 encoding.
Does ER affect data transmission in pass-through mode?
No. UTF-16 encoding is not compatible with ASCII, and uses surrogate code points. If a web page contains code points that are encoded as surrogate pairs, character errors may occur.
String.substring extracts a substring of characters represented as UTF-16 code points. A surrogate
pair contains two types of UTF-16 code point. In the substring, a surrogate pair may
be unpaired. If a substring contains an unpaired surrogate pair, the substring is
INVALID REPLACEMENT CHAR (65533) in UTF-8. In a browser, the code is not displayed.
How can I modify content?
- When you perform stream processing, pay attention to the surrogate code points. Make sure that the surrogate pairs are not unpaired. If a surrogate pair is unpaired, ER cannot determine the range of the substring that you want to extract. If most web pages do not contain characters that use surrogate pairs, but contain emojis that use surrogate pairs, you can ignore this consideration.
- Alibaba Cloud will soon launch an HTML parser that helps you modify HTML code more efficiently. Pay attention to the announcements on the Alibaba Cloud International site.
How do I encode an ArrayBuffer in UTF-8 or decode a UTF-8-encoded string to an ArrayBuffer?
You can use TextEncoder or TextDecoder to perform encoding or decoding.