Loading

Surrogate Pair Characters Are Treated as Two Characters

Дата публикации: Sep 10, 2025
Описание

Issue

Although surrogate pair characters (e.g., "𠮟", "𠀋") display as 1 character, formula string functions (LEFT, LEN, MID, RIGHT, etc.) treat them as 2 characters internally.

  • LEN("𠮟") returns 2, not 1.

  • Extracting the high surrogate via LEFT("𠮟", 1) results in "?" because the character cannot be interpreted correctly.

Cause

Surrogate pair characters consist of a combination of two values: a "high surrogate" and a "low surrogate." Consequently, the Salesforce formula engine counts these as two separate characters.

Scenario: 

For example, a Salesforce admin building a formula to count characters in a text field containing Japanese kanji extended characters notices that LEN() returns an unexpected count.

Решение

To correctly manipulate surrogate pair characters in formulas, you must treat them as two characters.

For example, to retrieve the surrogate pair character "𠮟", use the following syntax: LEFT("𠮟", 2)

Дополнительные ресурсы

This behavior is also observed in the String class. Please refer to the following article for more details.

See also:

Номер статьи базы знаний

005166849

 
Загрузка
Salesforce Help | Article