Loading

Surrogate Pair Characters Are Treated as Two Characters

Publiceringsdatum: Sep 10, 2025
Beskrivning

Issue

Although surrogate pair characters (e.g., "𠮟", "𠀋") display as 1 character, formula string functions (LEFT, LEN, MID, RIGHT, etc.) treat them as 2 characters internally.

  • LEN("𠮟") returns 2, not 1.

  • Extracting the high surrogate via LEFT("𠮟", 1) results in "?" because the character cannot be interpreted correctly.

Cause

Surrogate pair characters consist of a combination of two values: a "high surrogate" and a "low surrogate." Consequently, the Salesforce formula engine counts these as two separate characters.

Scenario: 

For example, a Salesforce admin building a formula to count characters in a text field containing Japanese kanji extended characters notices that LEN() returns an unexpected count.

Lösning

To correctly manipulate surrogate pair characters in formulas, you must treat them as two characters.

For example, to retrieve the surrogate pair character "𠮟", use the following syntax: LEFT("𠮟", 2)

Ytterligare resurser

This behavior is also observed in the String class. Please refer to the following article for more details.

See also:

Knowledge-artikelnummer

005166849

 
Laddar
Salesforce Help | Article