Loading

Surrogate Pair Characters Are Treated as Two Characters

Publiseringsdato: Sep 10, 2025
Beskrivelse

Issue

Although surrogate pair characters (e.g., "𠮟", "𠀋") display as 1 character, formula string functions (LEFT, LEN, MID, RIGHT, etc.) treat them as 2 characters internally.

  • LEN("𠮟") returns 2, not 1.

  • Extracting the high surrogate via LEFT("𠮟", 1) results in "?" because the character cannot be interpreted correctly.

Cause

Surrogate pair characters consist of a combination of two values: a "high surrogate" and a "low surrogate." Consequently, the Salesforce formula engine counts these as two separate characters.

Scenario: 

For example, a Salesforce admin building a formula to count characters in a text field containing Japanese kanji extended characters notices that LEN() returns an unexpected count.

Løsning

To correctly manipulate surrogate pair characters in formulas, you must treat them as two characters.

For example, to retrieve the surrogate pair character "𠮟", use the following syntax: LEFT("𠮟", 2)

Flere ressurser

This behavior is also observed in the String class. Please refer to the following article for more details.

See also:

Knowledge-artikkelnummer

005166849

 
Laster
Salesforce Help | Article