Although surrogate pair characters (e.g., "𠮟", "𠀋") display as 1 character, formula string functions (LEFT, LEN, MID, RIGHT, etc.) treat them as 2 characters internally.
LEN("𠮟") returns 2, not 1.
Extracting the high surrogate via LEFT("𠮟", 1) results in "?" because the character cannot be interpreted correctly.
Surrogate pair characters consist of a combination of two values: a "high surrogate" and a "low surrogate." Consequently, the Salesforce formula engine counts these as two separate characters.
Scenario:
For example, a Salesforce admin building a formula to count characters in a text field containing Japanese kanji extended characters notices that LEN() returns an unexpected count.
To correctly manipulate surrogate pair characters in formulas, you must treat them as two characters.
For example, to retrieve the surrogate pair character "𠮟", use the following syntax: LEFT("𠮟", 2)
This behavior is also observed in the String class. Please refer to the following article for more details.
See also:
005166849

We use three kinds of cookies on our websites: required, functional, and advertising. You can choose whether functional and advertising cookies apply. Click on the different cookie categories to find out more about each category and to change the default settings.
Privacy Statement
Required cookies are necessary for basic website functionality. Some examples include: session cookies needed to transmit the website, authentication cookies, and security cookies.
Functional cookies enhance functions, performance, and services on the website. Some examples include: cookies used to analyze site traffic, cookies used for market research, and cookies used to display advertising that is not directed to a particular individual.
Advertising cookies track activity across websites in order to understand a viewer’s interests, and direct them specific marketing. Some examples include: cookies used for remarketing, or interest-based advertising.