Unicode-printable characters are rejected

If you’ve applied the unicode-printable data constraint to a user profile attribute then you might run into the following issue when using the /entity.create operation to create a user account (assuming, of course, that your API call tries to modify the affected attribute). For example, this Curl command tries to set the givenName attribute to 🤎️:

curl -L -X POST \
  'https://se-demos-gstemp.us-dev.janraincapture.com/entity.create' \
  -H 'Authorization: Basic eTR4Zmc2ZjQ0bXNhYzN2ZXBqanZ4Z2d6dnQzZTNzazk6OTVjY3hrN2N6YnZ1eng2ZHB0ZTVrOXA2ZGo1Ynpla3U=' \
  -H 'Content-Type: application/x-www-form-urlencoded' \
  --data-urlencode 'type_name=user' \
  --data-urlencode 'attributes=\
  {"givenName":"🤎️","familyName":"Nafir","displayName":"Karim Nafir","email":"karim.nafir@mail.com"}
  '

Although 🤎️ is a valid, printable Unicode character (Unicode character U+1F90E), the preceding command fails with the following error:

{
    "attribute_name": "/givenName",
    "constraint_name": "unicode-printable",
    "error": "constraint_violation",
    "request_id": " 17665",
    "error_description": "the value provided for /givenName violates the unicode-printable constraint",
    "stat": "error",
    "code": 360
}

The constraint_violation error typically occurs when you try to assign an attribute value (on an attribute where the unicode-printable constraint has been applied) by using a character that can’t be printed to the screen via HTML. We say “typically” because, in the preceding example, we tried to set the givenName attribute to the brown heart emoji and the brown heart emoji is a Unicode character that can be printed on screen (as you can see for yourself). And yet we still got a constraint_violation when making our API call. Why?

The answer simple, albeit a little frustrating: as it turns out, the entity.create operation doesn’t accept every Unicode character that can be printed onscreen by using HTML. Instead, it accepts HTML-compatible Unicode characters available in Unicode release 7.0 or earlier. For example, take the man lifting weights emoji (Unicode character U+1F3CB):

img

This emoji was released as part of the Unicode 7.0 package and it can be used as a given name even when the unicode-printable constraint is set on the givenName attribute. If we run our API call using this emoji, that API call succeeds and we end up with a new user profile that, in Console, looks like this:

img

But the brown heart emoji? Because that character debuted in the Unicode 12.0 release (i.e., post-Unicode 7.0) it can’t be used with any user profile attribute where the unicode-printable constraint has been applied: it’s simply too recent of a release.


📘

Admittedly, there might be an occasional exception. But, as a general rule, Unicode characters released after Unicode 7.0 can’t be used if the unicode-printable constraint has been applied.


There’s no doubt that this can be a problem if your site allows the use of emojis or other Unicode characters (such as the characters used in any number of non-English languages). Keep in mind, however, that it’s only a problem if you’ve apply the unicode-printable constraint to an attribute. If you haven’t applied that constraint then you can use an emoji such as brown heart as an attribute value:

img

And this limitation doesn’t mean that you can’t use the latest Unicode characters when calling the entity operations: you can. To do that, however, you must remove the unicode-printable constraint from any user profile attribute that accepts Unicode characters. As a quick example, this Curl command removes all the data constraints from the givenName attribute (including the unicode-printable constraint) , a trick it accomplishes by setting the constraints property to an empty array:

curl -L -X POST \
  'https://documentation.us-dev.janraincapture.com/entityType.setAttributeConstraints' \
  -H 'Authorization: Basic oY6aZmc2ZjQ0brt4YzN2ZXBqanZ4Z2d6dnQzZTNzazk6OTVjY3hrN2N6Ylp91eng2ZHB0ZTVrOXA2ZGo1YnVFraU=' \
  -H 'Content-Type: application/x-www-form-urlencoded' \
  --data-urlencode 'type_name=user' \
  --data-urlencode 'attribute_name=givenName' \
  --data-urlencode 'constraints=[]'


After we've removed the constraint we can set the attribute value to a Unicode character released after Unicode 7.0. And if we restore the constraint? Well ....

We don’t claim that this is the perfect solution to the problem: theoretically a user could now set their given name to an unprintable HTML character (e.g., a tab or carriage return). But if you need to allow the use of the latest Unicode characters it’s the only solution we have at the moment. And, of course, you can remove the constraint and then use a regular expression to check for the existence of non-printable characters.