Transform relational data

If a user has only one instance of something – one display name, one birthday, one primary phone number – then databases are easy to construct and to maintain. For example:

displayNameBirthdayprimaryAddress.phoneNumber
Maria Fuentes12/24/1963(503)-555-0712

We can continue to add more and more attributes to this simple “flat file” database (a database that stores all its information in a single table) without any problem: the database would grow in size, but it wouldn’t grow much in terms of management or complexity:

displayNameBirthdayprimaryAddress.phoneNumbergender
Maria Fuentes12/24/1963(503)-555-0712female

But that’s true only if we add attributes that store a single value; it’s a different story if we need to add attributes that can store multiple values. For example, suppose we’re a media company, and we publish several different magazines. If each person can only subscribe to one magazine then data storage is easy:

displayNamemagazinesubscriptionType
Maria Fuentes​Akamai​ SportsPremium
Bob Jones​Akamai​ HealthPremium

But what happens if people are allowed to subscribe to multiple magazines? We won’t bother going into all the details here, but suffice to say that allowing people to have more than one thing (more than one magazine subscription, more than one phone number, more than one email address) wreaks havoc with the traditional flat file database. That’s why, back in 1970, Edgar F. Codd came up with the notion of a “relational database,” a database that uses related tables to store multiple pieces of information. For example, our publishing company database might have two tables, one for subscribers and one for subscriptions:

iddisplayName
000001Maria Fuentes
000002Bob Jones
idmagazinesubscriptionType
000001​Akamai​ SportsPremium
000001​Akamai​ HealthPremium
000002​Akamai​ HealthIntroductory
000001​Akamai​ BusinessBusiness

Although highly-simplified, the preceding examples gives you a rough idea of how relational databases work. Each user is assigned a unique ID; for example, Maria Fuentes’ unique ID is 000001. In turn, each magazine subscription is associated with the ID of the subscriber. If you look in the second table, you’ll see the ID 000001 listed three times. That’s because Maria subscribes to three different magazines.

So why are we telling you the history of flat file and relational databases? Well, for one thing, you might have relational data (e.g., magazine subscriptions) that you need to migrate to the ​Akamai​ user profile store. That’s fine, except for one problem: ​Akamai​'s user profile store (and ​Akamai​'s entity types) aren’t set up as relational databases. How are you supposed to copy relational database data into a non-relational database?

The answer lies with plurals, a special type of attribute that acts as a sort of relational table within a user profile. A plural is distinguished from a “regular” user profile attribute in two ways: 1) a plural attribute can contain any number of “child” attributes; and, 2) a plural attribute can contain any number of values. For example, in the Console, the schema for the statuses attribute looks like this:

As you can see, the statuses attribute is composed of several child attributes, including statuses.id; statuses.status; and statuses.statusCreated. And before you ask, yes, the full name of a child attribute does include the parent attribute name (statuses), a dot (.), and the child attribute name (id).

If you look at an actual user profile stored within the ​Akamai​ Identity Cloud you’ll see that any one user can have more than one status:

So how do you handle a situation where, say, a user has subscriptions to more than one magazine? Well, perhaps the best way to handle that is to create a plural attribute (such as subscriptions) and then populate that plural with the appropriate child attributes (e.g., magazineName and subscriptionType). The schema for that plural might look like this:



📘

The id attribute is generated by the system itself, and provides a way for you to keep track of individual subscriptions. In the preceding example, we only added the magazineName and subscriptionType attributes.


If you’d like more information on adding a plural attribute to a schema, see the entityType.addAttribute API endpoint documentation. For now, we’ll simply point out that code for creating the subscriptions plural described above looked similar to this:

curl -X POST \
    -H "Authorization: Basic aW1fYV...NfbXk="\
    --data-urlencode type_name=user \
    --data-urlencode attr_def='{"name":"subscriptions","type":"plural",
     "attr\_defs":[{"name":"magazineName","type":"string","length":256},
     {"name":"subscriptionType","type":"string","length": 256}]}'\
     https://educationcenter.janraincapture.com/entityType.addAttribute

To migrate data to your new plural attribute you’ll need to do several things. First, your data must be written to the data migration datafile as a JSON (JavaScript Object Notation) object. For example, the following JSON object specifies two magazine subscriptions (Akamai Business and Akamai Sports) for a user:

"[{"magazineName": "Akamai Business", "subscriptionType": "Premium"},{"magazineName": "Akamai Sports", "subscriptionType": "Premium"}]"

Admittedly, the preceding might not look like a JSON object; that’s because JSON objects are more typically displayed like this:

[    
  {    
     "magazineName":"Akamai Business",  
    "subscriptionType":"Premium"  
  },  
  {    
     "magazineName":"Akamai Sports",  
     "subscriptionType":"Premium"  
  }  
]

That’s easier to read, but you can’t use data like that in a datafile. To properly format the JSON object for use in the datafile, you must collapse all the data into a single line, something you do by removing the line breaks and any extraneous blank spaces:

[{"magazineName": "Akamai Business", "subscriptionType": "Premium"},{"magazineName": "Akamai Sports", "subscriptionType": "Premium"}]

When that’s done, you must then “double up” all the double quote marks in the value:

[{""magazineName"": ""Akamai Business"",""subscriptionType"": ""Premium""},{""magazineName"": ""Akamai Sports"",""subscriptionType"": ""Premium""}]

Finally, you have to enclose the entire JSON object in double quotes (but not double double quotes):

"[{""magazineName"": ""AkamaiBusiness"", ""subscriptionType"": ""Premium""},{""magazineName"": ""AkamaiSports"", ""subscriptionType": "Premium""}]"

That takes care of the data itself. Meanwhile, in the datafile header line, make sure you specify only the plural attribute name; in this case, use subscriptions as opposed to, say, subscriptions.magazineName:

email,password,givenName,familyName,birthday,subscriptions

In the end, a very simple datafile (one containing only a single user record) will look similar to this:

email,password,givenName,familyName,birthday,subscriptions  
karim.nafir@mail.com,$4$Bi34T/abBIA/kaq954jKtNlIShgm011/,Karim,Nafir,07/12/1981,"[{""magazineName"": ""Akamai Business"", ""subscriptionType"": ""Premium""},{""magazineName"": ""Akamai Sports"", ""subscriptionType": "Premium""}]"

And one other thing: after the datafile has been created you must update dataload.py to ensure that the script applies the transform_plural data transformation to the attribute. For example:

reader.add_transformation("subscriptions", transform_plural)

This latter step is explained in more detail in the Create data transformations section of this documentation.