Query or import large data sets

User data can be managed using the /entity API:

OperationDescription
/entityRetrieves a single user record.
/entity.findRetrieves a set of user records, determine by the filter applied.
/entity.createCreates a single user record.
/entity.bulkCreateCreates multiple user records in a single API call.
/entity.updateUpdates only the specified attributes for an existing user record.
/entity.replaceReplaces all attributes for an existing user record; any attributes not specified will be replaced with null values.

Querying large data sets

When using the entity.find operation to iterate over large sets of data (> 100,000), queries should be optimized using natural database sorting by sorting on the id attribute. This has two benefits:

  • Records created between when the time iteration begins and when the time iteration ends are included in the results.

  • Efficient and consistent performance querying and loading for each page of results.

The following tips will help you optimize your queries:

  • Use the attributes member to limit the number of attributes returned for each record to minimize the size of the HTTP payload.

  • Experiment with the max_results member to optimize for responses under 10 seconds.

  • Include the timeout member (up to 60 seconds) if, and only if, you are unable to keep responses under 10 seconds using the max_results member.

The sample code below (written in Python) shows how to iterate over every record updated since January 1, 2016. Only the iduuid, and email attributes are returned in the result set, and up to 100 records are returned with each request.

import requests
import json
last\_id = 0
while True:
    response = requests.post(
       'https://YOUR\_APP.janraincapture.com/entity.find',
        headers={
            'Authorization': 'Basic aW1fYV...NfbXk='
       },
        data={
            'type\_name': 'user',
            'max\_results': '100',
            'attributes': '["id", "uuid", "email"]',
            'sort\_on': '["id"]',
            'filter': "id > {} and lastUpdated >= '2016-01-01'".format(last\_id),
        }
    )
    json\_resp = json.loads(response.text)
    if json\_resp['stat'] == 'ok' and json\_resp.get('result\_count', 0) > 0:
        for record in json\_resp['results']:
            # do something with record
            print(record)
            # update last\_id variable with last record in the results
            last\_id = record['id']
    else:
        # stop iterating when there are no more results
        break

Bulk data imports

If you need to import user records from an existing data store into the Identity Cloud platform, the /entity.bulkCreate API operation can be used for bulk loading data.

Akamai Professional Services can provide an example script utilizing this API to perform data migrations. Always alert ​Akamai​ of the date and time you plan to run any bulk data events by submitting a Traffic Event request through the Support Portal.

Note that the entity.bulkCreate operation limits you to a request body no larger than 5 MB. If you encounter a client intended to send too large body error, you'll need to reduce the size of the request body (for example, by dividing the list of accounts to be created in half, and then making two API calls).