Downloading a bulk export of failed events file

📘

Webhooks v3 bulk export of failed events is a new Identity Cloud feature currently in limited availability. Because the feature isn't in general release, that means that both the product, and this documentation, are subject to change at any time.

Starting a job doesn’t actually delivered failed events to you or to your listener endpoint. Instead, each time you start a new job Webhooks v3 copies the events in your failed events table to a file and then writes that file to an Amazon S3 bucket. To actually get your hands on these events you’ll need to download that file by using the /download endpoint.

📘

Yes, you need to use the Webhooks v3 APIs to download an export file. For security reasons, and to ensure that you, and only you, have access to your webhook events, Akamai doesn’t allow direct access to the S3 bucket.

As you’ll see in a moment, calling the /download endpoint is remarkably easy. Still, there are a few things to keep in mind before you start using this endpoint:

  • A download file remains available for 7 days. After those 7 days are up, the file is automatically deleted from the S3 bucket and can’t be restored.

  • The same file can be downloaded multiple times, but Identity Cloud doesn’t keep track of which files have, and haven’t been, downloaded. That can potentially cause problems. For example, suppose you have a file containing events A, B, and C. If you download that file multiple times, and if you then upload the contents to your events database each time your database will contain multiple instances of events A, B, and C.

  • There’s no API endpoint for listing the files available for downloading. If you aren’t sure which files are in the S3 bucket use the /exports endpoint to determine the job IDs for each job run in the past 7 days. Those job IDs correspond to the download file IDs.

  • Sometimes you might try to download a file and, instead of a download, you get back a 404 Not Found error. This is typically because you specified an invalid job ID. However, this error can also occur if the file in question has been deleted (remember, download files are automatically deleted after 7 days) or because the file hasn’t been written to the S3 bucket yet. To verify this, use the /exports/{jobId} endpoint and check the status of the job. If the status is active the job is still processing and the file isn’t ready for downloading. If the status is canceled that means the job was cancelled and there won’t ever be a file you can download.

To download a file, use the /download operation, being sure to reference the export ID:

curl -L -X GET \
  'https://v1.api.us.janrain.com/e0a70b4f-1eef-4856-bcdb-f050fee66aae/webhooks/subscriptions/e26925c7-ca17-4b82-8530-e43158c2d63a/events/exports/a22c9604-7b27-464f-bff5-83ba229323af/download' \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer Yh1cEVzUOy0lZNaBxcGbQRQ6qFaRvW0UBqTD81p76ymSNN73P6eulMZFX2-MkN1h'

When you start a download, you’ll immediately get back the Etag value associated with the download file. For example:

"\"33a64df551425fcc55e4d42a148795d9f25f89d4\""

That’s the only feedback you’ll get. However, the download process is pretty quick, which means that, in most cases, it should take less than a minute to download the file.

We should also mention that download files are plain-text files in which each line is a JSON object containing the payload for a single webhook event (see The NDJSON file format for more information). For example, a single line in the event download file might look like this:

{"iss":"https://v1.api.us.janrain.com/00000000-0000-0000-0000-000000000000/webhooks","iat":1568312175,"jti":"519a08c4-3016-4cad-9b10-feb49fcd133a","aud":["https://example.com/path/to/endpoint"],"txn":"00000000-0000-0000-0000-000000000000","toe":1559372000,"events":{"entityCreated":{"captureApplicationId":"aaaaaaaaaaaaaaaaaaaaaaaaaa","captureClientId":"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa","entityType":"user","globalSub":"capture-v1://us.janraincapture.com/aaaaaaaaaaaaaaaaaaaaaaaaaa/user/00000000-0000-0000-0000-000000000000","sub":"00000000-0000-0000-0000-000000000000","id":"00000000-0000-0000-0000-000000000000"}}}

That single line looks this when reformatted using standard JSON:

{
	"iss": "https://v1.api.us.janrain.com/00000000-0000-0000-0000-000000000000/webhooks",
	"iat": 1568312175,
	"jti": "519a08c4-3016-4cad-9b10-feb49fcd133a",
	"aud": ["https://example.com/path/to/endpoint"],
	"txn": "00000000-0000-0000-0000-000000000000",
	"toe": 1559372000,
	"events": {
		"entityCreated": {
			"captureApplicationId": "aaaaaaaaaaaaaaaaaaaaaaaaaa",
			"captureClientId": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
			"entityType": "user",
			"globalSub": "capture-v1://us.janraincapture.com/aaaaaaaaaaaaaaaaaaaaaaaaaa/user/00000000-0000-0000-0000-000000000000",
			"sub": "00000000-0000-0000-0000-000000000000",
			"id": "00000000-0000-0000-0000-000000000000"
		}
	}
}

As noted, only the event payload is exported. Additional information, such as the security event token headers and signature, aren’t exported.

See also