Schedule redelivery for a failed webhook event

Redeliver a failed event


Webhooks v3 was designed to guard against single-point-of-failure issues that could arise when delivering webhook notifications.

If a delivery fails in Webhooks v3 then:

  • Webhooks v3 waits 3 seconds and then tries again.
  • If the second delivery attempt fails Webhooks v3 waits 30 seconds and then tries again.
  • If the third delivery attempt fails Webhooks v3 waits 5 minutes and then tries again.
  • If the fourth delivery attempt fails Webhooks v3 waits 1 hour and then tries again.
  • If the fifth delivery attempt fails Webhooks v3 waits 24 hours and then tries again.

If delivery still fails 24 hours after the first attempt, then Webhooks v3 stops retrying and assigns the notification the failure state. Failed events are kept in the Webhooks event store for the next 7 days. After those 7 days have elapsed, failed events are automatically and permanently deleted from the event store.

The /redeliver endpoint enables you to restart the delivery process for an event that has been marked as failed and is still in the event store. In other words, you have the ability to recover failed events within 7 days of failure.


Cautions about, and limitations of, the event redelivery service

  • The /redeliver endpoint reschedules event deliveries on an event-by-event basis. For example, suppose events A, B, C, and D have all failed. To retrieve those events, you’ll need to make an API call that redelivers event A; make a second API call that redelivers event B; make a third API call that redelivers event C; and so on.

  • The /redeliver endpoint is rate limited to 1 request per minute per Webhooks subscription. In theory, you could write a script that simply calls the /redeliver endpoint over and over again, in rapid-fire succession. However, if you exceed the rate limit your script will fail.

The event delivery service is designed as a way to help you retrieve the occasional event that fails to be delivered. It is not the method by which you should routinely receive your event notifications, and it does not work well in the event of a disaster that results in hundreds or thousands of failed events.

For disaster recovery scenarios, please use the bulk export feature instead.


How the Webhooks v3 redelivery service works

The Webhooks v3 redelivery service is remarkably simple. As noted previously, it consists of a single API endpoint that can be called to reschedule delivery of a single event notification. Or at least it can do that as long as:

  • The event is marked as failed. Your API call will fail if the event is in any other state (e.g., awaiting-executing). This includes events marked as success (i.e., events that were successfully delivered). Suppose Event A is delivered and, as a result, is marked as success. Suppose you then inadvertently delete Event A from your listener endpoint. In a case like that, you can’t use the /redeliver endpoint to get a “replacement” copy of Event A: there’s no way to redeliver events that have already been delivered.

  • The event is in the event store. Remember, events are automatically deleted from the event store after 7 days: once deleted, those events cannot be retrieved. Let’s assume that, for whatever reason, you were unable to receive Webhook notifications for the past 8 days. Events for the last days will still be in the event store; however, events generated on day 1 (8 days ago) will have already been deleted from the store. Those events can’t be retrieved and they can’t be scheduled for redelivery.


📘

But keep reading: if necessary, there is a way to keep events in the event store indefinitely.


The first step in scheduling event redelivery is to determine the unique IDs of all the failed events currently in the event store; that’s something that can be done using the /events endpoint and the state parameter. For example, this call returns a collection of all the failed events for the Webhooks subscription a6de662c-e93b-4041-96f0-283214de75b6:

curl -X GET \
  https://v1.api.us.janrain.com/e0a70b4f-1eef-4856-bcdb-f050fee66aae/webhooks/subscriptions/a6de662c-e93b-4041-96f0-283214de75b6/events?state=failure \
  -H Authorization: 'Bearer Xk7EzdpGq5GPQcsxCWM2SxdlwU_iTsA4i2Px4TEzBrfLIvddjnDVBJxjPDuCARHH'

📘

Yes, you must retrieve failed events on a subscription-by-subscription basis. If you have 4 Webhook subscriptions you’ll need to make the preceding API call against each subscription in order to return all your failed events.


For each failed event (or at least for each failed event that you’d like redelivered), your next step is to retrieve the event ID. For example:

"_embedded": [{
    "id": "1b8773c6-5f6a-4ba5-8f3b-210732476cd6",
    "createdAt": "2020-01-28T18:16:04.034726Z",
    "updatedAt": "2020-01-28T18:16:04.616963Z",
    "state": "success",
    "attempts": 1,
    "request": {
      "endpoint": "https://webhook.site/46ff3c5e-ae95-43df-b32d-d07bb84746b4",
      "headers": {
        "Accept": "*/*",
        "Content-Length": "1252",
        "Content-Type": "application/secevent+jwt",
        "Host": "webhook.site",
        "User-Agent": "Akamai Identity Cloud Webhooks/v3.0.0"
        },

That event ID is then included in your call to the /redeliver endpoint:

curl -L -X POST \
 'https://v1.api.us.janrain.com/e0a70b4f-1eef-4856-bcdb-f050fee66aae/webhooks/subscriptions/a6de662c-e93b-4041-96f0-283214de75b6/events/1b8773c6-5f6a-4ba5-8f3b-210732476cd6/redeliver'
   -H 'Authorization: Bearer ELfZB8fwZIKewDiv7iiXdef4CFMtjI5An9N1BI-BzQixRPtRmm9U6lzyPzHHmbdv'

After making your API call, repeat the process with the next event in your list.

Here are a couple quick notes regarding the /redeliver endpoint. For one, you must use the POST method. For another, you can’t include anything in the API call’s body parameter. As shown in the preceding example, your API call can only include the Authorization header and the endpoint URL.

If your API call succeeds, two things will happen. First, you’ll get back a 202 Accepted HTTP response. Second, the event state for the specified event will be changed from failed to awaiting-executing, the same state assigned to brand-new events. In fact, from that point on the formerly-failed event is treated exactly the same as a brand-new event. (Again, with the caveat that the rescheduled event is in a lower-priority delivery queue.) That means that the regular delivery cycle will be in force; Webhooks will attempt to deliver the event and, if delivery fails, will wait 3 seconds and then make aa second attempt. If that second attempt fails, Webhooks will wait 30 seconds and then try a third attempt, and so on. If, 24 hours later, the event still can’t be delivered then the event is marked as failed and is kept in the event store for 7 days. After 7 days, the event is automatically deleted.

And what if you use the /redeliver endpoint to make a second stab at redelivering the event? In that case, the cycle repeats itself: the event is marked as awaiting-execution and the process starts all over again. There’s no limit on how many times you can run the /redeliver endpoint against a specific event. In theory, you could keep an event in the event store forever, as long as you schedule it for redelivery before the 7-day time period expired.


📘

And assuming that the event hasn’t been delivered. Once an event has been delivered and marked as a success, the event can’t be redelivered and is deleted from the event store after 7 days.


For visual learners, a somewhat-simplified version of the process looks like this: