The Fast Purge API uses rate controls to protect itself from inadvertent or malicious overuse.
The rate limiting values below are applied at an account level. If multiple API clients within the same account are calling the API, the rate limit is applied globally to all the API clients at any given time. It limits the number of requests per second, as well as the number of objects (URLs, CP codes, or cache tags) that can be submitted.
For the purpose of rate limiting, ARLs are counted as URLs.

This table shows the enforced rate limits for API requests in general and for each type of object. An individual API request can include many URLs, CP codes, or cache tags.

Limit TypeSustained RateBurst
API request50 requests per second100 requests
URL/ARL200 URLs/ARLs per second10,000 URLs/ARLs
CP code30 CP codes per minute300 CP codes
Cache tag500 cache tags per minute5,000 cache tags

The rate limiting algorithm used for sustained rate and burst follows a token bucket model.

There's one bucket for requests and a separate bucket for each object type (URLs/ARLs, CP codes, and cache tags). The burst value in the table represents how many tokens the bucket can hold. The buckets start full of tokens. Tokens are constantly added to the bucket based on the sustained rate. Once the bucket is full, no more tokens can be added.

When an API request is made, the request may contain either URLs/ARLs, CP codes, or cache tags. First, the API request bucket is checked for tokens. If a token is available, one token is removed from the bucket. Next, the object type bucket is checked for tokens. The object type bucket needs to contain enough tokens to satisfy all of the objects in the request. If there are enough tokens, they're removed from the bucket and the request is accepted. If there aren't enough tokens in the object bucket, no tokens are removed, the API request token is returned to the API request bucket, and the entire request is denied.

For example, the token bucket for cache tags holds 5,000 tokens and refills at a rate of 500 tokens per minute. You can submit a burst of 5,000 cache tags if the token bucket is full, but this completely empties the token bucket. Cache tag tokens are refilled at 500 tokens per minute, so after one minute, the bucket has 500 tokens in it and another request with up to 500 objects can be processed.

You can submit a burst of 5,000 tags every 10 minutes, 1,000 cache tags every 2 minutes, or 500 cache tags every minute. In all cases, the average sustained rate is still 500 cache tags per minute. If you submit a burst of 5,000 cache tags followed immediately by another request for 500 cache tags, the second request for 500 cache tags is denied because there aren't enough tokens left.

To purge more objects while remaining within defined limits, combine many objects into a single API request and reduce the rate of requests. A single API request may contain many objects of the same type, as long as the request body is smaller than 50,000 bytes.

See Best practices for multiple requests to learn about the match constraints and most efficient ways to build complex request files.