Create data streams
Data streams are monitoring and analytical components you set up independently of and later assign to your properties.
This component-rule relationship lets you configure multiple data streams, each focused on gathering a specific type of data. When combined, data streams deliver customized metrics and health insights about your traffic across multiple properties.
What you'll do
Configure a data stream, set up the push of raw data logs to a given destination, and connect it to your properties.
Get your properties
The properties you use in DataStream integrations must have room to pick up a new data stream and be active on a network.
- You can assign up to three data streams to a single property. To see how many data streams are assigned to a property, review its
Datastream
rule. - A single data stream can support up to 100 properties. Use the Data streams resource to get a list of its assigned properties.
Note: If you need, create a new property and activate it on a network.
-
Get a list of your properties and determine which of them you'll use with your data stream.
data "akamai_properties" "my_properties" { group_id = "12345" contract_id = "C-0N7RAC7" } output "my_properties" { value = data.akamai_properties.my_properties }
+ my_properties = { + contract_id = "C-0N7RAC7" + group_id = "12345" + id = "grp_12345ctr_C-0N7RAC7" + properties = [ + { + contract_id = "ctr_C-0N7RAC7" + group_id = "grp_12345" + latest_version = 8 + note = "Added hostname." + product_id = "prd_Adaptive_Media_Delivery" + production_version = 8 + property_id = "prp_12345" + property_name = "my_property1" + rule_format = "" + staging_version = 3 }, + { + contract_id = "ctr_C-0N7RAC7" + group_id = "grp_12345" + latest_version = 3 + note = "File type update." + product_id = "prd_Object_Delivery" + production_version = 3 + property_id = "prp_98765" + property_name = "my_property2" + rule_format = "" + staging_version = 2 }, ] }
-
Export all chosen properties using the Terraform CLI.
The CLI export command places property configuration files in your current directory. To export them to a different location, add the
--tfworkpath <path>
flag to the command.When exporting more than one, loop through the CLI command with the
--tfworkpath <path>
flag, changing the property name and export location every iteration.akamai terraform --edgerc <edgerc-file-location> --section <edgerc-section> export-property <property-name>
-
Run the included import script to populate your Terraform state. This prevents Terraform from attempting to recreate your assets.
After you create a data stream, you'll update your properties' rule trees.
Create a data stream
To create a data stream, configure a log destination and choose data sets that shape what's monitored and collected about your traffic.
Provide a destination
You can use a custom HTTPS endpoint or a third-party object storage location. Supported storage locations:
- Amazon S3
- Azure Storage
- Datadog
- Elasticsearch
- Google Cloud Storage
- Loggly
- New Relic
- Oracle Cloud
- Splunk
- Sumo Logic
Create a connector block for your data destination. Use the argument column's heading as is and add it to _connector
to name your connector's block, for example, gcs_connector
.
Include all required arguments for your destination.
gcs_connector {
bucket = "my_bucket"
display_name = "my_connector_name"
path = "akamai/logs"
private_key = "-----BEGIN PRIVATE KEY-----\nprivate_key\n-----END PRIVATE KEY-----\n"
project_id = "my_project_id"
service_account_name = "my_service_account_name"
}
Argument | Required | Description |
---|---|---|
azure | ||
access_key |
✔ | The account access key for authentication. |
account_name |
✔ | The Azure Storage account. |
display_name |
✔ | The connector's name. |
container_name |
✔ | The Azure Storage container name. |
path |
✔ | The path to the log storage folder. |
compress_logs |
Boolean that sets the compression of logs. | |
datadog | ||
auth_token |
✔ | Your account's API key. |
display_name |
✔ | The connector's name. |
endpoint |
✔ | The storage endpoint for the logs. |
tags |
The Datadog connector tags. | |
compress_logs |
Boolean that sets the compression of logs. | |
service |
The Datadog service connector. | |
source |
The Datadog source connector. | |
elasticsearch | ||
display_name |
✔ | The connector's name. |
endpoint |
✔ | The storage endpoint for the logs. |
user_name |
✔ | The BASIC user name for authentication. |
password |
✔ | The BASIC password for authentication. |
index_name |
✔ | The index name for where to store log files. |
tls_hostname |
The hostname that verifies the server's certificate and matches the Subject Alternative Names (SANs) in the certificate. If not provided, DataStream fetches the hostname from the endpoint URL. | |
ca_cert |
The certification authority (CA) certificate used to verify the origin server's certificate. If the certificate is not signed by a well-known certification authority, enter the CA certificate in PEM format for verification. | |
client_cert |
The digital certificate in the PEM format you want to use to authenticate requests to your destination. If you want to use mutual authentication, you need to provide both the client certificate and the client key in PEM format. | |
client_key |
The private key for back-end authentication in non-encrypted PKCS8 format you. If you want to use mutual authentication, you need to provide both the client certificate and the client key. | |
m_tls |
Boolean that sets mTLS enablement. | |
content_type |
The content type to pass in the log file header. | |
custom_header_name |
A custom header name passed with the request to the destination. | |
custom_header_value |
The custom header's value passed with the request to the destination. | |
gcs | ||
bucket |
✔ | The bucket name. |
display_name |
✔ | The connector's name. |
private_key |
✔ | A JSON private key for a Google Cloud Storage account. |
project_id |
✔ | A Google Cloud project ID. |
service_account_name |
✔ | The name of the service account with the storage object create permission or storage object creator role. |
compress_logs |
Boolean that sets the compression of logs. | |
path |
The path to the log storage folder. | |
https | ||
authentication_type |
✔ | Either NONE for no authentication or BASIC for username and password authentication. |
display_name |
✔ | The connector's name. |
content_type |
✔ | The content type to pass in the log file header. |
endpoint |
✔ | The storage endpoint for the logs. |
m_tls |
Boolean that sets mTLS enablement. | |
compress_logs |
Boolean that sets the compression of logs. | |
custom_header_name |
A custom header name passed with the request to the destination. | |
custom_header_value |
The custom header's value passed with the request to the destination. | |
password |
The BASIC password for authentication. |
|
user_name |
The BASIC user name for authentication. |
|
tls_hostname |
The hostname that verifies the server's certificate and matches the Subject Alternative Names (SANs) in the certificate. If not provided, DataStream fetches the hostname from the endpoint URL. | |
ca_cert |
The certification authority (CA) certificate used to verify the origin server's certificate. If the certificate is not signed by a well-known certification authority, enter the CA certificate in PEM format for verification. | |
client_cert |
The digital certificate in the PEM format you want to use to authenticate requests to your destination. If you want to use mutual authentication, you need to provide both the client certificate and the client key in PEM format. | |
client_key |
The private key for back-end authentication in non-encrypted PKCS8 format you. If you want to use mutual authentication, you need to provide both the client certificate and the client key. | |
loggly | ||
display_name |
✔ | The connector's name. |
endpoint |
✔ | The storage endpoint for the logs. |
auth_token |
✔ | The HTTP code for your Loggly bulk endpoint. |
content_type |
The content type to pass in the log file header. | |
tags |
Tags to segment and filter log events in Loggly. | |
custom_header_name |
A custom header name passed with the request to the destination. | |
custom_header_value |
The custom header's value passed with the request to the destination. | |
new_relic | ||
display_name |
✔ | The connector's name. |
endpoint |
✔ | The storage endpoint for the logs. |
auth_token |
✔ | Your account's API key. |
content_type |
The content type to pass in the log file header. | |
custom_header_name |
A custom header name passed with the request to the destination. | |
custom_header_value |
The custom header's value passed with the request to the destination. | |
oracle | ||
access_key |
✔ | The account access key for authentication. |
bucket |
✔ | The bucket name. |
compress_logs |
✔ | Boolean that sets the compression of logs. |
display_name |
✔ | The connector's name. |
namespace |
✔ | The Oracle Cloud storage account's namespace. |
path |
✔ | The path to the log storage folder. |
region |
✔ | The region where the bucket resides. |
secret_access_key |
✔ | The account access key for authentication. |
s3 | ||
access_key |
✔ | The account access key for authentication. |
bucket |
✔ | The bucket name. |
display_name |
✔ | The connector's name. |
path |
✔ | The path to the log storage folder. |
region |
✔ | The region where the bucket resides. |
secret_access_key |
✔ | The secret access key used to authenticate requests to the Amazon S3 account. |
compress_logs |
Boolean that sets the compression of logs. | |
splunk | ||
display_name |
✔ | The connector's name. |
event_collector_token |
✔ | The Splunk account's event collector token. |
endpoint |
✔ | The storage endpoint for the logs. |
client_key |
The private key for back-end authentication in non-encrypted PKCS8 format you. If you want to use mutual authentication, you need to provide both the client certificate and the client key. | |
ca_cert |
The certification authority (CA) certificate used to verify the origin server's certificate. If the certificate is not signed by a well-known certification authority, enter the CA certificate in PEM format for verification. | |
client_cert |
The digital certificate in the PEM format you want to use to authenticate requests to your destination. If you want to use mutual authentication, you need to provide both the client certificate and the client key in PEM format. | |
m_tls |
Boolean that sets mTLS enablement. | |
custom_header_name |
A custom header name passed with the request to the destination. | |
custom_header_value |
The custom header's value passed with the request to the destination. | |
tls_hostname |
The hostname that verifies the server's certificate and matches the Subject Alternative Names (SANs) in the certificate. If not provided, DataStream fetches the hostname from the endpoint URL. | |
compress_logs |
Boolean that sets the compression of logs. | |
sumologic | ||
collector_code |
✔ | The Sumo Logic endpoint's HTTP collector code. |
content_type |
✔ | The content type to pass in the log file header. |
display_name |
✔ | The connector's name. |
endpoint |
✔ | The storage endpoint for the logs. |
compress_logs |
Boolean that sets the compression of logs. | |
custom_header_name |
A custom header name passed with the request to the destination. | |
custom_header_value |
The custom header's value passed with the request to the destination. |
Choose data sets
Data set fields represent the types of data collected and returned in your log files.
Choose data set field and add their IDs as a comma separated list of integers in the dataset_fields
argument. The order in which you place these determines the order in which they appear in log files.
For fields that require additional behaviors, wait to adjust your property configuration until after data stream creation.
ID | Field name | Description |
---|---|---|
Log information | ||
999 | Stream ID | The ID for the stream that logged the request data. You can log this field to troubleshoot and group logs between different streams. |
1000 | CP code | The CP code associated with the request. |
1002 | Request ID | The request ID. |
1100 | Request time | The time when the edge server accepted the request from the client. |
2024 | Edge attempts | The number of attempts to download the content from the edge in a specific time interval. Value based on the number of total manifest requests received. |
Message exchange data | ||
1005 | Bytes | The content bytes served in the response body. For HTTP/2, this includes overhead bytes. |
1006 | Client IP | The requesting client's IPv4 or IPv6 address. |
1008 | HTTP status code | The returned HTTP response code. |
1009 | Protocol type | The request-response scheme, either HTTP or HTTPS. |
1011 | Request host | The value of the host in the request header. |
1012 | Request method | A request's HTTP method. |
1013 | Request path | The path to a resource in the request, excluding query parameters. |
1014 | Request port | The client TCP port number of the requested service. |
1015 | Response Content-Length | The size of the entity-body in bytes returned to the client. |
1016 | Response Content-Type | The type of the content returned to the client. |
1017 | User-Agent | The URI-encoded user agent making the request. |
2001 | TLS overhead time | The time in milliseconds between when the edge server accepts the connection and the completion of the SSL handshake. |
2002 | TLS version | The protocol of the TLS handshake, either TLSv1.2 or TLSv1.3. |
2003 | Object size | The size of the object, excluding HTTP response headers. |
2004 | Uncompressed size | The size of the uncompressed object if zipped before sending to the client. |
2006 | Overhead bytes | TCP overhead in bytes for the request and response. |
2008 | Total bytes | The total bytes served in the response, including content and HTTP overhead. |
2009 | Query string | The query string in the incoming URL from the client. To monitor this parameter in your logs, you need to update your property configuration to set the cache key query parameters behavior to include all parameters. |
2023 | File size bucket | Groups of response content sorted into different buckets by size in kilobytes, megabytes, and gigabytes. |
2060 | Brotli status | This field reports the status when serving a Brotli-compressed object. This field is available only for Ion Standard, Ion Premier and Ion Media Advanced products. For details, see Brotli status. |
2061 | Origin Content-Length | The compressible content-length object value, in bytes, in the response header from the origin. This field is only available for Ion Standard, Ion Premier, and Ion Media Advanced products. |
2062 | Download initiated | The number of successful download initiations in a specific time interval. |
2063 | Download completed | The number of successful downloads completed. |
Request header data | ||
1019 | Accept-Language | The list of languages acceptable in the response. |
1023 | Cookie | A list of HTTP cookies previously sent by the server with the Set-Cookie header. |
1031 | Range | The requested entity part returned. |
1032 | Referer | The address of the resource that forwarded the request URL. |
1037 | X-Forwarded-For | The originating IP address of a client connecting to a web server through an HTTP proxy or load balancer. |
2005 | Max-Age | The time in seconds a response object is valid for positive cache responses. |
Network performance data | ||
1033 | Request end time | The time in milliseconds it takes the edge server to fully read the request. |
1068 | Error code | A description detailing the issue with serving a request. |
1102 | Turn around time | The time in milliseconds from when the edge server receives the last byte of the request to when it sends the first bytes of the response. |
1103 | Transfer time | The time in milliseconds from when the edge server is ready to send the first byte of the response to when the last byte reaches the kernel. |
2007 | DNS lookup time | The time in seconds between the start of the request and the completion of the DNS lookup, if one was required. For cached IP addresses, this value is zero. |
2021 | Last byte | The last byte of the object that was served in a response. 0 indicates a part of a byte-range response. This field is available for all products supported by DataStream. |
2022 | Asnum | The Autonomous System Number (ASN) of the request's internet service provider. |
2025 | Time to first byte | The time taken to download the first byte of the received content in milliseconds. |
2026 | Startup errors | The number of download initiation failures in a specific time interval. |
2027 | Download time | The time taken to download the object in milliseconds. |
2028 | Throughput | The byte transfer rate for the selected time interval in kilobits per second. |
Cache data | ||
2010 | Cache status | Returns 0 if there was no object in the cache, and 1 if the object was present in the cache. In the event of negatively cached errors or stale content, the object is served from upstream even if cached. |
2011 | Cache refresh source | ? |
2019 | Cacheable | Returns 1 if the object is cacheable based on response headers and metadata, and 0 if the object is not cacheable. |
2020 | Breadcrumbs | Returns additional breadcrumbs data about the HTTP request-response cycle for improved visibility into the Akamai platform, such as the IP of the node or host, component, request end, turnaround, and DNS lookup time. This field is available only for Adaptive Media Delivery, Download Delivery, Object Delivery, Dynamic Site Accelerator, Ion Standard, Ion Premier, and API Acceleration products. To log this parameter for Dynamic Site Accelerator, Ion Standard, and API Acceleration, you need to enable the breadcrumbs behavior in your stream's property configuration. For details, see Breadcrumbs. |
Geo data | ||
1066 | Edge IP | The IP address of the edge server that served the response to the client. This is useful when resolving issues with your account representative. |
2012 | Country/Region | The ISO code of the country or region where the request originated. |
2013 | State | The state or province where the request originated. |
2014 | City | The city where the request originated. |
2052 | Server country/region | The ISO code of the country or region from where the request was served. |
2053 | Billing region | The Akamai geographical price zone for where the request was served. |
Web security | ||
2050 | Security rules | Returns data on security policy ID, non-deny, and deny rules when the request triggers any configured WAF or Bot Manager rules. Requires configuring the Web Application Firewall (WAF) behavior in your property or adding hostnames in your security configurations. |
EdgeWorkers | ||
3000 | EdgeWorkers usage | Returns EdgeWorkers data for client requests and responses if EdgeWorkers is enabled. The field format is: //[EdgeWorkers-Id]/[Version]/[Event Handler]/[Off Reason]/[Logic Executed]/[Status]/#[Metrics] . |
3001 | EdgeWorkers execution | Returns EdgeWorkers execution information if enabled, including the stage of execution, the EdgeWorker ID, process, total, and total stage time in milliseconds, used memory (in kilobytes), ghost flow, error code, HTTP status change when the response is generated using the API, CPU flits consumed during processing, tier ID for the request, indirect CPU time (in milliseconds) and ghost error code. |
Media | ||
2080 | CMCD | Returns a Common Media Client Data (CMCD) payload with detailed data on media traffic. This field is available only for the Adaptive Media Delivery product. For details, see Common media client data. |
2081 | Delivery type | Limits logged data to a specific media delivery type, such as live or video on demand. |
2082 | Delivery format | Returns 1 if media encryption is enabled for the content delivered from the edge to the client. |
2083 | Media encryption | Returns 1 if an edge server prefetched the content delivered from the edge to the client. |
Content protection | ||
3011 | Content protection information | Returns Enhanced Proxy Detection (EPD) data, including the GeoGuard category and the action EPD performed on the request. |
Midgress traffic | ||
2084 | Prefetch midgress hits | The midgress traffic within the Akamai network, such as between two edge servers. To use this, enable the collect_midgress_traffic option in the [DataStream behavior](ga-datastream) for your property in Property Manager. As a result, the second slot in the log line returns processing information about a request.
|
Custom fields | ||
1082 | Custom field | The data specified in the custom log field of the log requests details that you want to receive in the stream. For details, see Custom log field. |
Construct the resource
Add your connector block and data stream fields and property lists to the remaining required arguments to create a data stream.
Argument | Description |
---|---|
Required | |
active | Whether your data stream is activated along with creation.Important: Because the data stream creation process can take a bit, set the value to |
delivery_configuration | A set that provides configuration information for the logs.
|
contract_id | Your contract's ID. |
dataset_fields | An set of IDs for the data set fields within the product for which you want to receive logs. The order of the IDs defines their order in the log lines. For values, use the dataset_fields data source to get the available fields for your product. |
group_id | Your group's ID |
properties | A list of properties the data stream monitors. Data can only be logged on active properties. |
stream_name | The name of or for your stream. |
<connector>_connector | Destination details for the data stream. Replace <connector> with the respective type listed in the connector table. |
Optional | |
notification_emails | A list of email addresses to which the data stream's activation and deactivation status are sent. |
collect_midgress | Boolean that sets the collection of midgress data. |
resource "akamai_datastream" "my_datastream" {
active = true
delivery_configuration {
field_delimiter = "SPACE"
format = "STRUCTURED"
frequency {
interval_in_secs = 30
}
upload_file_prefix = "prefix"
upload_file_suffix = "suffix"
}
contract_id = "C-0N7RAC7"
dataset_fields = [
1000, 1002, 1102
]
group_id = 12345
properties = [
12345, 98765
]
stream_name = "Datastream_Example1"
gcs_connector {
bucket = "my_bucket"
display_name = "my_connector_name"
path = "akamai/logs"
private_key = "-----BEGIN PRIVATE KEY-----\nprivate_key\n-----END PRIVATE KEY-----\n"
project_id = "my_project_id"
service_account_name = "my_service_account_name"
}
notification_emails = [
"example1@example.com",
"example2@example.com",
]
collect_midgress = true
}
There is no default standard output as the attribute values are sensitive, but you can get your data stream's ID from the last line of the process log or by using the data stream data source.
akamai_datastream.my_datastream: Creation complete after 1h20m16s [id=12345]
Use the ID to connect your data stream to your property in the data stream rule.
Add rules and behaviors
Add the DataStream rule and behavior and any additional behaviors required by a data set to your properties' default rule. For options configuration, see the datastream behavior.
- For more than one property, loop through each property to add the rule.
- If you use includes in your rule tree, activate them before you activate your property.
{
"name": "Datastream",
"children": [],
"behaviors": [
{
"name": "datastream",
"options": {}
}
],
"criteria": [],
"criteriaMustSatisfy": ""
}
Activate properties
Activate your properties on a network to start collecting data with your data stream.
The required arguments for a property activation are property_id
, contact
, and version
. If you don't specify a network, the default action targets staging.
// Change the network value to production for the production network
resource "akamai_property_activation" "my_activation" {
property_id = "prp_12345"
network = "staging"
contact = ["jsmith@example.com"]
note = "Sample activation"
version = "1"
auto_acknowledge_rule_warnings = true
timeouts {
default = "1h"
}
}
Updated 7 months ago