GPU Compute Instances

Scientists, artists, and engineers need access to significant parallel computational power. Akamai offers GPU-optimized virtual machines accelerated by NVIDIA RTX 4000 Ada or NVIDIA Quadro RTX 6000. These GPU Compute Instances harness the power of CUDA, Tensor, and RT cores to execute complex processing, transcoding, and ray tracing workloads.

GPU plans using NVIDIA Quadro RTX 6000 were first introduced in 2019 and have limited deployment availability. NVIDIA RTX 4000 Ada GPU plans were introduced in 2024.

GPU plans are ideal for highly specialized workloads that would benefit from dedicated NVIDIA GPUs, including machine learning, AI inferencing, graphics processing, and big data analysis.

On-demand

When the costs associated with purchasing, installing, and maintaining GPUs are taken into account, the overall cost of ownership is often high. GPU Compute Instances allow you to leverage the power of GPUs while benefiting from the main value proposition of cloud: turning a CapEx into an OpEx.

Market leading hardware

The GPU plans use industry-leading NVIDIA GPUs with CUDA, Tensor, and RT cores in each unit. These GPUs support use cases associated with parallel processing, transcoding, and ray tracing. See GPU specifications for more details.

If one GPU card isn’t enough for your projected workloads, ​Akamai​ Cloud Computing offers GPU plans with up to four cards per Compute Instance.

Dedicated competition-free resources

A GPU Compute Instance's vCPU cores are dedicated, not shared, and accessible only to you. Your Compute Instance never has to wait for another process, enabling your software to run at peak speed and efficiency. This lets you run workloads that require full-duty work (100% CPU all day, every day) at peak performance.

Recommended workloads

GPU Compute Instances are suitable for specialized workloads that are optimized for GPUs:

  • Video encoding
  • Graphics processing
  • AI inferencing
  • Big data analysis

See GPU use cases to learn more.

Availability

GPU PlanRegions
NVIDIA RTX 4000 AdaChicago, US; Frankfurt 2; Osaka, JP; Paris, FR; Seattle, WA, US; Singapore 2
NVIDIA Quadro RTX 6000Atlanta, GA, US; Newark, NJ, US; Frankfurt, DE; Mumbai, IN; Singapore

Plans and pricing

ResourceNVIDIA RTX 4000 AdaNVIDIA Quadro RTX 6000
GPU cards1-41-4
GPU Memory (VRAM)20 GB - 80 GB24 GB - 96 GB
vCPU cores (dedicated)4 - 48 cores8-24 cores
Memory (RAM)16 GB - 126 GB32 GB - 128 GB
Storage0.5 TB - 2 TB640 GB - 2560 GB
Outbound Network Transfer0 TB16 TB - 20 TB
Outbound Network Bandwidth10 Gbps10 Gbps

Pricing starts at $350/mo ($0.52/hr) for an NVIDIA RTX 4000 Ada GPU x1 Small Compute Instance with 1 GPU card, 4 vCPU cores, 16 GB of memory, and 0.5 TB of SSD storage. Pricing starts at $1,000/mo ($1.50/hr) for an NVIDIA Quadro RTX 6000 GPU Compute Instance with 1 GPU card, 8 vCPU cores, 32 GB of memory, and 640 GB of SSD storage.

Review the pricing page for additional plans and their associated costs. Review the Plans page to learn more about other Compute Instance types.

📘

In some cases, a $100 deposit may be required to deploy GPU Compute Instances. This may include new accounts that have been active for less than 90 days and accounts that have spent less than $100 on services. If you are unable to deploy GPU Compute Instances, contact Support for assistance.

GPU specifications

Each of the NVIDIA RTX 4000 Ada GPUs is equipped the following:

SpecificationValue
GPU Memory (VRAM)20 GB GDDR6
CUDA Cores (Parallel-Processing)6144
Tensor Cores (Transcoding)192
RT Cores (Ray Tracing)48
FP32 Performance26.7 TFLOPS

Each of the NVIDIA Quadro RTX 6000 GPUs is equipped the following specifications:

SpecificationValue
GPU Memory (VRAM)24 GB GDDR6
CUDA Cores (Parallel-Processing)4608
Tensor Cores (Transcoding)576
RT Cores (Ray Tracing)72
FP32 Performance16.3 TFLOPS

What are GPUs?

GPUs (Graphical Processing Units) are specialized hardware originally created to manipulate computer graphics and process images. GPUs are designed to process large blocks of data in parallel making them excellent for compute intensive tasks that require thousands of simultaneous threads. Because a GPU has significantly more logical cores than a standard CPU, it can perform computations that process large amounts of data in parallel, more efficiently. This means GPUs accelerate the large calculations that are required by big data, video encoding, AI, and machine learning.

GPU Compute Instances include NVIDIA RTX 4000 Ada or NVIDIA Quadro RTX 6000 GPU cards with Tensor, RT, and CUDA cores. NVIDIA RTX 4000 Ada GPU plans are the newest plans. Read more about NVIDIA RTX 4000 Ada.

GPU use cases

Machine learning and AI

Machine learning is a powerful approach to data science that uses large sets of data to build prediction algorithms. These prediction algorithms are commonly used in “recommendation” features on many popular music and video applications, online shops, and search engines. When you receive intelligent recommendations tailored to your own tastes, machine learning is often responsible. Other areas where you might find machine learning being used include self-driving cars, process automation, security, marketing analytics, and health care.

AI (Artificial Intelligence) is a broad concept that describes technology designed to behave intelligently and mimic the cognitive functions of humans, like learning, decision making, and speech recognition. AI uses large sets of data to learn and adapt in order to achieve a specific goal. GPUs provide the processing power needed for common AI and machine learning tasks like input data preprocessing and model building.

Below is a list of common tools used for machine learning and AI that can be installed on a GPU Compute Instance:

  • TensorFlow - a free, open-source, machine learning framework, and deep learning library. Tensorflow was originally developed by Google for internal use and later fully released to the public under the Apache License.

  • PyTorch - a machine learning library for Python that uses the popular GPU optimized Torch framework.

  • Apache Mahout - a scalable library of machine learning algorithms, and a distributed linear algebra framework designed to let mathematicians, statisticians, and data scientists quickly implement their own algorithms.

Big data

Big data is a discipline that analyzes and extracts meaningful insights from large and complex data sets. These sets are so large and complex that they require specialized software and hardware to appropriately capture, manage, and process the data. When thinking of big data and whether or not the term applies to you, it often helps to visualize the “three Vs”:

  • Volume: Generally, if you are working with terabytes, exabytes, petabytes, or more amounts of information you are in the realm of big data.

  • Velocity: With Big Data, you’re using data that is being created, called, moved, and interacted with at a high velocity. One example is the real time data generated on social media platforms by its users.

  • Variety: Variety refers to the many different types of data formats with which you may need to interact. Photos, video, audio, and documents can all be written and saved in a number of different formats. It is important to consider the variety of data that you will collect in order to appropriately categorize it.

GPUs can help give Big Data systems the additional computational capabilities they need for ideal performance. Below are a few examples of tools which you can use for your own big data solutions:

  • Hadoop - an Apache project that allows the creation of parallel processing applications on large data sets, distributed across networked nodes.

  • Apache Spark - a unified analytics engine for large-scale data processing designed with speed and ease of use in mind.

  • Apache Storm - a distributed computation system that processes streaming data in real time.

Video encoding

Video Encoding is the process of taking a video file's original source format and converting it to another format that is viewable on a different device or using a different tool. This resource intensive task can be greatly accelerated using the power of GPUs.

  • FFmpeg - a popular open-source multimedia manipulation framework that supports a large number of video formats.

General purpose computing using CUDA

CUDA (Compute Unified Device Architecture) is a parallel computing platform and API that lets you interact more directly with the GPU for general purpose computing. In practice, this means that a developer can write code in C, C++, or many other supported languages utilizing their GPU to create their own tools and programs.

If you're interested in using CUDA on your GPU Compute Instance, see the following resources:

Graphics processing

One of the most traditional use cases for a GPU is graphics processing. Transforming a large set of pixels or vertices with a shader or simulating realistic lighting via ray tracing are massive parallel processing tasks. Ray tracing is a computationally intensive process that simulates lights in a scene and renders the reflections, refractions, shadows, and indirect lighting. It's impossible to do on GPUs in real-time without hardware-based ray tracing acceleration. GPU Compute Instances offers real-time ray tracing capabilities using a single GPU.

The GPU plans support advanced shading capabilities such as:

  • Mesh shading models for vertex, tessellation, and geometry stages in the graphics pipeline
  • Variable Rate Shading to dynamically control shading rate
  • Texture-Space Shading which utilizes a private memory held texture space
  • Multi-View Rendering which allows for rendering multiple views in a single pass.