GPU Compute Instances
Scientists, artists, and engineers need access to significant parallel computational power. Akamai offers GPU-optimized virtual machines accelerated by NVIDIA RTX 4000 Ada or NVIDIA Quadro RTX 6000. These GPU Compute Instances harness the power of CUDA, Tensor, and RT cores to execute complex processing, transcoding, and ray tracing workloads.
GPU plans using NVIDIA Quadro RTX 6000 were first introduced in 2019 and have limited deployment availability. NVIDIA RTX 4000 Ada GPU plans were introduced in 2024.
GPU plans are ideal for highly specialized workloads that would benefit from dedicated NVIDIA GPUs, including machine learning, AI inferencing, graphics processing, and big data analysis.
On-demand
When the costs associated with purchasing, installing, and maintaining GPUs are taken into account, the overall cost of ownership is often high. GPU Compute Instances allow you to leverage the power of GPUs while benefiting from the main value proposition of cloud: turning a CapEx into an OpEx.
Market leading hardware
The GPU plans use industry-leading NVIDIA GPUs with CUDA, Tensor, and RT cores in each unit. These GPUs support use cases associated with parallel processing, transcoding, and ray tracing. See GPU specifications for more details.
If one GPU card isn’t enough for your projected workloads, Akamai Cloud Computing offers GPU plans with up to four cards per Compute Instance.
Dedicated competition-free resources
A GPU Compute Instance's vCPU cores are dedicated, not shared, and accessible only to you. Your Compute Instance never has to wait for another process, enabling your software to run at peak speed and efficiency. This lets you run workloads that require full-duty work (100% CPU all day, every day) at peak performance.
Recommended workloads
GPU Compute Instances are suitable for specialized workloads that are optimized for GPUs:
- Video encoding
- Graphics processing
- AI inferencing
- Big data analysis
See GPU use cases to learn more.
Availability
GPU Plan | Regions |
---|---|
NVIDIA RTX 4000 Ada | Chicago, US; Frankfurt 2; Osaka, JP; Paris, FR; Seattle, WA, US; Singapore 2 |
NVIDIA Quadro RTX 6000 | Atlanta, GA, US; Newark, NJ, US; Frankfurt, DE; Mumbai, IN; Singapore |
Plans and pricing
Resource | NVIDIA RTX 4000 Ada | NVIDIA Quadro RTX 6000 |
---|---|---|
GPU cards | 1-4 | 1-4 |
GPU Memory (VRAM) | 20 GB - 80 GB | 24 GB - 96 GB |
vCPU cores (dedicated) | 4 - 48 cores | 8-24 cores |
Memory (RAM) | 16 GB - 126 GB | 32 GB - 128 GB |
Storage | 0.5 TB - 2 TB | 640 GB - 2560 GB |
Outbound Network Transfer | 0 TB | 16 TB - 20 TB |
Outbound Network Bandwidth | 10 Gbps | 10 Gbps |
Pricing starts at $350/mo ($0.52/hr) for an NVIDIA RTX 4000 Ada GPU x1 Small Compute Instance with 1 GPU card, 4 vCPU cores, 16 GB of memory, and 0.5 TB of SSD storage. Pricing starts at $1,000/mo ($1.50/hr) for an NVIDIA Quadro RTX 6000 GPU Compute Instance with 1 GPU card, 8 vCPU cores, 32 GB of memory, and 640 GB of SSD storage.
Review the pricing page for additional plans and their associated costs. Review the Plans page to learn more about other Compute Instance types.
In some cases, a $100 deposit may be required to deploy GPU Compute Instances. This may include new accounts that have been active for less than 90 days and accounts that have spent less than $100 on services. If you are unable to deploy GPU Compute Instances, contact Support for assistance.
GPU specifications
Each of the NVIDIA RTX 4000 Ada GPUs is equipped the following:
Specification | Value |
---|---|
GPU Memory (VRAM) | 20 GB GDDR6 |
CUDA Cores (Parallel-Processing) | 6144 |
Tensor Cores (Transcoding) | 192 |
RT Cores (Ray Tracing) | 48 |
FP32 Performance | 26.7 TFLOPS |
Each of the NVIDIA Quadro RTX 6000 GPUs is equipped the following specifications:
Specification | Value |
---|---|
GPU Memory (VRAM) | 24 GB GDDR6 |
CUDA Cores (Parallel-Processing) | 4608 |
Tensor Cores (Transcoding) | 576 |
RT Cores (Ray Tracing) | 72 |
FP32 Performance | 16.3 TFLOPS |
What are GPUs?
GPUs (Graphical Processing Units) are specialized hardware originally created to manipulate computer graphics and process images. GPUs are designed to process large blocks of data in parallel making them excellent for compute intensive tasks that require thousands of simultaneous threads. Because a GPU has significantly more logical cores than a standard CPU, it can perform computations that process large amounts of data in parallel, more efficiently. This means GPUs accelerate the large calculations that are required by big data, video encoding, AI, and machine learning.
GPU Compute Instances include NVIDIA RTX 4000 Ada or NVIDIA Quadro RTX 6000 GPU cards with Tensor, RT, and CUDA cores. NVIDIA RTX 4000 Ada GPU plans are the newest plans. Read more about NVIDIA RTX 4000 Ada.
GPU use cases
Machine learning and AI
Machine learning is a powerful approach to data science that uses large sets of data to build prediction algorithms. These prediction algorithms are commonly used in “recommendation” features on many popular music and video applications, online shops, and search engines. When you receive intelligent recommendations tailored to your own tastes, machine learning is often responsible. Other areas where you might find machine learning being used include self-driving cars, process automation, security, marketing analytics, and health care.
AI (Artificial Intelligence) is a broad concept that describes technology designed to behave intelligently and mimic the cognitive functions of humans, like learning, decision making, and speech recognition. AI uses large sets of data to learn and adapt in order to achieve a specific goal. GPUs provide the processing power needed for common AI and machine learning tasks like input data preprocessing and model building.
Below is a list of common tools used for machine learning and AI that can be installed on a GPU Compute Instance:
-
TensorFlow - a free, open-source, machine learning framework, and deep learning library. Tensorflow was originally developed by Google for internal use and later fully released to the public under the Apache License.
-
PyTorch - a machine learning library for Python that uses the popular GPU optimized Torch framework.
-
Apache Mahout - a scalable library of machine learning algorithms, and a distributed linear algebra framework designed to let mathematicians, statisticians, and data scientists quickly implement their own algorithms.
Big data
Big data is a discipline that analyzes and extracts meaningful insights from large and complex data sets. These sets are so large and complex that they require specialized software and hardware to appropriately capture, manage, and process the data. When thinking of big data and whether or not the term applies to you, it often helps to visualize the “three Vs”:
-
Volume: Generally, if you are working with terabytes, exabytes, petabytes, or more amounts of information you are in the realm of big data.
-
Velocity: With Big Data, you’re using data that is being created, called, moved, and interacted with at a high velocity. One example is the real time data generated on social media platforms by its users.
-
Variety: Variety refers to the many different types of data formats with which you may need to interact. Photos, video, audio, and documents can all be written and saved in a number of different formats. It is important to consider the variety of data that you will collect in order to appropriately categorize it.
GPUs can help give Big Data systems the additional computational capabilities they need for ideal performance. Below are a few examples of tools which you can use for your own big data solutions:
-
Hadoop - an Apache project that allows the creation of parallel processing applications on large data sets, distributed across networked nodes.
-
Apache Spark - a unified analytics engine for large-scale data processing designed with speed and ease of use in mind.
-
Apache Storm - a distributed computation system that processes streaming data in real time.
Video encoding
Video Encoding is the process of taking a video file's original source format and converting it to another format that is viewable on a different device or using a different tool. This resource intensive task can be greatly accelerated using the power of GPUs.
- FFmpeg - a popular open-source multimedia manipulation framework that supports a large number of video formats.
General purpose computing using CUDA
CUDA (Compute Unified Device Architecture) is a parallel computing platform and API that lets you interact more directly with the GPU for general purpose computing. In practice, this means that a developer can write code in C, C++, or many other supported languages utilizing their GPU to create their own tools and programs.
If you're interested in using CUDA on your GPU Compute Instance, see the following resources:
Graphics processing
One of the most traditional use cases for a GPU is graphics processing. Transforming a large set of pixels or vertices with a shader or simulating realistic lighting via ray tracing are massive parallel processing tasks. Ray tracing is a computationally intensive process that simulates lights in a scene and renders the reflections, refractions, shadows, and indirect lighting. It's impossible to do on GPUs in real-time without hardware-based ray tracing acceleration. GPU Compute Instances offers real-time ray tracing capabilities using a single GPU.
The GPU plans support advanced shading capabilities such as:
- Mesh shading models for vertex, tessellation, and geometry stages in the graphics pipeline
- Variable Rate Shading to dynamically control shading rate
- Texture-Space Shading which utilizes a private memory held texture space
- Multi-View Rendering which allows for rendering multiple views in a single pass.
Updated about 2 months ago