GPU Glossary

Technical terminology and definitions for the modern GPU landscape.

C

Cloud GPU

A GPU resource provided as a service in cloud computing environments to support scalable and flexible computing tasks.

CUDA Cores

Parallel processing units in NVIDIA GPUs that execute general-purpose computing tasks.

D

Deep Learning Super Sampling

A technology that uses AI to upscale lower-resolution images to higher resolutions in real-time.

DirectX Raytracing

A feature of Microsoft's DirectX API that allows for real-time ray tracing in supported GPUs, enhancing visual realism.

F

FP16 Performance

Half-precision floating-point performance, offering faster computation with reduced precision compared to FP32.

FP32 Performance

Single-precision floating-point performance, measuring how many 32-bit floating-point operations the GPU can perform per second.

FP8 Precision

A low-precision floating-point format used to improve performance and efficiency in AI and machine learning tasks.

G

GPU Architecture

The underlying design and organization of a GPU, defining its capabilities and performance characteristics.

I

Infinity Fabric

An interconnect technology developed by AMD to link different components within a system, enhancing speed and efficiency.

INT8 Precision

A low-precision integer format used to accelerate AI inference tasks by reducing computational requirements.

M

Memory Bandwidth

The rate at which data can be transferred between GPU memory and processing cores, measured in GB/s.

Memory Bus Width

The number of bits that can be transferred simultaneously between the GPU and its memory, measured in bits.

Multi-Instance GPU

A technology that allows a single GPU to be partitioned into multiple instances for diverse workloads.

N

NVLink

NVIDIA's high-speed interconnect technology for direct GPU-to-GPU communication, bypassing the CPU and PCIe bus.

NVSwitch

A high-speed switch fabric used to connect multiple GPUs in a single server for enhanced data throughput.

O

OAM Form Factor

OCP Accelerator Module (OAM) is a form factor standard designed for high-density accelerator computing in data centers.

P

PCIe Interface

The connection standard used to attach GPUs to the motherboard, determining data transfer speeds between CPU and GPU.

R

Ray Tracing Cores

Specialized hardware units designed to accelerate ray tracing calculations for realistic lighting and reflections.

RDMA

Remote Direct Memory Access is a technology that allows direct memory access from the memory of one computer into that of another without involving the CPU.

S

SR-IOV

Single Root Input/Output Virtualization is a technique that allows multiple virtual machines to share a single physical GPU.

SXM Form Factor

A specialized form factor for GPUs designed to maximize performance and efficiency in data centers.

T

TDP

Thermal Design Power - The maximum amount of heat a GPU is designed to dissipate under typical workloads, measured in watts.

Tensor Cores

Specialized processing units optimized for matrix multiplication operations used in AI and deep learning.

TF32 Precision

A precision format designed to accelerate AI workloads by providing a balance between FP16 and FP32.

Transformer Engine

A specialized engine in modern GPUs designed to accelerate Transformer model training and inference.

U

Ultra Ethernet

A high-speed networking standard designed to support intensive data transfer requirements in modern data centers.