Components

Technologies that change how systems behave and, in the right situation, could be game changers

This section focuses on the key hardware and software components that can be purchased, installed, and operated directly by your organization — with a clear emphasis on on-prem deployments.

Cloud-only or service-delivered solutions are intentionally excluded here. While they can be useful, they operate under a fundamentally different model.

As AI capabilities spread into more applications and workflows, the demands on infrastructure are rising sharply. The right mix of accelerators, emerging composable fabrics, and supporting technologies can deliver major gains in performance and efficiency — but only if chosen and deployed with a clear understanding of real-world workloads and constraints.

We break these components into two primary categories to start: Accelerators and Composable & Disaggregated Infrastructure. More categories will be added over time.

Accelerators

The accelerator market has become remarkably diverse. Organizations no longer need to pick a single accelerator and live with it for years. It is now entirely practical — and often advantageous — to use multiple accelerator types in the same environment, matching the right tool to the specific workload, efficiency target, and cost constraints.

Building a competitive AI accelerator is extremely difficult. It requires deep technical talent, years of development, and significant capital. Reaching a shipping product means the team has overcome major technical and commercial hurdles — and that investors and customers have placed real bets on its viability.

In the table below you’ll find both established players and newer entrants. Many of the emerging accelerators are highly specialized, targeting particular aspects of AI training, inference, or AI-enhanced workloads. Several offer compelling value propositions in performance, power efficiency, or total cost of ownership that are worth serious evaluation.

The right choice depends heavily on your actual workloads, software ecosystem maturity, and operational priorities. There is no universal “best” accelerator — only the best fit for your situation.

Vendor

Approach

Key Strengths

Best Suited For (On-Prem)

Full-stack GPUs (H100/H200/Blackwell)
Mature ecosystem, broad software support
General-purpose AI training & inference
High-memory GPUs

Competitive price/performance, very large HBM memory capacity, strong FP64 performance, high memory bandwidth, improving ROCm ecosystem

Large-model training and inference, especially workloads that benefit from high memory capacity and strong FP64 performance

Purpose-built AI ASIC

Strong efficiency and TCO on many LLM tasks

Training & inference where cost/power matter
Digital In-Memory Compute (DIMC)
Excellent perf/watt and low latency for inference
Power-constrained or high-volume inference
Wafer-scale engine (WSE-3)
Massive on-chip memory and bandwidth
Extremely large models in dedicated clusters

Maverick-2 Intelligent Compute Accelerator (adaptive dataflow)

Dynamically reconfigures at runtime to optimize for the workload; significantly better perf/watt on complex tasks; runs unmodified code

Traditional HPC and HPC-AI hybrid workloads where power efficiency and adaptability matter
Thunderbird (RISC-V "supercomputer-cluster-on-a-chip")
High core count, strong energy efficiency, general-purpose programmability
Scientific HPC, graph analytics, physics-based simulations, and AI surrogate modeling
SN50 Reconfigurable Dataflow Unit (RDU)
Flexible dataflow architecture, strong for agentic inference, good scaling
Enterprise-scale inference and mixed training/inference workloads
Wormhole / Blackhole (RISC-V based)
Open architecture, flexible scaling from edge to data center
Users prioritizing open ecosystems, programmability, and end-to-end flexibility

Composable & Disaggregated Infrastructure

One of the biggest practical problems in on-prem data centers today is low accelerator utilization. Expensive GPUs and accelerators often sit idle or underused because they are locked into fixed server configurations or bottlenecked by traditional CPU and networking overhead.

Composable and disaggregated architectures address this by allowing accelerators, storage, and other resources to be dynamically pooled and shared across servers and workloads.

Two related but distinct approaches are gaining traction:

  1. True Composable Fabrics — Dynamically pool accelerators (especially GPUs) across servers and racks so they can be allocated on demand.
  2. AI Inference Orchestration Engines — Maximize utilization within each server by intelligently offloading non-compute work and orchestrating inference workloads more efficiently.

Here are some of the leading on-prem options:

Vendor

Approach

Key Strengths

Best Suited For (On-Prem)

GPU Fabric & Composability

Mature low-latency GPU pooling across servers

AI/HPC/Enterprise clusters needing high utilization and resource sharing

Element composable infrastructure

Highly flexible pooling of GPUs, storage, and accelerators

Environments with varied or bursty workloads

NR1 AI-CPU + AI-NIC (inference orchestrator)

Replaces traditional CPU + NIC; pushes GPU utilization near 100%; works with any accelerator

Inference-heavy servers where per-node efficiency and TCO matter most

Key Takeaway:  Use composable fabrics (like GigaIO or Liqid) when you need to share expensive accelerators flexibly across many servers and workloads. Use AI inference orchestration engines (like NeuReality) when you want to dramatically improve utilization of accelerators that stay inside individual servers. Many efficient deployments combine both approaches, often taking advantage of CXL for better memory sharing and coherence.

Vendors included on this site are selected based on technical relevance and real-world deployments.