Components
Technologies that change how systems behave and, in the right situation, could be game changers
This section focuses on the key hardware and software components that can be purchased, installed, and operated directly by your organization — with a clear emphasis on on-prem deployments.
Cloud-only or service-delivered solutions are intentionally excluded here. While they can be useful, they operate under a fundamentally different model.
As AI capabilities spread into more applications and workflows, the demands on infrastructure are rising sharply. The right mix of accelerators, emerging composable fabrics, and supporting technologies can deliver major gains in performance and efficiency — but only if chosen and deployed with a clear understanding of real-world workloads and constraints.
We break these components into two primary categories to start: Accelerators and Composable & Disaggregated Infrastructure. More categories will be added over time.
Accelerators
The accelerator market has become remarkably diverse. Organizations no longer need to pick a single accelerator and live with it for years. It is now entirely practical — and often advantageous — to use multiple accelerator types in the same environment, matching the right tool to the specific workload, efficiency target, and cost constraints.
Building a competitive AI accelerator is extremely difficult. It requires deep technical talent, years of development, and significant capital. Reaching a shipping product means the team has overcome major technical and commercial hurdles — and that investors and customers have placed real bets on its viability.
In the table below you’ll find both established players and newer entrants. Many of the emerging accelerators are highly specialized, targeting particular aspects of AI training, inference, or AI-enhanced workloads. Several offer compelling value propositions in performance, power efficiency, or total cost of ownership that are worth serious evaluation.
The right choice depends heavily on your actual workloads, software ecosystem maturity, and operational priorities. There is no universal “best” accelerator — only the best fit for your situation.
Vendor | Approach | Key Strengths | Best Suited For (On-Prem) |
Full-stack GPUs (H100/H200/Blackwell) | Mature ecosystem, broad software support | General-purpose AI training & inference | |
High-memory GPUs | Competitive price/performance, very large HBM memory capacity, strong FP64 performance, high memory bandwidth, improving ROCm ecosystem | Large-model training and inference, especially workloads that benefit from high memory capacity and strong FP64 performance | |
Purpose-built AI ASIC | Strong efficiency and TCO on many LLM tasks | Training & inference where cost/power matter | |
Digital In-Memory Compute (DIMC) | Excellent perf/watt and low latency for inference | Power-constrained or high-volume inference | |
Wafer-scale engine (WSE-3) | Massive on-chip memory and bandwidth | Extremely large models in dedicated clusters | |
Maverick-2 Intelligent Compute Accelerator (adaptive dataflow) | Dynamically reconfigures at runtime to optimize for the workload; significantly better perf/watt on complex tasks; runs unmodified code | Traditional HPC and HPC-AI hybrid workloads where power efficiency and adaptability matter | |
Thunderbird (RISC-V "supercomputer-cluster-on-a-chip") | High core count, strong energy efficiency, general-purpose programmability | Scientific HPC, graph analytics, physics-based simulations, and AI surrogate modeling | |
SN50 Reconfigurable Dataflow Unit (RDU) | Flexible dataflow architecture, strong for agentic inference, good scaling | Enterprise-scale inference and mixed training/inference workloads | |
Wormhole / Blackhole (RISC-V based) | Open architecture, flexible scaling from edge to data center | Users prioritizing open ecosystems, programmability, and end-to-end flexibility |
Composable & Disaggregated Infrastructure
One of the biggest practical problems in on-prem data centers today is low accelerator utilization. Expensive GPUs and accelerators often sit idle or underused because they are locked into fixed server configurations or bottlenecked by traditional CPU and networking overhead.
Composable and disaggregated architectures address this by allowing accelerators, storage, and other resources to be dynamically pooled and shared across servers and workloads.
Two related but distinct approaches are gaining traction:
- True Composable Fabrics — Dynamically pool accelerators (especially GPUs) across servers and racks so they can be allocated on demand.
- AI Inference Orchestration Engines — Maximize utilization within each server by intelligently offloading non-compute work and orchestrating inference workloads more efficiently.
Here are some of the leading on-prem options:
Vendor | Approach | Key Strengths | Best Suited For (On-Prem) |
GPU Fabric & Composability | Mature low-latency GPU pooling across servers | AI/HPC/Enterprise clusters needing high utilization and resource sharing | |
Element composable infrastructure | Highly flexible pooling of GPUs, storage, and accelerators | Environments with varied or bursty workloads | |
NR1 AI-CPU + AI-NIC (inference orchestrator) | Replaces traditional CPU + NIC; pushes GPU utilization near 100%; works with any accelerator | Inference-heavy servers where per-node efficiency and TCO matter most |
Key Takeaway: Use composable fabrics (like GigaIO or Liqid) when you need to share expensive accelerators flexibly across many servers and workloads. Use AI inference orchestration engines (like NeuReality) when you want to dramatically improve utilization of accelerators that stay inside individual servers. Many efficient deployments combine both approaches, often taking advantage of CXL for better memory sharing and coherence.
Vendors included on this site are selected based on technical relevance and real-world deployments.