Components
Technologies that change how systems behave and, in the right situation, could be game changers
This section focuses on technologies that can be deployed in real data centers — systems, components, and software that can be purchased, installed, and operated directly by an organization.
Cloud-only and service-delivered technologies are intentionally excluded. While often important, they represent a different model — consuming compute rather than operating it. The focus here is on infrastructure and software you control, integrate, and tune within your own environment.
AI capabilities will increasingly be infused into any application or workflow where they make sense. While this will deliver real benefits, it also introduces significantly higher computational complexity. This section highlights technologies that help improve efficiency and performance, including those that don’t fit neatly into traditional categories like systems or cooling.
We have a few categories to start with but will be adding more over time.
Accelerators
Accelerators have become the primary drivers of performance in modern data centers, particularly as AI capabilities are incorporated into a growing range of applications and workflows. While CPUs remain essential, much of the computational heavy lifting has shifted to specialized processors designed to handle parallel, high-throughput workloads efficiently.
The accelerator market today is built on two fundamentally different approaches. NVIDIA and AMD offer GPU-based platforms designed for flexibility across a wide range of workloads, supported by large and mature software ecosystems. Most other vendors take a more specialized ASIC-based approach, building domain-specific architectures aimed at particular workloads, often with the goal of improving performance or efficiency.
That architectural difference has real implications. GPUs benefit from broad software support, established tools, and a large base of developers. More specialized accelerators can offer compelling advantages for specific use cases, but typically require greater alignment between hardware, software, and workloads, and may involve more effort to integrate and operate effectively.
The accelerator landscape is expanding rapidly, with a mix of established vendors and newer entrants offering a wide range of architectures and approaches. However, not all of these technologies are equally mature or broadly deployable. Some are widely used across enterprise, cloud, and HPC environments, while others are earlier in their lifecycle or targeted at more specialized use cases.
This section focuses on accelerators that can be deployed in on-premises data centers. Solutions that are only available through cloud or service-based models are not included, as they cannot be directly integrated into enterprise or HPC system architectures.
As with systems more broadly, there is no single “best” accelerator. The right choice depends heavily on workload characteristics, software ecosystem support, and how well the technology integrates into an organization’s overall infrastructure. In practice, most environments will include a mix of architectures optimized for different types of work.
Vendor | Approach | Key Strengths | Best Suited For (On-Prem) |
Full-stack GPUs (H100/H200/Blackwell) | Mature ecosystem, broad software support | General-purpose AI training & inference | |
High-memory GPUs | Competitive price/performance, very large HBM memory capacity, strong FP64 performance, high memory bandwidth, improving ROCm ecosystem | Large-model training and inference, especially workloads that benefit from high memory capacity and strong FP64 performance | |
Purpose-built AI ASIC | Strong efficiency and TCO on many LLM tasks | Training & inference where cost/power matter | |
Digital In-Memory Compute (DIMC) | Excellent perf/watt and low latency for inference | Power-constrained or high-volume inference | |
Wafer-scale engine (WSE-3) | Massive on-chip memory and bandwidth | Extremely large models in dedicated clusters | |
Maverick-2 Intelligent Compute Accelerator (adaptive dataflow) | Dynamically reconfigures at runtime to optimize for the workload; significantly better perf/watt on complex tasks; runs unmodified code | Traditional HPC and HPC-AI hybrid workloads where power efficiency and adaptability matter | |
Thunderbird (RISC-V "supercomputer-cluster-on-a-chip") | High core count, strong energy efficiency, general-purpose programmability | Scientific HPC, graph analytics, physics-based simulations, and AI surrogate modeling | |
SN50 Reconfigurable Dataflow Unit (RDU) | Flexible dataflow architecture, strong for agentic inference, good scaling | Enterprise-scale inference and mixed training/inference workloads | |
Wormhole / Blackhole (RISC-V based) | Open architecture, flexible scaling from edge to data center | Users prioritizing open ecosystems, programmability, and end-to-end flexibility |
Composable & Disaggregated Infrastructure
As accelerators become more expensive and power-hungry, one of the biggest practical problems in on-prem data centers is low utilization. Expensive GPUs and accelerators often sit idle or underused because they are locked into fixed server configurations or bottlenecked by traditional CPU and networking overhead.
Two related but distinct approaches help solve this:
1. True Composable Fabrics — dynamically pool accelerators across servers/racks.
2. AI Inference Orchestration Engines — maximize utilization within each server by offloading non-compute work.
Here are the leading on-prem options:
Vendor | Approach | Key Strengths | Best Suited For (On-Prem) |
GPU Fabric & Composability | Mature low-latency GPU pooling across servers | AI/HPC/Enterprise clusters needing high utilization and resource sharing | |
Element composable infrastructure | Highly flexible pooling of GPUs, storage, and accelerators | Environments with varied or bursty workloads | |
NR1 AI-CPU + AI-NIC (inference orchestrator) | Replaces traditional CPU + NIC; pushes GPU utilization near 100%; works with any accelerator | Inference-heavy servers where per-node efficiency and TCO matter most |
Key takeaway: Use composable fabrics (like GigaIO or Liqid) when you need to share accelerators flexibly across nodes. Use NeuReality when you want to get maximum performance out of the accelerators inside each server. Many efficient deployments combine both, often taking advantage of CXL for better memory sharing and coherency.
Vendors included on this site are selected based on technical relevance and real-world deployments.