The Case for a New Data Processing Benchmark in the Age of Heterogeneous Computing

Industry-wide collaboration is essential to develop standards that guide smarter infrastructure decisions, ensuring efficiency and ROI in the evolving landscape of AI and analytics.

Rajan Goyal , CEO, DataPelago

November 27, 2025

4 Min Read
Digital abstract Network of blue and yellow lines and connected dots
Alamy

Data infrastructure is undergoing its most significant transformation in decades. Generative AI and the shift toward heterogeneous accelerated computing environments that combine different hardware are reshaping the core requirements of a modern data stack. The ability to quickly and cost-effectively process complex datasets for AI and analytics has become a defining factor in operational efficiency and infrastructure ROI. 

Historically, data processing performance has been determined by the sophistication of a query planner and the strength of an execution engine, with it assumed that the underlying hardware is the same across systems. Additionally, existing data processing benchmarks, such as TPC-DS and TPC-H, are designed to test the performance and efficiency of the system at the workload level.

Today’s data centers feature a wide range of accelerated computing hardware, including GPUs, TPUs, and FPGAs, with data processing performance and efficiency increasingly shaped by these hardware components as well. What was once a standardized infrastructure layer has evolved into a heterogeneous computing environment with distinct strengths and limitations.

Nearly every hardware vendor also claims that their hardware is best suited for data processing, citing specifications such as peak FLOPS, memory bandwidth, and tensor throughput. But these specs may not directly translate into the performance of real-world data processing workloads. For example, a GPU might advertise 28 petaflops, yet much of that compute resides in tensor cores that are irrelevant to ETL tasks. Even when specifications are relevant, actual results often depend on increasingly complex system-level interactions, such as CPU-to-GPU connectivity, GPU-to-GPU data movement, the ratio of CPUs to GPUs in a system, memory capacity, and memory bandwidth. 

Related:AI Workloads to Dominate Data Centers Within Two Years – Report

For operators responsible for designing clusters and forecasting throughput, this gap between spec-sheet performance and real-world workload performance introduces significant risk: inefficient power usage, stranded accelerator capacity, and suboptimal node configurations that can persist for years. 

The result is a growing disconnect. Data center operators are forced to make critical infrastructure decisions based on incomplete and misleading indicators. Just as benchmarks like CoreMark helped normalize CPU performance across tasks, it has become clear that we need a standardized way to measure today’s accelerated hardware to determine which processors are most performant on core data processing tasks. 

Related:How MLPerf Benchmarks Guide Data Center Design Decisions

Properties of an Effective Modern Benchmark

For such a benchmark to be impactful, it must accurately reflect the realities of modern infrastructure rather than relying on legacy assumptions. In practice, that means meeting several key criteria:

  • System-level Measurement: System-level measurement is fundamental. Rather than evaluating individual components, a benchmark must assess the performance of the entire system within a node. This requires datasets to be large enough that they don't fit entirely in host memory, forcing the benchmark to measure actual data movement patterns and memory hierarchies. This approach prevents systems with larger caches from gaining unfair advantages, providing more realistic performance assessments.

  • Vendor Agnostic: To ensure fair comparisons across different technologies and architectures, a benchmark must not be tied to a single vendor. It also must be designed to avoid bias toward any particular vendor's technology or approach, allowing organizations to make informed decisions based on their specific requirements rather than benchmark optimizations.

  • Reflect Modern Distributed Systems: To accurately reflect modern distributed computing environments, a benchmark should effectively evaluate performance in both single-node and scale-out multi-node configurations.

  • Coverage of Diverse Workloads: ETL, business intelligence and generative AI workloads each stress different aspects of the data processing pipeline. ETL workloads emphasize operations such as scanning, projection, filtering, aggregation, and joins, while BI workloads add complexity with JSON processing, shuffle operations, window functions, and top-K queries. GenAI also introduces entirely new requirements around data extraction, filtering, tokenization and embedding generation. A comprehensive benchmark must be able to account for all such workloads. It may even require separate evaluations for each workload category, recognizing that a system optimized for traditional BI queries might not perform well for AI data preparation tasks.

Related:How TIA Plans To Solve Data Center Standardization Challenge

The Path Forward

Benchmarks are more than just a technical exercise; they shape how businesses evaluate technology and the solutions they invest in. It has become clear that there is no existing benchmark that captures the nuances of today’s heterogeneous computing environments and the strengths of respective accelerated hardware for data processing. 

However, developing such a benchmark isn't a challenge that any single company can solve alone. It requires industry-wide collaboration to define, validate, and adopt new standards that serve the entire ecosystem. Hardware vendors, software developers, data center operators, and end-users will need to collaborate to create benchmarks that accurately reflect the performance characteristics of modern data processing systems.

For data center operators, the stakes are clear. Billions of dollars are being invested in new data center development, and the effective planning, design, and operation of these facilities depend on an accurate understanding of how different accelerators perform under real data-processing workloads – not synthetic or training-oriented metrics. The industry has an opportunity to create a modern benchmark that provides the clarity needed to make smarter infrastructure decisions, avoid costly missteps, and ensure systems are optimized for the workloads defining the future of AI and analytics.

About the Author

Rajan Goyal

CEO, DataPelago

Rajan Goyal is CEO of DataPelago.

You May Also Like