CSC Standards Framework v1.0

Version 1.0 of the Compute Standard has been published to encourage conversation around future drafts and serves as a baseline for the community with a suite of benchmarks for performance measurement and a tool and schema for capturing core elements of an AI compute environment. With these tools, you will be able to run simple commands on AI infrastructure and capture benchmarks and environment information with a consistent framework that measures workload performance rather than simple comparators like FLOPs or tensors and captures a full picture of the environment. Armed with this data, you’ll be in a position to more accurately compare and evaluate multiple solutions.

For this first release we are leveraging the tools provided by NVIDIA DGX™ Cloud Benchmarking which gauges training and inference performance across AI workloads and platforms. We are exposing the schema SiliconMark has assembled to gather environment details against the system running the benchmarks. Usage instructions for these toolsets can be found below. We invite your organization to join our community as a creator or reviewer of future standards releases. For some of our thoughts on what will follow, please see our Futures section.

Version Goals

Establish a baseline of measurements for initial use and consideration
Expose a schema for capturing information about a system using taxonomy exposed by well-known system tools
Position initial tooling for capturing data

Benchmarks

For this first release we are utilizing the benchmark tools provided by NVIDIA in the DGX Cloud Benchmarking Performance Recipes. These benchmarks have been developed to evaluate the performance of specific AI use cases across hardware and software combinations. These containerized recipes allow users to quickly set up and run standardized benchmarking methodology in their own environment, ensuring consistent and comparable results across platforms.

These Performance Recipes support performance characterization

across a variety of defined AI workloads, including pre-training, fine tuning, and inference.
across GPU-based infrastructure, whether running on-premises or with cloud service providers (CSPs).

Schema and Sample Capture

SiliconMark uses a variety of tools to capture and consolidate system information into a single JSON schema. Today, the taxonomy used for the output is implied directly from those system tools and examples of that output are provided in the sample.

{ "uuid": "REDACTED_UUID", // Unique identifier for the machine generated from serials/fixed attributes "gpu_model": "GENERIC_GPU_MODEL", // Name of the installed GPU "timestamp": "2025-07-01T00:00:00Z", // Time the benchmark was run (ISO 8601) "system_info": { "cpu_info": { "os": "GENERIC_OS_NAME", // Operating system full name and version "arch": "x86-64", // CPU architecture "chassis": "vm", // System form factor (e.g., virtual machine, laptop, server) "cpu_count": "4", // Logical CPU cores available "hardware_model": "GENERIC_INSTANCE_TYPE", // Host or VM instance type "virtualization": "GENERIC_VENDOR", // Virtualization environment (e.g., AWS, VMware) "hardware_vendor": "GENERIC_VENDOR", // Hardware manufacturer or cloud provider "processor_brand": "GENERIC_CPU_MODEL" // Full processor model string }, "dmi_info": { "system": {}, "chassis": {} }, // Usually populated with BIOS and hardware identifiers if available "gpu_info": [ { "name": "GENERIC_GPU_MODEL", // GPU name "pci_info": { "generation_max": "4", // PCIe maximum generation supported "link_width_max": "16", // Max number of PCIe lanes "generation_current": "1", // PCIe generation currently negotiated "link_width_current": "8" // Active PCIe lanes currently negotiated }, "free_memory": "22 GB", // Free GPU memory "temperature": "39", // GPU temperature in Celsius "cuda_version": "12.8", // CUDA driver version available "total_memory": "22 GB", // Total memory of the GPU "driver_version": "GENERIC_DRIVER_VERSION" // GPU driver version } ], "location": "REGION_PLACEHOLDER", // Region or country (used for location-based analysis) "ram_info": { "free_memory": "4 GB", // RAM free at time of test "total_memory": "14 GB", // Total memory available to OS "physical_memory": "16 GB" // Installed physical memory }, "config_id": "REDACTED_CONFIG_ID", // Deterministic hash/fingerprint of system config (not user-specific) "disk_info": [ { "name": "disk1", // OS-assigned name for disk "rota": false, // Whether it spins (false = SSD, true = HDD) "size": "233 GB", "type": "disk", "model": "GENERIC_DISK_MODEL", // Disk make/model (or cloud provider equivalent) "read_speed": "131.69 MB/s", // Disk read benchmark "mountpoints": ["/mount"], "write_speed": "91.69 MB/s" // Disk write benchmark }, { "name": "disk0", "rota": false, "size": "1000 GB", "type": "disk", "model": "GENERIC_DISK_MODEL", "children": [ { "name": "disk0p1", // Partition name "rota": false, "size": "1000 GB", "type": "part", "mountpoints": [ "/var/lib/containers/storage/overlay", "/" ] } ], "mountpoints": [ "" ] } ], "gpu_count": 1, // Number of GPUs in the system "machine_uuid": "REDACTED_UUID", // Same as top-level uuid; included for easy access "network_info": { "open_ports": "8", // Number of open network ports "upload_speed_mbps": "4942", // Upload speed measured during benchmark "download_speed_mbps": "8251" // Download speed measured during benchmark } }, "test_results": [ { "gpu_id": "REDACTED_GPU_ID", // Per-device unique identifier "bf16_tflops": 41.18, // BF16 tensor performance in TFLOPS "fp16_tflops": 38.07, // FP16 tensor performance in TFLOPS "fp32_tflops": 10.89, // FP32 compute performance in TFLOPS "memory_bandwidth_gbs": 115.73, // Memory bandwidth measured (GB/s) "temperature_centigrade": 39, // GPU temperature at test time "power_consumption_watts": 47.55 // Power used during benchmark (W) } ], "network_results": [ { "dest_ip": "REDACTED_IP", // IP of the target system tested for network performance "host_ip": "REDACTED_IP", // IP of the local system "latency_ms": 0.337, // Round-trip latency (ms) "throughput_gbps": 9.54, // Network throughput in Gbps "throughput_mbps": 9537 // Throughput in Mbps (duplicate of above, in smaller unit) } ], "aggregate_results": { // Aggregated or average values from `test_results` "gpu_id": "REDACTED_GPU_ID", "bf16_tflops": 41.18, "fp16_tflops": 38.07, "fp32_tflops": 10.89, "memory_bandwidth_gbs": 115.73, "temperature_centigrade": 39, "power_consumption_watts": 47.55 } }

Getting Started

In version 1, the benchmarking tools and the data capture tool from SiliconData must be run manually and separately.

For instructions on using NVIDIA DGX™ Cloud Benchmarking, please refer to their GitHub Project.

For instructions on using SiliconMark, please refer to the User Guide.

Future

The next iteration of the framework will incorporate input from the inaugural members representing each Subcommittee and the full AI stack. It will include data collection standards to ensure that the information collected for a defined environment is trustworthy. We also plan to include policy recommendations.

CSC is committed to expand and refine its benchmark methods and tooling and welcomes your participation. If you are an AI developer, infrastructure supplier, or user, this is your opportunity to help shape the standardized testing of industry solutions! Contact CSC with your ideas.

CSC is collaborating with SiliconData to provide a tool that will execute a discovery routine and selected benchmarks, and package the results for publication. SiliconData will also securely and immutably host the details from the data collection.