PARAM Siddhi-AI

About PARAM Siddhi-AI

The NPSF, C-DAC in Pune has commissioned the fastest HPC/AI system in India, the PARAM Siddhi-AI system, as part of the NSM initiative. This system comprises 336 NVIDIA A100 GPUs and is a dense GPU compute resource used for executing popular AI and HPC workloads. The configuration of the system includes 42 compute nodes of NVIDIA DGX-A100, each having 2 AMD EPYC CPUs, 8 A100 GPUs and 1 TB RAM. The system has a total peak computing capacity of 6.745 Peta Flops (DP) and 210 PF (AI) performance, with 42 compute nodes and 1 login node. PARAM Siddhi-AI system, at C-DAC is aimed to serve as AI / HPC specific Cloud Computing Infrastructure for INDIA subsuming academia, R&D institutes and start-ups. The system is a centralized facility to ensure increased accessibility and utilization as well as ability to support large scale and more diverse R&D projects in the AI and HPC domains and is dedicated to address India Specific Real-Life Problems The facility would also enable storing of India's massive data sets from areas like healthcare, agriculture locally in a high throughput and efficient storage. The use cases for PARAM Siddhi-AI system varies from Big Data Analytics to specialized AI / HPC solutions across multiple domains viz. Healthcare (precision diagnostics, non-invasive diagnostics etc.), Agriculture (precision agriculture, crop infestations, advanced agronomic advisory etc.), weather forecasting, security and surveillance, financial inclusion and other services (fraud detection), infrastructural tools i.e. NLP etc.

PARAM Siddhi-AI Details

    System Specifications
    NVIDIA DGX-A100 Compute Nodes 82 (20992 cpu cores)
    Total host (compute node) memory 82 TB (82 nodes * 1 TB per node)
    NVIDIA A100-40GB Tensor Core GPUs 656 (82 nodes * 8 gpus per node)
    Total GPU Memory 26.24 TB (82 Nodes * 8 GPUs per node * 40 GB Per Node)
    Mellanox 200G HDR InfiniBand Switch having 320 Tb/s aggregate switch throughput (Compute Communication) 800 Ports (20 leafs *40 ports per leaf)
    Mellanox 200G HDR InfiniBand Switches (Storage Delivery) 400 Ports (10 Switches* 40 ports per switch)
    PFS based storage (Network attched) @250 GB/Sec, 4M IOPs 10.5 PIB (2 Tier Storage)
    AIRAWAT-PSAI Compute Node Specification
    Component Specification
    CPU AMD EPYC 7742 64C 2.25GHz
    CPU Cores 128 cores (Dual Socket, each with 64 cores) [256 cores with Hyper-threading].
    L3 Cache 256 MB
    System Memory (RAM) 1 TB
    GPU NVIDIA A100 - SXM4
    GPU Memory 40 GB
    Local Storage 14 TB
    Total No. of GPUs per node 8
    Networking Mellanox ConnectX-6 VPI (Infiniband HDR), 1.6 Tb/Sec.
  • Architecture Diagram:
  • Software Stack:

Support

For any support, contact: airawat-outreach@cdac.in

PARAM Siddhi-AI Usage Report