I will set up a linux hpc cluster with openhpc, slurm, and infiniband
About this Gig
Setting up an HPC cluster correctly from the start saves months of debugging and prevents the configuration debt that causes 80% of cluster performance problems later.
I've commissioned HPC clusters from 4 nodes to 600 nodes under India's National Supercomputing Mission from bare metal to HPL acceptance.
Full stack I work with:
Provisioning: Warewulf 4, xCAT, PXE
OS: Rocky Linux 8/9, AlmaLinux, CentOS Stream
Scheduler: Slurm with full accounting and cgroup
MPI: OpenMPI, IntelMPI, MVAPICH2
Fabric: InfiniBand HDR/NDR/EDR, Ethernet RDMA
Storage: Lustre, BeeGFS, GPFS, NFS
Monitoring: Grafana, Prometheus, Ganglia
What you receive:
Fully provisioned compute nodes
Working Slurm queue with test jobs verified
InfiniBand fabric validated with ibdiagnet
MPI hello world and bandwidth test passing
Complete configuration documentation
Handover call to walk you through the system
Before ordering: message me your node count, hardware specs, and what workloads you plan to run. I'll confirm feasibility and timeline before you pay.
Server:
Other
Operating system:
Linux
