Looks Like This Service Is On Hold

I will provide aiops and sre consulting for devops and cloud reliability

United States

I speak English

GPU Infrastructure LLMOps Engineer NVIDIA Kubernetes Neo Cloud

I build scalable NVIDIA GPU infrastructure for AI training and inference. I specialize in Kubernetes GPU clusters, LLM training/inference, and GPU observability. Services: • GPU cluster setup • Kube...

About this Gig

Are you shipping LLM products but struggling with GPU infrastructure, scaling, and reliability? I help teams build production-grade GPU platforms end-to-end.

What you get: Neo cloud GPU setup and cluster hardening Kubernetes GPU scheduling and autoscaling for LLM training and inference (vLLM/Ollama/Triton) MLOps/LLMOps CI/CD for models and data pipelines GPU monitoring and alerts using NVIDIA DCGM + Prometheus + Grafana Cost optimization, capacity planning, and observability best practices

Deliverables can include architecture review, deployment plan, and hands-on implementation depending on package tier.

provide aiops and sre consulting for devops and cloud reliability

Full Screen

Tools:

Docker

•

GitLab

•

Jenkins

•

GitHub

•

CircleCI

Frameworks:

Terraform

•

Ansible

Cloud Provider:

Amazon Web Services

•

Microsoft Azure

+1 more

Programming language:

Bash

•

Python

•

Golang

Expertise:

Installation

•

Migration

•

Configuration

Need to get creative?

Looking for tech experts?

Ready to reach and convert consumers?

Looking for writers?

Get your business running smarter

Looks Like This Service Is On Hold

I will provide aiops and sre consulting for devops and cloud reliability

About this Gig

Related tags