I will deploy a private self hosted ai model on your server
About this Gig
Everyone wants to run AI. Most people get stuck on the infrastructure.
I'm a DevOps Engineer based in Germany. I'll deploy a fully working self-hosted LLM on your server private, fast, and under your control. No API bills, no data leaving your infrastructure. And GDPR compliant
What I work with:
- Ollama fast local model servingvLLM high-performance GPU inference for production
- Open WebUI clean ChatGPT-like interface for your team
- LocalAI OpenAI-compatible API for existing apps
- Docker + GPU passthrough on VPS or bare metal
What you get:
- Running LLM accessible via browser or API
- NGINX reverse proxy + SSL certificate
- Authentication so not just anyone can access it
- Your choice of model: Llama, Mistral, Gemma, Phi, and more
- Documentation so you can manage it yourself after
Whether you're a developer who wants a private coding assistant, a startup that needs an internal AI tool, or a company that can't send data to OpenAI for compliance reasons, don't hesitate to ring the bell!
I work in English, German, French and Arabic.
Other DevOps Engineering Services I Offer
FAQ
Do I need a GPU?
Not necessarily. Smaller models (3B–7B) run fine on CPU with enough RAM. I'll advise you on the right model for your hardware before we start.
Will my data stay private?
Yes — that's the whole point. Everything runs on your server, nothing is sent to OpenAI or any third party.
Can I connect it to my existing app?
Yes. I can expose an OpenAI-compatible API endpoint so your app can switch from OpenAI to your self-hosted model with minimal code changes.
What server do I need?
Message me with your current setup and I'll tell you exactly what's needed. A basic €10/month VPS works for smaller models.
