THE POST - own private AI server.

x12HST — Wed, 29 Apr 2026 10:04:21 GMT

I'm 19 and I just built my own private AI server.

Total cost: £3,000. Replaces £27,000/year of cloud AI. Here's the maths that shocked me into doing it 👇

I've been planning an experimental project that needs serious AI inference.

I priced out the cloud API costs. Then I priced them out again because I thought I'd made a mistake.

The numbers came back somewhere between £825 and £2,825 a month — depending on how heavily I'd be running it.

At full tilt, that's around £27,000 a year just to keep the lights on.

I don't have £27,000 a year for API calls.

So I built my own AI server instead.

Sitting in a 12U rack in my living room is a Minisforum MS-S1 Max — a mini PC the size of a hardback book. Inside: an AMD Ryzen AI Max+ 395, 128GB of unified memory, and an integrated Radeon GPU that can run a 70-billion-parameter language model entirely in RAM.

That's roughly the same model class as GPT-4-turbo. On my desk.

The full stack: ↳ Ubuntu Server 24.04 LTS ↳ Docker + Portainer ↳ Ollama (the AI inference engine) ↳ Open WebUI (a self-hosted ChatGPT clone) ↳ Cloudflare Tunnel (remote access without opening router ports) ↳ Llama 3.3 70B + Qwen3 30B + Qwen2.5 Coder 32B

Running cost: about £40 a month, mostly electricity.

The maths comparison:

Cloud option, at the scale I had in mind: ↳ ~£900-£2,900/month ↳ ~£22,800/year midpoint

Self-hosted option, same workload: ↳ ~£40/month ↳ Hardware paid back in month 3

This isn't a tradeoff. It's mathematics.

After a month of daily use, here's what surprised me:

Llama 3.3 70B feels indistinguishable from GPT-4 for ~90% of tasks. Time to first token is 2-4 seconds — competitive with ChatGPT on a good day, faster on a bad one.

The models don't go offline because OpenAI is having an outage. They don't change behavior because someone fine-tuned them differently this week. They don't send my data anywhere.

That permanence is something cloud AI can't sell you.

I'm writing a 10-part series on blogs.smartduck.cloud documenting the entire build — the driver nightmare that ate three days of my life, the security hardening, the Cloudflare setup, the load balancing across two servers, and the full cost breakdown.

Part 1 is live now. New article every week.

If you're building anything AI-heavy and the cloud bill is starting to scare you, this might be useful.

Link in the comments 👇

What's the most you've ever paid in OpenAI API fees in a single month? Curious what others are seeing at scale.

#AI #SelfHosted #Ollama #HomeLab #LLM