<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[smartduck]]></title><description><![CDATA[smartduck]]></description><link>https://blogs.smartduck.cloud</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1593680282896/kNC7E8IR4.png</url><title>smartduck</title><link>https://blogs.smartduck.cloud</link></image><generator>RSS for Node</generator><lastBuildDate>Fri, 01 May 2026 10:18:26 GMT</lastBuildDate><atom:link href="https://blogs.smartduck.cloud/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[THE POST -  own private AI server.]]></title><description><![CDATA[I'm 19 and I just built my own private AI server.
Total cost: £3,000. Replaces £27,000/year of cloud AI. Here's the maths that shocked me into doing it 👇
I've been planning an experimental project th]]></description><link>https://blogs.smartduck.cloud/the-post-own-private-ai-server</link><guid isPermaLink="true">https://blogs.smartduck.cloud/the-post-own-private-ai-server</guid><dc:creator><![CDATA[x12HST]]></dc:creator><pubDate>Wed, 29 Apr 2026 10:04:21 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/69f1d12fad6e531c729c51cc/73b9ac52-5bc0-4696-b640-b276fc6b9950.jpg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I'm 19 and I just built my own private AI server.</p>
<p>Total cost: £3,000. Replaces £27,000/year of cloud AI. Here's the maths that shocked me into doing it 👇</p>
<p>I've been planning an experimental project that needs serious AI inference.</p>
<p>I priced out the cloud API costs. Then I priced them out again because I thought I'd made a mistake.</p>
<p>The numbers came back somewhere between £825 and £2,825 a month — depending on how heavily I'd be running it.</p>
<p>At full tilt, that's around £27,000 a year just to keep the lights on.</p>
<p>I don't have £27,000 a year for API calls.</p>
<p>So I built my own AI server instead.</p>
<p>Sitting in a 12U rack in my living room is a Minisforum MS-S1 Max — a mini PC the size of a hardback book. Inside: an AMD Ryzen AI Max+ 395, 128GB of unified memory, and an integrated Radeon GPU that can run a 70-billion-parameter language model entirely in RAM.</p>
<p>That's roughly the same model class as GPT-4-turbo. On my desk.</p>
<p>The full stack: ↳ Ubuntu Server 24.04 LTS ↳ Docker + Portainer ↳ Ollama (the AI inference engine) ↳ Open WebUI (a self-hosted ChatGPT clone) ↳ Cloudflare Tunnel (remote access without opening router ports) ↳ Llama 3.3 70B + Qwen3 30B + Qwen2.5 Coder 32B</p>
<p>Running cost: about £40 a month, mostly electricity.</p>
<p>The maths comparison:</p>
<p>Cloud option, at the scale I had in mind: ↳ ~£900-£2,900/month ↳ ~£22,800/year midpoint</p>
<p>Self-hosted option, same workload: ↳ ~£40/month ↳ Hardware paid back in month 3</p>
<p>This isn't a tradeoff. It's mathematics.</p>
<p>After a month of daily use, here's what surprised me:</p>
<p>Llama 3.3 70B feels indistinguishable from GPT-4 for ~90% of tasks. Time to first token is 2-4 seconds — competitive with ChatGPT on a good day, faster on a bad one.</p>
<p>The models don't go offline because OpenAI is having an outage. They don't change behavior because someone fine-tuned them differently this week. They don't send my data anywhere.</p>
<p>That permanence is something cloud AI can't sell you.</p>
<p>I'm writing a 10-part series on blogs.smartduck.cloud documenting the entire build — the driver nightmare that ate three days of my life, the security hardening, the Cloudflare setup, the load balancing across two servers, and the full cost breakdown.</p>
<p>Part 1 is live now. New article every week.</p>
<p>If you're building anything AI-heavy and the cloud bill is starting to scare you, this might be useful.</p>
<p>Link in the comments 👇</p>
<hr />
<p>What's the most you've ever paid in OpenAI API fees in a single month? Curious what others are seeing at scale.</p>
<p>#AI #SelfHosted #Ollama #HomeLab #LLM</p>
<hr />
]]></content:encoded></item></channel></rss>