Podman pods (or quadlets) managed by ansible.
vegetaaaaaaa
True.
But by default the unattended-upgrades timer has a randomized trigger time (so that not all Debian machines in the world start hammering the mirrors at the same time). If you enable the auto reboot option in unattended-upgrades, your boxes will reboot at an unpredictable time. I prefer doing this at known times (middle of the night when I know nothing important is running/number of users is low).
This is a kernel bug, unattended-upgrades will take care of installing the new kernel once the fix is published, but you still have to reboot to load it. I've set up a cron job that runs needrestart nightly and reboots my servers if there is a pending kernel upgrade [1]
Damn their website has become a mess. Anyway
This is fine as long as upstream supports a convenient way to get the latest versions of software for which you actually need latest (APT repositories)
Stable base, only explicitly allow selected unstable/bleeding edge components.
This is what I do for ROCm and a few other things which need to be constantly updated (yt-dlp). Sometimes stable-backports repositories are enough, but not always.
I suggest using llama.cpp instead of ollama, you can easily squeeze +10% in inference speed and other memory optimizations from llama.cpp. With hardware prices nowadays I think every % saved on resources matters. Here is a simple ansible role to setup llama.cpp, it should give you a good idea of how to deploy it.
A dedicated inference rig is not gonna be cheap. What I did, since I need a gaming rig; is getting 32GB DDR5 (this was before the current RAMpocalypse, if I had known I would have bought 64) and an AMD 9070 (16GB VRAM - again if I had known how crazy prices would get I'd probably ahve bought a 24GB VRAM card). The home server runs the usual/non-AI stuff, and llamacpp runs on the gaming desktop (the home server just has a proxy to it). Yeah the gaming desktop has to be powered up when I want to run inference, this is my main desktop so it's powered on most of the time, no big deal
There is https://www.infomaniak.com/en/euria (Switzerland)
And https://mammouth.ai/ (France), though they're more a "middleman" for various providers (including providers serving open-weights models)
And of course you can still run models locally with LLM hosts like https://github.com/ggml-org/llama.cpp (there are hundreds of derivatives, but llama.cpp is the OG/underlying library for most of them). A decent gaming PC can now run local LLMs on par with SOTA proprietary models from 6-12 months ago (qwen3.6 is a beast). https://old.reddit.com/r/LocalLLaMA/ is a decent subreddit for news and discussions about this, I didn't find a real equivalent on lemmy.
Most applications/services offer mail as notification channel. Even old school unix utilities such as cron support sending mail (through the system MTA). I use msmtp. Then configure K-9 mail or any decent mail client on your phone, setup filters so that mail from your services ends up in a high priority folder in your mailbox with notifications enabled.
I want to be able to receive notifications both on mobile and desktop, this is the only reasonable option I found and have been running with it for > 10 years.
- use APT repositories when possible -> then
unattended-upgrades - For OCI images that do not provide tagged releases (looking at you searxng...), podman auto-update
- for everything else, subscribe to releases RSS feed, read release notes when they come out, check for breaking changes and possibly interesting stuff, update version in ansible playbook, deploy ansible playbook
It can protect APIs as much as any other URL. Or more simply you could disallow any unauthenticated API access in gitea or at the reverse proxy level?
cannot protect against bot traffic coming from many different residential proxies
It can block anything that doesn't pass the proof-of-work/JS challenge. Most bots don't interpret JS.
That's only if you use the default
Buildagent with the built-in prompt (https://github.com/anomalyco/opencode/blob/dev/packages/opencode/src/session/prompt/default.txt), and yes it is quite large.It's trivial to create custom agents in
opencode.jsonwith custom prompts, tools, whatever..For example I have created a
Personalagent which handles menial stuff such as searching/editing my notes, appointments, tasks... with a restricted set of tools and skills.The single most important change I made is only allowing the
localprovider in the config, which disables all cloud providers. IMHO this should be the default but I'm not complaining. It's the best open-source harness I've tried so far. I want to try pi.dev someday (quite minimal, needs a good amount of setup and tuning).I also argue that some local models actually behave much better with a semi-large system prompt (qwen 3.6 for example tends to lose itself in reasoning if you only use the default
You are a helpful assistantsystem prompt and a basicSay hiuser prompt - opencode-like large system prompts fixes this; even if you lose some time for initial prompt loading)