first_last_whatever

joined 9 months ago

Update: gemma4:31b changed the game, at least for the moment and my use-cases. Just running it locally.

[–] first_last_whatever@feddit.org 10 points 2 weeks ago (1 children)

Sadly Mistral's models are somewhat behind at the moment from the benchmarks I've been looking at. But yeah that would probably be the most european solution.

Microsoft hosting in EU datacenters doesn't help with Microsoft being a US corporation. Therefor they need to hand data even from EU servers to US government by law (CLOUD Act). As far as I know they don't even claim differently.

As for now I still avoid setting it up myself, because it is just much more expensive if you don't run very high loads. The per token price for occasional or low volume load is just better using API services.

 

I really want to use tools like Claude Code or self written agents (LangGraph) with top notch models without relying on US clouds or buying 5-10k€ hardware.

I had a look at Scaleway, IONOS and OVHcloud, but they all only offer the same outdated medium size models like gpt-oss-120b, llama-3.3:70b, Qwen3:32b. Scaleway at least just recently added Qwen3.5:397b.

The best I could find so far seems Nebius. The problem here: they do offer MiniMax-M2.5, GLM-5, Qwen3.5 but again just on US servers. Only the older versions like MiniMax-M2.1 or GLM-4.7 are EU hosted.

Is there just no infrastructure available in the EU to host such models at scale? I really hope that is not the reason.

What do you use? Please help!

Edit: You just now found that Nebius requires Google, GitHub, or Microsoft account to sign up. Sad.