this post was submitted on 08 Sep 2025
65 points (97.1% liked)

Selfhosted

59864 readers
228 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam.

  3. Posts here are to be centered around self-hosting. Please ensure it is clear in your post how it relates to self-hosting.

  4. Don't duplicate the full text of your blog or git here. Just post the link for folks to click.

  5. Submission headline should match the article title.

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 3 years ago
MODERATORS
 

Ugh, apparently yesterday a bot visited my Forgejo instance and queried everything, which caused Forgejo to create repo archives for everything. Git on the instance is 2.1 GB in size, but the repo archive filled up everything and is 120 GB. I really didn't expect such a spike.

That meant that it filled up the whole hard drive and the server and all the services and websites on it went down while I was sleeping.

Luckily it seems that just deleting that directory fixes the problem temporarily. I also disabled the possibility of downloading archived from the UI but I'm not sure if this will prevent bots from generating those archives again. I also can't just make the directory read only because it uses it for other things like mirroring, etc too.

For small instances like mine those archives are quite a headache.

you are viewing a single comment's thread
view the rest of the comments
[โ€“] jeena@piefed.jeena.net 4 points 9 months ago (1 children)

But then how do people who search for code like yours find your open source code if not though a search engine which uses a indexing not?

[โ€“] SteveTech@programming.dev 1 points 9 months ago* (last edited 9 months ago)

Cloudflare usually blocks 'unknown' bots, which are basically bots that aren't search crawlers. Also I've got Cloudflare setup to challenge requests for .zip, .tar.gz, or .bundle files, so that it doesn't affect anyone unless they download from their browser.

There's also probably a way to configure something similar in Anubis, if you don't like a middleman snooping your requests.