this post was submitted on 22 Jun 2026
25 points (96.3% liked)

Selfhosted

60177 readers
769 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

Detailed Rules Post

  1. Be civil.

  2. No spam.

  3. Posts are to be related to self-hosting.

  4. Don't duplicate the full text of your blog or readme if you're providing a link.

  5. Submission headline should match the article title.

  6. No trolling.

  7. Promotion posts require active participation, with an account that is at least 30 days old. F/LOSS without a paywall has exceptions, with requirements. See the rules link for details.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 3 years ago
MODERATORS
 

I am one of a network of academic researchers from around the world working on collecting media market data. One problem is that referenced sources often disappear which makes validation later difficult or impossible. So, I thought I would recommend self-hosting something like archive.org that would allow affiliated researchers to submit their web references and have their sources efficiently archived in a central project repository. That would allow validation and continuity for when web-hosted text and files disappear or researchers leave.

I have been looking at ArchiveBox. If you have experience of this or a similar solution, would that fit the bill? The important thing is efficiency for researchers submitting/retrieving pages and files, and openness in structure and formats so that the archive would remain useful if ArchiveBox or similar disappears. FOSS of course means you can't be locked out anyway.

you are viewing a single comment's thread
view the rest of the comments
[–] Stopwatch1986@lemmy.ml 2 points 4 days ago (1 children)

A wiki is a good idea. Putting a Singlefile or similar all-in-one file in a repository and provide index numbers organised as a look-up table would also work for easy retrieval by a random research user. Both require some admin and more effort from the researchers.

I wish there was a hostable version of archive.is for near-zero maintenance. You just submit a URL over the internet and the web page is cached once along with a screenshot. Then, anyone can access the archived version. This can be done already with archive.is but we have no control over its future, which is critical for long-term dependable archiving.

[–] irmadlad@lemmy.world 2 points 3 days ago (1 children)

This can be done already with archive.is but we have no control

Did a little digging this morning. I honestly can't find a selfhosted, archive.is alternative. All the solutions I came up with are either paid for and online use only, or free, but still online use only.

[–] Stopwatch1986@lemmy.ml 2 points 3 days ago (1 children)

Thanks for doingthe digging. An archivist may know something more. Or the archive.is people.

[–] irmadlad@lemmy.world 2 points 3 days ago

It might be worthwhile to run your scenario by the folks at https://lemmy.world/c/datahoarder