tl;dr: Use pre-built Docker images. Build-from-source requires BuildKit in Docker daemon, which needs sudo on Linux. The image-pull approach takes ~10 minutes instead of ~45.
Why This Guide?
The official Firecrawl self-hosting docs assume you're comfortable with BuildKit builds and lengthy docker compose build. On a fresh Manjaro install with only Portainer running, that approach hits two walls:
- BuildKit must be enabled in
/etc/docker/daemon.json — requires sudo
- Building from source takes 30-45 minutes on first run
This guide uses pre-built images from GHCR and gets you scraping in ~10 minutes with zero sudo beyond initial Docker setup.
Prerequisites
- Manjaro Linux (Arch-based)
- Docker installed and running
- Docker Compose installed (
sudo pacman -S docker-compose)
- Your user added to the
docker group (no sudo for docker commands)
- Docker daemon has BuildKit enabled (see note at bottom if not)
- Portainer (optional, but nice to have)
Step 1 — Clone the Repo
cd ~
git clone https://github.com/mendableai/firecrawl.git
cd firecrawl
You only need the repo for the docker-compose.yaml. After setup you can rm -rf it if you want.
Step 2 — Create the .env File
cat > .env << 'EOF'
PORT=3002
HOST=0.0.0.0
USE_DB_AUTHENTICATION=false
BULL_AUTH_KEY=CHANGEME
EOF
Set BULL_AUTH_KEY to something secure — it guards the Bull queue admin panel.
Step 3 — Switch to Pre-Built Images
The default docker-compose.yaml tries to build everything from source. We need to flip it to use pre-built images instead. Four services need switching:
# API (main firecrawl service)
sed -i 's|# image: ghcr.io/firecrawl/firecrawl|image: ghcr.io/firecrawl/firecrawl|' docker-compose.yaml
sed -i 's| build: apps/api| # build: apps/api|' docker-compose.yaml
# Playwright (browser rendering)
sed -i 's|# image: ghcr.io/firecrawl/playwright-service:latest|image: ghcr.io/firecrawl/playwright-service:latest|' docker-compose.yaml
sed -i 's| build: apps/playwright-service-ts| # build: apps/playwright-service-ts|' docker-compose.yaml
# Postgres (database)
sed -i 's|# image: ghcr.io/firecrawl/nuq-postgres:latest|image: ghcr.io/firecrawl/nuq-postgres:latest|' docker-compose.yaml
sed -i 's| build: apps/nuq-postgres| # build: apps/nuq-postgres|' docker-compose.yaml
Redis and RabbitMQ already use standard public images (redis:alpine, rabbitmq:3-management) so no changes needed there.
Verify with:
grep -n "image:\|build:" docker-compose.yaml | grep -v "^#"
You should see image: lines for: firecrawl, playwright-service, nuq-postgres, redis, rabbitmq. No build: lines should remain.
Step 4 — Pull the Images
docker pull ghcr.io/firecrawl/firecrawl:latest
docker pull ghcr.io/firecrawl/playwright-service:latest
docker pull ghcr.io/firecrawl/nuq-postgres:latest
Each is ~200-400MB. This takes 5-10 minutes depending on your connection.
Step 5 — Start Everything
docker compose up -d
Check status:
docker ps --format "table {{.Names}}\t{{.Status}}" | grep firecrawl
All 5 containers should show "Up" — RabbitMQ will show "(healthy)" once its health check passes (~30s).
Step 6 — Verify It Works
curl -X POST http://localhost:3002/v0/scrape \
-H 'Content-Type: application/json' \
-d '{
"url": "https://example.com/",
"pageOptions": {"onlyMainContent": true}
}'
You should get back JSON with success: true, page content, markdown, and metadata. Credits used will be 1 (from the Postgres DB, even without auth configured).
Step 7 — Wire Hermes Agent to Local Firecrawl
Open ~/.hermes/config.yaml and find the web_extract section under auxiliary:
auxiliary:
web_extract:
provider: auto
model: ''
base_url: ''
api_key: ''
timeout: 360
Change it to:
auxiliary:
web_extract:
provider: custom
model: ''
base_url: http://localhost:3002/
api_key: ''
timeout: 360
Restart Hermes. Now every time Hermes needs to scrape or extract content from the web, it will use your local Firecrawl instance instead of calling an external API.
Useful URLs Once Running
| Service |
URL |
| Firecrawl API |
http://localhost:3002/ |
| Bull Queue Dashboard |
http://localhost:3002/admin/CHANGEME/queues |
| Portainer (if installed) |
https://localhost:9443/ |
Rebuilding After Reboot
cd ~/firecrawl
docker compose up -d
Everything persists in Docker volumes. No data is lost.
Troubleshooting
sed: can't read errors on macOS: macOS sed has different syntax. Use sed -i '' 's/old/new/' on macOS, or use the -i without argument on Linux.
"mount option requires BuildKit" during any build: This only matters if you're building from source. If you're using pre-built images (this guide), you never hit this. If you do need BuildKit for something else, add this to /etc/docker/daemon.json and restart Docker:
{ "features": { "buildkit": true } }
"Image denied" when pulling: Use ghcr.io/firecrawl/firecrawl (lowercase, org path) instead of ghcr.io/mendableai/firecrawl. The latter requires GitHub auth.
RabbitMQ not healthy: Wait 30-60 seconds. First startup runs migrations. Check logs with docker compose logs rabbitmq.
API returns 500: Check API logs with docker compose logs api — common cause is RabbitMQ not being healthy yet.
What's Working vs. Cloud
| Feature |
Local |
Cloud |
/v0/scrape (fetch) |
Yes |
Yes |
/v0/scrape (Playwright) |
Yes |
Yes |
/v1/crawl |
Yes |
Yes |
/v0/extract (AI) |
Yes (needs OpenAI/Ollama key) |
Yes |
| Fire-engine (anti-bot) |
No |
Yes |
/search |
Yes (needs SearXNG) |
Yes |
AI features (/extract, structured output) work locally but require an OPENAI_API_KEY or OLLAMA_BASE_URL in your .env.
I brought up Bush’s reaction as an example of leadership failure, to highlight how tone-deaf it is for leaders to act indifferent in a crisis, regardless of the reason. And it’s ironic you’re resorting to ad hominem attacks about my intelligence when that’s the weakest form of argument.