Troubleshooting

503 No eligible workers

Symptom: Dispatching a job returns 503 with No eligible workers or the job sits in pending indefinitely.

Cause: No worker is online that matches the job’s requirements (provider, tags, capacity).

Fix:

Check that at least one worker is online and idle:

curl -H "Authorization: Bearer mr-..." \
  https://app.modelreins.com/presence

If the list is empty, no workers are connected. Install the Companion app or see Add a Worker.

Check tag overlap. If the job requires tags: ["code"] but your worker is tagged draft,triage, the router won’t match it. Use /dispatch/explain to see the score breakdown:

curl -H "Authorization: Bearer mr-..." \
  "https://app.modelreins.com/dispatch/explain?job_tags=code&urgency=normal"

Check the worker isn’t paused or at max concurrency. In the Saddle or dashboard, look at the worker’s status in the Fleet panel.
If using a self-hosted coordinator, make sure the worker’s MODELREINS_API_URL points to the correct address and the API key is valid.

401 Unauthorized after token rotation

Symptom: Workers were running fine, then suddenly all jobs fail with 401 Unauthorized or Authentication failed.

Cause: You rotated your API key (at the provider or at the ModelReins coordinator) but not all workers picked up the new key.

Fix:

Update the key in your environment or config file:

export MODELREINS_API_KEY=new-key-here

Restart all workers. If using the Companion App, open the tray menu → Settings → update the API key and click Save. The worker restarts automatically.
If using the MCP channel, update the key in your .mcp.json or VS Code settings and restart the MCP client.

Prevention: Use a shared config file or secrets manager so all workers read from the same source. The MODELREINS_CONFIG_URL env var can point workers at a remote config endpoint.

Ollama not detected

Symptom: The Companion or worker fails to connect to Ollama at http://localhost:11434.

Fix:

Check if Ollama is running:

curl http://localhost:11434/api/tags

If this fails, start the Ollama service:

# Linux
systemctl start ollama

# macOS — open the Ollama app, or:
ollama serve

If Ollama is on a different host or port:

export MODELREINS_OLLAMA_HOST=http://192.168.1.50:11434

If Ollama is running inside a container or VM, make sure the ModelReins worker can reach it over the network. Set MODELREINS_OLLAMA_HOST to the correct address.

LM Studio model not found

Symptom: Jobs dispatched to LM Studio fail with Model not found or No model loaded.

Fix:

Open LM Studio and check the Local Server tab.
Make sure a model is selected and loaded in the model dropdown. The server can run without a model selected, but it won’t process requests.
Verify the server is serving the expected model:

curl http://localhost:1234/v1/models

If the model name in the response doesn’t match your ModelReins config, update the config:

{
  "lmstudio": {
    "model": "TheBloke/Llama-3.2-GGUF"
  }
}

Use the exact model name from the /v1/models response.

Job stuck in “running” state

Symptom: A job shows status: running indefinitely. The worker processing it may have crashed or disconnected.

Fix:

Check which worker has the job:

curl -H "Authorization: Bearer mr-..." \
  https://app.modelreins.com/api/jobs/<job-id>

Check if that worker is still alive via the presence endpoint:

curl -H "Authorization: Bearer mr-..." \
  https://app.modelreins.com/presence

If the worker is dead, retry the job to release it back to the queue:

curl -X POST -H "Authorization: Bearer mr-..." \
  https://app.modelreins.com/api/jobs/<job-id>/retry

If this happens frequently, increase the job timeout and enable automatic reaping:

{
  "jobs": {
    "timeout_seconds": 300,
    "reap_stale_after_seconds": 600
  }
}

The coordinator will automatically requeue jobs that have been running longer than reap_stale_after_seconds without a heartbeat.

Rate limit errors (429 Too Many Requests)

Symptom: Cloud provider jobs fail with 429 Too Many Requests or Rate limit exceeded.

Fix:

Immediate: Pause the affected worker from the dashboard or Saddle, then reduce its concurrency in the Companion settings.
Short-term: Enable built-in rate limit handling. ModelReins will automatically back off and retry:

{
  "providers": {
    "claude": {
      "rate_limit": {
        "max_concurrent": 3,
        "retry_after_seconds": 10,
        "max_retries": 5
      }
    }
  }
}

Long-term: Spread load across providers using routing rules. Add OpenRouter as a fallback — it handles rate limiting across multiple upstream providers:

{
  "routing": {
    "strategy": "fallback",
    "chain": ["claude", "openrouter"]
  }
}

Worker connects but never picks up jobs

Symptom: The dashboard shows the worker as connected, but it never processes any jobs.

Fix:

Check that the worker’s provider and tags match the jobs in the queue. Use /dispatch/explain to see why the router isn’t selecting it:

curl -H "Authorization: Bearer mr-..." \
  "https://app.modelreins.com/dispatch/explain?job_tags=code&urgency=normal"

If jobs are queued for claude but the worker only supports ollama, it won’t pick them up.

Verify the worker isn’t paused or at max concurrency. Check the Fleet panel in the Saddle or dashboard — look for paused: true or concurrency: 0.

Dashboard not loading

Symptom: The dashboard URL returns a blank page or connection refused.

Fix:

SaaS users: Check status.modelreins.com or try curl https://app.modelreins.com/health. If the health endpoint is down, it’s on our side.
Self-hosted: Check that the coordinator process is running and hit the health endpoint:

curl http://your-coordinator:7420/health

If accessing remotely, check firewall rules allow traffic on the coordinator port.
If using a reverse proxy, ensure WebSocket connections are proxied (the dashboard uses WebSockets for live updates):

location / {
    proxy_pass http://localhost:7420;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection "upgrade";
}