Skip to content

Troubleshooting

Symptom: Dispatching a job returns 503 with No eligible workers or the job sits in pending indefinitely.

Cause: No worker is online that matches the job’s requirements (provider, tags, capacity).

Fix:

  1. Check that at least one worker is online and idle:
Terminal window
curl -H "Authorization: Bearer mr-..." \
https://app.modelreins.com/presence

If the list is empty, no workers are connected. Install the Companion app or see Add a Worker.

  1. Check tag overlap. If the job requires tags: ["code"] but your worker is tagged draft,triage, the router won’t match it. Use /dispatch/explain to see the score breakdown:
Terminal window
curl -H "Authorization: Bearer mr-..." \
"https://app.modelreins.com/dispatch/explain?job_tags=code&urgency=normal"
  1. Check the worker isn’t paused or at max concurrency. In the Saddle or dashboard, look at the worker’s status in the Fleet panel.

  2. If using a self-hosted coordinator, make sure the worker’s MODELREINS_API_URL points to the correct address and the API key is valid.


Symptom: Workers were running fine, then suddenly all jobs fail with 401 Unauthorized or Authentication failed.

Cause: You rotated your API key (at the provider or at the ModelReins coordinator) but not all workers picked up the new key.

Fix:

  1. Update the key in your environment or config file:
Terminal window
export MODELREINS_API_KEY=new-key-here
  1. Restart all workers. If using the Companion App, open the tray menu → Settings → update the API key and click Save. The worker restarts automatically.

  2. If using the MCP channel, update the key in your .mcp.json or VS Code settings and restart the MCP client.

Prevention: Use a shared config file or secrets manager so all workers read from the same source. The MODELREINS_CONFIG_URL env var can point workers at a remote config endpoint.


Symptom: The Companion or worker fails to connect to Ollama at http://localhost:11434.

Fix:

  1. Check if Ollama is running:
Terminal window
curl http://localhost:11434/api/tags

If this fails, start the Ollama service:

Terminal window
# Linux
systemctl start ollama
# macOS — open the Ollama app, or:
ollama serve
  1. If Ollama is on a different host or port:
Terminal window
export MODELREINS_OLLAMA_HOST=http://192.168.1.50:11434
  1. If Ollama is running inside a container or VM, make sure the ModelReins worker can reach it over the network. Set MODELREINS_OLLAMA_HOST to the correct address.

Symptom: Jobs dispatched to LM Studio fail with Model not found or No model loaded.

Fix:

  1. Open LM Studio and check the Local Server tab.
  2. Make sure a model is selected and loaded in the model dropdown. The server can run without a model selected, but it won’t process requests.
  3. Verify the server is serving the expected model:
Terminal window
curl http://localhost:1234/v1/models
  1. If the model name in the response doesn’t match your ModelReins config, update the config:
{
"lmstudio": {
"model": "TheBloke/Llama-3.2-GGUF"
}
}

Use the exact model name from the /v1/models response.


Symptom: A job shows status: running indefinitely. The worker processing it may have crashed or disconnected.

Fix:

  1. Check which worker has the job:
Terminal window
curl -H "Authorization: Bearer mr-..." \
https://app.modelreins.com/api/jobs/<job-id>
  1. Check if that worker is still alive via the presence endpoint:
Terminal window
curl -H "Authorization: Bearer mr-..." \
https://app.modelreins.com/presence
  1. If the worker is dead, retry the job to release it back to the queue:
Terminal window
curl -X POST -H "Authorization: Bearer mr-..." \
https://app.modelreins.com/api/jobs/<job-id>/retry
  1. If this happens frequently, increase the job timeout and enable automatic reaping:
{
"jobs": {
"timeout_seconds": 300,
"reap_stale_after_seconds": 600
}
}

The coordinator will automatically requeue jobs that have been running longer than reap_stale_after_seconds without a heartbeat.


Symptom: Cloud provider jobs fail with 429 Too Many Requests or Rate limit exceeded.

Fix:

  1. Immediate: Pause the affected worker from the dashboard or Saddle, then reduce its concurrency in the Companion settings.

  2. Short-term: Enable built-in rate limit handling. ModelReins will automatically back off and retry:

{
"providers": {
"claude": {
"rate_limit": {
"max_concurrent": 3,
"retry_after_seconds": 10,
"max_retries": 5
}
}
}
}
  1. Long-term: Spread load across providers using routing rules. Add OpenRouter as a fallback — it handles rate limiting across multiple upstream providers:
{
"routing": {
"strategy": "fallback",
"chain": ["claude", "openrouter"]
}
}

Symptom: The dashboard shows the worker as connected, but it never processes any jobs.

Fix:

  1. Check that the worker’s provider and tags match the jobs in the queue. Use /dispatch/explain to see why the router isn’t selecting it:
Terminal window
curl -H "Authorization: Bearer mr-..." \
"https://app.modelreins.com/dispatch/explain?job_tags=code&urgency=normal"

If jobs are queued for claude but the worker only supports ollama, it won’t pick them up.

  1. Verify the worker isn’t paused or at max concurrency. Check the Fleet panel in the Saddle or dashboard — look for paused: true or concurrency: 0.

Symptom: The dashboard URL returns a blank page or connection refused.

Fix:

  1. SaaS users: Check status.modelreins.com or try curl https://app.modelreins.com/health. If the health endpoint is down, it’s on our side.

  2. Self-hosted: Check that the coordinator process is running and hit the health endpoint:

Terminal window
curl http://your-coordinator:7420/health
  1. If accessing remotely, check firewall rules allow traffic on the coordinator port.

  2. If using a reverse proxy, ensure WebSocket connections are proxied (the dashboard uses WebSockets for live updates):

location / {
proxy_pass http://localhost:7420;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}