Bring Your Own Harness
You already have an AI coding agent. Maybe two. Maybe Claude Code is your daily driver, you’ve been eyeing OpenAI’s Codex CLI, and Aider is sitting in your pip list from that one weekend project.
ModelReins doesn’t replace any of them. It harnesses them. Each agent becomes a worker. The router picks the best one for each job. When one hits a cap, the others pick up.
This walkthrough wires Codex CLI as a worker on your laptop. The same pattern works for any harness with a non-interactive CLI mode.
Why this matters
Section titled “Why this matters”Anthropic just told subscription users that third-party harnesses don’t get to run on their plan limits anymore. OpenAI has its own caps. Local models don’t have cloud caps but they have RAM. No single agent is bulletproof.
ModelReins is the layer that admits this. Your fleet has all of them. The router routes around whichever one is currently broken.
Step 1: Install the harness
Section titled “Step 1: Install the harness”For Codex CLI:
npm install -g @openai/codexcodex loginFor Aider:
pip install aider-chatFor OpenClaw or anything else: install however its docs say. If it has a non-interactive mode (something like tool exec "<prompt>" or tool --message "<prompt>"), it will work as a ModelReins worker.
Step 2: Create a worker directory
Section titled “Step 2: Create a worker directory”mkdir -p worker/codex-myboxcd worker/codex-myboxStep 3: Drop in a start.bat (Windows) or start.sh (Mac/Linux)
Section titled “Step 3: Drop in a start.bat (Windows) or start.sh (Mac/Linux)”@echo offcd /d "%~dp0"set MODELREINS_URL=https://app.modelreins.comset MODELREINS_TOKEN=mr-yourname-xxxxxxxxxxxxxxxxxxset MODELREINS_WORKER=codex-myboxset MODELREINS_WORKER_TYPE=codexset MODELREINS_WORKER_MODEL=gpt-5-codexset MODELREINS_WORKER_TAGS=code,architecture,review,refactor,generalset MODELREINS_PROVIDER=codexset MODELREINS_SKIP_PREFLIGHT=1set MODELREINS_POLL_MS=5000node "%~dp0..\daemon.js" codex-myboxThe provider name (MODELREINS_PROVIDER=codex) tells ModelReins which preset to use. Built-in presets: claude, codex, aider, ollama-cli, ollama-http, lmstudio, 1minai. The preset knows the right command, prompt arg, and flags — you don’t have to.
Step 4: Start it
Section titled “Step 4: Start it”./start.batYou’ll see the rein banner and Ready — waiting for jobs.... The worker is now polling the brain every 5 seconds for jobs assigned to its name.
Step 5: Send it a job
Section titled “Step 5: Send it a job”Open the Saddle in VS Code. The new worker shows up in the Target row at the bottom of the cockpit. Click it to pin all dispatches to that worker, then hit Dispatch Job with any prompt.
DISPATCH standard / auto → codex-myboxPrompt: refactor this function for clarityIt runs. Output streams back into the saddle. You just used Codex CLI as a ModelReins worker.
What just happened
Section titled “What just happened”You took an AI agent that Anthropic doesn’t want running under your subscription, dropped it onto your own machine, and gave it a job through ModelReins. The job ran on your hardware, billed against your OpenAI account, and the result came back through the same saddle you use for Claude Code.
Now do it again with Claude Code as a second worker. And Ollama as a third. Now you have three independent providers in a single fleet, and the router picks between them. When Anthropic caps your subscription at 5pm, your fleet keeps running on Codex and Ollama. When OpenAI rate-limits you, the fleet keeps running on Claude and Ollama. When the internet dies, Ollama keeps running.
That’s the harness around your harnesses. That’s the rein.
Adding a custom harness
Section titled “Adding a custom harness”If your tool isn’t in the preset list, drop a yaml file into providers/yourtool.yaml:
name: yourtooldisplay_name: "Your Tool"command: yourtoolprompt_arg: "exec"extra_args: "--non-interactive --skip-confirm"output_format: linescapabilities: - code_gen - file_editcost_model: per_taskprompt_arg is whatever flag (or subcommand) the tool uses to take a single prompt. extra_args is the bag of flags that put the tool in headless mode (skip git checks, bypass approvals, no interactive prompts). output_format: lines is the safe default — use stream-json only if the tool actually emits Anthropic-style stream events.
Then point a worker at it:
set MODELREINS_PROVIDER=yourtoolThat’s the entire integration. Same pattern for everything.
- Add reserves so the router knows how much budget each provider has
- Pick the right effort tier so jobs go to the right worker without thinking about it