June 8, 20267 min read

Compiled Agents for mission-critical work

We're starting early access to a new type of harness for modern work: the compiled agent. Highly reliable, fast, low-cost agents that can run repeatable tasks consistently.

Introducing Compiled Agents

We're starting early access to a new type of harness for modern work: the compiled agent. These are highly reliable, fast, low-cost agents that can run repeatable tasks consistently. We achieve this by combining AI with code in a way that minimizes non-determinism, replaces parts of AI where code works better, and imposes a state machine to complete tasks. We protect the agent's reliability with tests and guardrails.

For businesses looking to scale operations with AI but running into high token costs and unreliable execution, compiled agents keep costs low, reduce errors, and ensure compliance.

What's wrong with agents today?

Most agents decide what to do dynamically at runtime and stuff everything into one enormous context that is reasoned over on every call. That makes it slow, expensive, and it's the source of "drift": the more the agent accumulates, the more it loses the thread.

Enterprise work requires repeatability and strong compliance. It just doesn't work when an agent works "most of the time"; the agent needs to work every time.

Enter: compiled agents

Squig agents are compiled: you give instructions in plain language, inside Squig or Slack. Your instructions are compiled into an internal workflow: deterministic where it can be, model-driven where it needs to be. Your specifications are protected against future changes through tests and evals.

Low cost, high reliability and fast performance

Since each step is scoped that tightly, Squig matches the appropriate model to the work. A lightweight model handles the simple steps and full power is reserved for the few that need it.

Smaller models work well because the step is scoped small enough. The expensive, most capable models are usually spent once up front, generating the workflow. After that, every run is cheap and fast. These agents are run on our workflow and integration infrastructure that we've built over the years.

This structured approach adds significant determinism around AI and tightly binds the agents to a narrow scope and context to reduce drift.

Our memory is compiled, too: instructions become code. This also replaces skill files and, in our testing, is far more token-efficient, faster, more reliably repeatable, and sidesteps context rot. We still use RAG for episodic memory and for information retrieval that do not impact agent behavior.

Demonstrating better performance: Topping τ-bench

τ-bench benchmark results showing Squig scoring 100 at pass^4

τ-bench, from Sierra.ai, is a benchmark built to measure agent repeatability. Our preliminary testing shows Squig scores a perfect 100 at pass^4 (4 passes of the same task). This score would be impossible with a purely agentic approach. Our hybrid approach is what helps us beat the benchmark while not even using the strongest model.

Self-healing and better guarded tool calling

Compiled parts of an agent can be brittle and when new edge cases are discovered, they are fed back into the compilation process at the end of each run. All changes are versioned, verified, and diffable, rather than improvised on each call.

Our integration layer lets users specify hardcoded limits on tools. For example, tool fields can be pre-filled, after which they are removed from AI's access. Even if the AI wanted to, it can't make the disallowed call. For example, you can fill the to field of an email tool.

Example of tool field pre-filling to restrict AI access

Tackling the jagged frontier of AI

Diagram illustrating the jagged frontier of AI capability

Models can look impressively competent on some tasks, then fail unpredictably on simpler tasks that are slightly more structured, longer-horizon, or rule-bound. This "jagged" boundary is a significant source of routine agent failures. By compiling agents, we systematically migrate tasks from the frontier's "danger zone" into structured, code-hardened pipelines with explicit checkpoints, making outcomes more repeatable.

Automating work for common business operations

We built compiled agents out of necessity: while working with customers in sales, marketing, and business operations, we found that typical agents with complex instructions fail rapidly. In any business case, there are hundreds of micro-instructions, from something as simple as how you qualify a lead to how you create a task for your engineering team.

Most AI adoption is constrained by AI's ability to perform high-quality, reliable work. With reliability at the core of the compiled agents, teams should be able to unlock productivity across many common business functions.

CRM Cleanup: if you have thousands of leads in your CRM, applying AI to clean of them gets expensive. With a compiled agent, you can provide your cleaning criteria and the agent will implement a fast, cheap solution to enriching, scoring and triaging your leads.

Outbound Sales: Give an agent your ICP and playbook, and the agent will research, qualify, personalize and engage with prospects via email and LinkedIn. The agent can be trained further based on the responses you're getting.

Support Triage: have an agent listen to and respond to incoming support chat messages. For any topic it does not know, it will ask on slack an update its own knowledge base. It can perform safe actions on the user's behalf such as refunds, track packages, or account changes.

Team Operations: Given a Slack channel, listen to chatter and create tasks when asked, show any existing overlapping tasks, keep the channel updated on task progress.

Make every employee an AI expert

Most agent deployments require specialized Forward Deployed Engineers (FDEs) to implement skills, workflows, tools, RAG pipelines, evaluations, etc., which take 4-6 weeks to ramp up.

For the vast majority of use cases, our compiled agents cut the implementation time down significantly and shift the power to change the agent directly in the hands of the business user.

Open for Early Access today

A compiled agent is all the generality of an agent with the reliability of code you'd actually ship. That combination is what makes it safe for repeatable, mission-critical work. Apply for early access to the compiled agents on squig.com

Back to Blog