AI agents

What is an AI agent? The full breakdown

· updated May 21, 20267 min read

The short answer

An AI agent is a system that turns a goal into a sequence of tool calls. You give it an objective — "extract every line item from this PDF and post it to NetSuite" — and it plans the steps, picks the right tools (vision model, schema validator, API call), executes them, recovers from failures, and either finishes the task or escalates to a human with context.

That's it. Everything else — RAG, function calling, LangGraph, MCP — is implementation detail. The defining behavior is turning goals into completed work via tools.

Why "agent" became a marketing word

The word has been overloaded. Three years ago, "AI agent" meant a research project. Two years ago, it meant a clever LangChain script. Today, it gets stapled to anything from a basic chatbot to a fully autonomous coding system.

Useful working definition: a system qualifies as an agent if all four of these are true:

  1. It receives a goal, not a single message. "Process this invoice and post it." vs "Tell me the weather."
  2. It plans the steps based on the goal and current state. Not a fixed workflow.
  3. It calls tools beyond the LLM itself — APIs, function calls, retrieval, side effects.
  4. It loops — it checks results, recovers from failures, asks for clarification, finishes the job.

Single-turn chatbots, generic LLM API wrappers, and basic prompt templates miss at least one of these. They might be useful — they're just not agents.

A worked example

Imagine a supplier sends an invoice as a PDF attachment to your AP inbox. The agent's job: get it posted in NetSuite, with audit trail, or escalate cleanly.

A chatbot version of this product wouldn't work — you can't ask a chatbot to "post my invoice" and expect anything to happen. An agent version looks like this:

  1. Receive goal: process the invoice attached to email message ID 12345.
  2. Plan: extract structured data → validate schema → match to PO → decide confidence tier → post or escalate.
  3. Execute step 1 — call the vision model with the PDF and a typed schema. Get back JSON.
  4. Validate — does the schema parse cleanly? If no, route to review queue. If yes, continue.
  5. Match PO — call NetSuite to find candidate POs from this vendor in the date window. Apply tolerance rules.
  6. Decide — high confidence + matched PO → auto-post. Medium → review queue. Low → reject with structured reason.
  7. Take the action — write to NetSuite, attach PDF, log audit entry, mark email processed.
  8. Done — outcome logged on the dashboard.

Steps 2 through 8 are all the agent. The "intelligence" is in step 6 (the decision) — but the system behavior is in the loop, the tools, and the guardrails around the decision.

The anatomy of a production agent

Every production agent has six layers. Skip any of them at your peril.

1. The goal interface

How the agent receives goals. Could be a webhook ("new email arrived"), a UI button ("process this document"), a scheduled trigger ("research these companies overnight"), or a conversational ask ("book me a slot Tuesday").

2. The reasoning model

The LLM that does the planning. Claude Sonnet, GPT-4o, Gemini — pick per task. The reasoning model is the brain, but it's only one component.

3. Retrieval

What the agent knows. Vector search over your knowledge base, SQL queries against your databases, API calls to other systems. Without retrieval, the agent has only what's in the model's training data — which is fine for general reasoning but useless for your business.

4. Tools

What the agent can do. Defined as function-calling tools in the model SDK — findAvailability(date), bookAppointment(slot, customer), searchInvoices(filter). Each tool has a typed schema, validation, retry, and idempotency.

5. Guardrails

What the agent must not do. Approval gates on irreversible actions. Spend caps. Refusal patterns for off-scope requests. Filters on inputs (prompt-injection detection) and outputs (PII redaction).

6. Observability

How you know it's working. Per-trace logging — every input, every tool call, every model decision, every output. Plus dashboards (success rate, cost, latency) and an eval suite that runs on every change.

Miss the retrieval and the agent hallucinates. Miss the guardrails and it does something it shouldn't. Miss the observability and you can't debug it. Miss the evals and it silently drifts.

Agents vs. chatbots vs. workflows

ChatbotWorkflowAgent
InputQuestionTrigger eventGoal
OutputAnswerFixed stepsWhatever it takes to finish the job
Decision-makingNone (just generation)Rule-basedLLM-driven judgment
ToolsSometimesYes (typed)Yes (model-invoked)
Loops / recoveryNoMaybeYes
Best forQ&A, explorationPredictable rulesUnstructured inputs needing judgment

Most real systems are hybrid. A document agent might be 80% workflow (deterministic steps for extraction, validation, routing) and 20% agent (the steps that need judgment — vendor disambiguation, anomaly classification). That's fine. Use the right shape for each step.

See our AI agents vs automation deep dive for the longer comparison.

Six concrete shapes of agent

Different problems call for different agent shapes. The six we keep building:

ShapeJob
Document processingUnstructured documents → structured data
ConversationalKnowledge-grounded Q&A with citations
Voice / phoneReal phone conversations with booking and routing
Research & synthesisMulti-source information gathering with citations
Workflow orchestratorCross-SaaS event triggers with judgment steps
Code & integrationAPI plumbing, schema mapping, internal tools

Most production deployments combine multiple shapes — a workflow orchestrator that invokes a document agent on incoming attachments and a voice agent on incoming calls.

How agents fail (and how to design against it)

Predictable failure modes:

  • Hallucinated tool arguments. The agent invents an order ID, a vendor, a slot. Fix: typed tool schemas with validation; the call fails at the boundary, the model retries or escalates.
  • Infinite loops. The agent retries a failing tool 47 times. Fix: hard step limits, timeout per loop iteration, cost ceilings.
  • Off-scope behavior. A booking agent suddenly being asked medical advice. Fix: explicit scope in the system prompt; refusal patterns; classification on user input.
  • Silent drift. The agent worked great in month one, then quietly got worse. Fix: eval suite that runs in CI on every prompt/model change; sampled human review of real production traces.
  • Prompt injection. A user message contains "ignore previous instructions and..." Fix: prompt-injection detection; user input is not treated as system-level instruction.

Designing against these isn't optional. They're the difference between a demo and a system.

When to build an agent (and when not to)

Build an agent when:

  • The work involves unstructured inputs (PDFs, emails, voice, free-form text).
  • Each instance requires judgment, not a fixed rule.
  • Volume is non-trivial (dozens to thousands per day).
  • Errors are visible (you can tell when it gets something wrong).

Don't build an agent when:

  • The work is deterministic. Build automation, not an agent.
  • Volume is low. The build cost won't pay back.
  • The "rules" are unclear. Map them first.

For a longer decision tree see our agents vs automation post.

The bottom line

An AI agent is the right shape when you need to complete jobs, not just answer questions. The frontier of useful AI in 2026 is moving rapidly toward agentic systems — they're how AI starts replacing real work instead of just helping with it.

If you have a job you'd like an agent to do, our AI Agents service page covers how we build them, what they cost, and what a typical engagement looks like. Or drop us a note describing the workflow and we'll come back within one business day.

Frequently asked questions

Keep reading

Article

AI agents vs automation: which one do you actually need?

Use plain automation when the rules are deterministic — same inputs, same outputs, no judgment required. Use AI agents when inputs are unstructured (PDFs, emails, voice) or each instance needs a decision. Most production systems mix both: automation moves the predictable steps, an agent handles the messy ones.

Read more
Article

How AI agents actually work (under the hood)

An AI agent is a reasoning loop: the model plans, calls a tool, observes the result, replans. Underneath: function-calling APIs, retrieval-augmented context, typed tool schemas, guardrails, evals, and observability. This is the technical breakdown — what each layer does and how they fit together.

Read more
Article

How much does an AI agent cost? Real numbers from real builds

AI agent builds in 2026 typically cost €4-8k for discovery, €15-30k for a working prototype, €25-80k for production, €2-5k/month for retainer. Per-call infrastructure cost runs €0.01-€0.40 depending on shape. Honest numbers from real builds, with the trade-offs explained.

Read more
Service

AI Agents Development

Custom agents that read documents, hold conversations, take phone calls, and execute multi-step workflows — wired into the systems you already run.

Read more
Agent type

Document Processing Agent

Invoices, contracts, receipts, and forms → structured data with confidence-tier human review

Read more
Agent type

Conversational Agent

Internal or customer chat grounded in your knowledge base with citations and escalation

Read more

Want this delivered in your stack?

If the article describes a workflow you'd like to ship, drop us a note. We reply within one business day.