Before Your Team Builds an AI Agent, Read This.
Before you give an AI agent access to your team's data, answer these 5 questions. A practical checklist for evaluating AI agents before you build or buy.
Published April 21, 2026
Before Your Team Builds an AI Agent, Read This.
Every productivity tool is shipping agents. Notion, Microsoft, Google, Salesforce, your CRM, your help desk, probably your email client by next quarter. They all want you to turn them on.
Most teams will turn them on. A few will get results. The difference is not the tool. It is whether anyone asked the right questions before building.
This is that checklist.
1. What Counts as an "AI Agent" (and What Doesn't)
The word "agent" has been stretched to mean anything AI-adjacent. Cut through it with one rule:
If it runs without a human prompting it, it is an agent.
That's the whole test. A chatbot waits for you to ask. Traditional automation follows fixed rules. An agent decides what to do next on its own, uses AI models to reason about the task, and acts on a schedule or trigger.
- Chatbot: You ask a question. It answers.
- Automation (Zapier, if-this-then-that): Rule fires. Same result every time.
- Agent: Trigger fires. The agent decides what to do. Results vary based on context.
If you're evaluating an "AI agent" that only responds when you type into it, you're evaluating a chatbot. That's fine. Just know what you're buying. For the broader picture, see WTF is an AI Agent?
2. The 5 Questions Before You Build
Ask these before anyone touches a builder or signs a contract. If you cannot answer all five, do not build yet.
1. What manual workflow does this replace? Name it. Time it. "Agent that helps with customer support" is not a workflow. "Agent that triages inbound support tickets by priority label, consuming 4 hours of lead time per week" is a workflow. If you cannot name the task and estimate the hours per week it takes today, you're automating a feeling, not a process.
2. Who owns the agent when it breaks? Agents fail silently. They post wrong answers, skip steps, loop on edge cases, or quietly stop running. Assign one human to review output weekly. If nobody volunteers, the agent is not important enough to build.
3. What data can the agent access? Run a permissions audit before granting access. An agent that can read your entire workspace can also surface things it shouldn't: salary data in a shared doc, private customer notes, HR threads. Start with read-only on a single database or channel. Expand only when you've seen it work.
4. What does it cost after the trial or free tier? Every agent has a meter running somewhere. Notion Credits start May 4. Copilot is $30 per user per month. Custom builds burn API tokens per run. Model the real cost at scale: if this agent runs 200 times a week for 10 people, what's the monthly bill? The answer often changes the business case.
5. How do you measure whether it worked? Pick one before-and-after metric. Hours saved. Tickets resolved. Errors reduced. Response time. Whatever it is, write down the baseline before the agent goes live. Without a baseline, you will never know if it worked, and you will keep paying for it anyway.
3. The Governance Checklist
Four questions your ops or IT team should have answered before the first agent ships:
- Who can create agents in your workspace? If every seat can spin one up, you have shadow agents inside a month. Decide now: admin-only, a named builder role, or free-for-all with audit logging.
- What is the review process for agent output? Human-in-the-loop for the first 30 days is not overkill. It is the minimum.
- How do you shut one down if it goes wrong? Every admin should know how to disable an agent in under 60 seconds. If you can't, find out before you need to.
- Does your AI policy cover autonomous agents or just chatbots? Most corporate AI policies were written for ChatGPT. Agents are a different risk profile: they act, they persist, they can access data over time. Update the policy before you build, not after.
If your company doesn't have an AI policy yet, start with what a company AI policy actually covers.
4. Red Flags in Any Agent Pitch
When a vendor or builder pitches you on an agent, watch for these four signs:
- "Set it and forget it." No. Agents fail silently. Anything worth running is worth reviewing weekly. Vendors who skip this step are setting you up to discover problems late.
- No usage dashboard or audit trail. You need to see what the agent did, when, and with what data. If the platform can't show you, you cannot govern it.
- Unlimited workspace access by default. Good agent platforms default to least-privilege and make you grant access explicitly. Bad ones open the whole workspace on day one.
- Pricing that hides per-run costs. "Free during the beta" or "$X per seat" often masks a meter that kicks in later. Ask the billing question directly: what does a heavily-used agent cost per month?
For the broader vendor conversation, see how to evaluate AI vendors.
5. The 14-Day Pilot Template
One agent. One workflow. One metric. Fourteen days. That's the pilot.
- Pick one workflow. Must be repetitive, well-defined, and low-stakes. Triaging support tickets, weekly status recaps, CRM digests, meeting prep. Not customer-facing outbound.
- Pick one metric. Hours per week spent on the task, or errors per 100 outputs, or time-to-response. One number you can measure before and after.
- Record the baseline. Measure the metric for a week before the agent runs. Write it down.
- Run the agent for 14 days. Weekly owner reviews the output. Log failures.
- Decide. Three options: scale (roll out to the team), iterate (fix the failures, run another 14 days), or kill (shut it down and free the budget).
The discipline: no "it seems to be working" scale decisions. If the number didn't move, the agent didn't work, no matter how cool the demo was.
The Bottom Line
AI agents are useful. Most deployments are not. The difference is rarely the model and almost never the tool. It's whether anyone defined the workflow, owned the output, measured the result, and was willing to kill it if the numbers didn't move.
If your team is about to build an agent, run through this checklist first. If you can't answer the five questions, you're not ready to build. That's a good answer to land on. A 15-minute conversation now is cheaper than a 3-month deployment that nobody uses.
Subscribe to The AI Minute.
Weekly tools, prompts, and the 10% of AI news that actually matters. Under 60 seconds every Tuesday.