
Salesforce just published something unusual. Not a product announcement or a feature launch — a candid account of what it actually takes to build AI agents that work in production. Written by Joe Inzerillo (President, Enterprise and AI Technology) and Michael Andrew (Chief Data Officer), the piece is rooted in Salesforce’s experience as Customer Zero — deploying Agentforce internally before shipping it to customers.
The headline number: a single Engagement Agent generated more than $120 million in annualized pipeline within its first few months of operation. But the number isn’t the story. How they got there is.
Their first version of that agent could only beat the bottom 10% of human SDRs. Through disciplined measurement, iteration, and what they describe as coaching the agent the way you’d coach an employee, they improved it until it beat 90%. Then they built those learnings directly into the product.
This is the most useful piece Salesforce has published this year. And it introduces a framework — the Agent Development Lifecycle (ADLC) — that every enterprise deploying AI agents should understand.
Here’s what it means, especially for organizations that don’t run Salesforce alone.
The Six Lessons and Why They Matter
The Applied AI paper distills six principles from Salesforce’s internal agent deployments. Each one deserves attention.
1. AI agents aren’t software. Software is deterministic: same input, same output. Agents reason, interpret, and vary. Salesforce’s conclusion: managing agents is closer to managing employees than managing code. You guide them, coach them, measure them against baselines, and gradually increase their autonomy. Teams that treat agents as software deployments consistently underperform teams that treat them as a managed workforce.
2. Think in “jobs to be done,” not “jobs.” The myth of the universal agent — one agent replacing an entire human role — is exactly that. Salesforce broke the SDR role into discrete tasks (follow-up, initial qualification, product questions, meeting scheduling) and built an agent that mastered each task individually. That’s what produced the $120M result. Not trying to replace SDRs — but doing specific SDR tasks exceptionally well.
3. Measure competency like you measure employees. Entry-level agents work from tight scripts. As they prove competency, they earn more latitude. Salesforce found they often had more granular data on agent performance than on human performance — which became a mirror that improved how they measured their own teams.
4. The abundant enterprise. When the cost of intelligence drops by an order of magnitude, initiatives that were previously shelved become viable. Salesforce now has 100% coverage of expert-level Account POVs — a task that previously took 4–5 hours per account and couldn’t scale. The constraint was never strategy. It was cost structure. Agents removed it.
5. Trust through observability. Because agents are probabilistic, they drift. Hallucinations emerge. Quality degrades invisibly. Observability isn’t optional — it’s the foundation that makes everything else possible.
6. The Agent Development Lifecycle. Autonomy is granted incrementally. An agent starts with tight human oversight, demonstrates competency against human baselines, and earns independence through continuous measurement and calibration. This is the ADLC — and it’s the bridge between probabilistic AI and deterministic enterprise expectations.
The Eight Design Principles — Architecture That Enables the ADLC
A week before the Applied AI paper, Salesforce published eight design principles for the Agentic Enterprise, written by Shibani Ahuja (SVP, Enterprise IT Strategy). These principles describe the architecture that makes the ADLC work at scale:
Modularity — agents built from reusable components, not rebuilt from scratch for every use case. Metadata-driven understanding — agents need more than unified data; they need the metadata to understand what it means across systems. Unified observability — real-time visibility into what agents did, why they chose that path, and what business outcome resulted. Built-in trust — governance baked into the architecture from day one, not added later. Strategic human oversight — not every decision reviewed, not every decision autonomous. The right level of intervention at each point. Event-driven processing — agents that respond to real-time triggers across any channel. Scalable infrastructure — AI workloads are unpredictable; architecture must absorb spikes. Open ecosystems and standards — no vendor can deliver everything; interoperability through MCP, A2A, and open protocols.
These eight principles are excellent. They describe what a well-architected agentic enterprise looks like from inside a single platform.
The Ninth Principle: Boundary Governance
Here’s what I keep coming back to. Salesforce’s ADLC and eight design principles assume a coherent environment — one platform, one reasoning engine, one observability layer, one trust model.
But most enterprises don’t operate that way.
When an Agentforce agent qualifies a lead and triggers a Workato Genie to orchestrate the routing workflow, which calls a MuleSoft-governed API to pull ERP data, which feeds context to Claude for account summarization — who owns the ADLC for that chain?
Salesforce can tell you the Agentforce agent is performing at 90th percentile. Workato can tell you the Genie executed successfully. MuleSoft can tell you the API call was governed. Anthropic can tell you Claude’s response was grounded. But nobody is measuring the competency of the chain as a whole. Nobody is observing the full journey across all four platforms in a single view. Nobody is granting or revoking autonomy for the cross-platform workflow.
That’s the ninth design principle: design for boundary governance. The architecture must account for what happens when agents cross platform boundaries — where authority transfers, where data definitions diverge, where observability ends at one platform’s edge and another’s begins, and where the failure cascade crosses systems that weren’t built together.
This isn’t a criticism of Salesforce’s framework. It’s an extension of it. The eight principles are correct and necessary. The ninth acknowledges the reality that most enterprises live in.
AI Foundry and the System-Level Shift
This is also the context for understanding AI Foundry, which Salesforce AI Research announced on March 26. The initiative signals a fundamental shift from model-level to system-level AI, focused on three areas:
Simulation environments (eVerse) — virtual training grounds where agents face thousands of edge cases before reaching production. This directly supports the ADLC: you can measure agent competency against realistic business scenarios at scale.
Ambient intelligence — agents that disappear into enterprise workflows, always on but never overwhelming. Context-aware, proactive, surfacing insights just in time.
Agent-to-agent ecosystems — agents interacting across organizational boundaries with standardized protocols, guardrails, decision logging, and coordinated escalation. Salesforce is developing an enterprise multi-agent semantic layer and even defining legal frameworks for autonomous agent negotiation.
Silvio Savarese, Salesforce’s Chief Scientist, put it directly: “The problems that matter most for businesses don’t live at the model level anymore. They live at the system level, where components work together to deliver accuracy, consistency, and reliability at scale.”
For multi-platform enterprises, this is the right direction. Agent-to-agent ecosystems that work across organizational boundaries are exactly what’s needed. The question is how fast these capabilities extend beyond Salesforce’s own ecosystem to govern agents built on other platforms.
Salesforce Validates the FDE Model
One more connection worth noting. On March 18, Salesforce published a piece on Forward Deployed Engineers that reinforces something we wrote about earlier this month. Salesforce tripled its FDE team in just six months. FDE job postings saw an 800% spike between January and September 2025.
Sarah Khalid, an FDE Director at Salesforce, described the role precisely: “Implementation teams build solutions. The FDEs make sure those solutions drive value.” Ruth Hickin, VP of Agentic Workforce Strategy, framed it as “kind of like hacker meets customer value meets professional services person.”
This matters because the ADLC — treating agents like employees, coaching them, measuring competency, gradually increasing autonomy — requires someone on-site who understands the specific business context. That’s the FDE. And when the agent chain crosses platforms, the FDE is the person who sees the full picture that no single platform’s observability layer can show.
What This Means for Your Deployment
If you’re deploying AI agents in a multi-platform environment, three practical takeaways:
Adopt the ADLC for every agent, regardless of platform. The principle of graduated autonomy — start constrained, measure against human baselines, earn independence — applies whether the agent runs on Agentforce, Claude, Workato, or something custom. Define competency metrics for each agent before deployment, not after.
Measure the chain, not just the agents. Individual agent performance is necessary but insufficient. Build observability that traces the full workflow across platform boundaries. If you can’t see the complete journey in one view, you can’t manage it.
Design your authority model before you need it. The Decision Authority Matrix — which agents can act autonomously, which need human flags, which are blocked — becomes exponentially more important when agents cross platforms. An agent that’s “Auto” in Salesforce might trigger actions in your ERP that should be “Blocked.” Define this at the boundary, not inside each platform.
Looking Forward to TDX
TDX 2026 starts April 15. With the Applied AI paper, the eight design principles, and AI Foundry all published in the same two-week window, Salesforce is clearly setting the stage for deeper announcements about agent competency, system-level AI, and cross-boundary governance.
I’ll be watching for specifics on how the ADLC translates into product capabilities, how eVerse simulation environments work in practice, and whether the agent-to-agent ecosystem story extends concretely to non-Salesforce agents.
More to come. If you’re navigating multi-platform agent deployment and the ADLC framework resonates, I’d welcome the conversation.
This article is part of Incepta’s Release Intelligence series — tracking what matters across Salesforce, MuleSoft, Workato, Anthropic, and Shopify for multi-platform enterprises. For deeper context on the multi-platform challenge, read Agentforce 360 and the Multi-Platform Reality and The Space Between Platforms on LinkedIn.
Sources & References
- Salesforce — Applied AI: Lessons from Building Agents in the Enterprise
- Salesforce — 8 Design Principles for the Agentic Enterprise
- Salesforce — AI Foundry Announcement
- Salesforce — How Forward Deployed Engineers Are Proving AI Makes Tech Jobs More Human
- Salesforce — Agentforce Customer Zero
- CIO.com — Salesforce AI Research Identifies Trends Shaping Agentic AI
- Diginomica — Salesforce Pitches the Need for Enterprise Transition from Model to System Level AI
- Salesforce — Agentforce EU Cloud Code of Conduct Compliance
- TDX 2026 — Official Event Page

Parth leads Incepta's Center of Excellence across Salesforce, MuleSoft, Workato, Shopify, and enterprise AI — helping organizations build the governed integration architectures that power production-grade agentic systems. With deep expertise spanning CRM strategy, enterprise commerce, data architecture, and multi-platform integration, Parth works directly with technology leaders navigating the convergence of AI agents, cloud platforms, and digital transformation.