The Salesforce + AI Stack That Actually Works in 2026

There are a lot of ways to combine AI with Salesforce right now. Some of them are mature and battle-tested. Some of them are half-built demos. Most of the confusion in the market comes from not knowing which is which.

Here’s the clear-headed version, based on what we’ve shipped in production.

The short version

For AI inside Salesforce (inside the platform, using Salesforce-provided tools): Agentforce and Einstein Copilot are good and getting better, but they’re best for chat interfaces and sales-rep-facing assistants. They are not great for deep automation or custom integrations.

For AI around Salesforce (external automations that read from and write to Salesforce): Claude Code + MCP servers is the dominant stack. It’s flexible, programmable, and handles the long-tail cases Agentforce can’t touch.

Most real projects use both.

Tier 1: the tools that actually work

Claude Code (Anthropic)

The base layer. A CLI tool that reads your codebase, edits files, runs commands, and talks to MCP servers. It’s how we build. Everything downstream is a pattern applied through Claude Code.

Salesforce MCP Server

The Model Context Protocol server that lets Claude (or any MCP-compatible client) talk to a Salesforce org — query SOQL, describe objects, read/write records, execute anonymous Apex, run tools. This is the connective tissue. If you haven’t set it up, start there.

Playwright + Salesforce sandbox

For end-to-end QA automation and UI-level testing. Playwright drives a real browser against a sandbox org, logs in, clicks through flows, and validates behavior the way a real user would. Combined with Claude-generated test cases, this is how we ship QA Agent Team for clients.

Supabase (or any Postgres + pgvector)

For vector storage when you need RAG over Salesforce Knowledge articles, internal docs, or historical Case data. Supabase is the easiest to set up. Pinecone works too but is heavier than you need for most Salesforce use cases.

Resend (or SendGrid)

For transactional email triggered by Salesforce events. Lightweight, reliable, free tier handles most small-team volumes. We use Resend for the contact form on this site (see: Pages Function + Resend setup).

Tier 2: worth watching

Agentforce

Salesforce’s native AI agent platform. It’s genuinely useful for sales-rep-facing chat experiences, internal Q&A over Knowledge, and structured conversations with guardrails. Limitations: it’s slower to iterate on than Claude Code, the customization story is improving but still clunky, and it works best inside the Salesforce UI rather than as a headless API layer.

Use it when: you need a chatbot that lives inside Salesforce, has access to your data, and doesn’t require heavy customization.

Einstein Copilot + Prompt Builder

Useful for generating records with AI assistance, like drafting emails or summarizing accounts. The Prompt Builder UI has gotten significantly better in 2026. Still best for admin-friendly use cases rather than engineering-heavy ones.

Data Cloud + Einstein Trust Layer

Necessary if you’re processing regulated data (health, finance, legal) and need to keep AI calls inside Salesforce’s compliance boundary. For most small and mid-market use cases, regular API calls to Anthropic or OpenAI from a Cloudflare Worker are fine.

Tier 3: hype, avoid, or wait

Generic “build a Salesforce agent with our low-code tool” products

Every month there’s a new startup promising a drag-and-drop AI agent builder for Salesforce. They demo beautifully. In production, they fall over the moment your use case gets specific. Don’t build critical automation on these. Use them for prototypes only.

”RAG over your whole org” products

Sounds great. In practice, the signal-to-noise ratio on Salesforce metadata is terrible, and most of these products produce confident wrong answers. A focused RAG over your Knowledge Base? Yes. A universal RAG over everything in your org? Wait another year.

Fine-tuned models for Salesforce-specific tasks

Fine-tuning a model for Salesforce patterns is usually the wrong tool. Modern frontier models (Claude Opus, GPT-5, etc.) already know Salesforce well enough. Prompt engineering and retrieval beat fine-tuning for 95% of use cases, and they’re an order of magnitude cheaper.

The pattern that ships 80% of real projects

Here’s the stack we reach for on most client projects:

Claude Code as the build environment
Salesforce MCP for org-level read/write
Cloudflare Workers or Pages Functions as the runtime for scheduled jobs and webhooks
Supabase for storage and vector search
Resend for outbound email
Playwright for UI-level test coverage
Regular Apex, Flow, and LWC for anything that has to live inside the org

We use Agentforce and Einstein where they make sense (mostly rep-facing chat and Einstein Next Best Action recommendations). We skip Data Cloud unless compliance requires it. We never build on low-code agent platforms for anything we expect to run for more than 3 months.

What this means for you

If you’re an admin or architect picking an AI strategy for your Salesforce org right now:

Don’t wait for the “perfect” platform. The tools available today are good enough to build real automation.
Start with one high-leverage use case. Lead triage, document intake, or QA automation. Not a sprawling “AI transformation.”
Keep humans in the loop. Every AI automation should have a review step for at least the first 3 months.
Write it as code, not as clicks. Low-code AI builders are great for prototypes and terrible for production.

If you want us to build one of these for your org, get in touch. Every engagement starts with a 30-minute strategy call where we map your highest-leverage automation opportunities.