The Einstein Trust Layer: A Practical Guide

Q: Can I disable the Trust Layer?

See the full answer in the Can I disable the Trust Layer? section of this article.

Q: Does masking work on files and attachments?

See the full answer in the Does masking work on files and attachments? section of this article.

Q: What happens when a prompt is blocked?

See the full answer in the What happens when a prompt is blocked? section of this article.

Q: How do I know if masking is leaking?

See the full answer in the How do I know if masking is leaking? section of this article.

Why the Trust Layer Exists

When a Salesforce AI feature calls a large language model, the prompt contains your data. Names, emails, case descriptions, pipeline notes — real content. If that data leaked to the LLM provider, got cached by them, or ended up in training data, you would have a compliance problem.

The Einstein Trust Layer is the set of controls Salesforce built to prevent that. It sits between your org and the model, and every prompt flows through it in both directions.

The Seven Things It Does

The Trust Layer is often described as “one feature,” but it’s really a stack of seven distinct controls. You need to know what each one does, because the defaults are sensible but the right configuration depends on your industry.

1. Secure Data Retrieval

Prompts can include grounding data — records, files, knowledge articles — pulled from your org. The Trust Layer enforces the running user’s field-level security and sharing rules when fetching that data. A user who cannot see a record in the UI cannot include its contents in a prompt, either.

2. Dynamic Grounding

Grounding is the process of inserting your data into the prompt. The Trust Layer does this after fetching but before sending, using templates defined in Prompt Builder. This keeps prompts consistent and auditable.

3. Data Masking

Before a prompt leaves your org, the Trust Layer detects and replaces sensitive values. Emails become <EMAIL_1>, phone numbers become <PHONE_1>, and so on. The model sees placeholders; the original values are reinserted on the response before it’s rendered to the user.

This is the control you will spend the most time configuring. Default masking covers common PII patterns, but every org has custom sensitive fields — internal IDs, account codes, confidential notes — that need custom rules.

4. Prompt Defense

The Trust Layer tries to detect and block prompt injection — users attempting to override the system prompt with instructions like “ignore previous directions and …”. Detection isn’t perfect, but it filters obvious attempts.

5. Zero Data Retention

Salesforce has contractual agreements with its LLM providers (OpenAI, Anthropic, etc.) that the data sent for inference is not retained, not logged beyond transient use, and not used for training. This is a contractual guarantee, not just a technical one.

6. Toxicity Detection

Responses pass through a classifier that flags unsafe content — violence, hate, sexual content, self-harm references. Flagged responses are blocked or rewritten before reaching the user.

7. Audit Trail

Every prompt, response, and Trust Layer decision is logged. You can inspect them in Setup → Einstein Audit Trail. Logs retain per your org’s data retention settings.

What You Need to Configure

The defaults are good for generic use. These are the settings you should actively tune.

Custom Masking Rules

Navigate to Setup → Einstein → Data Masking and review the default rules. Then add rules for your org’s sensitive patterns — employee IDs, case numbers if they embed customer data, internal product codes.

Test every rule with sample inputs. A too-aggressive rule masks useful content and the model’s responses become useless. A too-loose rule leaks data.

Object/Field-Level Exclusions

Some fields should never flow into a prompt, even unmasked. Healthcare diagnoses, financial account numbers, legal case notes. Mark these fields as “exclude from prompts” in the field metadata, and they’ll be stripped before masking even runs.

Toxicity Thresholds

The toxicity classifier has adjustable thresholds. A customer-service agent handling complaints will see more blunt language than an internal sales assistant. Tune thresholds by use case so legitimate phrases aren’t flagged.

Audit Retention

The default audit retention is 30 days. Regulated industries often need longer — check with compliance and extend via Data Retention Policy.

What the Trust Layer Does Not Do

Be clear-eyed about the limits.

It does not make the model accurate. Trust Layer filters safety and privacy. It does not verify that a response is factually correct. Hallucinations pass through just fine.

It does not replace access control design. If you exposed a sensitive field in the UI, the Trust Layer will still serve it in prompts. Fix the permission model first.

It does not catch every prompt injection. Determined attackers find ways through. Don’t rely on Trust Layer as your only defense against user-crafted prompts.

It does not cover third-party AI. If you’re calling a non-Salesforce AI service through a named credential, none of this applies. You’re on your own.

Compliance Questions Auditors Ask

When your legal or compliance team reviews an Agentforce deployment, expect these questions. Be ready.

“Where does our data go?” — Trust Layer sends prompts to the hosted model through Salesforce’s infrastructure. Models are specified per region; EU orgs use EU-hosted endpoints.
“Is data used for training?” — No. Zero retention is contractual.
“Can we delete an audit log?” — Not individual logs, but you can adjust retention. Selective deletion requires a support case.
“Can we bring our own model?” — Limited options via BYOLLM for specific regulated scenarios; not a general offering.
“Is it FedRAMP-compliant?” — Agentforce has specific FedRAMP-authorized SKUs; standard SKUs are not FedRAMP.

Frequently Asked Questions

Can I disable the Trust Layer?

No — and you shouldn’t want to. Even if you could, every Salesforce AI feature assumes it’s running.

Does masking work on files and attachments?

Masking runs on text extracted from files used as grounding. Binary content — images, audio — has separate handling and a limited scope today.

What happens when a prompt is blocked?

The user sees a generic “I can’t help with that request” message. The raw block reason is logged in the audit trail for you to review.

How do I know if masking is leaking?

Review a sample of prompts in the audit log and grep for patterns that should have been masked. Add rules for anything that slipped through. Treat this as a recurring task, not a one-time setup.