LGPD 2.0 and AI agents: What changes for those exposing data to language models

Brazil's LGPD is getting a significant update — and anyone using AI agents with personal data needs to pay attention.

Bill PL 2338/2023, which regulates the use of artificial intelligence in Brazil, is advancing through Congress and brings direct changes for companies exposing internal data to language models. Higher fines, joint liability between vendors and operators, and an ANPD regulatory sandbox are among the key changes.

For data teams connecting AI agents to their databases via MCP (Model Context Protocol), the regulatory landscape just got more complex — and the consequences of non-compliance, more severe.

What's changing in practice

Expanded fines

The original LGPD already provided for fines of up to 2% of gross revenue (capped at R$50 million per violation, ~$9M USD). The AI regulation expands this scope, creating specific penalties for improper use of personal data by artificial intelligence systems.

According to Sys4B analysis, the trend is for AI-related infractions to receive harsher treatment, especially when automated decisions affect data subjects' rights.

Joint liability

One of the most impactful changes: liability becomes joint between those who develop, provide, and operate AI systems. In practice, if an AI agent accesses personal data without adequate protection, both the company using the agent and the infrastructure provider can be held liable.

For companies using MCP solutions, this means the choice of infrastructure provider is a compliance decision, not just a technical one.

ANPD regulatory sandbox

The ANPD is implementing a regulatory sandbox to allow companies to test AI innovations in a controlled environment. It's a positive signal, but also indicates that the authority is preparing to enforce more strictly outside the sandbox.

Algorithmic transparency

AI systems processing personal data will need to demonstrate transparency about how data is used. This includes: which data feeds the model, how it's processed, and what decisions are made from it.

For AI agents accessing databases via MCP, the transparency chain needs to be complete: from the data in the database to the agent's response.

The gaps in generative AI tools

Recent research reveals that major generative AI tools still have significant data protection gaps:

ChatGPT, Claude, and Gemini — according to TecFlow, none of the leading tools fully meet LGPD requirements for personal data processing
Lack of retention control — data sent to LLMs may be used for training without explicit consent
No native masking — personal data is sent in plain text to the models

Inforchannel highlights that the main risk lies in the "gray zone" between what LGPD requires and what AI tools actually implement.

What this means for data teams

If your team exposes internal data to AI agents — via MCP, APIs, or any other means — LGPD 2.0 requires:

1. PII masking before the agent

Personal data (national IDs, email, phone, full name, address) must be masked before reaching the language model. You can't rely on the LLM to "not use it" — protection needs to happen at the data layer.

2. Complete audit logs

Every access to personal data via an agent needs to be recorded: who requested it, which query was executed, what data was returned, when it happened. Without audit logs, demonstrating compliance during an audit is impossible.

3. Granular access control

Not every agent needs to see all data. LGPD requires minimization — expose only what's necessary for each use case. A sales agent doesn't need to see complete financial records.

4. Documented legal basis

For each type of data exposed to the agent, there must be a clear legal basis (consent, legitimate interest, contract execution, etc.). This needs to be documented and auditable.

5. Data subject rights

Data subjects can request access, correction, deletion, and portability. Your system needs to be able to track which personal data was accessed by agents and respond to these requests.

How Surf Data addresses these requirements

At Surf Data, LGPD compliance is an architectural pillar, not an add-on feature:

Native PII masking — automatic detection of Brazilian patterns (CPF, phone, email, name) with masking applied before data reaches the agent
Immutable audit logs — every tool invocation is recorded with timestamp, agent, executed query, and result
Granular per-tool control — each SQL query exposed is a separate tool with its own masking rules
Destructive SQL blocking — DROP, DELETE, INSERT, UPDATE, and ALTER are blocked at the protocol layer
SHA-256 hashed tokens — access credentials stored as hashes, not plain text

Conclusion

LGPD 2.0 isn't a future event — it's happening now. For data teams working with AI agents, the window to comply is closing.

The good news: you don't need to build all this compliance infrastructure internally. Managed solutions built with data protection in their DNA let the team focus on what matters — defining which data to expose and how — while the platform handles protection.

The cost of non-compliance is clear: fines, joint liability, and reputational damage. The cost of compliance with the right tools is much lower than it seems.