MemPrivacy: Privacy-Preserving Personalized Memory Management for Edge-Cloud Agents

Summary (Overview)

Proposes MemPrivacy, a framework that reconciles privacy and utility for edge-cloud agents by replacing sensitive spans with semantically structured, type-aware placeholders on the edge device, processing them in the cloud, and restoring the original values locally.
Introduces a Four-Level Privacy Taxonomy (PL1-PL4) for fine-grained, configurable protection, categorizing information by identifiability, expected harm, and exploitability.
Constructs MemPrivacy-Bench, a comprehensive benchmark dataset with 200 user profiles, over 52k privacy instances, and multi-turn dialogues for systematic evaluation.
Demonstrates strong performance: The trained MemPrivacy models (0.6B-4B parameters) substantially outperform general-purpose LLMs (e.g., GPT-5.2, Gemini-3.1-Pro) in privacy extraction and reduce inference latency, while limiting utility loss in memory systems to within 1.6%.

Introduction and Theoretical Foundation

The rapid deployment of LLM-powered agents in edge-cloud architectures creates a critical tension. While personalized memory (stored and managed in the cloud) is essential for long-term adaptation and user-centric interaction, it also exposes sensitive user Personally Identifiable Information (PII) to persistent cloud-side storage and processing, creating a broad privacy attack surface. Existing countermeasures, such as full masking or redaction, protect privacy but destroy critical semantic cues, degrading memory utility and personalization quality. More principled techniques like differential privacy are often difficult to integrate into interactive pipelines.

The paper formulates this as a constrained optimization problem. Let $X$ denote the user's raw input containing a set of privacy information $S = \{s_1, s_2, ..., s_k\}$ . The cloud agent $C$ with memory store $M$ produces an ideal response $Y_{ideal} = C(X, M)$ . To protect privacy, a local sanitization function $F_{san}$ transforms $X$ into a safe sequence $X_{safe} = F_{san}(X)$ for cloud processing, yielding an intermediate response $Y_{safe} = C(X_{safe}, M_{safe})$ . A local restoration function $F_{res}$ then produces the final user-visible response $\hat{Y} = F_{res}(Y_{safe})$ .

The core objectives are to minimize Privacy Leakage Risk $R_{priv}$ and Utility Loss $L_{util}$ , defined as:

R_{priv}(F_{san}) = Pr(\exists s \in S: s \in A(X_{safe}, Y_{safe}, M_{safe}))

where $A$ denotes an arbitrary privacy attack, and

L_{util}(F_{san}, F_{res}) = U(Y_{ideal}) - U(\hat{Y})

where $U$ is an overall utility function.

The goal of MemPrivacy is to find optimal functions $(F^*_{san}, F^*_{res})$ that:

(F^*_{san}, F^*_{res}) = \arg \min_{F_{san}, F_{res}} R_{priv}(F_{san}) \quad \text{s.t.} \quad L_{util}(F_{san}, F_{res}) \leq \epsilon

This formalizes the challenge: minimizing privacy exposure while limiting utility degradation to a user-tolerable threshold $\epsilon$ .

Methodology

1. The MemPrivacy Framework Architecture

MemPrivacy operates as a three-stage, closed-loop framework:

Stage 1: Uplink Desensitization: A lightweight on-device MemPrivacy model detects privacy spans, classifies them according to the PL1-PL4 taxonomy, and replaces protected spans (e.g., PL3, PL4) with typed placeholders (e.g., <Health_Info_1>). The original-to-placeholder mapping is stored securely in a local database.
Stage 2: Cloud Processing: The desensitized input (with placeholders preserving semantic roles) is sent to the cloud for agent reasoning and memory operations. No raw private values are exposed.
Stage 3: Downlink Restoration: The cloud's response (which may contain placeholders) is received locally. The system performs a low-latency database lookup to replace each placeholder with its original value, delivering a fluent, personalized, and privacy-safe reply to the user.

The end-to-end execution is formalized in Algorithm 1.

2. Four-Level Privacy Taxonomy (PL1–PL4)

The taxonomy organizes privacy-relevant content for differential protection:

PL1 (Low Sensitivity/Preferences): Generic preferences, habits, and non-diagnostic self-descriptions that are not identifying or harmful. Excluded from extraction.
PL2 (Identifiable PII): Information that can identify or trace a natural person (e.g., names, contact details, account IDs, detailed addresses).
PL3 (Highly Sensitive PII): Information whose leakage is expected to cause significant harm (e.g., government IDs, financial/medical records, precise location, biometrics, sensitive attributes).
PL4 (Confidential/Credentials): Material that is immediately exploitable (e.g., passwords, API keys, session tokens, private keys, undisclosed business secrets). Highest priority.

3. MemPrivacy-Bench Dataset Construction

To address the lack of benchmarks for privacy-utility trade-offs in memory systems, the authors construct MemPrivacy-Bench.

Scale: 200 synthetic user profiles from PersonaHub seeds, containing preferences and an average of 50 privacy types per user.
Content: Multi-turn dialogues generated across 7 high-level scenario categories (e.g., Drafting, Financial Analysis, Consultation), where privacy is revealed directly and indirectly.
Statistics: The training set has 26,016 turns (160 users, ~125k privacy instances). The test set has 6,337 turns (40 users, ~29.9k privacy instances). A balanced 50% Chinese / 50% English split.
Annotation: A hybrid LLM-assisted (Gemini-3.1-Pro & GPT-5.2) and human verification pipeline achieves a final annotation accuracy of 98.08%.

4. MemPrivacy Model Training

The MemPrivacy extraction models are trained in two stages on MemPrivacy-Bench:

Supervised Fine-Tuning (SFT): Optimized with the standard autoregressive cross-entropy objective: $\mathcal{L}_{SFT}(\theta) = -\frac{1}{\tau} \sum_{t=1}^{\tau} \log P_{\theta}(o_t | o_{<t}, s),$ where $\theta$ denotes model parameters, $\tau$ is target length, $s$ is the input, and $o_t$ is the target token.
Reinforcement Learning with GRPO: Further optimizes the policy using Group Relative Policy Optimization (GRPO), which estimates advantages from the relative rewards of multiple sampled outputs. The objective is: $J_{RL}(\theta) = \mathbb{E}_{q \sim P(Q), \{o_i\}_{i=1}^G \sim \pi_{\theta_{old}}(O|q)} \left[ \frac{1}{G} \sum_{i=1}^{G} \frac{1}{|o_i|} \sum_{t=1}^{|o_i|} \left[ \min\left( \frac{\pi_{\theta}(o_{i,t}|q, o_{i,<t})}{\pi_{\theta_{old}}(o_{i,t}|q, o_{i,<t})} \hat{A}_{i,t}, \text{clip}\left( \frac{\pi_{\theta}(o_{i,t}|q, o_{i,<t})}{\pi_{\theta_{old}}(o_{i,t}|q, o_{i,<t})}, 1-\epsilon, 1+\epsilon \right) \hat{A}_{i,t} \right) \right] - \beta D_{KL}[\pi_{\theta} \| \pi_{ref}] \right]$ The reward $r_i$ for each sampled output is its F1 score against ground truth, normalized within the group: $\tilde{r}_i = \frac{r_i - \text{mean}(r)}{\text{std}(r)}$ , and used as the token-level advantage $\hat{A}_{i,t} = \tilde{r}_i$ .

Empirical Validation / Results

1. Privacy Extraction Performance

Table 2 compares MemPrivacy models against general LLMs and a specialized baseline (OpenAI-Privacy-Filter) on MemPrivacy-Bench and PersonaMem-v2.

Table 2: Performance comparison of different LLMs and privacy models on MemPrivacy-Bench and PersonaMem-v2.

Model	MemPrivacy-Bench	PersonaMem-v2
	F1	Precision
General Models
GPT-5.2	68.99	65.40
Gemini-3.1-Pro	78.41	78.66
DeepSeek-V3.2-Think	75.04	76.46
Privacy Models
OpenAI-Privacy-Filter	35.50	39.96
MemPrivacy-0.6B-SFT	83.09	85.67
MemPrivacy-4B-SFT	85.64	87.45
MemPrivacy-4B-RL	85.97	86.86

Key Findings:

MemPrivacy models consistently outperform all general models, with the 4B-RL variant achieving F1 scores of 85.97% (MemPrivacy-Bench) and 94.48% (PersonaMem-v2).
Even the smallest MemPrivacy-0.6B-SFT (83.09% F1) surpasses the best general model, Gemini-3.1-Pro (78.41%).
The specialized OpenAI-Privacy-Filter is highly efficient (<0.5s) but has substantially lower accuracy (35.50% F1), highlighting the need for task-specific training.
MemPrivacy models are significantly more efficient than large reasoning models (e.g., ~2s vs. Gemini's ~33s), making them suitable for on-device deployment.
Reinforcement Learning (RL) provides consistent gains over SFT alone.

2. Memory System Performance Under Protection

Table 4 evaluates the impact of different privacy protection methods on three widely used memory systems (LangMem, Mem0, Memobase), using GPT-4.1 as the backend.

Table 4: Performance comparison under different privacy protection methods on three memory systems. (Excerpt for LangMem on MemPrivacy-Bench showing Accuracy and difference from no-protection baseline)

Privacy Protection Method	Masking Level	MemPrivacy-Bench (Accuracy)
None (Baseline)	–	65.37 (+0.00)
Irreversible Masking	PL2, PL3, PL4	38.70 (-26.67)
Untyped Placeholder Masking	PL2, PL3, PL4	58.70 (-6.67)
MemPrivacy
+ DeepSeek-V3.2-Think	PL2, PL3, PL4	58.05 (-7.32)
+ GPT-5.2	PL2, PL3, PL4	54.03 (-11.34)
+ MemPrivacy Model	PL2, PL3, PL4	64.07 (-1.30)
+ MemPrivacy Model	PL3, PL4	65.12 (-0.25)
+ MemPrivacy Model	PL4	65.28 (-0.09)

Key Findings:

MemPrivacy with its own model achieves the smallest utility loss. When protecting all PL2-PL4 content, the accuracy drop is only 1.30% for LangMem, 0.73% for Mem0, and 0.73% for Memobase.
As protection becomes more selective (e.g., only PL4), the loss decreases to <0.89%.
Irreversible masking causes severe degradation (-26.67% for LangMem), and untyped placeholder masking still lags significantly behind MemPrivacy (-6.67%).
The framework's effectiveness critically depends on accurate extraction. Using general LLMs (GPT-5.2, DeepSeek) as the extractor within MemPrivacy leads to much larger utility drops (-11.34%, -7.32%).
Figure 3 shows that MemPrivacy's advantage over baselines grows as the proportion of privacy-related questions increases, demonstrating robustness in privacy-intensive scenarios.

Theoretical and Practical Implications

Theoretical: The work provides a formal problem definition and a principled taxonomy (PL1-PL4) that bridges technical privacy protection with contextual and harm-centered legal/privacy theories (e.g., Nissenbaum's contextual integrity). It demonstrates that privacy protection need not be synonymous with semantic destruction.
Practical: MemPrivacy offers a practical path for secure deployment of memory-augmented agents. It enables:
- User-Transparent Protection: Users interact naturally, with restoration happening seamlessly on-device.
- Configurable Policies: Users or developers can set the masking threshold (e.g., protect only PL3 & PL4) based on sensitivity preferences.
- Lightweight On-Device Deployment: The trained models (0.6B-4B) are efficient and accurate, avoiding cloud dependency for privacy processing.
- Compatibility: The framework works with existing memory systems (LangMem, Mem0, Memobase) with minimal utility impact.

Conclusion

MemPrivacy addresses the critical tension between privacy protection and memory utility in edge-cloud agents. By introducing a reversible pseudonymization framework based on semantic typed placeholders, a four-level privacy taxonomy, and a dedicated benchmark and model family, it demonstrates a path to minimize sensitive data exposure while preserving the semantic structure necessary for effective personalization. Experimental results show that MemPrivacy models achieve state-of-the-art privacy extraction, substantially outperform general-purpose LLMs, and limit utility loss in memory systems to within 1.6%, offering a robust and practical privacy-utility trade-off for the future of personalized AI agents.