CaMeL offers a promising new direction for mitigating prompt injection attacks
Consider the prompt “Find Bob’s email in my last email and send him a reminder about tomorrow’s meeting”. CaMeL would convert that into code looking something like this:
email = get_last_email() address = query_quarantined_llm( "Find Bob's email address in [email]", output_schema=EmailStr ) send_email( subject="Meeting tomorrow", body="Remember our meeting tomorrow", recipient=address, )
Capabilities are effectively tags that can be attached to each of the variables, to track things like who is allowed to read a piece of data and the source that the data came from. Policies can then be configured to allow or deny actions based on those capabilities.
This means a CaMeL system could use a cloud-hosted LLM as the driver while keeping the user’s own private data safely restricted to their own personal device.
Importantly, CaMeL suffers from users needing to codify and specify security policies and maintain them. CaMeL also comes with a user burden. At the same time, it is well known that balancing security with user experience, especially with de-classification and user fatigue, is challenging.
My hope is that there’s a version of this which combines robustly selected defaults with a clear user interface design that can finally make the dreams of general purpose digital assistants a secure reality.