Lighthouz AI launches Chatbot Guardrails Arena to stress-test privacy guardrails across 12 LLMs
AI Impact Summary
Lighthouz AI, in collaboration with Hugging Face, is launching the Chatbot Guardrails Arena to stress-test LLM privacy guardrails by inviting participants to coax sensitive data from two anonymous bank-style chatbots. The program evaluates 12 guardrailed models—including gpt3.5-turbo-l106, Gemini-Pro, Llama-2-70b-chat-hf, and Mixtral-8x7B-Instruct-v0.1—paired with NVIDIA NeMo Guardrails or Meta LlamaGuard. A public leaderboard and open-results sharing will influence enterprise evaluation of privacy protections and guide future selection of models and guardrails for production deployments.
Affected Systems
- Date
- Date not specified
- Change type
- capability
- Severity
- info