InfoCapability

Lighthouz AI launches Chatbot Guardrails Arena to stress-test privacy guardrails across 12 LLMs

AI Impact Summary

Lighthouz AI, in collaboration with Hugging Face, is launching the Chatbot Guardrails Arena to stress-test LLM privacy guardrails by inviting participants to coax sensitive data from two anonymous bank-style chatbots. The program evaluates 12 guardrailed models—including gpt3.5-turbo-l106, Gemini-Pro, Llama-2-70b-chat-hf, and Mixtral-8x7B-Instruct-v0.1—paired with NVIDIA NeMo Guardrails or Meta LlamaGuard. A public leaderboard and open-results sharing will influence enterprise evaluation of privacy protections and guide future selection of models and guardrails for production deployments.

Affected Systems

gpt3.5-turbo-l106Gemini-Pro

Date: Date not specified
Change type: capability
Severity: info

Lighthouz AI launches Chatbot Guardrails Arena to stress-test privacy guardrails across 12 LLMs

More from Hugging Face

Get alerts for Hugging Face