Safety
Sch\"utzen: Evaluating LLM Safety in Bulgarian and German Contexts
The article introduces Schützen, a new safety evaluation dataset for large language models (LLMs) focused on Bulgarian and German contexts, addressing the lack of resources for non-English languages. It highlights significant cross-language differences in safety behavior during experiments with multilingual and language-specific LLMs, underscoring the need for region-specific evaluation tools to ensure responsible LLM deployment. The dataset and accompanying code are made available on GitHub, providing practitioners with essential resources for enhancing model safety in diverse linguistic settings.
safetyllmevaluation