ResearcharXiv cs.AI — 12 d ago

Security and Privacy Prompts in the Wild: What Users Ask LLMs and How LLMs Respond

The study analyzes 14,727 security and privacy (S&P) prompts from a dataset of 3.2 million user-LLM conversations, categorizing them into nine S&P topics. It evaluates the response quality of commercial LLMs, such as GPT-5.5, which provided satisfactory answers on 98% of prompts, compared to open-weight models like Llama 4, which performed at 47%. The findings highlight the importance of understanding user queries in S&P contexts, as commercial models may still produce inconsistent responses, posing risks for users seeking reliable information.

llmsecurityprivacyuser-queriesrelevance 0.00 · engagement 0.00

Read at source ↗← all news