Research
Who Wrote the Book? Detecting and Attributing LLM Ghostwriters
The paper introduces GhostWriteBench, a dataset for evaluating authorship attribution in long-form texts generated by large language models (LLMs), specifically targeting texts over 50,000 words. It also presents TRACE, a novel and interpretable fingerprinting method that captures token-level transition patterns and demonstrates state-of-the-art performance in out-of-distribution scenarios, even with limited training data. This work is significant for practitioners as it provides new tools for identifying LLM-generated content, enhancing transparency and accountability in AI-generated literature.
llmauthorship attributionghostwriting