ResearcharXiv cs.CL — 2 d ago

Parallel Causal Associative Fields: Gated Sparse Memory for Long-Context Language Modeling

The paper introduces the Parallel Causal Associative Field (PCAF), a novel architecture for long-context language modeling that utilizes a parallel content-addressed memory to enhance efficiency and scalability. With 303M parameters and a context length of 2048, PCAF-semantic achieves a perplexity of 36.31 on WikiText-103, outperforming a matched dense Transformer while processing tokens at a rate of 0.61-0.62M tokens/s. This approach allows for sparse long-context access without the limitations of a fixed recurrent state, making it significant for practitioners aiming to optimize performance in large-scale language models.

language modelingtransformerslong-contextrelevance 0.00 · engagement 0.00

Read at source ↗← all news