ModelsReddit r/LocalLLaMA — 14 d ago

Kwai-Keye/Keye-VL-2.0-30B-A3B-GGUF · Hugging Face

Kwai-Keye has released Keye-VL-2.0-30B-A3B, a 30 billion parameter model designed for advanced long-video understanding and multimodal agent capabilities. It features a DSA-native long-context architecture that utilizes sparse attention for efficient processing of hour-long videos, and it leads benchmarks in video comprehension while offering robust post-training mechanisms to enhance reasoning and reduce hallucinations. This model is significant for practitioners as it integrates agent functionalities that support complex tasks like tool usage and web searches, marking a notable advancement in multimodal AI capabilities.

keyehugging_facevideo_understandingrelevance 0.00 · engagement 0.00

Read at source ↗← all news