Daily digest — 2026-06-15

AgentPLM: Agentic Protein Language Models with Reasoning-Augmented Decoding for Protein Sequence Design

AgentPLM has been introduced as a novel protein language model that integrates Reasoning-Augmented Decoding (RAD) and Contrastive Agent Policy Optimisation (CAPO) to enhance protein sequence design by allowing real-time consultation of external biophysical feedback. It outperforms existing passive models on benchmark tasks, achieving state-of-the-art results in antibody optimization and other applications, demonstrating improved hit rates and online error correction capabilities. This advancement is significant for practitioners as it enables more adaptive and efficient protein design processes, potentially leading to better therapeutic candidates.

arXiv cs.AI — 54 d agoAgents

Parthenon Law: A Self-Evolving Legal-Agent Framework

The article introduces \textsc{Parthenon}, a self-evolving legal-agent framework designed to enhance the performance of legal-domain large language models (LLMs) by addressing key challenges in the deployment of legal agents. It features a large-scale empirical study with $12,510$ agent trajectories demonstrating that while model accuracy improves with stronger models, matter completion remains inadequate. The framework incorporates a learning loop that allows agents to refine their skills and knowledge based on past performance, facilitating continuous improvement without altering model weights, which is crucial for practitioners aiming to build reliable legal AI systems.

arXiv cs.AI — 54 d agoAgents

Adaptive Teacher Exposure for Self-Distillation in LLM Reasoning

The paper introduces Adaptive Teacher Exposure for Self-Distillation (ATESD), a novel approach that optimizes the exposure of a teacher model during on-policy self-distillation to enhance reasoning in large language models (LLMs). ATESD utilizes a learnable Beta-policy controller to dynamically adjust the teacher's exposure to reference reasoning, leading to improved performance on benchmarks AIME 24, AIME 25, and HMMT 25 with Qwen3 models (1.7B, 4B, and 8B parameters), achieving significant gains over existing self-distillation and reinforcement learning methods. This work highlights the importance of adaptive exposure strategies in training LLMs, providing practitioners with a new mechanism to fine-tune model training and improve reasoning capabilities.

arXiv cs.AI — 54 d agoTraining

The day in AI, distilled.

AgentPLM: Agentic Protein Language Models with Reasoning-Augmented Decoding for Protein Sequence Design

Parthenon Law: A Self-Evolving Legal-Agent Framework

Adaptive Teacher Exposure for Self-Distillation in LLM Reasoning

Models & Releases

Training & Optimization

Safety & Security

Practical Impact