🤗 Daily Paper(2025-11-12)

1 view

Skip to first unread message

deep.di...@gmail.com

unread,

Nov 12, 2025, 3:07:32 PM (6 days ago) Nov 12

to hf-daily-pap...@googlegroups.com

🤗 Daily Paper Newsletter

Hope you found some gems!

This newsletter delivers you the curated list of papers by 🤗 Daily Papers.

project page

🤗 daily paper

KLASS: KL-Guided Fast Inference in Masked Diffusion Models

Published at 2025-11-07

#ML

The authors present a new method called KLASS that speeds up the generation process in masked diffusion models by using token-level KL divergence to identify stable predictions, without needing additional model training. This results in faster inference times and improved performance on reasoning benchmarks, as well as demonstrating effectiveness across various domains such as text, image, and molecular generation....

Optimizing Diversity and Quality through Base-Aligned Model Collaboration

Published at 2025-11-07

#ML

The authors present a new method called BACo that improves diversity and quality of large language model outputs during inference by combining a base LLM with its aligned version. BACo uses routing strategies to determine which model to decode from at each token, based on prediction uncertainty and semantic role, and consistently outperforms other methods in various tasks and metrics, achieving a 21.3% joint improvement in diversity and quality....

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Published at 2025-11-08

#ML

Researchers developed VibeThinker-1.5B, a small 1.5B-parameter model, using a new method called Spectrum-to-Signal Principle. This model, trained for only $7,800, demonstrates reasoning abilities on par with or better than larger, more expensive models, suggesting that small models can be just as capable as large ones....

VideoSSR: Video Self-Supervised Reinforcement Learning

Published at 2025-11-09

#ML

This research explores utilizing the inherent information in videos to create high-quality training data for multimodal large language models (MLLM) without manual annotation. The authors propose three self-supervised pretext tasks and a new dataset, introducing a reinforcement learning framework called VideoSSR, which significantly improves video understanding in MLLMs, outperforming existing methods by an average of over 5%....

Walking the Tightrope of LLMs for Software Development: A Practitioners' Perspective

Published at 2025-11-09

#ML

This study explores the impact of Large Language Models (LLMs) on software development from a practitioner's perspective. Through interviews and analysis, the researchers highlight both the benefits (like improved productivity and entrepreneurship) and drawbacks (such as potential negative effects on developers' personalities and reputations) of using LLMs in software development....

Beyond English: Toward Inclusive and Scalable Multilingual Machine Translation with LLMs

Published at 2025-11-10

#ML

The authors present LMT, a suite of large-scale multilingual translation models that cover 60 languages and 234 translation directions, focusing on both Chinese and English. They address the issue of directional degeneration in multilingual translation models and propose strategies to improve translation quality, achieving state-of-the-art performance among comparable models....

Grounding Computer Use Agents on Human Demonstrations

Published at 2025-11-10

#ML

The authors present a new large-scale dataset called GroundCUA for training computer-use agents, which includes over 56,000 screenshots and 3.56 million human-verified annotations from expert demonstrations. They then introduce the GroundNext family of models that use this dataset to accurately map natural language instructions to on-screen elements, outperforming previous models while requiring less data....

Wasm: A Pipeline for Constructing Structured Arabic Interleaved Multimodal Corpora

Published at 2025-11-10

#ML

The authors describe a new method to create a high-quality Arabic multimodal dataset that keeps the structure of web content, which is better for training language and multimodal models compared to existing Arabic datasets. They also share the dataset and the processing pipeline with the public to help future research....

Adaptive Multi-Agent Response Refinement in Conversational Systems

Published at 2025-11-11

#ML

The study presents a new method for improving conversational systems by using a multi-agent framework, where each agent focuses on a specific aspect of conversation quality such as factuality, personalization, and coherence. The agents communicate and coordinate dynamically to enhance the overall response, resulting in better performance compared to existing approaches, especially in tasks requiring knowledge or personalization....

BiCA: Effective Biomedical Dense Retrieval with Citation-Aware Hard Negatives

Published at 2025-11-11

#ML

The authors present BiCA, a method for improving biomedical dense retrieval by using citation-aware hard negatives, which are documents referenced in the source document. By fine-tuning GTE models with these negatives, they achieve better zero-shot dense retrieval results and outperform baselines on specific benchmarks, demonstrating an efficient way to adapt to new domains....

DynaAct: Large Language Model Reasoning with Dynamic Action Spaces

Published at 2025-11-11

#ML

The authors propose a new framework called DynaAct to automatically create a manageable action space for improving sequential decision-making in complex problems. By using large language models to estimate action spaces and a submodular function to evaluate and select candidate actions, DynaAct significantly enhances performance on various benchmarks without adding much computational cost....

Intelligence per Watt: Measuring Intelligence Efficiency of Local AI

Published at 2025-11-11

#ML

This study proposes a new metric, intelligence per watt, to evaluate the efficiency and capability of local AI models on power-constrained devices. By analyzing 20+ state-of-the-art local AI models and 8 accelerators, the research demonstrates that local AI can accurately answer 88.7% of real-world queries and significantly reduce energy consumption compared to cloud-based models....

The Path Not Taken: RLVR Provably Learns Off the Principals

Published at 2025-11-11

#ML

The study explores why Reinforcement Learning with Verifiable Rewards (RLVR) improves large language models with minimal parameter changes. It introduces the Three-Gate Theory to explain how RLVR focuses on specific parameter regions, resulting in consistent performance gains without significantly altering the model's spectrum, unlike Supervised Fine-Tuning (SFT). The research highlights RLVR's unique optimization regime and suggests a need for new, geometry-aware learning algorithms tailored fo...

Published at

Tags are generated by Google's Gemini Pro API, and the summary and translation are generated by Upstage's SOLAR mini chat model derived from SOLAR-10.7B open LLM.

(Experimental) The full paper is translated in korean with enko-t5-small-v0 model developed by Kim Kihyun.

Visit Developer's Social Media

Reply all

Reply to author

Forward

0 new messages