🤗 Daily Paper Newsletter |
 |
Hope you found some gems! |
This newsletter delivers you the curated list of papers by 🤗 Daily Papers. |
|
|
|
|
|
![]() |
Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty |
Published at 2025-07-22 |
#ML
|
This study presents RLCR, a method for training language models that not only enhances their accuracy but also ensures their confidence estimates are reliable. RLCR improves calibration without sacrificing accuracy, outperforming traditional RL training and post-hoc confidence scoring, and can further enhance performance via confidence-weighted scaling methods.... |
Read More |
|
|
![]() |
Met^2Net: A Decoupled Two-Stage Spatio-Temporal Forecasting Model for Complex Meteorological Systems |
Published at 2025-07-23 |
#ML
|
The paper presents a new method called Met^2Net that improves the accuracy of weather prediction for complex systems by using two separate stages of training for different variables, and a self-attention mechanism to better understand the relationships between variables, resulting in significant reductions in prediction errors.... |
Read More |
|
|
|
![]() |
ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment |
Published at 2025-07-25 |
#ML
|
The study presents a new framework called ScenePainter to create consistent and immersive 3D view sequences for long-term video synthesis and 3D scene reconstruction. This framework addresses the problem of semantic drift in generated scenes by aligning the outpainter's scene-specific prior with the current scene comprehension, using a hierarchical graph structure called SceneConceptGraph to construct relations among multi-level scene concepts.... |
Read More |
|
|
![]() |
UloRL:An Ultra-Long Output Reinforcement Learning Approach for Advancing Large Language Models' Reasoning Abilities |
Published at 2025-07-25 |
#ML
|
The authors propose a new method called UloRL to improve the reasoning abilities of large language models, particularly when generating ultra-long outputs. They do this by dividing the output into shorter segments and using dynamic masking to prevent issues during training, resulting in faster training and better performance on various tasks compared to a larger model.... |
Read More |
|
|
|
![]() |
Agentic Reinforced Policy Optimization |
Published at 2025-07-26 |
#ML
|
The authors present a new algorithm called Agentic Reinforced Policy Optimization (ARPO) that improves large language models' (LLMs) performance in multi-turn tool interactions by balancing their long-term reasoning abilities and tool usage proficiency. ARPO's entropy-based adaptive rollout mechanism promotes exploration in uncertain steps after tool usage, and it outperforms existing trajectory-level RL algorithms across various benchmarks, requiring only half the tool-use budget.... |
Read More |
|
|
![]() |
ForCenNet: Foreground-Centric Network for Document Image Rectification |
Published at 2025-07-26 |
#ML
|
The study presents a new method called Foreground-Centric Network (ForCenNet) to correct geometric distortions in document images, which outperforms existing techniques by focusing on foreground elements and using a curvature consistency loss to better understand distorted geometric distributions.... |
Read More |
|
|
|
![]() |
Region-based Cluster Discrimination for Visual Representation Learning |
Published at 2025-07-26 |
#ML
|
The authors present a new method called RICE that improves region-based visual understanding and OCR capabilities, specifically addressing the limitations of current vision-language contrastive models in dense prediction tasks. RICE uses a large-scale candidate region dataset, a Region Transformer layer, and a unified region cluster discrimination loss to enhance object and OCR learning, outperforming previous methods in various tasks and being publicly available for use.... |
Read More |
|
|
![]() |
Diversity-Enhanced Reasoning for Subjective Questions |
Published at 2025-07-27 |
#ML
|
This study presents a new method called MultiRole-R1 that improves the performance of large reasoning models on subjective questions by incorporating multiple perspectives. By using unsupervised data construction and reinforcement learning with diversity as a reward signal, MultiRole-R1 enhances both the accuracy and diversity of subjective reasoning tasks, demonstrating its effectiveness on six benchmarks.... |
Read More |
|
|
|
![]() |
A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence |
Published at 2025-07-28 |
#ML
|
The study presents a comprehensive review of self-evolving agents, focusing on their ability to adapt and evolve in real-time, contrasting them with the static nature of Large Language Models. The survey outlines a structured framework for understanding and designing self-evolving agents, discussing evolutionary mechanisms, adaptation methods, and evaluation metrics, while also highlighting applications and challenges in the pursuit of Artificial Super Intelligence.... |
Read More |
|
|
![]() |
ARC-Hunyuan-Video-7B: Structured Video Comprehension of Real-World Shorts |
Published at 2025-07-28 |
#ML
|
The research presents a new multimodal model, ARC-Hunyuan-Video, which can effectively understand real-world short videos by processing their visual, audio, and textual signals. This model excels in tasks like video captioning, summarization, and question answering, and has shown significant improvements in user engagement and satisfaction when deployed in real-world applications.... |
Read More |
|
|
|
![]() |
GPT-IMAGE-EDIT-1.5M: A Million-Scale, GPT-Generated Image Dataset |
Published at 2025-07-28 |
#ML
|
The researchers created a publicly accessible, large-scale image-editing dataset called GPT-IMAGE-EDIT-1.5M, which contains over 1.5 million high-quality image editing examples. They used GPT-4o to improve three popular image-editing datasets and fine-tuned open-source models on their new dataset, resulting in better image editing performance that outperforms previous open-source methods and comes close to proprietary models.... |
Read More |
|
|
![]() |
GenoMAS: A Multi-Agent Framework for Scientific Discovery via Code-Driven Gene Expression Analysis |
Published at 2025-07-28 |
#ML
|
GenoMAS is a new framework that uses a team of LLM-based agents to analyze gene expression data. This system integrates structured workflows with autonomous agents to handle complex and large datasets, outperforming previous methods and providing biologically plausible results.... |
Read More |
|
|
|
![]() |
Geometric-Mean Policy Optimization |
Published at 2025-07-28 |
#ML
|
This study presents Geometric-Mean Policy Optimization (GMPO), a more stable alternative to a recent method called Group Relative Policy Optimization (GRPO) that enhances the reasoning abilities of large language models. GMPO improves stability by optimizing the geometric mean of token-level rewards instead of the arithmetic mean, and it outperforms GRPO on various benchmarks, including mathematical and multimodal reasoning tasks.... |
Read More |
|
|
![]() |
JAM: A Tiny Flow-based Song Generator with Fine-grained Controllability and Aesthetic Alignment |
Published at 2025-07-28 |
#ML
|
The JAM model is a flow-based song generator that provides fine-grained control over word timing and duration, and improves song quality through aesthetic alignment without manual data annotations. The model is evaluated using the public JAME dataset, and it outperforms existing lyrics-to-song models in music-specific attributes.... |
Read More |
|
|
|
![]() |
Music Arena: Live Evaluation for Text-to-Music |
Published at 2025-07-28 |
#ML
|
Music Arena is a new platform for live evaluation of text-to-music models, allowing users to input prompts and compare system outputs. It offers a standardized, transparent, and music-specific approach to gathering human preferences, filling a gap in the current TTM evaluation methods.... |
Read More |
|
|
![]() |
Reconstructing 4D Spatial Intelligence: A Survey |
Published at 2025-07-28 |
#ML
|
This study proposes a new method to categorize techniques for reconstructing 4D spatial intelligence, or understanding moving 3D scenes, into five levels ranging from basic 3D attributes to incorporating physical laws. The authors also discuss challenges and potential advancements in each level, and provide a resource page for tracking developments in the field.... |
Read More |
|
|
|
![]() |
Rep-MTL: Unleashing the Power of Representation-level Task Saliency for Multi-Task Learning |
Published at 2025-07-28 |
#ML
|
The authors propose Rep-MTL, a Multi-Task Learning approach that uses representation-level task saliency to improve interactions between task-specific optimization and shared representation learning. By balancing task-specific learning and cross-task sharing, Rep-MTL mitigates negative transfer and promotes complementary information sharing, resulting in competitive performance gains on various benchmarks.... |
Read More |
|
|
![]() |
SAND-Math: Using LLMs to Generate Novel, Difficult and Useful Mathematics Questions and Answers |
Published at 2025-07-28 |
#ML
|
The authors have developed a pipeline called SAND-Math to generate high-quality, complex mathematical problems and solutions, which significantly improves the performance of mathematical reasoning LLMs. By increasing the difficulty of the problems, they achieved a 17.85 absolute points boost in performance on the AIME25 benchmark compared to the next-best synthetic dataset.... |
Read More |
|
|
|
![]() |
SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment |
Published at 2025-07-28 |
#ML
|
The study presents SmallThinker, a new family of large language models optimized for local device deployment, addressing issues like limited memory and slow storage. The authors introduce innovative techniques such as sparse structures and a pre-attention router to reduce computational demands and storage latency, enabling the models to outperform larger LLMs on consumer CPUs.... |
Read More |
|
|
|
|
Tags are generated by Google's Gemini Pro API, and the summary and translation are generated by Upstage's SOLAR mini chat model derived from SOLAR-10.7B open LLM.
(Experimental) The full paper is translated in korean with enko-t5-small-v0 model developed by Kim Kihyun. |
Visit Developer's Social Media |
|
|
|
|
|