🤗 Daily Paper Newsletter |
 |
Hope you found some gems! |
This newsletter delivers you the curated list of papers by 🤗 Daily Papers. |
|
|
|
|
|
![]() |
Drawing2CAD: Sequence-to-Sequence Learning for CAD Generation from Vector Drawings |
Published at 2025-08-26 |
#ML
|
The study presents a new approach to convert 2D vector drawings into parametric CAD models, which is essential in engineering design but has been overlooked. They propose a framework called Drawing2CAD, which uses a sequence-to-sequence learning problem to maintain geometric precision and design intent, and introduce a new dataset, CAD-VGDrawing, to train and evaluate their method.... |
Read More |
|
|
![]() |
Video-MTR: Reinforced Multi-Turn Reasoning for Long Video Understanding |
Published at 2025-08-28 |
#ML
|
The authors present a new method called Video-MTR for understanding long videos, which uses a multi-step reasoning process to select important video segments and comprehend questions. This approach, which does not require external tools, allows for more accurate and efficient analysis of long videos compared to existing methods.... |
Read More |
|
|
|
![]() |
DeepResearch Arena: The First Exam of LLMs' Research Abilities via Seminar-Grounded Tasks |
Published at 2025-09-01 |
#ML
|
The DeepResearch Arena benchmark is created using a Multi-Agent Hierarchical Task Generation system to evaluate the research abilities of LLMs. This benchmark, based on academic seminars, contains over 10,000 high-quality research tasks from 12 disciplines, and current state-of-the-art agents struggle with its challenges.... |
Read More |
|
|
![]() |
Drivel-ology: Challenging LLMs with Interpreting Nonsense with Depth |
Published at 2025-09-03 |
#ML
|
A new linguistic phenomenon called Drivelology, which consists of syntactically coherent but pragmatically paradoxical or emotionally loaded expressions, is introduced to test large language models. The study reveals that these models struggle with understanding the implicit meanings in Drivelological text, highlighting a gap in their pragmatic comprehension.... |
Read More |
|
|
|
![]() |
Delta Activations: A Representation for Finetuned Large Language Models |
Published at 2025-09-04 |
#ML
|
The authors propose a new method called Delta Activations to represent fine-tuned large language models as vector embeddings, making it easier to organize and understand these models based on their tasks and domains. This approach has desirable properties, such as being robust and additive, and can also be used for model selection and merging, ultimately helping to improve the reuse of publicly available models.... |
Read More |
|
|
![]() |
Durian: Dual Reference-guided Portrait Animation with Attribute Transfer |
Published at 2025-09-04 |
#ML
|
The authors propose Durian, a method for creating animated portrait videos with transferred attributes from a reference image to a target portrait without needing specific training. Durian uses dual reference networks to maintain high quality and consistency of transferred attributes across frames, and a mask expansion strategy for varying spatial extents of attributes. The model outperforms existing methods in portrait animation with attribute transfer.... |
Read More |
|
|
|
![]() |
False Sense of Security: Why Probing-based Malicious Input Detection Fails to Generalize |
Published at 2025-09-04 |
#ML
|
The study examines the effectiveness of probing-based methods for detecting harmful inputs in Large Language Models and finds that these methods rely on superficial patterns like instructional phrases and trigger words, which leads to poor performance in real-world scenarios. The research highlights the need for improved model designs and evaluation protocols to ensure responsible further research in this area.... |
Read More |
|
|
![]() |
Few-step Flow for 3D Generation via Marginal-Data Transport Distillation |
Published at 2025-09-04 |
#ML
|
The study presents a new framework called MDT-dist for accelerating 3D flow generation by reducing the number of sampling steps required. The framework introduces two new objectives, VM and VD, to improve the optimization process and achieve significant speedup without sacrificing image quality.... |
Read More |
|
|
|
![]() |
From Editor to Dense Geometry Estimator |
Published at 2025-09-04 |
#ML
|
This study compares fine-tuning image editing models and text-to-image generative models for dense geometry estimation, finding that editing models perform better due to their structural priors. The research then introduces FE2E, a framework that uses an advanced editing model to estimate dense geometry, resulting in significant performance improvements without additional training data.... |
Read More |
|
|
![]() |
Inverse IFEval: Can LLMs Unlearn Stubborn Training Conventions to Follow Real Instructions? |
Published at 2025-09-04 |
#ML
|
The study presents Inverse IFEval, a benchmark measuring Large Language Models' ability to ignore learned biases and follow instructions that contradict their training. The benchmark, which includes challenges like Question Correction and Counterfactual Answering, highlights the need for models to be adaptable and reliable in diverse, real-world situations.... |
Read More |
|
|
|
![]() |
NER Retriever: Zero-Shot Named Entity Retrieval with Type-Aware Embeddings |
Published at 2025-09-04 |
#ML
|
The researchers developed a new method called NER Retriever for retrieving documents mentioning specific types of entities without using predefined entity types or fine-tuned models. They showed that this method, which uses internal representations from large language models and a lightweight contrastive projection network, is more effective than existing lexical and dense sentence-level retrieval methods for entity retrieval tasks.... |
Read More |
|
|
![]() |
Towards a Unified View of Large Language Model Post-Training |
Published at 2025-09-04 |
#ML
|
This study presents a unified framework that combines two main methods for improving language models: reinforcement learning and supervised fine-tuning. The framework, called Hybrid Post-Training, dynamically chooses the best training signals, leading to better performance across different model sizes and types on various benchmarks.... |
Read More |
|
|
|
![]() |
Transition Models: Rethinking the Generative Learning Objective |
Published at 2025-09-04 |
#ML
|
This study presents a new generative model called Transition Models (TiM) that overcomes the trade-off between computational cost and output quality in generative modeling. TiM achieves state-of-the-art performance with only 865M parameters, outperforming larger models like SD3.5 and FLUX.1, and improves quality with more generation steps, unlike previous few-step generators.... |
Read More |
|
|
|
|
Tags are generated by Google's Gemini Pro API, and the summary and translation are generated by Upstage's SOLAR mini chat model derived from SOLAR-10.7B open LLM.
(Experimental) The full paper is translated in korean with enko-t5-small-v0 model developed by Kim Kihyun. |
Visit Developer's Social Media |
|
|
|
|
|