Daily TMLR digest for Mar 21, 2025

9 views

Skip to first unread message

TMLR

unread,

Mar 21, 2025, 12:06:06 AM3/21/25

to tmlr-anno...@googlegroups.com

Accepted papers
===============

Title: Building Blocks for Robust and Effective Semi-Supervised Real-World Object Detection

Authors: Moussa Kassem Sbeyti, Nadja Klein, Azarm Nowzad, Fikret Sivrikaya, Sahin Albayrak

Abstract: Semi-supervised object detection (SSOD) based on pseudo-labeling significantly reduces dependence on large labeled datasets by effectively leveraging both labeled and unlabeled data. However, real-world applications of SSOD often face critical challenges, including class imbalance, label noise, and labeling errors. We present an in-depth analysis of SSOD under real-world conditions, uncovering causes of suboptimal pseudo-labeling and key trade-offs between label quality and quantity. Based on our findings, we propose four building blocks that can be seamlessly integrated into an SSOD framework. Rare Class Collage (RCC): a data augmentation method that enhances the representation of rare classes by creating collages of rare objects. Rare Class Focus (RCF): a stratified batch sampling strategy that ensures a more balanced representation of all classes during training. Ground Truth Label Correction (GLC): a label refinement method that identifies and corrects false, missing, and noisy ground truth labels by leveraging the consistency of teacher model predictions. Pseudo-Label Selection (PLS): a selection method for removing low-quality pseudo-labeled images, guided by a novel metric estimating the missing detection rate while accounting for class rarity. We validate our methods through comprehensive experiments on autonomous driving datasets, resulting in up to 6% increase in SSOD performance. Overall, our investigation and novel, data-centric, and broadly applicable building blocks enable robust and effective SSOD in complex, real-world scenarios. Code is available at https://mos-ks.github.io/publications.

URL: https://openreview.net/forum?id=vRYt8QLKqK

---

Title: Compositionality in Time Series: A Proof of Concept using Symbolic Dynamics and Compositional Data Augmentation

Authors: Michael Hagmann, Michael Staniek, Stefan Riezler

Abstract: This work investigates whether time series of natural phenomena can be understood as being generated by sequences of latent states which are ordered in systematic and regular ways. We focus on clinical time series and ask whether clinical measurements can be interpreted as being generated by meaningful physiological states whose succession follows systematic principles. Uncovering the underlying compositional structure will allow us to create synthetic data to alleviate the notorious problem of sparse and low-resource data settings in clinical time series forecasting, and deepen our understanding of clinical data.
We start by conceptualizing compositionality for time series as a property of the data generation process, and then study data-driven procedures that can reconstruct the elementary states and composition rules of this process.
We evaluate the success of this methods using two empirical tests originating from a domain adaptation perspective.
Both tests infer the similarity of the original time series distribution and the synthetic time series distribution from the similarity of expected risk of time series forecasting models trained and tested on original and synthesized data in specific ways.
Our experimental results show that the test set performance achieved by training on compositionally synthesized data is comparable to training on original clinical time series data, and that evaluation of models on compositionally synthesized test data shows similar results to evaluating on original test data.
In both experiments, performance based on compositionally synthesized data by far surpasses that based on synthetic data that were created by randomization-based data augmentation.
An additional downstream evaluation of the prediction task of sequential organ failure assessment (SOFA) scores shows significant performance gains when model training is entirely based on compositionally synthesized data compared to training on original data, with improvements increasing with the size of the synthesized training set.

URL: https://openreview.net/forum?id=msI02LXVJX

---

Title: Understanding and Robustifying Sub-domain Alignment for Domain Adaptation

Authors: Yiling Liu, Juncheng Dong, Ziyang Jiang, Ahmed Aloui, Keyu Li, Michael Hunter Klein, Vahid Tarokh, David Carlson

Abstract: In unsupervised domain adaptation (UDA), aligning source and target domains improves the predictive performance of learned models on the target domain. A common methodological improvement in alignment methods is to divide the domains and align sub-domains instead. These sub-domain-based algorithms have demonstrated great empirical success but lack theoretical support. In this work, we establish a rigorous theoretical understanding of the advantages of these methods that have the potential to enhance their overall impact on the field. Our theory uncovers that sub-domain-based methods optimize an error bound that is at least as strong as non-sub-domain-based error bounds and is empirically verified to be much stronger. Furthermore, our analysis indicates that when the marginal weights of sub-domains shift between source and target tasks, the performance of these methods may be compromised. We therefore implement an algorithm to robustify sub-domain alignment for domain adaptation under sub-domain shift, offering a valuable adaptation strategy for future sub-domain-based methods. Empirical experiments across various benchmarks validate our theoretical insights, prove the necessity for the proposed adaptation strategy, and demonstrate the algorithm's competitiveness in handling label shift.

URL: https://openreview.net/forum?id=oAzu0gzUUb

---

Title: SAFE-NID: Self-Attention with Normalizing-Flow Encodings for Network Intrusion Detection

Authors: Brian Matejek, Ashish Gehani, Nathaniel D. Bastian, Daniel J Clouse, Bradford J Kline, Susmit Jha

Abstract: Machine learning models are increasingly adopted to monitor network traffic and detect intrusions. In this work, we introduce SAFE-NID, a novel machine learning approach for real-time packet-level traffic monitoring and intrusion detection that includes a safeguard to detect zero day attacks as out-of-distribution inputs. Unlike traditional models, which falter against zero-day attacks and concept drift, SAFE-NID leverages a lightweight encoder-only transformer architecture combined with a novel normalizing flows-based safeguard. This safeguard not only quantifies uncertainty but also identifies out-of-distribution (OOD) inputs, enabling robust performance in dynamic threat landscapes. Our generative model learns class-conditional representations of the internal features of the deep neural network. We demonstrate the effectiveness of our approach by converting publicly available network flow-level intrusion datasets into packet-level ones. We release the labeled packet-level versions of these datasets with over 50 million packets each and describe the challenges in creating these datasets. We withhold from the training data certain attack categories to simulate zero-day attacks. Existing deep learning models, which achieve an accuracy of over 99% when detecting known attacks, only correctly classify 1% of the novel attacks. Our proposed transformer architecture with normalizing flows model safeguard achieves an area under the receiver operating characteristic curve of over 0.97 in detecting these novel inputs, outperforming existing combinations of neural architectures and model safeguards. The additional latency in processing each packet by the safeguard is a small fraction of the overall inference task. This dramatic improvement in detecting zero-day attacks and distribution shifts emphasizes SAFE-NID’s novelty and utility as a reliable and efficient safety monitoring tool for real-world network intrusion detection.

URL: https://openreview.net/forum?id=hDywd5AbIM

---

Title: A Unified View of Double-Weighting for Marginal Distribution Shift

Authors: José I. Segovia-Martín, Santiago Mazuelas, Anqi Liu

Abstract: Supervised classification traditionally assumes that training and testing samples are drawn from the same underlying distribution. However, practical scenarios are often affected by distribution shifts, such as covariate and label shifts. Most existing techniques for correcting distribution shifts are based on a reweighted approach that weights training samples, assigning lower relevance to the samples that are unlikely at testing. However, these methods may achieve poor performance when the weights obtained take large values at certain training samples. In addition, in multi-source cases, existing methods do not exploit complementary information among sources, and equally combine sources for all instances. In this paper, we establish a unified learning framework for distribution shift adaptation. We present a double-weighting approach to deal with distribution shifts, considering weight functions associated with both training and testing samples. For the multi-source case, the presented methods assign source-dependent weights for training and testing samples, where weights are obtained jointly using information from all sources. We also present generalization bounds for the proposed methods that show a significant increase in the effective sample size compared with existing approaches. Empirically, the proposed methods achieve enhanced classification performance in both synthetic and empirical experiments.

URL: https://openreview.net/forum?id=aPyJilTiIb

---

New submissions
===============

Title: Labeling without Seeing? Blind Annotation for Privacy-Preserving Entity Resolution

Abstract: The entity resolution problem requires finding pairs across datasets that belong to different owners but refer to the same entity in the real world. To train and evaluate solutions (either rule-based or machine-learning-based) to the entity resolution problem, generating a ground truth dataset with entity pairs or clusters is needed. However, such a data annotation process involves humans as domain oracles to review the plaintext data for all candidate record pairs from different parties, which inevitably infringes the privacy of data owners, especially in privacy-sensitive cases like medical records. To the best of our knowledge, there is no prior work on privacy-preserving ground truth labeling in the context of entity resolution. We propose a novel blind annotation protocol based on homomorphic encryption that allows domain oracles to collaboratively label ground truth without sharing data in plaintext with other parties. In addition, we design a domain-specific, user-friendly language that conceals the complex underlying homomorphic encryption circuits, making it more accessible and easier for users to adopt this technique. The empirical experiments indicate the feasibility of our privacy-preserving protocol (f-measure on average achieves more than 90\% compared with the real ground truth).

URL: https://openreview.net/forum?id=bAM8y3Hm0p

---

Title: Compressed Decentralized Momentum Stochastic Gradient Methods for Nonconvex Optimization

Abstract: In this paper, we design two compressed decentralized algorithms for solving nonconvex stochastic optimization under two different scenarios. Both algorithms adopt a momentum technique to achieve fast convergence and a message-compression technique to save communication costs. Though momentum acceleration and compressed communication have been used in literature, it is highly nontrivial to theoretically prove the effectiveness of their composition in a decentralized algorithm that can maintain the benefits of both sides, because of the need to simultaneously control the consensus error, the compression error, and the bias from the momentum gradient.

For the scenario where gradients are bounded, our proposal is a compressed decentralized adaptive method. To the best of our knowledge, this is the first decentralized adaptive stochastic gradient method with compressed communication. For the scenario of data heterogeneity without bounded gradients, our proposal is a compressed decentralized heavy-ball method, which applies a gradient tracking technique to address the challenge of data heterogeneity. Notably, both methods achieve an optimal convergence rate, and they can achieve linear speed up and adopt topology-independent algorithmic parameters within a certain regime of the user-specified error tolerance. Superior empirical performance is observed over state-of-the-art methods on training deep neural networks (DNNs) and Transformers.

URL: https://openreview.net/forum?id=RqhMQHHkB4

---

Title: On Efficient Bayesian Exploration in Model-Based Reinforcement Learning

Abstract: In this work, we address the challenge of data-efficient exploration in reinforcement learning by developing a principled, information-theoretic approach to intrinsic motivation. Specifically, we introduce a novel class of exploration bonuses that targets epistemic uncertainty rather than the aleatoric noise inherent in the environment. We prove that these bonuses naturally signal epistemic information gains and converge to zero once the agent becomes sufficiently certain about the environment’s dynamics and rewards, thereby aligning exploration with genuine knowledge gaps. To enable practical use, we also discuss tractable approximations via sparse variational Gaussian Processes, Deep Kernels and Deep Ensemble models. We then propose a Predictive Trajectory Sampling with Bayesian Exploration (PTS-BE) algorithm, which combines model-based planning with our proposed information-theoretic bonuses to achieve sample-efficient deep exploration. Empirically, we demonstrate that PTS-BE substantially outperforms other baselines across a variety of environments characterized by sparse rewards and/or purely exploratory tasks.

URL: https://openreview.net/forum?id=Na02hDWqkF

---

Title: A Survey on Verifiable Cross-Silo Federated Learning

Abstract: Federated Learning (FL) is a widespread approach that allows training machine learning (ML) models with data distributed across multiple devices. In cross-silo FL, which often appears in domains like healthcare or finance, the number of participants is moderate, and each party typically represents a well-known organization. For instance, in medicine data owners are often hospitals or data hubs which are well-established entities. However, malicious parties may still attempt to disturb the training procedure in order to obtain certain benefits, for example, a biased result or a reduction in computational load. While one can easily detect a malicious agent when data used for training is public, the problem becomes much more acute when it is necessary to maintain the privacy of the training dataset. To address this issue, there is recently growing interest in developing verifiable protocols, where one can check that parties do not deviate from the training procedure and perform computations correctly. In this paper, we present a survey on verifiable cross-silo FL. We analyze various protocols, fit them in a taxonomy, and compare their efficiency and threat models. We also analyze Zero-Knowledge Proof (ZKP) schemes and discuss how their overall cost in a FL context can be minimized. Lastly, we identify research gaps and discuss potential directions for future scientific work.

URL: https://openreview.net/forum?id=uMir8UIHST

---

Title: Open Problems in Mechanistic Interpretability

Abstract: Mechanistic interpretability aims to understand the computational mechanisms underlying neural networks' capabilities in order to accomplish concrete scientific and engineering goals. Progress in this field thus promises to provide greater assurance over AI system behavior and shed light on exciting scientific questions about the nature of intelligence. Despite recent progress toward these goals, there are many open problems in the field that require solutions before many scientific and practical benefits can be realized: Our methods require both conceptual and practical improvements to reveal deeper insights; we must figure out how best to apply our methods in pursuit of specific goals; and the field must grapple with socio-technical challenges that influence and are influenced by our work. This forward-facing review discusses the current frontier of mechanistic interpretability and the open problems that the field may benefit from prioritizing.

URL: https://openreview.net/forum?id=91H76m9Z94

---

Title: A Survey of State Representation Learning for Deep Reinforcement Learning

Abstract: Representation learning methods are an important tool for addressing the challenges posed by complex observations spaces in sequential decision making problems. Recently, many methods have used a wide variety of types of approaches for learning meaningful state representations in reinforcement learning, allowing better sample efficiency, generalization, and performance. This survey aims to provide a broad categorization of these methods within a model-free online setting, exploring how they tackle the learning of state representations differently. We categorize the methods into six main classes, detailing their mechanisms, benefits, and limitations. Through this taxonomy, our aim is to enhance the understanding of this field and provide a guide for new researchers. We also discuss techniques for assessing the quality of representations, and detail relevant future directions.

URL: https://openreview.net/forum?id=gOk34vUHtz

---

Title: Do Concept Bottleneck Models Respect Localities?

Abstract: Concept-based explainability methods use human-understandable intermediaries to produce explanations for machine learning models. These methods assume concept predictions can help understand a model's internal reasoning. In this work, we assess the degree to which such an assumption is true by analyzing whether concept predictors leverage ``relevant'' features to make predictions, a term we call locality. Concept-based models that fail to respect localities also fail to be explainable because concept predictions are based on spurious features, making the interpretation of the concept predictions vacuous. To assess whether concept-based models respect localities, we construct and use three metrics to characterize when models respect localities, complementing our analysis with theoretical results. Many concept-based models used in practice fail to respect localities because concept predictors cannot always clearly distinguish distinct concepts. Based on these findings, we propose suggestions for alleviating this issue.

URL: https://openreview.net/forum?id=4mCkRbUXOf

---

Title: FB-MOAC: A Reinforcement Learning Algorithm for Forward-Backward Markov Decision Processes

Abstract: Reinforcement learning (RL) algorithms are effective in solving problems that can be modeled as Markov decision processes (MDPs).
They primarily target forward MDPs whose dynamics evolve over time from an initial state.
However, several important problems in stochastic control and network systems, among others,
exhibit both a forward and a backward dynamics. As a consequence, they cannot be expressed as a standard MDP, thereby calling
for a novel theory for RL in this context.
Accordingly, this work introduces the concept of Forward-Backward Markov Decision Processes (FB-MDPs)
for multi-objective problems and develops a novel theoretical framework to characterize their optimal solutions.
Moreover, it introduces the FB-MOAC algorithm that employs a step-wise forward-backward mechanism to obtain optimal policies
with guaranteed convergence and a competitive rate with respect to standard approaches in RL.
FB-MOAC is finally evaluated on three use cases in the context of mathematical finance, mobile resource management,
and edge computing.
The obtained results show that FB-MOAC outperforms the state of the art across different metrics, highlighting its ability to learn and maximize rewards.

URL: https://openreview.net/forum?id=li5DyC6rfS

---

Title: Associative memory inspires improvements for in-context learning using a novel attention residual stream architecture

Abstract: Large language models (LLMs) demonstrate an impressive ability to utilise information within the context of their input sequences to appropriately respond to data unseen by the LLM during its training procedure. This ability is known as in-context learning (ICL). Humans and non-human animals demonstrate similar abilities, however their neural architectures differ substantially from LLMs. Despite this, a critical component within LLMs, the attention mechanism, resembles modern associative memory models, widely used in and influenced by the computational neuroscience community to model biological memory systems. Using this connection, we introduce an associative memory model capable of performing ICL. We use this as inspiration for a novel residual stream architecture which allows information to directly flow between attention heads. We test this architecture during training within a two-layer Transformer and show its ICL abilities manifest more quickly than without this modification. We then apply our architecture in small language models with 8 million parameters, focusing on attention head values, with results also indicating improved ICL performance at this larger and more naturalistic scale.

URL: https://openreview.net/forum?id=lcTFm4LIRR

---

Title: ModernTCN Revisited: A Reproducibility Study with Extended Benchmarks

Abstract: This study presents a reproducibility analysis of ModernTCN, a recently proposed convolutional architecture for time series analysis. ModernTCN aims to address the limitations of traditional Temporal Convolutional Networks (TCNs) by enhancing the effective receptive field (ERF) and capturing long-range dependencies. We validate the experimental setup and performance claims of the original paper, and extend the evaluation to include additional datasets and tasks, such as short-term forecasting on ETT, classification on Speech Commands and PhysioNet, and ablation studies on the cross-variable component. Our results show that while ModernTCN achieves competitive performance, its state-of-the-art claims are tempered by sensitivity to experimental settings and data handling. Furthermore, ModernTCN's performance on Speech Commands lags behind convolutional methods with global receptive fields, and it exhibits less parameter efficiency. However, ablation studies on the PhysioNet dataset confirm the importance of the cross-variable component in handling missing data. This study provides a comprehensive evaluation of ModernTCN's contributions, reproducibility, and generalizability in time series analysis.

URL: https://openreview.net/forum?id=R20kKdWmVZ

---

Title: Continuous Tensor Relaxation for Finding Diverse Solutions in Combinatorial Optimization Problems

Abstract: Finding the optimal solution is often the primary goal in combinatorial optimization (CO). However, real-world applications frequently require diverse solutions rather than a single optimum, particularly in two key scenarios. First, when directly handling constraints is challenging, penalties are incorporated into the cost function, reformulating the problem as an unconstrained CO problem. Tuning these penalties to obtain a desirable solution is often time-consuming. Second, the optimal solution may lack practical relevance when the cost function or constraints only approximate a more complex real-world problem. To address these challenges, generating (i) penalty-diversified solutions by varying penalty intensities and (ii) variation-diversified solutions with distinct structural characteristics provides valuable insights, enabling practitioners to post-select the most suitable solution for their specific needs. However, efficiently discovering these diverse solutions is more challenging than finding a single optimal one. This study introduces Continual Tensor Relaxation Annealing (CTRA), a computationally efficient framework for unsupervised-learning (UL)-based CO solvers that generates diverse solutions within a single training run. CTRA leverages representation learning and parallelization to automatically discover shared representations, substantially accelerating the search for these diverse solutions. Numerical experiments demonstrate that CTRA outperforms existing UL-based solvers in generating these diverse solutions while significantly reducing computational costs.

URL: https://openreview.net/forum?id=ix33zd5zCw

---

Title: Node Duplication Improves Cold-start Link Prediction

Abstract: Graph Neural Networks (GNNs) are prominent in graph machine learning and have shown state-of-the-art performance in Link Prediction (LP) tasks. Nonetheless, recent studies show that GNNs struggle to produce good results on low-degree nodes despite their overall strong performance. In practical applications of LP, like recommendation systems, improving performance on low-degree nodes is critical, as it amounts to tackling the cold-start problem of improving the experiences of users with few observed interactions. In this paper, we investigate improving GNNs' LP performance on low-degree nodes while preserving their performance on high-degree nodes and propose a simple yet surprisingly effective augmentation technique called NodeDup. Specifically, NodeDup duplicates low-degree nodes and creates links between nodes and their own duplicates before following the standard supervised LP training scheme. By leveraging a ``multi-view'' perspective for low-degree nodes, NodeDup shows significant LP performance improvements on low-degree nodes without compromising any performance on high-degree nodes. Additionally, as a plug-and-play augmentation module, NodeDup can be easily applied on existing GNNs with very light computational cost. Extensive experiments show that NodeDup achieves 38.49%, 13.34%, and 6.76% relative improvements on isolated, low-degree, and warm nodes, respectively, on average across all datasets compared to GNNs and the existing cold-start methods.

URL: https://openreview.net/forum?id=hIOTzz87N9

---

Title: Text-to-Image Generation Via Energy-Based CLIP

Abstract: Joint Energy Models (JEMs), while drawing significant research attention, have not been successfully scaled to real-world, high-resolution datasets. We present CLIP-JEM, a novel approach extending JEMs to the multimodal vision-language domain using CLIP, integrating both generative and discriminative objectives. For the generative one, we introduce an image-text joint-energy function based on Cosine similarity in the CLIP space, training CLIP to assign low energy to real image-caption pairs and high energy otherwise. For the discriminative one, we employ contrastive adversarial loss, extending the adversarial training objective to the multimodal domain. CLIP-JEM not only generates realistic images from text but also achieves competitive results on the compositionality benchmark, outperforming leading methods with fewer parameters. Additionally, we demonstrate the superior guidance capability of CLIP-JEM by enhancing CLIP-based generative frameworks and converting unconditional diffusion models to text-based ones. Lastly, we show that our model can serve as a more robust evaluation metric for text-to-image generative tasks than CLIP.

URL: https://openreview.net/forum?id=FBmWiJXIGk

---

Title: Learning Federated Neural Graph Databases for Answering Complex Queries from Distributed Knowledge Graphs

Abstract: The increasing demand for deep learning-based foundation models has highlighted the importance of efficient data retrieval mechanisms. Neural graph databases (NGDBs) offer a compelling solution, leveraging neural spaces to store and query graph-structured data, thereby enabling LLMs to access precise, contextually relevant information. However, current NGDBs are constrained to single-graph operation, limiting their capacity to reason across multiple, distributed graphs. Furthermore, the lack of support for multi-source graph data in existing NGDBs hinders their ability to capture the complexity and diversity of real-world data. In many applications, data is distributed across multiple sources, and the ability to reason across these sources is crucial for making informed decisions. This limitation is particularly problematic when dealing with sensitive graph data, as directly sharing and aggregating such data poses significant privacy risks. As a result, many applications that rely on NGDBs are forced to choose between compromising data privacy or sacrificing the ability to reason across multiple graphs. To address these limitations, we propose to learn Federated Neural Graph DataBases (FedNGDBs), a pioneering systematic framework that empowers privacy-preserving reasoning over multi-source graph data. FedNGDB leverages federated learning to collaboratively learn graph representations across multiple sources, enriching relationships between entities and improving the overall quality of the graph data. Unlike existing methods, FedNGDBs can handle complex graph structures and relationships, making it suitable for various downstream tasks. We evaluate FedNGDBs on three real-world datasets, demonstrating its effectiveness in retrieving relevant information from multi-source graph data while keeping sensitive information secure on local devices. Our results show that FedNGDBs can efficiently retrieve answers to cross-graph queries, making it a promising approach for LLMs and other applications that rely on efficient data retrieval mechanisms.

URL: https://openreview.net/forum?id=3K1LRetR6Y

---

Reply all

Reply to author

Forward

0 new messages