Daily TMLR digest for Apr 12, 2025

1 view
Skip to first unread message

TMLR

unread,
Apr 12, 2025, 12:06:06 AM4/12/25
to tmlr-anno...@googlegroups.com

Accepted papers
===============


Title: Dynamic Pricing in the Linear Valuation Model using Shape Constraints

Authors: Daniele Bracale, Moulinath Banerjee, Yuekai Sun, Salam Turki, Kevin Stoll

Abstract: We propose a shape-constrained approach to dynamic pricing for censored data in the linear valuation model eliminating the need for tuning parameters commonly required by existing methods. Previous works have addressed the challenge of unknown market noise distribution $F_0$ using strategies ranging from kernel methods to reinforcement learning algorithms, such as bandit techniques and upper confidence bounds (UCB), under the assumption that $F_0$ satisfies Lipschitz (or stronger) conditions. In contrast, our method relies on isotonic regression under the weaker assumption that $F_0$ is $\alpha$-H\"older continuous for some $\alpha \in (0,1]$, for which we derive a regret upper bound. Simulations and experiments with real-world data obtained by Welltower Inc (a major healthcare Real Estate Investment Trust) consistently demonstrate that our method attains lower empirical regret in comparison to several existing methods in the literature while offering the advantage of being tuning-parameter free.

URL: https://openreview.net/forum?id=uKZ0R4IQaO

---

Title: Rank Suggestion in Non-negative Matrix Factorization: Residual Sensitivity to Initial Conditions (RSIC)

Authors: Marc A. Tunnell, Zachary DeBruine, Erin Carrier

Abstract: Determining the appropriate rank in Non-negative Matrix Factorization (NMF) is a critical challenge that often requires extensive parameter tuning and domain-specific knowledge. Traditional methods for rank determination focus on identifying a single optimal rank, which may not capture the complex structure inherent in real-world datasets. In this study, we introduce a novel approach called Residual Sensitivity to Intial Conditions (RSIC) that suggests potentially multiple ranks of interest by analyzing the sensitivity of the relative residuals (e.g., relative reconstruction error) to different initializations. By computing the Mean Coordinatewise Interquartile Range (MCI) of the residuals across multiple random initializations, our method identifies regions where the NMF solutions are less sensitive to initial conditions and potentially more meaningful. We evaluate RSIC on a diverse set of datasets, including single-cell gene expression data, image data, and text data, and compare it against current state-of-the-art rank determination methods. Our experiments demonstrate that RSIC effectively identifies relevant ranks consistent with the underlying structure of the data, outperforming traditional methods in scenarios where they are computationally infeasible or less accurate. This approach provides a more scalable and generalizable solution for rank determination in NMF that does not rely on domain-specific knowledge or assumptions.

URL: https://openreview.net/forum?id=9Xj5w4DX0t

---

Title: Downstream Task Guided Masking Learning in Masked Autoencoders Using Multi-Level Optimization

Authors: Han Guo, Ramtin Hosseini, Ruiyi Zhang, Sai Ashish Somayajula, Ranak Roy Chowdhury, Rajesh K. Gupta, Pengtao Xie

Abstract: Masked Autoencoder (MAE) is a notable method for self-supervised pretraining in visual representation learning. It operates by randomly masking image patches and reconstructing these masked patches using the unmasked ones. A key limitation of MAE lies in its disregard for the varying informativeness of different patches, as it uniformly selects patches to mask. To overcome this, some approaches propose masking based on patch informativeness. However, these methods often do not consider the specific requirements of downstream tasks, potentially leading to suboptimal representations for these tasks. In response, we introduce the Multi-level Optimized Mask Autoencoder (MLO-MAE), a novel framework that leverages end-to-end feedback from downstream tasks to learn an optimal masking strategy during pretraining. Our experimental findings highlight MLO-MAE's significant advancements in visual representation learning. Compared to existing methods, it demonstrates remarkable improvements across diverse datasets and tasks, showcasing its adaptability and efficiency. Our code is available at https://github.com/Alexiland/MLO-MAE.

URL: https://openreview.net/forum?id=cFmmaxkD5A

---


New submissions
===============


Title: Rollout Total Correlation for Deep Reinforcement Learning

Abstract: Learning task-relevant representations is crucial for reinforcement learning. Recent approaches aim to learn such representations by improving the temporal consistency in the observed transitions. However, they only consider individual transitions and can fail to achieve long-term consistency. Instead, we argue that capturing aspects of the state that correlate with other states and actions of the trajectory---even more distant in the future---could further help in extracting task-relevant information. Hence, in this paper we investigate how to learn representations by maximizing the rollout total correlation, the correlation among all learned representations and actions within the trajectories produced by the agent. For improving rollout total correlation, we propose to combine two complementary lower bounds based on a generative and a discriminative model, combined with a simple and effective technique of chunk-wise mini-batching. Furthermore, we propose an intrinsic reward based on the learned representation for better exploration. Experimental evaluations on a set of challenging image-based simulated control tasks show that our method achieves better sample efficiency, and robustness to both white noise and natural video backgrounds compared to leading baselines.

URL: https://openreview.net/forum?id=qTdRJAL8Li

---

Title: Influential Bandits: Pulling an Arm May Change the Environment

Abstract: While classical formulations of multi-armed bandit problems assume that each arm's reward is independent and stationary, real-world applications often involve non-stationary environments and interdependencies between arms. In particular, selecting one arm may influence the future rewards of other arms, a scenario not adequately captured by existing models such as rotting bandits or restless bandits. To address this limitation, we propose the influential bandit problem, which models inter-arm interactions through an unknown, symmetric, positive semi-definite interaction matrix that governs the dynamics of arm losses. We formally define this problem and establish two regret lower bounds, including a superlinear $\Omega(T^2 / \log^2 T)$ bound for the standard UCB algorithm and an algorithm-independent $\Omega(T)$ bound, which highlight the inherent difficulty of the setting. We then introduce a new algorithm based on a lower confidence bound (LCB) estimator tailored to the structure of the loss dynamics. Under mild assumptions, our algorithm achieves a regret of $O(KT \log T)$, which is nearly optimal in terms of its dependence on the time horizon. The algorithm is simple to implement and computationally efficient. Empirical evaluations on both synthetic and real-world datasets demonstrate the presence of inter-arm influence and confirm the superior performance of our method compared to conventional bandit algorithms.

URL: https://openreview.net/forum?id=YNKaDfYbY3

---

Title: Collaboration with Dynamic Open Ad Hoc Team via Team State Modelling

Abstract: Open ad hoc teamwork presents the challenging problem of designing an autonomous agent that can rapidly adapt to collaborate with teammates without prior coordination in an open environment. Existing methods primarily rely on fixed, predefined teammate types, overlooking the fact that teammates may change dynamically. To address this limitation, we propose a novel reinforcement learning approach, the Open Online Teammate Adaptation Framework (Open-OTAF), which enables a controlled agent to collaborate with dynamic teammates in open ad hoc environments. To achieve this, the controlled agent employs a dual teamwork situation inference model to capture the current teamwork state, facilitating decision-making under partial observability. To handle the dynamic nature of teammate types, we first introduce a Chinese Restaurant Process-based model to categorize diverse teammate policies into distinct clusters, improving the efficiency of identifying teamwork situations. Next, to model heterogeneous agent relationships and accommodate a variable number of teammates, we represent the team as a heterogeneous graph and leverage heterogeneous graph attention neural networks to learn the representation of the teamwork situation. Extensive experiments across four challenging multi-agent benchmark tasks—Level-Based Foraging, Wolf-Pack, Cooperative Navigation, and FortAttack—demonstrate that our method successfully enables dynamic teamwork in open ad hoc settings. Open-OTAF outperforms state-of-the-art methods, achieving superior performance with faster convergence.

URL: https://openreview.net/forum?id=BukMU42P3G

---

Title: Counting Hours, Counting Losses: The Toll of Unpredictable Work Schedules on Financial Security

Abstract: Financial instability is a pressing concern in the United States, with drivers that include growing employment disparities and insufficient wages. While research typically focuses on financial aspects such as income inequality in precarious work environments, there is a tendency to overlook the time-related aspect of unstable work schedules. The inability to rely on a consistent work schedule not only leads to burnout and conflicts between work and family life but also results in financial shocks that directly impact workers' income and assets. Unforeseen fluctuations in earnings pose challenges in financial planning, affecting decisions regarding savings and spending, and ultimately undermining individuals' long-term financial stability and well-being. This issue is particularly evident in sectors where workers experience frequently changing schedules without sufficient notice. The lack of advance notice disproportionately affects vulnerable groups, including those in the food service and retail sectors, part-time and hourly workers, individuals with lower incomes and education levels, and specific racial groups. These groups are already more financially vulnerable, and the unpredictable nature of their work schedules exacerbates their financial fragility.

Our objective in this study is to understand how unforeseen fluctuations in earnings exacerbate financial fragility by investigating the extent to which individuals' financial management depends on their ability to anticipate and plan for future events. To address this question, we develop an online learning approach in which individuals adapt their consumption strategies over time in response to financial uncertainty and evolving information. This approach forms the basis of our simulation framework, which models how workers manage consumption in the face of variable work schedules and the imperative to avoid financial ruin.

With this framework, we demonstrate both theoretically and empirically how a worker's capacity to anticipate schedule changes enhances their long-term utility. Conversely, the inability to predict future events can worsen workers' financial instability. Moreover, our framework enables us to explore interventions aimed at mitigating the problem of schedule uncertainty and evaluate their effectiveness.

URL: https://openreview.net/forum?id=PEZz2i9kiP

---

Title: Synthesizing Minority Samples for Long-tailed Classification via Distribution Matching

Abstract: In many real-world applications, deep neural networks (DNNs) often perform poorly on datasets with long-tailed distributions. To address this issue, a promising approach is to propose an optimization objective to transform real majority samples into synthetic minority samples. However, this objective is designed only from the classification perspective. To this end, we propose a novel framework that synthesizes minority samples from the majority by considering both classification and distribution matching. Specifically, our method adjusts the distribution of synthetic minority samples to closely align with that of the true minority class, while enforcing the synthetic samples to learn more generalizable and discriminative features of the minority class. Experimental results on several standard benchmark datasets demonstrate the effectiveness of our method in both long-tailed classification and synthesizing high-quality synthetic minority samples.

URL: https://openreview.net/forum?id=VqLe8tPbZn

---

Title: Emergent Corpus Pre-training Benefits Vision Language Models

Abstract: Vision-Language Pre-trained Models (VL-PTMs) have achieved impressive performance across a wide range of tasks, but their success often hinges on access to large-scale multimodal datasets. While effective in high-resource settings, these models tend to struggle in data-scarce regimes. In this work, we investigate Emergent Communication (EC) as a mechanism to improve sample efficiency in VL-PTMs. We pre-train a Vision-Language Model (VLM) using EC tokens generated through a referential game between two artificial agents. Across three diverse cross-modal matching and reasoning benchmarks, EC pretraining yields substantial gains, improving Visual Referring Expression (VRE) accuracy by 108.6% and Visual Entailment (VE) by 69.6%. To further validate the the effectiveness of EC pretraining, we introduce LLaVA-1.5-EC, a LLaVA variant trained entirely on EC tokens. LLaVA-1.5-EC outperforms strong LVLM baselines, including BLIP-2 (13B), achieving relative gains of 104.23% on VizWiz, 34.8% on GQA, and 10.8% on VQAv2, and top performance on MMBench, a challenging instruction-following benchmark. These results highlight the transferability and generalization capacity of EC pretraining and underscore the potential of leveraging grounded EC tokens to enhance vision-language reasoning in low-resource settings, especially in settings with limited natural language data. We discuss implications and propose avenues for future research to explore the connections between EC and VL for multimodal understanding and effective human-machine communication. Code and data are available at anonymized link.

URL: https://openreview.net/forum?id=bivKGSaXkD

---

Title: Stochastic Block Model-Aware Topological Neural Networks for Graph Link Prediction

Abstract: Link prediction is an important learning task for graph-structured data and is indispensable to understanding graphs' properties. Recent works focus on designing complicated graph neural networks (GNNs) architectures to explore and capture various pairwise interactions among graph nodes. Most GNNs are based on combining graph structural and node feature information by iterative message-passing schemes. However, despite GNNs revolutionizing the field of graph representation learning, some thorny questions are raised concerning whether GNNs can efficiently learn the edge probabilities based on topological structures (i.e., higher-order interactions) and node features, and provide statistically rigorous uncertainty estimates. In this paper, we tackle these challenges and propose a novel stochastic block model (SBM)-aware topological neural networks, called SBM-TNN, that uses SBMs to infer the latent community structure of nodes from graph structures and uses persistent homology to encode higher-order information. Furthermore, we theoretically study the entrywise bound and asymptotic normality of the estimated edge probability matrix to quantify the uncertainty in statistical inference of the edge probabilities. Our extensive experiments for link prediction on both graphs and knowledge graphs show that SBM-TNN achieves state-of-the-art performance over a set of popular baseline methods.

URL: https://openreview.net/forum?id=FBjVSPAsgs

---

Title: A stochastic gradient descent algorithm with random search directions

Abstract: Stochastic coordinate descent algorithms are efficient methods in which each iterate is obtained by fixing most coordinates at their values from the current iteration, and approximately minimizing the objective with respect to the remaining coordinates. However, this approach is usually restricted to canonical basis vectors of $\mathbb{R}^d$. In this paper, we develop a new class of stochastic gradient descent algorithms with random search directions which uses the directional derivative of the gradient estimate following more general random vectors. We establish the almost sure convergence of these algorithms with decreasing step. We further investigate their central limit theorem and pay particular attention to analyze the impact of the search distributions on the asymptotic covariance matrix. We also provide non-asymptotic $\mathbb{L}^p$ rates of convergence.

URL: https://openreview.net/forum?id=npER8AaLSb

---

Title: Regularization-based Framework for Quantization-, Fault- and Variability-Aware Training

Abstract: Efficient inference is critical for deploying deep learning models on edge AI devices. Low-bit quantization (e.g., 3- and 4-bit) with fixed-point arithmetic improves efficiency, while low-power memory technologies like analog nonvolatile memory enable further gains. However, these methods introduce non-ideal hardware behavior, including bit faults and device-to-device variability. We propose a regularization-based quantization-aware training (QAT) framework that supports fixed, learnable step-size, and learnable non-uniform quantization, achieving competitive results on CIFAR-10 and ImageNet. Our method also extends to Spiking Neural Networks (SNNs), demonstrating strong performance on 4-bit networks on CIFAR10-DVS and N-Caltech 101. Beyond quantization, our framework enables fault- and variability-aware fine-tuning, mitigating stuck-at faults (fixed weight bits) and device resistance variability. Compared to prior fault-aware training, our approach significantly improves performance recovery under upto 20% bit-fault rate and 40% device-to-device variability. Our results establish a generalizable framework for quantization and robustness-aware training, enhancing efficiency and reliability in low-power, non-ideal hardware.

URL: https://openreview.net/forum?id=6CRQbAH7by

---

Reply all
Reply to author
Forward
0 new messages