Daily TMLR digest for Jun 22, 2024

0 views

Skip to first unread message

TMLR

unread,

Jun 22, 2024, 12:00:37 AM (9 days ago) Jun 22

to tmlr-anno...@googlegroups.com

Accepted papers
===============

Title: Learning Network Granger causality using Graph Prior Knowledge

Authors: Lucas Zoroddu, Pierre Humbert, Laurent Oudre

Abstract: Understanding the relationships among multiple entities through Granger causality graphs
within multivariate time series data is crucial across various domains, including economics,
finance, neurosciences, and genetics. Despite its broad utility, accurately estimating Granger
causality graphs in high-dimensional scenarios with few samples remains a persistent chal-
lenge. In response, this study introduces a novel model that leverages prior knowledge in
the form of a noisy undirected graph to facilitate the learning of Granger causality graphs,
while assuming sparsity. In this study we introduce an optimization problem, we propose
to solve it with an alternative minimization approach and we proved the convergence of
our fitting algorithm, highlighting its effectiveness. Furthermore, we present experimental
results derived from both synthetic and real-world datasets. These results clearly illustrate
the advantages of our proposed method over existing alternatives, particularly in situations
where few samples are available. By incorporating prior knowledge and emphasizing spar-
sity, our approach offers a promising solution to the complex problem of estimating Granger
causality graphs in high-dimensional, data-scarce environments.

URL: https://openreview.net/forum?id=DN6sut5fyR

---

Title: Best-of-Both-Worlds Linear Contextual Bandits

Authors: Masahiro Kato, Shinji Ito

Abstract: This study investigates the problem of $K$-armed linear contextual bandits, an instance of the multi-armed bandit problem, under an adversarial corruption. At each round, a decision-maker observes an independent and identically distributed context and then selects an arm based on the context and past observations. After selecting an arm, the decision-maker incurs a loss corresponding to the selected arm. The decision-maker aims to minimize the cumulative loss over the trial. The goal of this study is to develop a strategy that is effective in both stochastic and adversarial environments, with theoretical guarantees. We first formulate the problem by introducing a novel setting of bandits with adversarial corruption, referred to as the contextual adversarial regime with a self-bounding constraint. We assume linear models for the relationship between the loss and the context. Then, we propose a strategy that extends the {\tt RealLinExp3} by \citet{Neu2020} and the Follow-The-Regularized-Leader (FTRL). The regret of our proposed algorithm is shown to be upper-bounded by $O\left(\min\left\{\frac{(\log(T))^3}{\Delta_{*}} + \sqrt{\frac{C(\log(T))^3}{\Delta_{*}}},\ \ \sqrt{T}(\log(T))^2\right\}\right)$, where $T \in\mathbb{N}$ is the number of rounds, $\Delta_{*} > 0$ is the constant minimum gap between the best and suboptimal arms for any context, and $C\in[0, T] $ is an adversarial corruption parameter. This regret upper bound implies $O\left(\frac{(\log(T))^3}{\Delta_{*}}\right)$ in a stochastic environment and by $O\left( \sqrt{T}(\log(T))^2\right)$ in an adversarial environment. We refer to our strategy as the {\tt Best-of-Both-Worlds (BoBW) RealFTRL}, due to its theoretical guarantees in both stochastic and adversarial regimes.

URL: https://openreview.net/forum?id=aIG2RAtNuX

---

Title: Spike Accumulation Forwarding for Effective Training of Spiking Neural Networks

Authors: Ryuji Saiin, Tomoya Shirakawa, Sota Yoshihara, Yoshihide Sawada, Hiroyuki Kusumoto

Abstract: In this article, we propose a new paradigm for training spiking neural networks (SNNs), spike accumulation forwarding (SAF). It is known that SNNs are energy-efficient but difficult to train. Consequently, many researchers have proposed various methods to solve this problem, among which online training through time (OTTT) is a method that allows inferring at each time step while suppressing the memory cost. However, to compute efficiently on GPUs, OTTT requires operations with spike trains and weighted summation of spike trains during forwarding. In addition, OTTT has shown a relationship with the Spike Representation, an alternative training method, though theoretical agreement with Spike Representation has yet to be proven. Our proposed method can solve these problems; namely, SAF can halve the number of operations during the forward process, and it can be theoretically proven that SAF is consistent with the Spike Representation and OTTT, respectively. Furthermore, we confirmed the above contents through experiments and showed that it is possible to reduce memory and training time while maintaining accuracy.

URL: https://openreview.net/forum?id=RGQsUQDAd9

---

Title: Learning the essential in less than 2k additional weights - a simple approach to improve image classification stability under corruptions

Authors: Kai Bäuerle, Patrick Müller, Syed Muhammad Kazim, Ivo Ihrke, Margret Keuper

Abstract: The performance of image classification on well-known benchmarks such as ImageNet is remarkable, but in safety-critical situations, the accuracy often drops significantly under adverse conditions. To counteract these performance drops, we propose a very simple modification to the models: we pre-pend a single, dimension preserving convolutional layer with a large linear kernel whose purpose it is to extract the information that is essential for image classification. We show that our simple modification can increase the robustness against common corruptions significantly, especially for corruptions of high severity. We demonstrate the impact of our channel-specific layers on ImageNet-100 and ImageNette classification tasks and show an increase of up to 30% accuracy on corrupted data in the top1 accuracy. Further, we conduct a set of designed experiments to qualify the conditions for our findings. Our main result is that a data- and network-dependent linear subspace carries the most important classification information (the essential), which our proposed pre-processing layer approximately identifies for most corruptions, and at very low cost.

URL: https://openreview.net/forum?id=i2SuGWtIIm

---

New submissions
===============

Title: PASS: Pruning Attention Heads with Almost-sure Sparsity Targets

Abstract: Transformer models have been widely used to obtain high accuracy values in multiple fields including natural language processing (NLP), computer vision, and more. This superior performance typically comes at the expense of substantial computational overhead. Multi-head attention is the key factor in the success of Transformer models that has been found to be computationally expensive. Significant research effort has been devoted to improving attention compute efficiency by pruning redundant attention heads. A widely adopted paradigm is to jointly learn a set of gate variables and apply thresholds on gate values to prune heads. Previous work shows a high level of sensitivity to threshold tuning which can limit subnetwork performance and prevent them from wider adoption in practice. We propose the notion of almost-sure sparsity to overcome this limitation and develop a generic framework for Pruning with Almost-Sure Sparsity (PASS) targets over attention heads. To further boost efficiency, we design a novel technique, concentrator, based on which we develop PASSCONC (PASS with CONCentrator). We also present a simple-yet-effective strategy to further improve subnetwork performance by clipping and selectively reopening learned gates. We investigate PASS and PASSCONC on two widely studied architectures: encoder-decoder (ED) Transformer and encoder-only Transformer (e.g., BERT). Experiments on IWSLT14 German-to-English translation and GLUE benchmark tasks demonstrate that our approaches outperform the SOTA by achieving up to 1.33 higher BLEU scores, 1.44% higher accuracy, and 60% higher attention speedups.

URL: https://openreview.net/forum?id=S4duStTKGL

---

Title: RLHF Workflow: From Reward Modeling to Online RLHF

Abstract: We present the workflow of Online Iterative Reinforcement Learning from Human Feedback (RLHF) in this technical report, which is widely reported to outperform its offline counterpart by a large margin in the recent large language model (LLM) literature. However, existing open-source RLHF projects are still largely confined to the offline learning setting. In this technical report, we aim to fill in this gap and provide a detailed recipe that is easy to reproduce for online iterative RLHF. In particular, since online human feedback is usually infeasible for open-source communities with limited resources, we start by constructing preference models using a diverse set of open-source datasets and use the constructed proxy preference model to approximate human feedback. Then, we discuss the theoretical insights and algorithmic principles behind online iterative RLHF, followed by a detailed practical implementation. Our trained LLM achieves impressive performance on LLM chatbot benchmarks, including AlpacaEval-2, Arena-Hard, and MT-Bench, as well as other academic benchmarks such as HumanEval and TruthfulQA. We have shown that supervised fine-tuning (SFT) and iterative RLHF can obtain state-of-the-art performance with fully open-source datasets. Further, we have made our models, curated datasets, and comprehensive step-by-step code guidebooks publicly available.

URL: https://openreview.net/forum?id=a13aYUU9eU

---

Reply all

Reply to author

Forward

0 new messages