Daily TMLR digest for Jun 27, 2024

0 views
Skip to first unread message

TMLR

unread,
Jun 27, 2024, 12:00:07 AM (4 days ago) Jun 27
to tmlr-anno...@googlegroups.com

Accepted papers
===============


Title: Choosing the parameter of the Fermat distance: navigating geometry and noise

Authors: Frederic Chazal, Laure Ferraris, Pablo Groisman, Matthieu Jonckheere, Frederic Pascal, Facundo Fabián Sapienza

Abstract: The Fermat distance has been recently established as a valuable tool for machine learning tasks when a natural distance is not directly available to the practitioner or to improve the results given by Euclidean distances by exploiting the geometrical and statistical properties of the dataset. This distance depends on a parameter $\alpha$ that significantly affects the performance of subsequent tasks. Ideally, the value of $\alpha$ should be large enough to navigate the geometric intricacies inherent to the problem. At the same time, it should remain restrained enough to avoid any deleterious effects stemming from noise during the distance estimation process.
We study both theoretically and through simulations how to select this parameter.

URL: https://openreview.net/forum?id=jDRNEoxVc7

---


New submissions
===============


Title: On the Convergence Rates of Federated Q-Learning across Heterogeneous Environments

Abstract: Large-scale multi-agent systems are often deployed across wide geographic areas, where agents interact with heterogeneous environments. There is an emerging interest in understanding the role of heterogeneity in the performance of the federated versions of classic reinforcement learning algorithms. In this paper, we study synchronous federated Q-learning, which aims to learn an optimal Q-function by having $K$ agents average their local Q-estimates per $E$ iterations. We observe an interesting phenomenon on the convergence speeds in terms of $K$ and $E$. Similar to the homogeneous environment settings, there is a linear speed-up concerning $K$ in reducing the errors that arise from sampling randomness. Yet, in sharp contrast to the homogeneous settings, $E>1$ leads to significant performance degradation. Specifically, we provide a fine-grained characterization of the error evolution in the presence of environmental heterogeneity, which decay to zero as the number of iterations $T$ increases. The slow convergence of having $E>1$ turns out to be fundamental rather than an artifact of our analysis. We prove that, for a wide range of stepsizes, the $\ell_{\infty}$ norm of the error cannot decay faster than $\Theta (E/T)$. In addition, our experiments demonstrate that the convergence exhibits an interesting two-phase phenomenon. For any given stepsize, there is a sharp phase-transition of the convergence: the error decays rapidly in the beginning yet later bounces up and stabilizes. Provided that the phase-transition time can be estimated, choosing different stepsizes for the two phases leads to faster overall convergence.

URL: https://openreview.net/forum?id=jPMJYlJc4j

---

Title: A Survey on Large Language Models for Critical Societal Domains: Finance, Healthcare, and Law

Abstract: In the fast-evolving domain of artificial intelligence, large language models (LLMs) such as GPT-3 and GPT-4 are revolutionizing the landscapes of finance, healthcare, and law: domains characterized by their reliance on professional expertise, challenging data acquisition, high-stakes, and stringent regulatory compliance. This survey offers a detailed exploration of the methodologies, applications, challenges, and forward-looking opportunities of LLMs within these high-stakes sectors. We highlight the instrumental role of LLMs in enhancing diagnostic and treatment methodologies in healthcare, innovating financial analytics, and refining legal interpretation and compliance strategies. Moreover, we critically examine the ethics for LLM applications in these fields, pointing out the existing ethical concerns and the need for transparent, fair, and robust AI systems that respect regulatory norms. By presenting a thorough review of current literature and practical applications, we showcase the transformative impact of LLMs, and outline the imperative for interdisciplinary cooperation, methodological advancements, and ethical vigilance. Through this lens, we aim to spark dialogue and inspire future research dedicated to maximizing the benefits of LLMs while mitigating their risks in these precision-dependent sectors. To facilitate future research on LLMs in these critical societal domains, we also initiate a reading list that tracks the latest advancements under this topic, which will be released and continually updated.

URL: https://openreview.net/forum?id=upAWnMgpnH

---

Title: Sparse Neural Architectures and Deterministic Ramanujan Graphs

Abstract: We present a sparsely connected, neural network architecture constructed using the theory of Ramanujan graphs which provide comparable performance to a dense network. The deterministic Ramanujan graphs occur either as Cayley graphs of certain algebraic groups or as Ramanujan $r$-coverings of the full $(k,l)$ bi-regular bipartite graph on $k + l$ vertices. The bipartite graphs represent the convolution and the fully connected layers retaining desirable structural properties like path connectivity and symmetricity. The method is novel as a zero-shot, data independent, deterministic pruning at initialization technique. The approach helps in early identification of winning lottery tickets, unlike previous techniques which typically determine them in an iterative fashion. We demonstrate experimentally that the proposed architecture provides competitive accuracy and sparsity ratio with those achieved by previous pre-training pruning algorithms.

URL: https://openreview.net/forum?id=x8wscCAJ2m

---

Reply all
Reply to author
Forward
0 new messages