Weekly TMLR digest for Dec 31, 2023

6 views

Skip to first unread message

TMLR

unread,

Dec 30, 2023, 7:00:09 PM12/30/23

to tmlr-annou...@googlegroups.com

New certifications
==================

Survey Certification: Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Tong Wang, Samuel Marks, Charbel-Raphael Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Biyik, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell

https://openreview.net/forum?id=bx24KpJ4Eb

---

Survey Certification: A Survey on the Possibilities & Impossibilities of AI-generated Text Detection

Soumya Suvra Ghosal, Souradip Chakraborty, Jonas Geiping, Furong Huang, Dinesh Manocha, Amrit Bedi

https://openreview.net/forum?id=AXtFeYjboj

---

Accepted papers
===============

Title: Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Authors: Stephen Casper, Xander Davies, Claudia Shi, Thomas Krendl Gilbert, Jérémy Scheurer, Javier Rando, Rachel Freedman, Tomasz Korbak, David Lindner, Pedro Freire, Tony Tong Wang, Samuel Marks, Charbel-Raphael Segerie, Micah Carroll, Andi Peng, Phillip Christoffersen, Mehul Damani, Stewart Slocum, Usman Anwar, Anand Siththaranjan, Max Nadeau, Eric J Michaud, Jacob Pfau, Dmitrii Krasheninnikov, Xin Chen, Lauro Langosco, Peter Hase, Erdem Biyik, Anca Dragan, David Krueger, Dorsa Sadigh, Dylan Hadfield-Menell

Abstract: Reinforcement learning from human feedback (RLHF) is a technique for training AI systems to align with human goals. RLHF has emerged as the central method used to finetune state-of-the-art large language models (LLMs). Despite this popularity, there has been relatively little public work systematizing its flaws. In this paper, we (1) survey open problems and fundamental limitations of RLHF and related methods; (2) overview techniques to understand, improve, and complement RLHF in practice; and (3) propose auditing and disclosure standards to improve societal oversight of RLHF systems. Our work emphasizes the limitations of RLHF and highlights the importance of a multi-layered approach to the development of safer AI systems.

URL: https://openreview.net/forum?id=bx24KpJ4Eb

---

Title: A Survey on the Possibilities & Impossibilities of AI-generated Text Detection

Authors: Soumya Suvra Ghosal, Souradip Chakraborty, Jonas Geiping, Furong Huang, Dinesh Manocha, Amrit Bedi

Abstract: Large Language Models (LLMs) have revolutionized the domain of natural language processing (NLP) with remarkable capabilities of generating human-like text responses. However, despite these advancements, several works in the existing literature have raised serious concerns about the potential misuse of LLMs such as spreading misinformation, generating fake news, plagiarism in academia, and contaminating the web. To address these concerns, a consensus among the research community is to develop algorithmic solutions to detect AI-generated text. The basic idea is that whenever we can tell if the given text is either written by a human or an AI, we can utilize this information to address the above-mentioned concerns. To that end, a plethora of detection frameworks have been proposed, highlighting the possibilities of AI-generated text detection. But in parallel to the development of detection frameworks, researchers have also concentrated on designing strategies to elude detection, i.e., focusing on the impossibilities of AI-generated text detection. This is a crucial step in order to make sure the detection frameworks are robust enough and it is not too easy to fool a detector. Despite the huge interest and the flurry of research in this domain, the community currently lacks a comprehensive analysis of recent developments. In this survey, we aim to provide a concise categorization and overview of current work encompassing both the prospects and the limitations of AI-generated text detection. To enrich the collective knowledge, we engage in an exhaustive discussion on critical and challenging open questions related to ongoing research on AI-generated text detection.

URL: https://openreview.net/forum?id=AXtFeYjboj

---

Title: FREED++: Improving RL Agents for Fragment-Based Molecule Generation by Thorough Reproduction

Authors: Alexander Telepov, Artem Tsypin, Kuzma Khrabrov, Sergey Yakukhnov, Pavel Strashnov, Petr Zhilyaev, Egor Rumiantsev, Daniel Ezhov, Manvel Avetisian, Olga Popova, Artur Kadurin

Abstract: A rational design of new therapeutic drugs aims to find a molecular structure with desired biological functionality, e.g., an ability to activate or suppress a specific protein via binding to it.
Molecular docking is a common technique for evaluating protein-molecule interactions.
Recently, Reinforcement Learning (RL) has emerged as a promising approach to generating molecules with the docking score (DS) as a reward.
In this work, we reproduce, scrutinize and improve the recent RL model for molecule generation called FREED (Yang et al., 2021).
Extensive evaluation of the proposed method reveals several limitations and challenges despite the outstanding results reported for three target proteins.
Our contributions include fixing numerous implementation bugs and simplifying the model while increasing its quality, significantly extending experiments, and conducting an accurate comparison with current state-of-the-art methods for protein-conditioned molecule generation.
We show that the resulting fixed model is capable of producing molecules with superior docking scores compared to alternative approaches.

URL: https://openreview.net/forum?id=YVPb6tyRJu

---

Title: In search of projectively equivariant networks

Authors: Georg Bökman, Axel Flinth, Fredrik Kahl

Abstract: Equivariance of linear neural network layers is well studied.
In this work, we relax the equivariance condition to only be true in a projective sense.
Hereby, we introduce the topic of projective equivariance to the machine learning audience.
We theoretically study the relation of projectively and linearly equivariant linear layers. We find that in some important cases, surprisingly, the two types of layers coincide.
We also propose a way to construct a projectively equivariant neural network, which boils down to building a standard equivariant network where the linear group representations acting on each intermediate feature space are lifts of projective group representations.
Projective equivariance is showcased in two simple experiments. Code for the experiments is provided in the supplementary material.

URL: https://openreview.net/forum?id=Ls1E16bTj8

---

Title: Improving Native CNN Robustness with Filter Frequency Regularization

Authors: Jovita Lukasik, Paul Gavrikov, Janis Keuper, Margret Keuper

Abstract: Neural networks tend to overfit the training distribution and perform poorly on out-of-distribution data. A conceptually simple solution lies in adversarial training, which introduces worst-case perturbations into the training data and thus improves model generalization to some extent. However, it is only one ingredient towards generally more robust models and requires knowledge about the potential attacks or inference time data corruptions during model training. This paper focuses on the native robustness of models that can learn robust behavior directly from conventional training data without out-of-distribution examples. To this end, we study the frequencies in learned convolution filters. Clean-trained models often prioritize high-frequency information, whereas adversarial training enforces models to shift the focus to low-frequency details during training. By mimicking this behavior through frequency regularization in learned convolution weights, we achieve improved native robustness to adversarial attacks, common corruptions, and other out-of-distribution tests. Additionally, this method leads to more favorable shifts in decision-making towards low-frequency information, such as shapes, which inherently aligns more closely with human vision.

URL: https://openreview.net/forum?id=2wecNCpZ7Y

---

New submissions
===============

Title: No Need for Ad-hoc Substitutes: The Expected Cost is a Principled All-purpose Classification Metric

Abstract: The expected cost (EC) is one of the main classification metrics introduced in statistical and machine learning books. It is based on the assumption that, for a given application of interest, each decision made by the system has a corresponding cost which depends on the true class of the sample. An evaluation metric can then be defined by taking the expectation of the cost over the data. Two special cases of the EC are widely used in the machine learning literature: the error rate (one minus the accuracy) and the balanced error rate (one minus the balanced accuracy or unweighted average recall). Other instances of the EC can be useful for applications in which some types of errors are more severe than others, or when the prior probabilities of the classes differ between the evaluation data and the use-case scenario. Surprisingly, the more general form for the EC is rarely used in the machine learning literature. Instead, alternative ad-hoc metrics like the F-beta score and the Matthews correlation coefficient (MCC) are used for many applications. In this work, we argue that the EC is superior to these alternative metrics, being more general, interpretable, and adaptable to any application scenario. We provide both theoretically-motivated discussions as well as examples to illustrate the behavior of the different metrics.

URL: https://openreview.net/forum?id=3mN9QNWArl

---

Reply all

Reply to author

Forward

0 new messages