AIセキュリティ・プライバシー分野の講演 (Prof. Kobbi Nissim, Prof. Eric Wong)

10 views
Skip to first unread message

Jun Sakuma

unread,
May 27, 2026, 2:40:41 AM (6 days ago) May 27
to ibi...@googlegroups.com
IBISMLの皆様、

東京科学大の佐久間です。AIセキュリティ・プライバシー分野の講演のご案内です。
差分プライバシーの提唱者の一人であり、暗号理論における業績でゲーデル賞を受賞したProf. Kobbi Nissim (Georgetown
Universityの) および
JailbreakBenchや敵対的サンプルに対する証明可能防御などのAIセキュリティ分野で著名な業績を多数挙げられたProf. Eric
Wong (University of Pennsylvania)
に、東京科学大大岡山キャンパスで講演をいただくことになりました。奮ってご参加ください。

【講演1】
講演者:Prof. Kobbi Nissim (Georgetown University)
日時 6/4(木) 14:00-15:00
場所 東京科学大 大岡山キャンパス 西8号館E 10F 系会議室 (1004)
タイトル: Protecting the Undeleted in Machine Unlearning
Abstract: Legal data protection standards such as the EU General Data
Protection Regulation and the California Consumer Privacy Act give
individuals the right to request that their specific information be
deleted, also known as the Right to be Forgotten. This provision gave
rise to machine unlearning, a branch of machine learning focused on
removing elements from training data by efficiently producing a model
that would have been obtained had the deleted data never been
included, namely, “perfect retraining.”
In this talk, Prof. Nissim will discuss how data deletion affects
privacy. He will first present a task that can be computed with strong
privacy guarantees, yet any perfect retraining mechanism for the task
allows an adversary controlling only a small number of data points to
reconstruct almost the entire dataset simply by issuing deletion
requests.
He will then discuss ways forward, in particular a new
cryptographically motivated security definition that safeguards
undeleted data points against leakage caused by the deletion of other
points. The talk will also show that this definition permits several
essential functionalities, including bulletin boards, summations, and
statistical learning.
This is based on joint work with Aloni Cohen, Refael Kohen, and Uri Stemmer.
Zoom link info:
https://zoom.us/j/99735897828?pwd=DR1RbS4bRbHrPJjGl77KwEd3XblyjS.1


【講演2】
講演者:Prof. Eric Wong (University of Pennsylvania)
日時 6/30(火) 16:00-17:00
場所 東京科学大 大岡山キャンパス 西8号館E 10F 系会議室 (1004)
タイトル: Understanding Safety & Alignment with Mechanistic Theory
Abstract: Why are LLM guardrails fundamentally so easily broken, and
how can we enforce them? This talk formalizes a mechanistic theory for
studying safety problems. We begin with one-layer transformers,
identifying rule-breaking as an inherent architectural vulnerability
in the model's attention mechanism. This mechanistic theory framework
(LogicBreaks) taught us a critical lesson: if attention is the key to
breaking rules, it may also be the key to enforcing them. Building
upon this insight, we expand the mechanistic theory to analyze
attention-based interventions, arriving at InstaBoost: an incredibly
simple yet highly effective steering method that boosts the model's
attention on user-provided instructions during generation. This
technique, developed from analysis on one-layer transformers, provides
state-of-the-art control over large-scale LLMs with just five lines of
code.
Zoom link info:
https://zoom.us/j/99735897828?pwd=DR1RbS4bRbHrPJjGl77KwEd3XblyjS.1

Best regards,
--
Jun Sakuma
Institute of Science Tokyo / RIKEN AIP
Reply all
Reply to author
Forward
0 new messages