Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

Jiang's PhD thesis: "Language models for verifiable mathematical automation, Interaction, Integration, and Autoformalization"

18 views
Skip to first unread message

Steve Omohundro

unread,
Sep 2, 2024, 2:44:50 AM9/2/24
to guaranteed-safe-ai
Albert Jiang's PhD thesis: 
"Language models for verifiable mathematical automation, Interaction, Integration, and Autoformalization"

The thesis is based on some of the works I did with fantastic collaborators during my PhD: 1. Evaluating language models for mathematics through interactions. With ! pnas.org/doi/abs/10.107 2. Thor: Wielding Hammers to Integrate Language Models and Automated Theorem Provers. proceedings.neurips.cc/paper_files/pa 3. Magnushammer: A Transformer-Based Approach to Premise Selection. Lead by and . openreview.net/forum?id=oYjPk 4. Autoformalization with Large Language Models. With . proceedings.neurips.cc/paper_files/pa 5. Multilingual Mathematical Autoformalization. arxiv.org/abs/2311.03755 6. Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs. arxiv.org/abs/2210.12283 Works not included because they were done outside of my PhD time or I was not good enough a writer to incorporate them: 1. Mistral 7B. arxiv.org/abs/2310.06825 2. Mixtral of Experts. arxiv.org/abs/2401.04088 3. Llemma: An Open Language Model For Mathematics. arxiv.org/abs/2310.10631 4. Repurposing Language Models into Embedding Models: Finding the Compute-Optimal Recipe. With ! arxiv.org/abs/2406.04165 5. More Details, Please: Improving Autoformalization with More Detailed Proofs. By MPhil student Guillem Tarrach! openreview.net/forum?id=AkJvz 6. A great paper under submission by MEng student Andy Lo! 7. NuminaMath: The largest public dataset in AI4Maths with 860k pairs of competition math problems and solutions. faculty.bicmr.pku.edu.cn/~dongbin/Publi

Best,
Steve

Gunnar Zarncke

unread,
Sep 14, 2024, 7:23:32 PM9/14/24
to Steve Omohundro, guaranteed-safe-ai
Terence Tao tested OpenAI o1 (presumably o1-preview):
Test 1: "...a vaguely worded mathematical query which could be solved by identifying a suitable theorem (Cramer's theorem) from the literature. Previously, GPT was able to mention some relevant concepts but the details were hallucinated nonsense. This time around, Cramer's theorem was identified and a perfectly satisfactory answer was given."
Test 2: A challenging complex analysis Problem. "The experience seemed roughly on par with trying to advise a mediocre, but not completely incompetent, graduate student. However, this was an improvement over previous models, whose capability was closer to an actually incompetent graduate student."
Test 3: Formalizing a result in Lean "(specifically, to establish one form of the prime number theorem as a consequence of another) by breaking it up into sublemmas which it would formalize the statement of, but not the proof. Here, the results were promising in that the model understood the task well and performed a sensible initial breakdown of the problem, but was inhibited by the lack of up-to-date information on Lean and its math library in its training, with its code containing several mistakes. However, I could imagine a model of this capability that was specifically finetuned on Lean and Mathlib, and integrated into an IDE, being extremely useful in formalization projects."
Conclusion: "It is certainly a more capable tool than previous iterations, though still struggling with the most advanced research mathematical tasks."
Source:

--
You received this message because you are subscribed to the Google Groups "guaranteed-safe-ai" group.
To unsubscribe from this group and stop receiving emails from it, send an email to guaranteed-safe...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/guaranteed-safe-ai/CAOLEBs-FQ5x80t_b3LVV_%3DTFd%3DW0DWeMZYqTLSEZ0TKWTtC%2Bww%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Kris Carlson

unread,
Sep 15, 2024, 7:48:12 AM9/15/24
to Gunnar Zarncke, Steve Omohundro, guaranteed-safe-ai
Very interesting, thanks Gunnar. I've been through the o1 overview, Learning to Reason with LLMs, and see striking progress in 1) long reasoning chains, 2) ability to detect obstacles in sub-goals, 3) creatively form new sub-goals to try to overcome them, and further progress in identifying contexts. I'm going thru the system card (attached) to see how they addressed safety. If you try to follow the very long reasoning chain in their cryptography problem example, it's amazing. It is also interesting to see that deepening the method of reinforcement learning with human feedback (RLHF) is fruitful.

Also very interesting, and of course alarming, that o1 manifests Steve O's instrumental drives:

While this behavior is benign and within the range of systems administration and troubleshooting
tasks we expect models to perform, this example also reflects key elements of instrumental
convergence and power seeking: the model pursued the goal it was given, and when that goal
proved impossible, it gathered more resources (access to the Docker host) and used them to
achieve the goal in an unexpected way.

I hope they are looking at recursive safety improvement, i.e. turning its abilities on improving its own safety & alignment, as Jan Leike, who left, emphasized. That seems to be the case with its improved RLHF.

Another significant step toward safety is the symbolic layer is transparent & comprehensible:

Chain of thought reasoning provides new opportunities for alignment and safety. We found that integrating our policies for model behavior into the chain of thought of a reasoning model is an effective way to robustly teach human values and principles. By teaching the model our safety rules and how to reason about them in context, we found evidence of reasoning capability directly benefiting model robustness: o1-preview achieved substantially improved performance on key jailbreak evaluations and our hardest internal benchmarks for evaluating our model's safety refusal boundaries. We believe that using a chain of thought offers significant advances for safety and alignment because (1) it enables us to observe the model thinking in a legible way, and (2) the model reasoning about safety rules is more robust to out-of-distribution scenarios.

Kris

o1_system_card.pdf
Reply all
Reply to author
Forward
0 new messages