Starkly Speaking: SynCoGen: Synthesizable 3D Molecule Generation via Joint Reaction and Coordinate Modeling
14 views
Skip to first unread message
Hannes Stärk
unread,
Mar 9, 2026, 9:37:52 AM (10 days ago) Mar 9
Reply to author
Sign in to reply to author
Forward
Sign in to forward
Delete
You do not have permission to delete messages in this group
Copy link
Report message
Show original message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to stark...@googlegroups.com
Hi together,
Todays reading group will be 1h earlier for non-us folks due to the timezone change that happened in the US on Sunday.
Speaker: Andrei Rekesh and Miruna Cretu
Paper: SynCoGen: Synthesizable 3D Molecule Generation via Joint Reaction and Coordinate Modeling https://arxiv.org/abs/2507.11818 (Andrei Rekesh, Miruna Cretu, Dmytro Shevchuk, Vignesh Ram Somnath, Pietro Liò, Robert A. Batey, Mike Tyers, Michał Koziarski, Cheng-Hao Liu) Synthesizability remains a critical bottleneck in generative molecular design. While recent advances have addressed synthesizability in 2D graphs, extending these constraints to 3D for geometry-based conditional generation remains largely unexplored. In this work, we present SynCoGen (Synthesizable Co-Generation), a single framework that combines simultaneous masked graph diffusion and flow matching for synthesizable 3D molecule generation. SynCoGen samples from the joint distribution of molecular building blocks, chemical reactions, and atomic coordinates. To train the model, we curated SynSpace, a dataset family containing over 1.2M synthesis-aware building block graphs and 7.5M conformers. SynCoGen achieves state-of-the-art performance in unconditional small molecule graph and conformer co-generation. For protein ligand generation in drug discovery, the amortized model delivers superior performance in both molecular linker design and pharmacophore-conditioned generation across diverse targets without relying on any scoring functions. Overall, this multimodal non-autoregressive formulation represents a foundation for a range of molecular design applications, including analog expansion, lead optimization, and direct de novo design.