Machine Learning in Structural Biology

Workshop at the 37th Conference on Neural Information Processing Systems

December 2023

About

Structural biology, the study of the 3D structure or shape of proteins and other biomolecules, has been transformed by breakthroughs from machine learning algorithms. Machine learning models are now routinely used by experimentalists to predict structures to aid in hypothesis generation and experimental design, accelerate the experimental process of structure determination (e.g. computer vision algorithms for cryo-electron microscopy), and have become a new industry standard for bioengineering new protein therapeutics (e.g. large language models for protein design). Despite all of this progress, there are still many active and open challenges for the field, such as modeling protein dynamics, predicting the structure of other classes of biomolecules such as RNA, learning and generalizing the underlying physics driving protein folding, and relating the structure of isolated proteins to the in vivo and contextual nature of their underlying function. These challenges are diverse and interdisciplinary, motivating new kinds of machine learning methods and requiring the development and maturation of standard benchmarks and datasets.

Machine Learning in Structural Biology (MLSB), seeks to bring together field experts, practitioners, and students from across academia, industry research groups, and pharmaceutical companies to focus on these new challenges and opportunities. This year, MLSB aims to bridge the theoretical and practical by addressing the outstanding computational and experimental problems at the forefront of our field. The intersection of artificial intelligence and structural biology promises to unlock new scientific discoveries and develop powerful design tools.

MLSB will be an in-person workshop on December 15th at NeurIPS.

Please contact the organizers at workshopmlsb@gmail.com with any questions.

Stay updated on changes and workshop news by joining our mailing list.

Presenter Information

Congratulations to all accepted presenters! Please find some information on deadlines and expectations leading up to the MLSB Workshop!

Posters

We expect all authors to prepare a poster that can be presented as part of our workshop. Posters must be 24W x 36H inches and will be taped to the wall. Poster boards will not be provided at the workshop. Posters should be on lightweight paper, and not laminated.

Additionally, a virtual copy of each poster must be uploaded to the NeurIPS poster upload portal by Thursday, December 14. Posters must be PNG with no more than 5120 width x 2880 height (no more than 10 MB). Thumbnail images should be 320 width x 256 height PNG and no more than 5 MB. Users should log in using the neurips.cc account associated with their CMT email address. If they did not already have a neurips.cc account, then it should have automatically been created and can be accessed by resetting the password.

Paper Camera-Ready

De-anonymized, camera-ready versions of the workshop paper will be due on Microsoft CMT by Monday, Dec 4. Papers must indicate that they are NeurIPS MLSB workshop papers by using the modified NeurIPS style file here. Papers should be compiled with the 'final' argument, e.g. \usepackage[final]{neurips_mlsb_2023}

We plan to make all submitted papers available on the workshop website (https://www.mlsb.io/). If you would prefer that your work not be shared, please email the organizers by responding to this email as soon as possible. Additionally, please let us know if there is an arXiv/biorXiv link for the paper that should be linked as well.

Travel Award

This year we will try to cover as many workshop registrations as possible for student/academic attendees with oral presentations or posters who need financial assistance. If you would like to be considered, please fill out the following form by Friday, Nov 17th. If you have any questions, please don't hesitate to contact us at workshopmlsb@gmail.com.

Key Dates

Application for Registration Reimbursement: Friday, November 17th, 2023, at 11:59PM, Anywhere on Earth.

Camera-Ready PDF due on Microsoft CMT: Monday, December 4th, 2023.

Poster due: Thursday, December 14th, 2023.

Invited Speakers

Bridget Carragher

Bridget Carragher

Founding Technical Director of the Chan-Zuckerberg Imaging Institute.

Show/Hide Bio
Kyunghyun Cho

Kyunghyun Cho

Associate Professor at NYU
Senior Director of Frontier Research at Prescient Design.

Show/Hide Bio
Rhiju Das

Rhiju Das

HHMI Investigator, Associate Professor of Biochemistry at Stanford University.

Show/Hide Bio
Polly Fordyce

Polly Fordyce

Associate Professor of Genetics and
Bioengineering at Stanford University.

Show/Hide Bio
Tanja Kortemme

Tanja Kortemme

Professor of Bioengineering at University of California, San Francisco.

Show/Hide Bio

Gevorg Grigoryan

Gevorg Grigoryan

Co-Founder and CTO of Generate Biomedicines.
Associate Professor at Dartmouth College.

Show/Hide Bio
RFDiffusion

RF Diffusion Team

A diffusion model for protein design.

Show/Hide Bio

Schedule (CST)

08:30 Opening Remarks
08:35 Invited Speaker - Kyunghyun Cho

Health system scale language models for clinical and operational decision making

08:40
08:45
08:50
08:55
09:00 Contributed Talk

Validation of de novo designed water-soluble and membrane proteins by in silico folding and melting
Alvaro Martin · Carolin Berner · Sergey Ovchinnikov · Anastassia Vorobieva

09:05
09:10
09:15 Invited Speaker - Tanja Kortemme

Accurate and tunable de novo protein shapes for new functions

09:20
09:25
09:30
09:35
09:40 Break
09:45
09:50
09:55
10:00 Invited Speaker - Bridget Carragher

A CryoET Data Portal to Foster a Collaboration between the Machine Learning and CryoET Communities

10:05
10:10
10:15
10:20
10:25 Contributed Talk

AlphaFold Meets Flow Matching for Generating Protein Ensembles
Bowen Jing · Dr. Bonnie Berger · Tommi Jaakkola

10:30
10:35
10:40 Contributed Talk

DSMBind: an unsupervised generative modeling framework for binding energy prediction
Wengong Jin · Caroline Uhler · Nir Hacohen

10:45
10:50
10:55 Invited Speaker - Polly Fordyce

Leveraging microfluidics for high-throughput and quantitative biochemistry and biophysics

11:00
11:05
11:10
11:15
11:20 Poster Session/Lunch
11:25
11:30
11:35
11:40
11:45
11:50
11:55
12:00
12:05
12:10
12:15
12:20
12:25
12:30
12:35
12:40 Invited Speaker - Gevorg Grigoryan

Illuminating protein space with a programmable generative model

12:45
12:50
12:55
01:00
01:05 Contributed Talk

Protein generation with evolutionary diffusion: sequence is all you need
Sarah Alamdari · Nitya Thakkar · Rianne van den Berg · Alex Lu · Nicolo Fusi · Ava P Amini · Kevin Yang

01:10
01:15
01:20 Invited Speaker - Jason Yim / Brian Trippe

De novo design of protein structure and function with RFdiffusion

01:25
01:30
01:35
01:40
01:45 Break
01:50
01:55
02:00 Contributed Talk

DiffDock-Pocket: Diffusion for Pocket-Level Docking with Sidechain Flexibility
Michael Plainer · Marcella Toth · Simon Dobers · Hannes Stärk · Gabriele Corso · Céline Marquet · Dr.Regina Barzilay

02:05
02:10
02:15 Contributed Talk

PoseCheck: Generative Models for 3D Structure-based Drug Design Produce Unrealistic Poses
Charles Harris · Kieran Didi · Arian R. Jamasb · Chaitanya Joshi · Simon V Mathis · Pietro Lió · Tom Blundell

02:20
02:25
02:30 Invited Speaker - Rhiju Das

World-wide competitions and the RNA folding problem

02:35
02:40
02:45
02:50
02:55 Break
03:00 Panel Discussion
03:05
03:10
03:15
03:20
03:25
03:30
03:35
03:40
03:45
03:50
03:55
04:00 Poster Session / Happy Hour
04:05
04:10
04:15
04:20
04:25
04:30
04:35
04:40
04:45
04:50
04:55
05:00 Closing Remarks

Organizers

Photo of Gabriele Corso

Gabriele Corso
MIT

Photo of Gina El Nesr

Gina El Nesr
Stanford University

Photo of Sergey Ovchinnikov

Sergey Ovchinnikov
Harvard University

Photo of Roshan Rao

Roshan Rao
Meta AI

Photo of Hannah Wayment-Steele

Hannah Wayment-Steele
Brandeis University

Photo of Ellen Zhong

Ellen Zhong
Princeton University