Machine Learning for Structural Biology

Workshop at the 34th Conference on Neural Information Processing Systems

Saturday, December 12

About

Spurred on by recent advances in neural modeling and wet-lab methods, structural biology, the study of the three-dimensional (3D) atomic structure of proteins and other macromolecules, has emerged as an area of great promise for machine learning. The shape of macromolecules is intrinsically linked to their biological function (e.g., much like the shape of a bike is critical to its transportation purposes), and thus machine learning algorithms that can better predict and reason about these shapes promise to unlock new scientific discoveries in human health as well as increase our ability to design novel medicines.

Moreover, fundamental challenges in structural biology motivate the development of new learning systems that can more effectively capture physical inductive biases, respect natural symmetries, and generalize across atomic systems of varying sizes and granularities. Through the Machine Learning in Structural Biology (MLSB) workshop, we aim to include a diverse range of participants and spark a conversation on the required representations and learning algorithms for atomic systems, as well as dive deeply into how to integrate these with novel wet-lab capabilities.

To attend the MLSB workshop talks, poster sessions, and virtual hangouts, register for the main NeurIPS conference at neurips.cc by December 7th, 2020.

Call For Papers

We welcome submissions of short papers leveraging machine learning to address problems in structural biology, including but not limited to:

  • Structure prediction
  • Protein and RNA design
  • Experimental determination of structure
  • Interaction prediction
  • Conformational change and ensemble prediction
  • Molecular dynamics with learned samplers or potential functions
  • Function or property prediction
  • Structural systems biology
  • Model systems, such as lattice proteins or other toy ensembles
  • Learning representations of structure

We request anonymized PDF submissions by Thursday, October 15th, 2020, 11:59PM in the timezone of your choice through Microsoft CMT.

Submissions should be 4-9 pages in PDF format and fully anonymized as per the requirements of NeurIPS. We request use of the NeurIPS style files. A maximum of 9 pages excluding references and appendices will be considered. The review process will be double-blind.

Accepted papers will be invited to present a poster at the virtual workshop, with nominations of spotlight talks at the discretion of the organizers. This workshop is considered non-archival and does not publish proceedings, however authors of accepted contributions will have the option to make their work available through the workshop website. Presentation of work that is concurrently in submission is welcome.

Important Dates

Submission Deadline: October 15, 2020, 11:59 PM, Anywhere on Earth.

Notification of Acceptance: Oct 30, 2020.

Workshop Date: December 12th, 2020

Invited Speakers

David Baker

David Baker

Keynote

Director of the Institute for Protein Design, University of Washington.

Michael Levitt

Michael Levitt

Keynote

Professor of Structural Biology at Stanford University. Recipient of the 2013 Nobel Prize in Chemistry.

Mohammed AlQuraishi

Mohammed AlQuraishi

Assistant Professor of Systems Biology at Columbia University.

Charlotte Deane

Charlotte Deane

Professor of Structural Bioinformatics at Oxford University.

Possu Huang

Possu Huang

Assistant Professor of Bioengineering at Stanford University.

Frank Noe

Frank Noe

Professor of Mathematics and Computer Science at Freie Universität Berlin.

Chaok Seok

Chaok Seok

Professor of Chemistry at Seoul National University.

Andrea Thorn

Andrea Thorn

Group leader at the Rudolf Virchow Center of Würzburg University.

Schedule (PST)

08:00 Opening Remarks
08:10 Keynote - Michael Levitt:

Is Basic Science Needed for Significant and Fundamental Discoveries

08:20
08:30
08:40
08:50 Invited Talk - Charlotte Deane:

Predicting the conformational ensembles of proteins

9:00
9:10 Invited Talk - Frank Noe:

Deep Markov State Models versus Covid-19

9:20
9:30 Invited Talk - Andrea Thorn:

Finding Secondary Structure in Cryo-EM maps: HARUSPEX

9:40
9:50 Break
10:00
10:10
10:20 Keynote - David Baker:

Rosetta design of COVID antivirals and diagnostics

10:30
10:40
10:50
11:00 Contributed Talk: Predicting Chemical Shifts with Graph Neural Networks (Ziyue Yang, Maghesree Chakraborty, Andrew White)
11:10 Contributed Talk: Cryo-ZSSR: multiple-image super-resolution based on deep internal learning (Qinwen Huang, Ye Zhou, Xiaochen Du, Reed Chen, Jianyou Wang, Cynthia Rudin, Alberto Bartesaghi)
11:20 Contributed Talk: Wasserstein K-Means for Clustering Tomographic Projections (Rohan Rao, Amit Moscovich, Amit Singer)
11:30 Poster SessionHeld on gather.town
11:40
11:50
12:00
12:10
12:20
12:30 Lunch
12:40
12:50
01:00 Panel Discussion: Future of ML for Structural Biology
01:10
01:20
01:30
01:40
01:50
2:00 Invited Talk - Possu Huang:Jump starting an evolution by protein design through deep learning of protein structures
2:10
2:20 Contributed Talk: ProGen: Language Modeling for Protein Generation (Ali Madani, Bryan McCann, Nikhil Naik, Nitish Shirish Keskar, Namrata Anand, Alexander E Chu, Raphael R Eguchi, Po-Ssu Huang, Richard Socher)
2:30 Contributed Talk: Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences (Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Demi Guo, Myle Ott, Larry Zitnick, Jerry Ma, Rob Fergus)
2:40 Contributed Talk: SidechainNet: An All-Atom Protein Structure Dataset for Machine Learning (Jonathan King, David Koes)
2:50 Contributed Talk: Generating 3D Molecular Structures Conditional on a Receptor Binding Site with Deep Generative Models (Tomohide Masuda, Matthew Ragoza, David Koes)
3:00 Contributed Talk: Learning from Protein Structure with Geometric Vector Perceptrons (Bowen Jing, Stephan Eismann, Patricia Suriana, Raphael Townshend, Ron Dror)
3:10 Poster Session Held on gather.town
3:20
3:30
3:40
3:50
4:00
4:10 Invited Talk - Mohammed AlQuraishi:

(Nearly) end-to-end differentiable learning of protein structure

4:20
4:30 Invited Talk - Chaok Seok: Ab initio protein structure prediction by global optimization of neural network energy: Can AI learn physics?
4:40
4:50 Closing Remarks
5:00 Happy HourHeld on gather.town

Accepted Papers

  • Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences

    Alexander Rives, Joshua Meier, Tom Sercu, Siddharth Goyal, Zeming Lin, Demi Guo, Myle Ott, Larry Zitnick, Jerry Ma, Rob Fergus

    [bioRxiv]

  • Combining variational autoencoder representations with structural descriptors improves prediction of docking scores

    Miguel Garcia-Ortegon, Carl Edward Rasmussen, Andreas Bender, Hiroshi Kajino, Sergio Bacallado

    [paper]

  • Conservative Objective Models: A Simple Approach to Effective Model-Based Optimization

    Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine

    [paper]

  • Cross-Modality Protein Embedding for Compound-Protein Affinity and Contact Prediction

    Yuning You, Yang Shen

    [paper][bioRxiv]

  • Cryo-ZSSR: multiple-image super-resolution based on deep internal learning

    Qinwen Huang, Ye Zhou, Xiaochen Du, Reed Chen, Jianyou Wang, Alberto Bartesaghi, Cynthia Rudin

    [paper][arXiv]

  • DHS-Crystallize: Deep-Hybrid-Sequence based method for predicting protein Crystallization

    Azadeh Alavi, David Ascher

    [paper][bioRxiv]

  • Design-Bench: Benchmarks for Data-Driven Offline Model-Based Optimization

    Brandon Trabucco, Aviral Kumar, Xinyang Geng, Sergey Levine

    [paper]

  • Designing a Prospective COVID-19 Therapeutic with Reinforcement Learning

    Marcin J Skwark, Nicolas Lopez Carranza, Thomas Pierrot, Joe Phillips, Slim Said, Alexandre Laterre, Amine Kerkeni, Ugur Sahin, Karim Beguir

    [paper][arXiv]

  • ESM-1b: Optimizing Evolutionary Scale Modeling

    Joshua Meier, Jason Liu, Zeming Lin, Naman Goyal, Myle Ott, Tom Sercu, Alexander Rives

    [bioRxiv]

  • Exploring generative atomic models in cryo-EM reconstruction

    Ellen Zhong, Adam Lerer, Joseph Davis, Bonnie Berger

    [paper][arXiv]

  • Fast and adaptive protein structure representations for machine learning

    Janani Durairaj, Mehmet Akdel, Dick de Ridder, Aalt DJ van Dijk

    [paper]

  • GEFA: Early Fusion Approach in Drug-Target Affinity Prediction

    Tri Minh Nguyen, Thin Nguyen, Thao Minh Le, Truyen Tran

    [paper][arXiv]

  • Generating 3D Molecular Structures Conditional on a Receptor Binding Site with Deep Generative Models

    Tomohide Masuda, Matthew Ragoza, David Koes

    [paper][arXiv]

  • Is Transfer Learning Necessary for Protein Landscape Prediction?

    Amir Shanehsazzadeh, David Belanger, David Dohan

    [paper][arXiv]

  • Learning Super-Resolution Electron Density Map of Proteins using 3D U-Net

    BAISHALI MULLICK, Yuyang Wang, Prakarsh Yadav, Amir Barati Farimani

    [paper]

  • Learning a Continuous Representation of 3D Molecular Structures with Deep Generative Models

    Matthew Ragoza, Tomohide Masuda, David Koes

    [paper][arXiv]

  • Learning from Protein Structure with Geometric Vector Perceptrons

    Bowen Jing, Stephan Eismann, Patricia Suriana, Raphael Townshend, Ron Dror

    [paper][arXiv]

  • Molecular Mechanics-Driven Graph Neural Network with Multiplex Graph for Molecular Structures

    Shuo Zhang, Yang Liu, Lei Xie

    [paper][arXiv]

  • Pre-training Protein Language Models with Label-Agnostic Binding Pairs Enhances Performance in Downstream Tasks

    Modestas Filipavicius

    [paper][arXiv]

  • Predicting Chemical Shifts with Graph Neural Networks

    Ziyue Yang, Maghesree Chakraborty, Andrew White

    [paper][bioRxiv]

  • ProGen: Language Modeling for Protein Generation

    Ali Madani, Bryan McCann, Nikhil Naik, Nitish Shirish Keskar, Namrata Anand, Alexander E Chu, Raphael R Eguchi, Po-Ssu Huang, Richard Socher

    [paper][bioRxiv]

  • Profile Prediction: An Alignment-Based Pre-Training Task for Protein Sequence Models

    Pascal Sturmfels, Jesse Vig, Ali Madani, Nazneen Fatema Rajani

    [paper][arXiv]

  • Protein model quality assessment using rotation-equivariant, hierarchical neural networks

    Stephan Eismann, Patricia Suriana, Bowen Jing, Raphael Townshend, Ron Dror

    [paper][arXiv]

  • Sequence and stucture based deep learning models for the identification of peptide binding sites

    Osama Abdin, Philip Kim, Han Wen

    [paper]

  • SidechainNet: An All-Atom Protein Structure Dataset for Machine Learning

    Jonathan King, David Koes

    [paper][arXiv]

  • The structure-fitness landscape of pairwise relations in generative sequence models

    Dylan Marshall, Peter K. Koo, Sergey Ovchinnikov

    [paper][bioRxiv]

  • Wasserstein K-Means for Clustering Tomographic Projections

    Rohan Rao, Amit Moscovich, Amit Singer

    [paper][arXiv]

Organizers

Raphael
Townshend

Photo of Raphael Townshend
Stanford University

Stephan
Eismann

Photo of Stephan Eismann
Stanford University

Ron
Dror

Photo of Ron Dror
Stanford University

Ellen
Zhong

Photo of Ellen Zhong
MIT

Namrata
Anand

Photo of Namrata Anand
Stanford University

John
Ingraham

Photo of John Ingraham
Generate Biomedicines

Wouter
Boomsma

Photo of Wouter Boomsma
University of Copenhagen

Sergey
Ovchinnikov

Photo of Sergey Ovchinnikov
Harvard University

Roshan
Rao

Photo of Roshan Rao
UC Berkeley

Per
Greisen

Photo of Per Greisen
Novo Nordisk

Rachel
Kolodny

Photo of Rachel Kolodny
University of Haifa

Bonnie
Berger

Photo of Bonnie Berger
MIT

Sponsors

Insitro Logo
Generate Bio Logo