Structural biology, the study of the 3D structure or shape of proteins and other biomolecules, has been transformed by breakthroughs from machine learning algorithms. Machine learning models are now routinely used by experimentalists to predict structures to aid in hypothesis generation and experimental design, accelerate the experimental process of structure determination (e.g. computer vision algorithms for cryo-electron microscopy), and have become a new industry standard for bioengineering new protein therapeutics (e.g. large language models for protein design). Despite all of this progress, there are still many active and open challenges for the field, such as modeling protein dynamics, predicting the structure of other classes of biomolecules such as RNA, learning and generalizing the underlying physics driving protein folding, and relating the structure of isolated proteins to the in vivo and contextual nature of their underlying function. These challenges are diverse and interdisciplinary, motivating new kinds of machine learning methods and requiring the development and maturation of standard benchmarks and datasets.
Machine Learning in Structural Biology (MLSB), seeks to bring together field experts, practitioners, and students from across academia, industry research groups, and pharmaceutical companies to focus on these new challenges and opportunities. This year, MLSB aims to bridge the theoretical and practical by addressing the outstanding computational and experimental problems at the forefront of our field. The intersection of artificial intelligence and structural biology promises to unlock new scientific discoveries and develop powerful design tools.
MLSB will be an in-person NeurIPS workshop on 15th December 2024 in MTG Rooms 11 & 12 at the Vancouver Convention Center.
Please contact the organizers at workshopmlsb@gmail.com with any questions.
Stay updated on changes and workshop news by joining our mailing list.
Congratulations to all accepted presenters! Please find some information on deadlines and expectations leading up to the MLSB Workshop!
We ask all authors to prepare a poster that can be presented as part of our workshop. Posters must be 24W x 36H inches and will be taped to the wall. Poster boards will not be provided at the workshop. We specifically ask for portrait layout because we will be tight on wall space.
Additionally, a virtual copy of each poster must be uploaded to the NeurIPS poster upload portal, by Thursday, December 12. Posters must be PNG with no more than 5120 width x 2880 height (no more than 10 MB). Thumbnail images should be 320 width x 256 height PNG and no more than 5 MB. We know these are different dimensions than what we're asking for in-person posters, the poster upload dimensions are set by NeurIPS.
Users should log in using the neurips.cc account associated with their CMT email address. If they did not already have a neurips.cc account, then it should have automatically been created and can be accessed by resetting the password.
De-anonymized, camera-ready versions of the workshop paper will be due on Microsoft CMT by Monday, Dec 2. Papers must indicate that they are NeurIPS MLSB workshop papers by using the modified NeurIPS style file here. Papers should be compiled with the `final` argument, e.g. \usepackage[final]{neurips_mlsb_2024}
We plan to make all camera-ready submitted papers available on the workshop website (https://www.mlsb.io/). If you would prefer that your work not be shared, then there is no need to submit a camera-ready version..
This year we will try to cover as many workshop registrations as possible for student/academic attendees with oral presentations or posters who need financial assistance.
If you would like to be considered, please fill out the following form by Friday, Nov 15 Friday, Nov 8.
If you have any questions, please don't hesitate to contact us at workshopmlsb@gmail.com.
Application for Registration Reimbursement: Friday, November 15th, 2024 November 8th, 2024, at 11:59PM, Anywhere on Earth.
Camera-Ready PDF due on Microsoft CMT: Monday, December 2nd, 2024.
Poster due: Thursday, December 12th, 2024.
We welcome submissions of short papers leveraging machine learning to address problems in structural biology, including but not limited to:
We request anonymized PDF submissions by Friday, September 20, 2024, at 11:59PM, AoE (anywhere on earth) through our submission website on CMT.
Papers should present novel work that has not been previously accepted at an archival venue at the time of submission. Submissions should be a maximum of 5 pages (excluding references and appendices) in PDF format, using the NeurIPS style files, and fully anonymized as per the requirements of NeurIPS. The NeurIPS checklist can be omitted from the submission. Submissions meeting these criteria will go through a light, double-blind review process. Reviewer comments will be returned to the authors as feedback.
Accepted papers will be invited to present a poster at the workshop, with nominations of spotlight talks at the discretion of the organizers.
New this year, we will have two special tracks for models for predicting protein-protein and protein-ligand interactions, evaluated on two new large-scale benchmarks, PINDER and PLINDER. The highest-performing open-source methods from these two tracks will receive invitations to a spotlight presentation. Stay tuned for more information on how to submit to these tracks.
Like last year, authors that commit to open-sourcing code, model weights, and datasets used in the work will be given precedence for spotlight talks. This change only affects consideration for spotlights. Submissions that cannot make this commitment will still be considered for posters and will not be penalized for acceptance.
This workshop is considered non-archival, however, authors of accepted contributions will have the option to make their work available through the workshop website. Presentation of work that is concurrently in submission is welcome. We welcome papers sharing encouraging work-in-progress results or forward-looking position papers that would benefit from feedback and community discussion at our workshop.
Submission Deadline: Friday, September 20th, 2024, at 11:59PM, Anywhere on Earth.
Notification of Acceptance: Wednesday, October 9th, 2024.
Workshop Date: December 15th 2024, Vancouver, Canada.
Assistant Professor, Department of Pharmacology, Northwestern University.
Show/Hide BioThis year we are running a challenge on the Pinder and Plinder datasets to evaluate how well the community is currently doing for protein-protein interaction prediction and protein-ligand complex prediction.
To submit your trained model you will need to make an inference docker image on HuggingFace Spaces using the following templates:
(SMILES, monomer protein structure, monomer FASTA, monomer MSA)
(monomer protein structure 1, monomer protein structure 2, FASTA 1, FASTA 2, MSA 1, MSA 2)
Please find the technical documentation for how to use the datasets for the challenge:
Submission system will use Hugging Face Spaces. To qualify for submission, each team must:
requirements.txt
to capture all dependencies.inference_app.py
file. This contains a predict
function that should be modified to reflect the specifics of inference using their model.train.py
file to ensure that training and model selection use only the PINDER/PLINDER datasets and to clearly show any additional hyperparameters used.Other metrics computed by PINDER/PLINDER will be displayed on the leaderboard but will not influence the ranking.
The winners will be invited to present their work at the MLSB workshop.
Although the exact composition of the eval set will be shared at a future date, below we provide an overview of the dataset and what to expect
Training workshop September 24th, 2024, virtual (Register here)
Leaderboard Opens: October 9th, 2024 (following acceptance notifications for MLSB).
Leaderboard Closes: November 9th, 2024
Winner Notification: Wednesday, November 27th, 2024
If you have trouble we invite you to join the PINDER/PLINDER discord server