Skip to main content

Bioinformatics Training and Education Program

june, 2021

08jun1:00 pm2:00 pmATOM Modeling Pipeline (AMPL) for Drug DiscoveryA hands-on tutorial

more

Event Details

Do you want to know how to use Machine Learning (ML) for accelerating drug discovery? Join us on June 8, 1:00 pm – 2:00 pm ET, for the first in a series of workshops on how to use the Atom Modeling PipeLine (AMPL), an open-source conda-based software that automates key drug discovery steps. AMPL is designed to take molecular binding data (ex., IC50, ki, etc.) and carry out key ML steps with minimal user intervention. The first workshop will introduce AMPL and highlight AMPL’s capabilities for creating ML-ready datasets. Follow-on workshops will be offered during the summer and will cover modeling methods and inference.

Location: Webex

Registration: Not required

Presenter: Sarangan Ravichandran, PhD, PMP Senior Data Scientist, ATOM Consortium/Frederick National Laboratory for Cancer Research (FNLCR) and Adjunct Professor in Bioinformatics, Hood College

Supporting materials: Tutorial and AMPL: A Data-Driven Modeling Pipeline for Drug Discovery

The workshop on June 8 will include two parts, a short presentation followed by a hands-on tutorial.

Part 1: A 20-minute presentation that will cover the following topics:

  • Introduction to small-molecule binding and the database sources
  • Issues associated with data ingestion and curation
  • Exploratory data analysis of the ingested and curated datasets
  • Use of different featurization methods like molecular fingerprints or properties (Molecular Weight, number of hydrogen-bond acceptors, etc.)
  • Creation of ML-ready datasets

Part 2: A 35-minute AMPL code demonstration followed by a 5-minute Q&A. We will share a Python Jupyter notebook that will cover the following ML steps: data ingestion/curation, featurization, and visualization to create ML-ready datasets. Here are the key sections of the notebook:

  • Highlights of AMPL functions that are designed to address the common issues encountered during the data ingestion and curation of drug discovery or small-molecule-focused projects
  • Introduction of the extensible AMPL featurizer module and a demonstration on how simple keyword choices can lead to the computation of a range of different feature sets
  • Exploratory Data Analysis and visualization code templates that can be adopted for other drug discovery projects with very little modification

To learn more about the software, visit the AMPL GitHub repository at this link

Questions? Contact the NCI Data Science Learning Exchange

Time

(Tuesday) 1:00 pm - 2:00 pm

Location

Online

Organizer

Data Science Learning Exchange

Leave a reply

X