Course Details

  • Date: February 21st, 2017 - February 22nd, 2017
  • Time: 9:30 am - 4:00 pm
  • Location: Bldg 10 FAES room 4 (B1C205)
  • Presenter(s): Andrea O'Hara (BioDiscovery Nexus), Justin Lack (NIAID CBR)

This BTEP Workshop will cover the fundamentals and best practices of Exome-Seq analysis, including downstream interpretation of variants using a variety of in-house and NCI-licensed software solutions. There will be hands-on training on CCBR Exome-Seq Pipeline, CLC Biomedical Workbench, Genomatix GeneGrid, Ingenuity Variant Analysis and BioDiscovery Nexus Copy Number.

NOTE: This is a BYOC (Bring your own laptop Computer) class. Government issued or personal computers are permitted. We will be able to supply a very limited set of computers, so if you want to take the class but cannot bring your own computer please indicate such in the Comment section on the registration form. Please register only if you intend to attend the workshop.

Dates: February 21-22, 2017 (Tuesday and Wednesday)

Time: 9:30 am – 4:00 pm

Location: NIH Bldg 10, FAES Classroom 4


[1] Download the Trial License for CLC Biomedical Genomics Workbench here:…

[2] Download the Trial License for Ingenuity Variant Analysis here:

[3] For access to BioDiscovery Nexus Copy Number, please login to, and navigate to Scientific Software under ‘Request Something’.

[4] Information on access to Genomatix GeneGrid will be provided to attendees present at the workshop, prior to the start of the session.


Day 1: Tuesday, February 21, 2017

9:30 – 10:30 am         

Title: Introduction to Exome-Seq: What, Why, How?

Presenter: Chunhua Yan, PhD

This will be an introduction to Exome-Seq, covering:

    •    Brief overview of next-generation sequencing technology
    •    Exome sequencing (Cost, Speed, Gene coverage, Biological implication)
    •    Experimental design (Sample size, Coverage, Whole/Targeted exome-seq, Sample submission)
    •    Mutation calling resources (Dream Challenge, Genome in A Bottle, exome databases)

10:30 am – 12:30 pm

Title: Exome-Seq Data Analysis Pipeline: From Reads to Results

Presenter: Justin Lack, PhD

This talk will provide an overview of the CCBR Exome-Seq pipeline work-flow with recommended best practices.

Some of the topics covered will be:

  • Raw data processing and QC
  • Short read mapping and alignment QC, 
  • Approaches to improving processing alignments
  • Germline SNP and small INDEL calling,
  • Somatic SNP and small INDEL calling,
  • Germline and somatic structural variant calling,
  • Multi-tool variant annotation (AVIA, SnpEff, Oncotator, etc.)
  • Example processing and analysis of a tumor/germline comparison data set

LUNCH BREAK 12:30 – 1:00 pm

1:00-4:00 pm

Title: CLC Biomedical Workbench for Analysis of Exome-Seq Data

Presenter: Jennifer Poitras, Field Application Specialist

Biomedical Genomics Workbench is a comprehensive and accurate data analysis platform that enables you to find the signal in the noise in your cancer and hereditary disease NGS data. With its broad selection of end-to-end analysis workflows, tools, and visualization modules, it enables easy and accurate discovery, verification, and validation of novel disease biomarkers. In this training, we will use prebuilt workflows, or analysis pipelines, to identify somatic variants in tumor samples and tumor/normal pairs. Within the workflow, we will map reads to the reference, identify variants, and annotate those variants not only with nucleic and amino acid changes, but also with information from third party sources, such as 1,000 genomes, dbSNP, and ClinVar. By the end of the training, you will appreciate that Biomedical Genomics Workbench is your one stop shop for analysis and visualization of NGS data.

Day 2: Wednesday, February 22, 2017

9:30 am – 11:00am    

Title: Using the Genomatix GeneGrid Analyzer for Your Exome-Seq Data
Presenter: Justin Lack, Ph.D.

 This talk will cover the ease-of-use and application(s) of GeneGrid:

  • Import and annotate variants
  • Compare samples for multiple experimental designs
  • Filter and prioritize variants
  • Generate extensive reports
  • Analyze affected pathway
  • Browse variants on the genome

11:00 am – 12:30 pm

Title: Ingenuity Variant Analysis (IVA) Software for Identifying Clinically Impactful Variants

Presenter: Jennifer Poitras, Field Application Specialist

Ingenuity Variant Analysis (IVA) combines analytical tools and integrated content to help you rapidly identify and prioritize variants by drilling down to a small, targeted subset of compelling variants based both upon published biological evidence and your own knowledge of disease biology. This workshop will focus on how the users can upload their datasets, efficiently use different filters within variant analysis to identify causal variants, export data and will also go over the recent IVA updates. With IVA, you can interrogate your variants from multiple biological perspectives, explore different biological hypotheses, and identify the most promising variants for follow-up.

LUNCH BREAK 12:30 – 1:00 pm

1:00 – 2:00 pm            OPEN Q & A with Presenters

2:00 – 4:00 pm            

Title: Using BioDiscovery Nexus for Copy Number Analysis

Presenter: Andrea O’Hara, Field Application Specialist

Nexus Copy Number version 8.0, offers copy number estimation from whole-genome sequencing (WGS), whole-exome sequencing (WES), and targeted sequencing panels. The sophisticated algorithm in Nexus Copy Number requires only BAM files as input and in addition to copy number, also derives B-allele frequencies (BAF) from BAM files. The interactive visualization and powerful statistical tools allow detection of structural variations (e.g. copy number, homozygous regions), association with sequence variations (point mutations, InDels, inversions, etc.), and identification of statistically significant co-occurring up/down regulated genes (from mRNA, miRNA, and RNA-Seq data). In this workshop, we will evaluate matched and unmatched tumor-normal cohorts for copy number and sequence variant analysis; we will use the sophisticated built-in statistical analyses and integrated graphical display to rapidly explore and mine vast amounts of data in minutes. Comprehensive downstream analysis will include statistical comparisons, concordance, clustering, survival and enrichment analysis.