Spring 2022 Next Generation Sequencing Workshops | crc.pitt.edu

Spring 2022 Next Generation Sequencing Workshops

These workshops were supported in part by the University of Pittsburgh seed project titled "University of Pittsburgh Computational Genomics Training Program".

High throughput sequencing has brought abundant sequence data along with a wealth of new “-omics” protocols, and this explosion of data can be as bewildering as it is exciting. Our multi-day hands-on workshops give researchers the research, open-sourced tools to plan and execute successful bioinformatics and genomics experiments. These workshops, taught by experienced Bioinformatics core faculty, cover both the theoretical and practical aspects of a wide range of NGS data, using the HTC cluster.

These workshop have hands-on components that require the following requirements be set up before a workshop begins.

Participants should have an account on the HTC cluster, which is the cluster we will use for demonstration purposes. (page 1 of this documentation)
This workshop also requires that participants either be on a Pitt network (hard-line) or behind a VPN. (page 2 of this documentation)
You can submit jobs, i.e., your group’s account has not expired, and your group’s service units (CPU-hours) have not been exhausted entirely (page 4 of this documentation)

As a general rule, we offer no troubleshooting for technical setup issues at the workshops themselves! Therefore, be aware that if you do not set up the workshop's technical prerequisites well in advance, you may not be able to participate fully in its hands-on activities.

You can find the titles of past NGS workshops and links to their recordings. For each workshop, there may also be slides used in the workshop and other additional relevant resources.

A familiarity with Linux and the Bash Shell is vital for these workshops. Submitting, monitoring, and managing jobs on the HTC cluster largely involves command-line operations. We do not routinely teach beginning Linux classes. If you are new to Linux environments, we highly recommend that you work your way through one of the past recordings (Introduction to Linux for NGS, Spring or fall 2021 workshop).

Next Generation Sequencing and Single Cell Techniques

Tuesday, Feb. 1
1:00 pm - 4:00pm

This workshop will cover the basis of Next-Gen Sequencing Library Preparation for Illumina Sequencers. Different Library Preparation Techniques (DNA-seq, ChIP-seq, RNA-seq, Methyl-seq) are explained. Quality Control steps of the starting input material and final libraries are also explained. This workshop will also discuss considerations for experimental design and the end goals of analysis prior to sequencing. Basics of sequencing and cost estimates will be discussed in the experimental design process. Presented by Amanda Poholek. 1:00pm - 2:30pm

This workshop will overview 10X genomics techniques for single cell sequencing, including RNA, ATAC, TCR, etc. The workshop will also highlight tips and potential approaches related to single-cell sequencing and introduce the major methods and tools available for single-cell sequencing. We will discuss considerations for experimental design, the end goals of single cell analysis and cost estimates using the single-cell applications on the NovaSeq and the NextSeq 500 sequencers. Presented by Robert Lafyatis. 2:30pm – 4:00pm

Overview of NGS data analysis using Pitt ondemand and R

Tuesday, Feb. 8
1:00 pm - 4:00pm
This workshop will provide general workflows of NGS data analysis using Pitt ondemand and the HTC cluster. I will briefly introduce ondemand file management, various genomics apps, RStudio server and shiny server. A hands-on session to run Seurat through RStudio server will follow. We will then go through the command line interface of the HTC cluster, focusing on R programming and container for package management. We will conclude the workshop with a second hands-on session to run Shiny applications through ondemand. Presented by Fangping Mu.

RNASeq data analysis

Tuesday, Feb. 15
1:00pm -4:00pm

The focus of the workshop will be on running RNA seq pipelines, from raw fast files, to fastqc, alignment to reference genome and generating gene expression counts. To facilitate learning, the workshop will be centered on hands-on tutorial that will guide students in processing the data from raw reads through read counts using a real case study based approach appropriate for Illumina read data. Presented by Uma Chandran.

Emerging NGS techniques: Long-read sequencing

Thursday, Feb. 17
1:00pm - 2:00pm

This seminar will review long-read sequencing approaches in NGS technology, and bioinformatic challenges, caused by coverage biases, high error rates in base allocation, scalability and limited availability of appropriate pipelines. Presented by Silvia Liu

Differential Expression and Functional Analysis

Tuesday, Feb. 22
1:00pm - 4:00pm

This hands-on workshop will introduce participants to statistical methods and considerations used to perform differential gene expression analysis on bulk RNA-seq data using DESEQ2. The workshop will also provide an overview of tools for functional analysis of DE genes to make biological inferences from large gene lists. Presented by Dhivyaa Rajasundaram.

Introduction to epigenomics and ChIP-seq/ATAC-seq data analysis

Thursday, Feb. 24
1:00pm - 4:00pm

This workshop will provide both theoretical and practical introduction to ChIP-seq and ATAC-Seq data analysis. In the first section, we will present the principle of ChIP and ATAC sequencing, bioinformatics pipeline of peak calling, data visualization, method of motif discovery, and a brief introduction of CUT&RUN sequencing. In the second half of the workshop, we will hand on a real ChIP-seq dataset to practice the pipeline using the HTC cluster. Presented by Silvia Liu.

Emerging NGS techniques: TCR/BCR data analysis

Tuesday, March 1

1:00pm-2:00pm

B cell receptors (BCRs) and T cell receptors (TCRs) make up an essential network of defense molecules. The analysis of BCR and TCR repertoires plays an important role in basic immunology. This seminar will review software methods to extract meaningful information from BCR and TCR sequence data. Presented by Dhivyaa Rajasundaram

Introduction to nf-core pipelines and metagenomics (mag) data analysis

Tuesday, March 15

1:00pm - 4:00pm

Nextflow is a workflow manager. It has been developed specifically to ease the creation and execution of bioinformatics pipelines. nf-core is a community effort to collect a curated set of analysis pipelines built using nextflow. nf-core pipelines are compatible with the HTC cluster computational infrastructures, such as the slurm job scheduler, and container/singularity for integrated software dependency management. We will introduce how to set up nf-core pipelines on the HTC cluster. We will explore a hands-on exercise focusing on mag pipeline (https://nf-co.re/mag), which is a bioinformatics best-practice analysis pipeline for assembly, binning and annotation of metagenomes. Presented by Paul Cantalupo

Introduction to NGS data analysis and WES/WGS variant calling

Tuesday, March 22

1:00pm - 4:00pm

High-throughput sequencing technology involves a number of concepts and techniques that shape a project before application-specific processes are utilized. This workshop covers common file formats for sequence data and limitations of sequencing technologies. This workshop introduces the more “universal” aspects of high-throughput sequence analysis. We will explore a hands-on exercise focusing on WES/WGS data processing for variant calling. Presented by Riyue Bao.

scRNASeq data processing using cellranger and singlecellTK shiny app

Tuesday, March 29

1:00pm - 4:00pm

Cell Ranger is a set of analysis pipelines that process 10X single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more. Cell Ranger can be run in cluster mode, using the HTC cluster to run the stages on multiple nodes via batch scheduling. This allows highly parallelizable stages to utilize hundreds of cores concurrently, dramatically reducing time to solution.The Single Cell ToolKit (SCTK) is an analysis platform that provides an R interface to several popular single-cell RNA-sequencing (scRNAseq) data preprocessing, quality control, analysis, and visualization tools. We will hand on a real single cell rnaseq dataset to practice the singelcellTK shiny UI interfaces using the HTC cluster. Presented by Fangping Mu

scRNASeq data analysis using Seurat

Tuesday, April 5

1:00pm - 4:00pm

This workshop will briefly review the cellranger pipelines to process raw reads into expression values. The hands-on training will include reading the count data in R, quality control, normalization, dimensionality reduction, cell clustering, and finding marker genes. The Seurat pipeline will be covered. It also includes the integrative analysis of simulated vs control data using the anchoring framework in Seurat V3. Presented by Dhivyaa Rajasundaram.

Emerging NGS techniques: Spatial transcriptomics

Tuesday, April 12

1:00pm-2:00pm

Spatial transcriptomics is a molecular profiling method that allows scientists to measure all the gene activity in a tissue sample and map where the activity is occurring. This seminar will review spatial gene expression solutions and the data analysis methods. Presented by Riyue Bao.