SSSC Seminar: Dimension Reduction Methods: from PCA to TSNE and UMAP In this seminar series, I will review some basic concepts of statistical analyses that are commonly used in everyday research.
SSSC Seminar: Dimension Reduction Methods: from PCA to TSNE and UMAP
In this seminar series, I will review some basic concepts of statistical analyses that are commonly used in everyday research. I will talk about supervised versus unsupervised learning. My talk will be focused on unsupervised methods, in particular, on dimension reduction methods, which are fundamental to most of state-of-the-art technologies used in the single cell data analysis. Principal Component Analysis (PCA) provides a foundation to understanding various dimension reduction methods. Principal components are the linear combinations of the original variables with the 1st principal component captures the largest variance, followed by a descending order for the rest of the principal components and subject to the principal components being orthonormal. Multidimensional Scaling (MDS) provides an alternative strategy for dimension reduction. I will explain why MDS is equivalent to PCA. PCA and MDS are linear dimension reduction methods. In 2000, two Science papers described nonlinear dimension reduction methods: ISOMAP and locally-linear embedding (LLE). ISOMAP is similar to MDS except that it uses geodesic distance instead of Euclidean distance. ISOMAP and LLE provided a conceptual framework that led to the development of the current state of the art dimension reduction methods, such as TSNE and UMAP, which have much improved performance and better visual representation. I will discuss the connection between dimension reduction methods and clustering analysis. I will talk about integrated data analysis using Canonical Correlation Analysis and trajectory analysis using reversed graph embedding.
Thursday, April 30, 2020, 12noon – 1pm
‘Dimension Reduction Methods: from PCA to TSNE and UMAP – Part III’
(Thursday) 12:00 pm - 1:00 pm