Metabolomic Analyses: Comparison of PCA, PLS-DA and OPLS-DA

MetwareBio data analysis blog series


In the exploration of articles employing metabolomics for detection, a recurring theme involves the utilization of PCA, PLS-DA, and OPLS-DA analyses. The question naturally arises: What sets these three analytical methods apart? Furthermore, how do these analyses yield distinct conclusions in the realm of biological research? This article meticulously elucidates the nuances of PCA, PLS-DA, and OPLS-DA analyses.


What is PCA analysis?

Principal Component Analysis (PCA), an unsupervised multivariate statistical analysis method, strategically employs orthogonal transformations. This approach transforms potentially correlated variables into linearly uncorrelated variables known as principal components. In essence, PCA compresses raw data into principal components to vividly describe the characteristics of the original dataset. PC1 embodies the most salient feature in a multidimensional data matrix, with PC2 capturing the next most significant feature, and so forth (Eriksson et al., 2006).


What is PLS-DA analysis?

Partial Least-Squares Discriminant Analysis (PLS-DA), a multivariate dimensionality reduction tool prevalent in chemometrics for over two decades, is recommended for omics data analysis. PLS-DA can be considered a "supervised" version of PCA, combining dimensionality reduction with group information consideration. As a result, it not only serves for dimensionality reduction but also facilitates feature selection and classification.


What is OPLS-DA analysis?

Orthogonal Partial Least Squares-Discriminant Analysis (OPLS-DA), as the name suggests, seamlessly integrates orthogonal signal correction (OSC) and PLS-DA methods. It adeptly decomposes the X matrix into Y-related and unrelated information, streamlining the selection of differential variables. Unlike PCA, OPLS-DA stands as a supervised discriminant analysis statistical method.


Comparison of PCA, PLS-DA and OPS-DA analysis








Data visualization, evaluation of biological replicates

Unable to identify differential metabolites



Identify differential metabolites, build classification models

May be affected by noise
OPLS-DA Supervised Improve the accuracy and reliability of differential analysis

Higher computational complexity



What is PCA analysis used for?

PCA serves a dual purpose. Firstly, it gauges the suitability of biological replicates for subsequent analysis. For instance, Figure 1's left[1] graph exhibits well-distributed biological replicates, making it conducive for subsequent differential metabolite screening. Conversely, the right graph showcases outlier samples, prompting the recommendation to eliminate such samples to circumvent false positives or negatives in subsequent differential metabolite selection.




Secondly, PCA identifies the primary and secondary factors contributing to the most substantial differences. In a study involving two variables (breed and treatment temperature), resulting in four groups of samples, PCA may reveal that breed contributes the most significant difference along PC1, followed by treatment temperature along PC2.


What are PLS-DA and OPLS-DA analyses used for?

PLS-DA builds upon PCA by incorporating group information, enabling the forcible grouping of data. This feature facilitates an intuitive examination of differences between various groups, making PLS-DA a crucial tool for screening differential metabolites. Through PLS-DA analysis, metabolites demanding focused attention—acting as major contributors to differences between treatments or groups—are pinpointed. 




Both PLS-DA and OPLS-DA can be utilized for the selection of differential metabolites. The key distinction lies in the inclusion of orthogonal correction signals in OPLS-DA, aiding in the filtration of errors introduced by non-experimental factors. In a study involving drought-treated plants, for instance, slight differences in light intensity among treated plants could introduce metabolite variations. OPLS-DA efficiently filters out such false positives, directing attention to metabolites of genuine interest.




PCA, PLS-DA, and OPLS-DA analyses are commonly used statistical analysis methods in metabolomics research. The choice of method depends on the research purpose and data characteristics. At MetwareBio's Boston laboratory, we offer extensive metabolomics and multi-omics testing services, alongside comprehensive data analysis services. Access our free and user-friendly Metware Cloud Platform for seamless analysis of your multi-omics data. Have questions? We're here to offer guidance and support every step of the way!



Please submit a detailed description of your project. We will provide you with a customized project plan metabolomics services to meet your research requests. You can also send emails directly to support-global@metwarebio.com for inquiries.
Name can't be empty
Email error!
Message can't be empty
Copyright © Metware Biotechnology Inc. All Rights Reserved.
support-global@metwarebio.com +1(781)975-1541
8A Henshaw Street, Woburn, MA 01801
Contact Us Now
Name can't be empty
Email error!
Message can't be empty