Home Resources Blog Data analysis

WGCNA Explained: Everything You Need to Know

Multi-omics data analysis blog series

01 What is WGCNA?

WGCNA, short for Weighted Gene Co-expression Network Analysis, is a commonly used tool for analyzing gene co-expression networks. It is often translated as weighted correlation network analysis. Weighted Correlation network analysis, particularly through the WGCNA R package, is applied to examine correlation structures in high-dimensional datasets, such as gene expression and proteomics data.

02 When is WGCNA used?

WGCNA is applied to analyze gene expression data in complex transcriptome data with multiple samples, particularly in the study of developmental regulation across different organs/tissues and stages. Differential network analysis is crucial for identifying changes in connectivity patterns under varying conditions. Additionally, it is utilized to investigate response mechanisms to biotic and abiotic stresses at various time points.

03 How to interpret WGCNA results?

3.1 Identifying gene co-expression network sets

Based on pairwise correlation of gene expression data across all samples, genes with similar expression patterns are grouped into modules. This categorization condenses thousands of differentially expressed genes into several modules, typically a dozen or more. Using transcriptome data from tomato fruit development research as an example, WGCNA analysis identified 12 gene modules. The commonly used representation in literature is depicted in the figure below, where the upper part of the evolutionary tree displays each branch representing a gene, and the different colors below represent various modules.

3.2 Filtering Key Modules for Functional enrichment analysis

Method 1: Filtering based on module characteristic expression patterns.

After identifying modules, WGCNA calculates a module characteristic value (Epigene) for each module, representing the expression status of all genes in the module. Analyzing the abundance of module characteristic values in various samples helps filter modules closely related to the samples. For instance, in tomatoes, the "brown" module shows higher characteristic value expression (positive) in samples from the first period, making it a key module for subsequent analysis. This process requires the input of gene expression data to accurately determine the module characteristic expression patterns.

Correlation_Heatmap_between_Samples_and_Modules Method 2: Filtering through module-sample or phenotype correlation analysis. Calculating the correlation coefficient between modules and sample or phenotype data identifies modules highly correlated with specific samples or phenotypes. In tomato data, a specific correlation between JS3 and the "pink" module suggests special attention to this module. If there are statistical data on relevant phenotypes during tomato fruit development, such as tomato lycopene content, modules with the highest correlation to lycopene content can also be selected.

Method 3: Filtering through module gene function enrichment. Conducting functional enrichment analysis, like Gene Ontology (GO), for each module helps identify modules corresponding to biological processes related to the traits of interest. For example, in tomato fruit development, processes like carotenoid metabolism and ethylene signaling are relevant to fruit ripening, prompting focus on modules enriched with relevant GO terms. Additionally, network visualization can be enhanced by plotting the connectivity distribution of the entire network, where the y-axis depicts the logarithm of the corresponding frequency distribution.

Method 4: Filtering modules through target gene selection. Considering research objectives, previous findings, and published literature, modules containing target genes of interest can be directly selected for further analysis. In tomato fruit development, key genes like PG2A and PL1 involved in pectin degradation found in the "yellow" module make it a candidate for further investigation.

3.3 Identifying key genes

After filtering down to candidate modules through the aforementioned analyses, analyzing the internal composition of the modules is crucial. Identifying key genes within the modules, often referred to as Hub genes, is essential. This can be achieved through analyzing intra-modular gene connectivity (TOM values, KME, or KIM), selecting genes with higher connectivity in the network. RNA-Seq datasets from the Gene Expression Omnibus (GEO) are invaluable for such transcriptomics research, providing comprehensive data for various species and biological sample groups. Additionally, attention can be directed towards genes with regulatory functions, such as transcription factors, as they generally act as regulators in the upstream part of the module regulatory network.

04 Metware Cloud Platform: Simplifying WGCNA Analysis

MetwareBio’s Metware Cloud Platform offers powerful tools for WGCNA, enabling seamless network construction, module identification, and visualization. With an intuitive interface and fast processing, researchers can focus on insights rather than technical hurdles. Watch how the Metware Cloud Platform streamlines WGCNA analysis, helping you visualize gene co-expression networks effortlessly.

MetwareBio has extensive experience in LC - MS/MS detection services and multi-omics data analysis. Explore how MetwareBio’s solutions in proteomics, transcriptomics, metabolomics, and multi-omics can advance your research today!

Connect With Us

NEXT: Understanding WGCNA Analysis in Publications

Resources

Sample Requirements

Document Download

FAQ

Proteomics

Proteomics Methodology Proteomics Sample Extraction Proteomics Sample Preparation Proteomics Data Analysis

Metabolomics

Metabolites for Metabolomics Metabolomics Methodology Metabolomics Sample Extraction Metabolomics Sample Preparation Metabolomics Data Analysis

Multiomics

Multiomics Methodology Multi-omics Data Analysis

Lipidomics

Lipids for Lipidomics Lipidomics Methodology Lipidomics Sample Extraction Lipidomics Sample Preparation Lipidomics Data Analysis

Blog

Spatial Metabolomics

Proteomics

Metabolomics

Metabolites

Lipidomics

Multi-omics

Data analysis

Metabolites Library

Knowledgebase

Metabolomics

Metabolites

Lipidomics

Proteomics

Multi-omics

Data Analysis

Instrumentation

Metware Cloud

Publications

Metware Cloud Platform

Services

Proteomics

DIA Quantitative Proteomics

DDA Quantitative Proteomics

Serum/Plasma Quantitative Proteomics

Low-Input Quantitative Proteomics

PRM Targeted Proteomics

Phosphoproteomics

Ubiquitin Proteomics

Acetyl-Proteomics

Proteome + PTM Analysis

Global Metabolite Profiling

Untargeted Metabolomics

TM Widely-Targeted Metabolomics

Widely-Targeted Metabolomics for Plants

Flavonoids Metabolomics

Spatial Metabolomics

Lipidomics

Quantitative Lipidomics

Quantitative Lipidomics for Plants

Targeted Metabolomics

Bile Acid

Oxylipin Targeted Metabolomics

Neurotransmitter Targeted Metabolomics

Steroid Hormone Targeted Metabolomics

Energy Metabolism

Tryptophan Targeted Metabolomics

Amino Acid Targeted Metabolomics

Short-Chain Fatty Acids

Plant Hormone Assay

Carotenoid Targeted Metabolomics

Anthocyanin Assay

Gibberellin Assay

Name can't be empty

Email error!

Message can't be empty

CONTACT FOR DEMO

Next-Generation Omics Solutions:
Proteomics & Metabolomics

Have a project in mind? Tell us about your research, and our team will design a customized proteomics or metabolomics plan to support your goals.
Ready to get started? Submit your inquiry or contact us at support-global@metwarebio.com.

Name can't be empty

Email error!

Message can't be empty

CONTACT FOR DEMO