Aneta Sawikowska
Comprehensive data analyses for high-throughtput lc-ms data are presented. Methods of statistical analysis and integration for multifactorial experiments are shown. Examples data sets comes from studies on cereals response to pathogen infection and barley (Hordeum vulgare) under drought stress. Primary metabolites, secondary metabolites and proteins were analyzed.
Data preprocessing, analysis and visualization was done in the R system. The statistical analyses were performed using procedures in Genstat package. Methods of omic data integration and visualization by networks are presented.
The correlation networks and differential correlation networks were constructed to compare relations between metabolites and proteins under different conditions. Traits are represented by nodes, lines (edges) correspond to correlations between the pairs of traits. Modules - clusters with highly correlated traits are detected. Hubs, which are traits with many connections (correlations with other traits) are indicated.
Correlation network analysis was done using WGCNA package in R, the Pearson correlation matrix was transformed into an adjacency matrix using a power function. Modules were detected by clustering. Differential correlation networks were created using the test based on Fisher's Z transformation, with Bonferroni correction. Visualization of networks was performed in Cytoscape.
The algorithms can be adapted to any high-throughtput lc-ms data.