Correspondence should be addressed to:
Erik van den Akker
Molecular Epidemiology, Leiden University Medical Centre, Leiden, The Netherlands
Multiple studies have illustrated that gene expression profiling of primary breast cancers throughout the final stages of tumor development can provide valuable markers for risk prediction of metastasis and disease sub typing. However, the identification of a biologically interpretable and universally shared set of markers proved to be difficult. Here, we propose a method for de novo grouping of genes by dissecting the protein-protein interaction network into disjoint sub networks using pair wise gene expression correlation measures. We show that the obtained sub networks are functionally coherent and are consistently identified when applied on a compendium composed of six different breast cancer studies. Application of the proposed method using different integration approaches underlines the robustness of the identified sub network related to cell cycle and identifies putative new sub network markers for metastasis related to cell-cell adhesion, the proteasome complex and JUN-FOS signalling. Although gene selection with the proposed method does not directly improve upon previously reported cross study classification performances, it shows great promises for applications in data integration and result interpretation.