Correspondence should be addressed to:
Department of Computer Science, University of Texas at San Antonio,TX 78249, USA
Revealing the complex interaction between trans- and cis-regulatory elements and identifying these potential binding sites are fundamental problems in understanding gene expression. The progresses in ChIP-chip technology facilitate identifying DNA sequences that are recognized by a specific transcription factor. However, protein-DNA binding is a necessary, but not sufficient, condition for transcription regulation. We need to demonstrate that their gene expression levels are correlated to further confirm regulatory relationship. Here, instead of using a linear correlation coefficient, we used a non-linear function that seems to better capture possible regulatory relationships. By analyzing tissue-specific gene expression profiles of human and mouse, we delineate a list of pairs of transcription factor and gene with highly correlated expression levels, which may have regulatory relationships. Using two closely-related species (human and mouse), we perform comparative genome analysis to cross-validate the quality of our prediction. Our findings are confirmed by matching publicly available TFBS databases (like TRANFAC and ConSite) and by reviewing biological literature. For example, according to our analysis, 80% and 85.71% of the targets genes associated with E2F5 and RELB transcription factors have the corresponding known binding sites. We also substantiated our results on some oncogenes with the biomedical literature. Moreover, we performed further analysis on them and found that BCR and DEK may be regulated by some common transcription factors. Similar results for BTG1, FCGR2B and LCK genes were also reported.