Journal Article
Research Support, Non-U.S. Gov't
Add like
Add dislike
Add to saved papers

Gene Ontology analysis in multiple gene clusters under multiple hypothesis testing framework.

OBJECTIVE: Gene Ontology (GO) has become a routine resource for functional analysis of gene lists. Although a number of tools have been provided to identify enriched GO terms in one or two gene lists, two technical challenges remain. First, how to handle multiple hypothesis testing in the analysis given that the tests are heavily correlated; second, how to identify GO terms that are enriched in a gene cluster, as compared to multiple other gene clusters. We provide a statistical procedure to rigorously treat these problems and offer a software tool for applying GO to the analysis of gene clusters.

METHODS: We previously introduced a statistical procedure that handles hypothesis testing in a two-group comparison scenario. In this paper we extend the two-group comparison procedure into a general procedure that enables the analysis of any number of gene lists/clusters. This new procedure enables identification of GO terms enriched in any gene cluster, while it controls for multiple hypothesis testing. This procedure is implemented into a user-friendly analysis tool: GoSurfer. The current version of GoSurfer takes one or several gene lists as input, and it identifies the GO terms that are enriched in any of the input gene lists. GoSurfer estimates a conservative false discovery rate (FDR) for every GO term. The FDR estimation procedure in GoSurfer has two advantages: it does not rely on independence assumption, and it does not assume all the hypotheses are null hypothesis (complete null). Thus GoSurfer's FDR estimates are mildly conservative rather than overly conservative.

RESULTS: We implemented the new procedure for GO analysis in multiple gene clusters into the GoSurfer software. We provide three examples on using GoSurfer to analyze time course gene expression data sets on the differentiation of embryonic stem cells. In the example of analysis of multiple gene clusters, we first used a typical clustering algorithm and identified five gene clusters, representing up-regulation, down-regulation and other patterns in the differentiation time course. Taking all the five gene clusters as input data, GoSurfer reports "cell adhesion" and "muscle contraction" as significant GO terms for the up-regulated cluster, "amino acids metabolism" as a significant GO term for the down-regulated gene cluster, and GoSurfer reports a number of GO terms related to RNA processing and RNA transport as significant terms to a cluster that is up-regulated in both early and late time points. This may suggest that genes for RNA processing and genes for RNA transport are coregulated in the differentiation process of embryonic stem cells.

CONCLUSION: The GoSurfer software is provided to analyze multiple gene clusters and identify GO terms that are enriched in any gene cluster. Gosurfer is available at: www.gosurfer.org.

Full text links

We have located links that may give you full text access.
Can't access the paper?
Try logging in through your university/institutional subscription. For a smoother one-click institutional access experience, please use our mobile app.

Related Resources

For the best experience, use the Read mobile app

Mobile app image

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app

All material on this website is protected by copyright, Copyright © 1994-2024 by WebMD LLC.
This website also contains material copyrighted by 3rd parties.

By using this service, you agree to our terms of use and privacy policy.

Your Privacy Choices Toggle icon

You can now claim free CME credits for this literature searchClaim now

Get seemless 1-tap access through your institution/university

For the best experience, use the Read mobile app