Truth and Service
Future StudentsStudentsParents and FamilyAlumniFaculty and StaffVisitors
Directory Image
Customer Service
Gene Set Enrichment: Activities

Our work is related to the Dempster-Shafer’s evidence theory. Dempster-Shafer theory offers an alternative to traditional probabilistic theory for the mathematical representation of uncertainty. The significant innovation of this framework is that it allows for the allocation of a probability mass to sets or intervals. Dempster-Shafer theory does not require an assumption regarding the probability of the individual constituents of the set or interval.

There are two critical and related issues concerning the combination of evidence obtained from multiple sources: one is the type of evidence involved and the other is how to handle conflicting evidence. We consider four types of evidence from multiple sources that impact the choice of how information is to be combined: consonant evidence, consistent evidence, arbitrary evidence, and disjoint evidence. Traditional probability theory cannot handle consonant, consistent, or arbitrary evidence without resorting to further assumptions of the probability distributions within a set, nor can probability theory express the level of conflict between these evidential sets. Dempster-Shafer theory is a framework that can handle these various evidentiary types by combining a notion of probability with the traditional conception of sets. In addition, in Dempster Shafer theory, there are many ways in which conflict can be incorporated when combining multiple sources of information.

Characterizing the uncertainties in sensor measurements is still a challenging problem, firstly because there is no general analytical solution to non-linear and/or non-Gaussian situations and secondly because both the environment and sensor working conditions are time-varying in many practical applications. Monte-Carlo methods provide a novel approach to non-Gaussian distribution approximation. Multiple models plus adaptive model switching methods provide a divide-and conquer approach to handle complicated uncertain situations. Fuzzy reasoning as a general tool for coping with uncertainty could be useful in characterizing sensor uncertainties.

We decided to apply developed statistical algorithms and Bayesian method for multi sensory information integration in grid-based network tool to analyze and construct the gene regulatory networks in fission yeast cell system. This problem is not directly related to the Cooperative Autonomous Mobile Robotic Systems. However the applied techniques that in this work are similar and any experience garnered during solving this particular genomics problem may be applied to the CAMoRoS. Through this effort, we are developing projects in bioinformatics at our CREST Center. The biology department is the largest department in the College of Science and technology and in addition at NCCU we have also two large institutes on biotechnology BBRI and biopharmaceutical science BRITE. We thus recognize this project as a golden opportunity that will allow us to start development of bioinformatics projects and at same time to further advance the algorithms related to the CAMoRoS.

Fission yeast cell cycle regulation has been studied on genome-wide scale with different experiments and platforms. We used a differential meta-analytic approach to identify stress responsive genes. We applied our method to combination of ten genome-wide time course expression experiments on the cell division cycle of fission yeast Schozosaccharomyces pombe. We studied the difference in the significance levels of the periodicity of oscillation and the expression regulation for every gene in S. Pombe over the course of the cell cycle. This led to comprehensive identification of two statistically significant gene sets showing markedly opposing patterns of expression: (1) highly periodic but weakly regulated genes, versus (2) those with high expression levels but which do not follow a cyclic pattern. The second set is of more interest for us because it represents the genes that respond to the environmental stresses introduced by the cell growth arrest mechanisms due to different synchronization protocols used in the biological labs. We performed Gene Set Enrichment Analysis (GSEA) on the second set of genes to validate the genome-wide ranking with the help of different global enrichment patterns for well characterized gene sets in S. Pombe. We also conducted Bayesian network analysis to find the interaction between these stress responsive genes. We identified a new regulatory network of genes that are known for their responses to environmental stress.

We evaluated the P-value for periodicity following the procedure described by de Lichtenberg et al.Comparison of computational methods for the identification of cell cycle-regulated genes. Bioinformatics, 2005. 21(7): p. 1164-1171. The P-value for periodicity is calculated based on the Fourier sum (to fit sinusoidals) and a permutation testing with 100,000 permutations. For computing the P-value for expression, we used the assumption “genes with higher variation are more likely to be differentially expressed”. For a given experiment, we first estimate the population variance, and estimate the sample variance for a gene using a bootstrapping procedure and a final P-value is obtained for expression. A gene is classified as stress responsive if it is weakly cyclic and highly expressed.

To combine the results from ten experiments, we use a differential meta-analytic procedure. For gene g in experiment j, let formula1 be the P-value for periodicity, formula be the P-value for expression, test statistic formula3 can be approximated by formula4 where N is the number of experiments to be integrated (see for instance George, E.O. and G.S. Mudholkar, On the Convolution of Logistic Random Variables. Metrika, 1983. 30: p. 1-14). The meta P-value is calculated from this test statistic based on the approximate t-distribution with 10N + 4 degrees of freedom.