| Overview | Research | Publications | Members | Contact Us   

     

Research Projects

 
Bioinformatics, Data Mining & Data Integration
>> Pennsylvania Cancer Alliance Data Warehouse for Cancer Bio-Geo Informatics
>> Spatio-temporal Organization of Nuclear Structure
>> Analysis of temporal gene expression data
>> Information Fusion and Data combination for Functional Genomics
   
Netcentric Computing
>> Routing of Dynamic Service Level Agreement Requests between Inter-domain Bandwidth Brokers and Intra-domain Multi-path Splitting of Differentiated Service Aggregated Flows
       

 
Bioinformatics, Data Mining & Databases
Pennsylvania Cancer Alliance Data Warehouse for Cancer Bio-Geo Informatics
 

The Pennsylvania Cancer Alliance Bioinformatics Consortium (PCABC) is a unique partnership comprised of the University of Pittsburgh Cancer Institute (UPCI), Fox Chase Cancer Center, Kimmel Cancer Center of Thomas Jefferson University, Penn State Cancer Institute, University of Pennsylvania Cancer Center and the Wistar Institute. The Consortium has the goal of using the state’s investment to cooperate closely to use a shared bioinformatics approach to identify biomarkers that have usefulness for assisting in the diagnosis of cancer and/or for predicting response to treatment of cancer patients or the clinical course of their disease.

The current project focuses on building a Data Warehouse that serves as a data access and mining platform for multiple data sets related to cancer research from various research centers. The data sets include Pennsylvania Cancer Registry, Clinical data, Tissue data, Biomarker, Geomarker and Gene Expression data. The goal is to integrate the distributed data sets mentioned above and offer a one-point access to all the participating research centers through a website.

 

top

   
Spatio-temporal Organization of Nuclear Structure
 

The genetic information of all eukaryotic cells is located inside the cell nucleus. Scientists use fluorescence microscopy to biochemically label and detect molecules within the cell nucleus for understanding fundamental relationships between the spatial structure and function of chromatin, which consists of DNA, RNA and various proteins, and which condenses into chromosomes during mitotic cell division. We are developing a system for image analysis within the context of 3D microscopic images of the labeled cell nucleus under various conditions and for various cell types.

Our research has lead us in exciting directions in which we have developed algorithms that help in identifying higher order domains and networks of labeled sites within the cell nucleus that are undergoing replication and transcription. We are extending this analysis to the more difficult problem of quantifying domain and network changes in living cells that undergo complex 3D motion with time. We have identified salient structural features that enable us to formulate this as a graph matching problem. Currently heuristic methods are under development that will improve the runtime and compute results with higher accuracy. In another research direction we have pursued, the chromatin has been characterized as a space filling fractal curve with a measurable fractal dimension. We have successfully applied this method to the classification of images of cell nuclei according to the stage of the cell cycle it was obtained in. We hope to extend this analysis to all stages of the cell cycle and also to distinguish between normal and cancerous cells.

 

top

   
Analysis of temporal gene expression data
 

DNA arrays measure expression levels of thousands of mRNAs in a single experiment. Data mining tools are required to be developed to mine this information rich data with the goal of identifying the patterns that characterize the underlying mechanisms of action. The kinetics of gene expression are commonly examined in many experimental designs to delineate the temporal sequence of transcriptional events that occur in response to a given stimulus. The identification of groups of genes with ‘similar’ temporal patterns of expression is usually a critical step in the analysis of kinetic data because it provides insights into the gene–gene interactions and thereby facilitates the testing and development of mechanistic models for the regulation of the underlying biological processes.

Our research work in this area has included the development of clustering algorithms specifically designed for temporal gene expression data. Information theoretic dissimilarity measures such as the Kullback-Leibler divergence for comparing the shapes of the two gene profiles have proven extremely useful.

 

top

   
Information Fusion and Data combination for Functional Genomics
 

Microarray experimental data are a valuable, but limited source for inferring gene regulation mechanisms on a genomic scale. Additional information such as promoter sequences of genes/ DNA binding motifs, gene ontologies, and location data, when combined with gene expression analysis can increase the statistical significance of the finding.

To this end, our current research work focuses on the development of a machine learning based algorithms for the combination of heterogeneous genomic data. Our current project involves the development of a general computational system that integrates diverse data from multiple genomic sensors such as gene expression data, upstream sequences of genes, genomic multi-species sequence alignment data, and putative regulatory regions, for example, in order to provide improved predictions of regulatory regions made genome-wide.

 

top

   
Netcentric Computing
Routing of Dynamic Service Level Agreement Requests between Inter-domain Bandwidth Brokers and Intra-domain Multi-path Splitting of Differentiated Service Aggregated Flows
  The program of research involves the use of a new hierarchical QoS routing scheme for routing of dynamic SLAs in a differentiated-services enabled network and splitting of aggregated flows at the incoming line card of the ingress router in a transit domain. In this work, our specific research goals are as follows:
  >> To study the use of hierarchical QoS routing for computing routes for dynamic SLAs, in the presence of multiple physical paths between an inter-domain source-destination pair, so as to maximize the probability of success for the SLA. We use hierarchical QoS routing in order to have a scalable solution. We intend to study the feasibility and trade-offs of inter-domain BB End-to-End Notification.
  >> To propose and study a new probabilistic scheme for hierarchical QoS routing which makes use of local statistics collected by individual routers and takes into account the total traffic handling capacity of a domain.
 

>>

To study the splitting up of aggregated DS flows at the ingress router of transit domains along multiple paths and recombining the split flows into an aggregated flow at the egress router. We propose to use scheduling of flows for different paths at the incoming line card of the ingress router. This is different from the traditional scheduling of packets at the link layer.
   

top

     
       

© 2003 - 2004 Advanced Laboratory for Information Systems & Analysis
Department of Computer Science and Engineering
The Pennsylvania State University, University Park, PA, 16801