We are interested in using bioinformatics techniques to gain understanding of key molecular features of stem cell biology.
Modern high-throughput techniques such as microarray expression profiling generate huge amounts of data. In order to interpret this data it is essential to apply appropriate data storage, analysis and visualization techniques. Software resources must also be available that allow the framing of biological questions in the context of the wealth of available data.
Working within the EU project Eurostemcell we developed a database designed to store and share the data outputs from this large consortium. This database, StemDB now forms a key component of the data management strategy for the EU FP7 project Eurosystem.
We are interested in the application of integrative analysis approaches in order to gain understanding of complex data sets such as genome scale expression data from stem cell populations. Non-trivial understanding of profiling data from these cell populations is especially chalenging given the heterogeniety of these cells, the capacity for spontaneous commitment and differentiation and the ability of the cells to adapt to their culture environment.
We have refined data analysis approaches that support the identification of complex patterns across the huge body of available data. We use these integrative approaches to dissect stem cell transcriptional profiles in order to identify key regulatory components within a large body of data. We are applying these methods to dissect the key molecular genes involved in self-renewal and differentiation of mouse and human embryonic stem cells.
Approaches and progress
Analysis work within the group makes use of commercial packages such as Genespring in combination with a wealth of public resources especially those provided as part of the Bioconductor project for Microarray analysis, EMBOSS for sequence analysis and our in-house software components. Our analysis work is supported by locally held genomic resources and our own custom microarray database.
Our software development work mostly makes use of Java, C/C++ occasionally combined with a variety of scripting languages. Most of our web development uses Tomcat/JSP with MySQL as our database back-end. We have also developed several data-mining plugins for Genespring that are Swing based. Most of our C++ development has been focused on developing native applications for Windows using MFC and .NET based technologies. Software development projects have so far been in the areas of image analysis, data processing and integration and the interpretation of expression data in the context of genomic data.
All the collaborative projects make use of the considerable expertise in isolating and maintaining progenitor and stem cell populations within the Centre. Much of the work exploits methodologies that allow the cells to be grown as relatively pure, well defined cell populations in feeder free (and serum free) conditions. The projects also make use of a range of gene targeting techniques that allow the activity of specific genes to be modulated under precisely controlled conditions.