The stem cell bioinformatics group uses computational methods to explore the molecular mechanisms underpinning stem cells. To accomplish this we develop and apply advanced analysis techniques that make it possible to dissect complex collections of data from a wide range of technologies and sources.
The fields of stem cell biology and regenerative medicine research are fundamentally about understanding dynamic cellular processes such as development, reprogramming, repair, differentiation and the loss, acquisition or maintenance of pluripotency. In order to precisely decipher these processes at a molecular level, it is critical to identify and study key regulatory genes and transcriptional circuits. Modern high-throughput molecular profiling technologies provide a powerful approach to addressing these questions as they allow the profiling of tens of thousands of gene products in a single experiment. A central focus of our work is to use bioinformatics to interpret the information produced by such technologies. We work extensively with data from public repositories and collaborations over a wide variety of platforms such as microarrays, RNA-seq and ChIP-seq, using the latest methods to integrate studies and identify subtle but biologically significant patterns.
We have built two bioinformatics software platforms: StemDB (www.stemdb.org) and GeneProf (www.geneprof.org). These are complex projectss built using Java enterprise technology.
StemDB is focused on sharing data within international research communities and is currently used by two EU projects, Thymistem (www.thymistem.org) and StemBANCC (stembancc.org). StemDB allows capture, sharing and dissemination of data generated by member consortia.
GeneProf is focused on the analysis and interpretation of ChIP-seq and RNA-seq data and has been used for the analysis of over 1000 studies by hundreds of registered users. Data from ~200 studies have been made publicly available with many more on the way in the near future.
We are just about to release StemDB v3, which contains an updated set of components that allow StemDB to act as the central data repository for StemBANCC data. This release will be closely followed by GeneProf v2, which generalises the GeneProf analysis platform to support a wider range of data types.
The group maintains a complex compute infrastructure hosting our enterprise platforms and analysis resources. This infrastructure also hosts a range of web resources including the Eurostemcell web portal (www.eurostemcell.org), the CRM sites (including this site) as well as a variety of outreach and project sites such as Thymistem (www.thymistem.org) and Optistem (www.optistem.org).
We have a strong interest in applying bioinformatics methods to further understand pluripotent stem cells. Work in the group follows a number of different strands. One key aspect centres around nanog (in collaboration with Ian Chambers) and another using integration approaches to explore pluripotency at the genomic level. Complementary to these efforts is work on cellular reprogramming where we have an ongoing collaboration with Keisuke Kaji as well as a programme of work within the StemBANCC consortium. StemBANCC is a joint industry/academia (IMI) project that aims to generate and characterise in detail 1500 human iPS lines from patients with different disease backgrounds.
Although our main focus at the moment is in the area of pluripotent stem cells, we also work with other systems. Two example projects are one working on thymus development (with Clare Blackburn) and one on the ontogeny of haematopoietic stem cells (with Alexander Medvinsky). These two projects are part of extensive programmes of work from which the group is funded through the EU FP7 project Thymistem and BBSRC EASTBIO respectively.
To find out more details about the groups' work a good place to start are our selected publications. Halbritter et al 2012 and Halbritter et al 2014 describes the GeneProf sofware system and data resource, Skylaki & Tomlinson 2012 provides an example of our integrative analysis methods. Two references, Festuccia et al 2012 and Karwacki-Neisius et al 2013 are examples of our work with Ian Chambers, while O'Malley et al 2013 is an example of our collaboration with Keisuke Kaji. StemDB is currently unpublished, but of course full details of this resource can be found on the site itself (www.stemdb.org).
In order to provide wider access to our bioinformatics expertise than is possible through collaboration alone, the group started a cost-recovery bioinformatics service in 2014. This service is primarily provided for CRM scientists, but is interested in taking projects from the wider community. More details can be found at bioinfservice.stembio.org.
The group has a very strong interest in training the next generation of scientists at all stages of their career. If you are interested in joining the group please feel free to contact Simon Tomlinson using the contact details above to discuss the possibilities.
- Halbritter F, Kousa AI, Tomlinson SR. 2014. GeneProf data: a resource of curated, integrated and reusable high-throughput genomics experiments. Nucleic Acids Res. 42(database issue):D851-8.
- O'Malley J, Skylaki S, Iwabuchi KA, Chantzoura E, Ruetz T, Johnsson A, Tomlinson SR, Linnarsson S, Kaji K. 2013. High-resolution analysis with novel cell-surface markers identifies routes to iPS cells. Nature 499(7456):88-91.
- Karwacki-Neisius V, Göke J, Osorno R, Halbritter F, Ng JHui, Weiße AY, Wong F, Gagliardi A, Mullin NP, Festuccia N et al. 2013. Reduced Oct4 expression directs a robust pluripotent state with distinct signaling activity and increased enhancer occupancy by Oct4 and Nanog. Cell Stem Cell 12(5):531-45.
- Festuccia N, Osorno R, Halbritter F, Karwacki-Neisius V, Navarro P, Colby D, Wong F, Yates A, Tomlinson SR, Chambers I. 2012. Esrrb is a direct Nanog target gene that can substitute for Nanog function in pluripotent cells.. Cell Stem Cell. 11(4):477-90.
- Halbritter F, Vaidya H, Tomlinson SR. 2012. GeneProf: analysis of high-throughput sequencing experiments. Nat Methods 9(1):7-8.
- Skylaki S, Tomlinson SR. 2012. Recurrent transcriptional clusters in the genome of mouse pluripotent stem cells. Nucleic Acids Res. 40(19):e153.