Developmental Biology, Bioinformatics and Systems Approaches

(Adapted from Bard, J. 2007. Systems developmental biology: the use of ontologies in annotating models and in identifying gene function within and across species. Mammalian Genome. 18: 402–411.)

In developmental biology, as in every other area of biology, the major resources for up-to-date knowledge, if not understanding, are the web-accessible molecular databases and their associated bioinformatics tools (to be found at, for example, Pubmed www.ncbi.nlm.nih.gov and the European Bioinformatics Institute site www.ebi.ac.uk).

The richness of development does, however, require more than just these databases, tools, literature, and, of course, access to Google and other search engines. This is because developmental biology is as much based on model organisms and their tissues, mutations, and associated abnormal phenotypes as it is on genes and proteins. To handle the complexity of development, every major model organism now has its own web-accessible database (see Table 1), and these databases are not only places for holding organism-associated data, but are major community resources with newsletters, jobs, major research groups, etc. (see Figure 1). Anyone interested in the development of some aspect of a particular organism needs to explore its database (Table 1).

Figure 1
Figure 1   The home page of the zebrafish database http://zfin.org/cgi-bin/webdriver?MIval=aa-ZDB_home.apg. (Click image to enlarge.)

Table 1: Major model organism databases
Arabadopsis (plant)www.arabidopsis.org
C. elegans (nematode) www.wormbase.org
Drosophila (fly) www.flybase.org
Mus (mouse) www.informatics.jax.org
Xenopus (frog) www.xenbase.org
Danio (zebrafish)www.zfin.org
Gallus (chicken)www.geisha.arizona.edu/geisha/
Sea urchinswww.spbase.org/SpBase/
Canis (dogs) www.research.nhgri.nih.gov/dog_genome/
Homo (human)www.genomics.energy.gov/; www.ncbi.nlm.nih.gov/omim/

The molecular pages of these databases are much the same as those in non-organism resources, but it has turned out that tissue-associated data (e.g. gene expression, phenotype abnormalities, cell types, etc.) have had to be handled differently. This is because tissue organisation, unlike molecular data and the like which can readily be stored in the tables of relational databases, in naturally hierarchical (for the mouse, the digit is part of the paw is part of the forelimb etc), and hierarchies cannot easily be stored in tabular formats. If, say, one needs to know the tissues expressed in the mouse brain at a particular stage of development, GXD (the mouse gene expression database) has to “know” the parts hierarchy of the brain at that stage so that the programme can collate the genes expressed in each tissue component. The organism database thus has to hold what can be seen as a formalized textbook of developmental anatomy detailing the relationships between its various tissues.

Sites for instruction on using databases for finding sequences

In practice, this book of developmental anatomy is organised in what is known as an ontology: this is a way of formalizing areas of knowledge in linked triads of the form, such as (e.g. or or ). To help on the computational side, each tissue has a unique identifier (ID). In practice, data is linked to the terms, and the logic of the relationships used to collate the answers to queries. A particular advantage of ontology IDs is that they can be used interoperatively, that is, where programs allow computers to access information from one another’s databases directly.

Such hierarchies are widely used across informatics, and perhaps the best known is the Gene Ontology (GO) (www.geneontology.org) where knowledge about the types, roles, and cellular locations of genes is linked to a database of hundreds of thousands of proteins (indeed, many molecular databases include GO associated data). The ontologies for all the anatomies of the main model organisms and other areas of biology, together with the tools for handling them, can be found at www.obofoundry.org.

In more formal terms, ontologies are mathematical graphs (not to be confused with data graphs), and this graph formalism, together with ontology associated IDs for genes, tissues etc., (see Bard, 2008) is also used in systems biology; it seeks to see how molecules work together to produce functions and phenotypes (see en.wikipedia.org/wiki/Systems_biology). The area can be seen as a reaction to the reductionist paradigm of the last 20 years, discovering the molecules underpinning developmental events. Systems biology was actually started in developmental biology (see Weiss, 1971 and von Bertalanffy, 1968), and developmental biologists up until 20 years ago were, of course, mainly systems biologists experimenting on phenotype generation and generally ignoring genetic regulation, as so few regulatory genes were known.

It now is clear that developmental phenotypes depend on the integration of systems of complex genetic pathways (which can be visualised in graphs) rather than on individual genes— and the pathways themselves are beginning to become known (the best data come from work on the sea urchin). See Materna and Davidson, 2007. The key experimental approach for systems developmental biology is using expression and mutation analysis to see how abnormal or absent expression effects phenotype (an approach first used by Waddington [1940], perhaps the greatest of developmental biologists and who, through his epigenetic landscape, first provided a metaphor for systems developmental biology [see Bard, 2007 for details]). The main theoretical task for systems biologists it to work out a general framework for understanding this integration, analysing the graphs, predicting phenotypes, and suggesting experiments. This task is turning out to be very difficult, but those interested in the area should, as a first step, read the work of Uri Alon (2007) at www.weizmann.ac.il/mcb/UriAlon

Literature Cited

Alon, U. 2007. An introduction to systems biology. Chapman & Hall.

Bard, J. 2007. Systems developmental biology: the use of ontologies in annotating models and in identifying gene function within and across species. Mammalian Genome. 18: 402–411.

Materna, S.C. and E.H. Davidson. 2007. Logic of gene regulatory networks. Current Opinion Biotechnology. 18: 351–354.

von Bertalanffy, L. 1968. General System Theory: Foundations, Development, Applications. George Braziller. New York.

Waddington, C.H. 1940. The genetic control of wing development in Drosophila. J. Genetics 25: 75–139.

Weiss, P. A., et al. 1971. Hierarchically Organized Systems in Theory and Practice. Hafner. New York.

© All the material on this website is protected by copyright. It may not be reproduced in any form without permission from the copyright holder.



Home Link