Evolutionary Biology and Biocomplexity


We study fundamental properties of the evolutionary process, using theoretical and computational methods. Evolutionary theory has a claim of universality, in the sense that the theory does not make any reference to its instantiation, that is, how information is encoded. We therefore often use populations of self-replicating computer programs (also known as digital life) to perform simple evolutionary experiments. We believe that evolutionary theory can be treated just like any theory in physics, where theories inspire experiments, who in turn can be designed to validate of falsify theories. Below is a list of our publications in this area, with links to the Los Alamos archive, online journals, or local PDF files.


1995-2000
  • Interaction Between Directional Epistasis and Average Mutational Effects (2001)
  • Evolution of Digital Organisms at High Mutation Rate Leads to Survival of the Flattest(2001)
  • Optimal adaptive performance and delocalization in NK fitness landscapes (2002)
  • Ab Initio Modeling of Ecosystems with Artificial Life (2002)
  • Design of Evolvable Computer Languages (2002)
  • The Biology of Digital Organisms (2002)
  • Viral Evolution Under the Pressure of an Adaptive Immune System: Optimal Mutation Rates for Viral Escape (2002)
  • Sequence Complexity in Darwinian Evolution (2002)
  • What is Complexity? (2002)
  • Evolution of Mutational Robustness (2003)
  • Compensatory Mutations Cause Excess of Antagonistic Epistasis in RNA Secondary Structure Folding (2003)
  • Selective Pressures on Genomes in Molecular Evolution (2003)
  • The Evolutionary Origin of Complex Adaptive Features (2003)
  • Modeling Stochastic Clonal Interference (2003)
  • Experiments in Digital Evolution (2004)
  • Bifurcation into Functional Niches in Adaptation (2004)
  • Evolution of Robustness in Digital Organisms (2004)
  • Influence of Chance, History, and Adaptation on Digital Evolution (2004)
  • Adaptive Radiation from Resource Competition in Digital Organisms (2004)

    For more information on our group, see the Digital Life Laboratory's Home page.


  • Other Topics in Computational Biology

    Information Theory in Molecular Biology

    Information Theory (IT) has a bad name in molecular biology: most serious researchers do not think that the methods of information theory apply to the concept of information in biological genes. This is due for the most part to an erroneous treatment of the concept of information in most of the published literature concerning information in molecular biology. Instead, information theory can be a very useful tool. It can be shown that it is possible, in statistical ensembles of symbolic sequences, to distinguish between entropy and information in genes, and reveal associations between molecules. Apart from quantifying the information content of genes, IT has applications in molecular sequence analysis, as well as drug design.




    Protein Evolution and Protein-Protein Interactions

  • Apparent Dependence of Protein Evolutionary Rate on Number of Interactions is Linked to Biases in Protein-Protein Interactions Data Sets (2004)
  • Stability and the Evolvability of Function in a Model Protein (2004)
  • Thermodynamic Prediction of Protein Neutrality (2004)

  • The complexity of symbolic sequences

    C. Adami and N. Cerf

    We introduce a practical measure for the complexity of sequences of symbols (``strings'') that is rooted in automata theory but avoids the problems of Kolmogorov-Chaitin complexity. This physical complexity can be estimated for ensembles of sequences, for which it reverts to the difference between the maximal entropy of the ensemble and the actual entropy given the specific environment within which the sequence is to be interpreted. Thus, the physical complexity measures the amount of information about the environment that is coded in the sequence, and is conditional on such an environment. In practice, an estimate of the complexity of a string can be obtained by counting the number of loci per string that are fixed in the ensemble, while the volatile positions represent, again with respect to the environment, randomness. We apply this measure to tRNA sequence data.

  • Physical complexity of symbolic sequences (2000)
  • We applied this measure of complexity to study the evolution of complexity in
  • Evolution of Biological Complexity (2000)
  • Complexity measures (including physical complexity), are reviewed in
  • Sequence Complexity in Darwinian Evolution (2002)
  • What is Complexity? (2002)


  • Evolution and development of neural networks

    Artificial neural networks (ANNs) based on McCullough-Pitts neurons and the standard connectionist paradigm can be analyzed analytically, but do not appear to help in understanding biological information processing in the brain. Instead of constructing ANNs, we attempt to model decentralized growth and development of neural networks inspired by the molecular biology and physiology of real nervous systems. In this model, each individual artificial neuron is an autonomous unit whose behavior is only determined by the genetic information it harbors and local concentrations of substrates. The chemicals and substrates, in turn, are modeled by a simple artificial chemistry. The combination of local substrate concentrations and genetic information leads to gene expression, manifested as axon and dendrite growth, cell division and differentiation, substrate production, and cell stimulation. While genomes leading to naturally grown ANNs can evolve according to Darwin's principle of selection and survival of the fittest, we demonstrate the power of the artificial chemistry with engineered (user-written) genomes that lead to the growth of simple networks with behaviors similar to known physiology such as deterministically structured networks, pacemaker behavior, sensitization, habituation, associative classical conditioning, computation of logical functions, and self-limiting network growth. To evolve more complex structures, we implemented a platform-independent, asynchronous, distributed Genetic Algorithm (GA) for the genomes that code for development, expression, and the physiology of the neurons, and that allows users on the Internet to participate in evolutionary experiments via the World Wide Web.

  • Development and Evolution of Neural Networks in an Artificial Chemistry (1998)
  • A Developmental Model for the Evolution of Artificial Neural Networks (1999)
  • Evolution of Robust Developmental Neural Networks(2004)

  • Critical and Near-Critical Branching Processes

    C. Adami and J. Chu

    Branching processes have a surprisingly universal dynamics which gives rise to scale-free dynamics under certain circumstances. The simplest branching process (the Galton-Watson process) is also the oldest, having been invented to study the disappearance of family names from the British peerage. It can also be used any time a process leads to branchings based on a probability distribution. The "critical" parameter in such a system is the average number m of "daughters", i.e., replicas, a node in such a branching system has. In an infinite system this number can exceed one, but not if it is finite and selection is present. In that case, the critical value is m=1, a situation in which selection is extremely strong: there are (almost) no competing nodes in such a tree. A simple mean-field treatment of this model shows that in this limit the "taxon" abundance distributions, i.e., the probability distribution for an initial node to give rise to a "family" with n sub-nodes is a pure power-law. As selection becomes weaker, allowing more competing neutral nodes in the system, m becomes smaller than one and the power law degenerates into an exponential. We have studied this process theoretically and numerically, and applied it to taxon distributions in the fossil record, taxon distributions of digital organisms, as well as avalanche-size distributions in sandpile models. The latter study shows that sandpiles are only critical in a very small region of parameter space which is in fact unphysical. Physical sandpiles are non-critical, while evolutionary abundance distributions are driven towards the critical regime by the system's dynamics.

  • Critical and near-critical branching processes (1999)
  • A simple explanation for taxonomic abundance patterns (2000)
  • Non-critical sandpiles