Aevol
Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Toggle Dark/Light/Auto mode Back to homepage

Model Description

Introduction

Aevol is a forward-in-time evolutionary simulator that models the evolution of a population of organisms through a process of variation and selection. Each artificial organism has an explicit genome, on which RNAs and genes can be identified. The design of the model focuses on the realism of the genome structure and of the mutational process: mutations affect directly the sequence, without any a priori fitness effect. Aevol can therefore be used to decipher the effect of different operators or processes on genome evolution.

Aevol exists in three flavors: Standard, Eukaryote and 4-Bases. The first two use a binary genomic sequence while the third uses an ATGC-based sequence. Most of the model is identical among the different flavors, and variations will be presented explicitly throughout the model presentation. In short:

  • In Standard Aevol, is inspired by prokaryotec organisms: each organism is asexual, haploid, and owns a single circular chromosome. The genome is encoded as a double-strand binary string. Most experiments historically used this version (see Publications).
  • In Eukaryote Aevol, each organism owns two linear chromosomes, and the reproduction is sexual and includes a meiotic recombination event. The genome is encoded as a double-strand binary string.
  • In 4-Bases Aevol, the genome is not binary but encoded with the 4 letters ATGC. This changes the genotype-to-phenotype map, but not the rest of the model. It has been used in Liard et al. 2017 and Daudey et al. 2024.

Note: although the Eukaryote and 4-Bases versions could possibly work together, this has not been sufficiently tested yet, and no executable file are currently provided.

Overview

Genome model

Each individual owns an explicit DNA sequence, composed of \(0\)s and \(1\)s for Standard and Eukaryote Aevol, and ATGCs in 4-Bases Aevol. Genomes are double-stranded, with complementary bases on each strand. Leading and Lagging strands are read in opposite directions.

On the sequence, promoters are recognized using a consensus patterns and mark the mRNA start sites. Their activity is inversely proportional to its distance from the consensus and determines the level of expression of all the protein-coding genes located on the corresponding mRNA. An mRNA ends at the first encountered terminator sequence, which are sequences that would form a stem-loop structure, alike to ρ-independent bacterial terminators.

On each mRNA, another layer of pattern recognition defines potential genes. The translation is initiated by a Shine-Dalgarno-like sequence followed by a Start codon, and continues until a Stop codon is reached. Each codon lying between the initiation and termination signals is translated into an abstract “amino-acid’’ using an artificial genetic code, thus giving rise to the protein’s primary sequence.

Each protein’s primary sequence is then “folded” into a mathematical function defining the protein’s contribution to the phenotype. This phenotypic contribution is defined as a triangular function. The x-axis represents functional traits, and the y-ayis the activation level of said-trait. The phenotype of an individual is the sum of all these protein functions.

Mutations

Upon replication of an individual, mutations may occur in the sequence. They do not have a predefined fitness effect, as they are applied to positions drawn at random in the genome and the new phenotype is computed afterward.

Mutations can be chromosomal rearrangements (inversions, translocations, duplications, or deletions), or local events (point mutations, InDels). Their rates are per-base and are defined for each mutation type separately.

Selection

The population follows a generational model: the whole population is replaced at each generation, and individuals compete to populate the next generation. This competition can either be global, or local on a pre-defined neighborhood, except in the Eukaryote model that only allows for global selection. Reproduction is asexual in Standard and 4-Bases Aevol while it is sexual in Eukaryote Aevol. In the latter case, there can be selfing at a predefined rate, and there is always one meiotic recombination upon gametes production.

References

  • Vincent Liard, Jonathan Rouzaud-Cornabas, Nicolas Comte, Guillaume Beslon (2017). A 4-base model for the Aevol in-silico experimental evolution platform. Proceedings of ECAL 2017, the Fourteenth European Conference on Artificial Life.
  • Hugo Daudey, David P. Parsons, Eric Tannier, Vincent Daubin, Bastien Boussau, Vincent Liard, Romain Gallé, Jonathan Rouzaud-Cornabas, Guillaume Beslon (2024). Aevol_4b: Bridging the gap between artificial life and bioinformatics. Proceedings of the 2024 Artificial Life Conference.