Objective and Overview
Aevol is a forward-in-time evolutionary simulator that models the evolution of a population of organisms through a process of variation and selection. The design of the model focuses on the realism of the genome structure and of the mutational process. Aevol can therefore be used to decipher the effect of chromosomal rearrangements on genome evolution, including their interactions with other types of mutational events.
In short, Aevol is made of three components:
- a Genotype-to-Phenotype map: Each individual’s genomic sequence is decoded into a phenotype, and the corresponding fitness value is computed.
- a Population of organisms: Each organsims has its own genome, phenotype and fitness. At each generation, the organisms compete to populate the next generation.
- a process of Genome replication with variation: During reproduction, genomes can undergo several kinds of mutational events, including chromosomal rearrangements and local mutations. The seven modelled types of mutation entail 3 local mutations: substitutions, small insertion, small deletion, 2 balanced rearrangements (which conserve the genome size): inversions and translocations, and 2 unbalanced rearrangements: duplications and deletions. This allows the user to study the effect of chromosomal rearrangements and their interaction with other kinds of events such as substitutions and InDels.
Each individual owns an explicit DNA sequence, composed of \(0\)s and \(1\)s for Standard and Eukaryote Aevol, and ATGCs in 4-Bases Aevol. Genomes are double-stranded, with complementary bases on each strand. Leading and Lagging strands are read in opposite directions.
On the sequence, promoters are recognized using a consensus patterns and mark the mRNA start sites. Their activity is inversely proportional to its distance from the consensus and determines the level of expression of all the protein-coding genes located on the corresponding mRNA. An mRNA ends at the first encountered terminator sequence, which are sequences that would form a stem-loop structure, alike to ρ-independent bacterial terminators.
On each mRNA, another layer of pattern recognition defines potential genes. The translation is initiated by a Shine-Dalgarno-like sequence followed by a Start codon, and continues until a Stop codon is reached. Each codon lying between the initiation and termination signals is translated into an abstract “amino-acid’’ using an artificial genetic code, thus giving rise to the protein’s primary sequence.
Each protein’s primary sequence is then “folded” into a mathematical function defining the protein’s contribution to the phenotype. This phenotypic contribution is defined as a triangular function. The x-axis represents functional traits, and the y-ayis the activation level of said-trait. The phenotype of an individual is the sum of all these protein functions.
Upon replication of an individual, mutations may occur in the sequence. They do not have a predefined fitness effect, as they are applied to positions drawn at random in the genome and the new phenotype is computed afterward.
Mutations can be chromosomal rearrangements (inversions, translocations, duplications, or deletions), or local events (point mutations, InDels). Their rates are per-base and are defined for each mutation type separately.
The population follows a generational model: the whole population is replaced at each generation, and individuals compete to populate the next generation. This competition can either be global, or local on a pre-defined neighborhood, except in the Eukaryote model that only allows for global selection. Reproduction is asexual in Standard and 4-Bases Aevol while it is sexual in Eukaryote Aevol. In the latter case, there can be selfing at a predefined rate, and there is always one meiotic recombination upon gametes production.