Aevol is a forward-in-time evolutionary simulator that simulates the evolution of a population of organisms through a process of variation and selection.
The design of the model focuses on the realism of the genome structure and of the mutational process: the mutations affect directly the sequence, without any a priori fitness effect.
Aevol can therefore be used to decipher the effect of different operators or processes on genome evolution.
To run Aevol, you must first initiate a population of organisms with the desired parameters (Sections Parameter File and Initiate a Simulation),
and then run the simulation for the desired number of steps (Section Run a simulation).
The simulations can be further analyzed afterward with dedicated tools (Section Post-Treatments).
Aevol exists in three flavors: Standard, 4-Bases and Eukaryote.
Most of the model is identical across the different flavors, and variations will be presented explicitly throughout the model presentation.
In short:
In Standard Aevol, each organism is asexual, haploid, and owns a single circular chromosome. The genome is encoded as a double-strand binary string. This is inspired by prokaryotic organisms
In 4-Bases Aevol, the genome is not binary but encoded with $4$ letters. That changes the genotype-to-phenotype map, but not the rest of the model.
In Eukaryote Aevol, each organism owns two linear chromosomes, and the reproduction is sexual and includes a meiotic recombination event. The genome is binary.
Note: although the 4-Bases and Eukaryote versions should work together, this has not been sufficiently tested yet.
Please note it is assumed you have fully installed Aevol on your computer and that the executables can be run directly from the command line.
If this is not the case, you can use the full path to the Aevol executables instead of their name.
E.g. if you have built aevol in /home/login/aevol/build, you can use:
/home/login/aevol/build/bin/aevol_2b_create --help # beware of the additional bin level
instead of
aevol_2b_create --help
Please also note that each independent Aevol simulation should be run in its own separate directory.
Parameter File
To run a simulation, you will generally want to provide a parameter file to specify the experimental conditions for the run (it is also possible to use default parameters, but it is not recommended).
A parameter file comprises one keyword parameter per line with its arguments, separated by spaces.
Additional comments should be preceded with a # sign.
Examples are provided in the example directory: example directory on the GitLab.
Parameter name
Default value(s)
Description
SEED
\(5,000\)
Seed used to initialize the random number generators
WORLD_SIZE
\(32 \times 32\)
Width and height of the world toroidal grid. It gives the total number of individuals and their geographic structure
CHROMOSOME_INITIAL_LENGTH
\(5,000\)
Used only when no chromosome file is provided: initial length of the chromosome to generate
SELECTION_SCOPE
local \(3 \times 3\)
Type of selection scope (local or global), and in case of a local selection, width and height of the patch on which the local competition is done
SELECTION_SCHEME
fitness_proportionate 1000
Selection method and associated parameter
POINT_MUTATION_RATE
\(5 \times 10^{-5}\)
Per base substitution rate
SMALL_INSERTION_RATE
\(5 \times 10^{-5}\)
Per base small insertion rate
SMALL_DELETION_RATE
\(5 \times 10^{-5}\)
Per base small deletion rate
DUPLICATION_RATE
\(5 \times 10^{-5}\)
Per base duplication rate
DELETION_RATE
\(5 \times 10^{-5}\)
Per base deletion rate
TRANSLOCATION_RATE
\(5 \times 10^{-5}\)
Per base translocation rate
INVERSION_RATE
\(5 \times 10^{-5}\)
Per base inversion rate
MAX_INDEL_SIZE
6
Maximal size of the small deletions and small insertions
ENV_ADD_GAUSSIAN
Add a Gaussian component to the phenotypic target
MAX_TRIANGLE_WIDTH \(^{1}\)
\(0.033333333\)
Maximum width of the metabolic contribution of a gene to the phenotype (~level of pleiotropy)
CHECKPOINT_STEP
\(1,000\)
Interval between 2 checkpoints
RECORD_TREE
ON \(1,000\)
Whether to record the genealogical trees (containing all the mutational events) and at which interval
STATS_BEST
ON \(1\)
Whether to record statistics about the best individual and at which interval
STATS_POP
ON \(1\)
Whether to record statistics about the whole population and at which interval
(1): Note that MAX_TRIANGLE_WIDTH is a scaling factor for the \(w\) parameter of a protein (see this section of the model description)
Some parameters are specific to Eukaryote Aevol:
Parameter name
Default value(s)
Description
SELFING_RATE
\(0\)
Probability of autofecondation at the reproduction event
ALIGN_SCORE
Minimal alignment score to find to perform a meiotic recombination
Initiate a simulation (aevol_create)
Warning
Any new simulation must be run in a new directory.
There are two main ways to initiate a simulation: from scratch, using a randomly generated initial genome, or providing a sequence (usually a WildType).
From scratch
When creating a new simulation from scratch, a simple bootstrapping method is used to generate the initial genome:
genomes whose corresponding fitness is lower than that of a genome with no genes are discarded.
This implies that the generated genome codes for at least one beneficial gene.
For Standard Aevol
aevol_2b_create parameter_file.in
For 4-Bases Aevol
aevol_4b_create parameter_file.in
For Eukaryote Aevol
aevol_eukaryote_2b_create parameter_file.in
For Eukaryote Aevol, the current recommendation is to use the provided Wild-Types
Starting from scratch generally results in a single functional chromosome and the other one empty.
This is due to dosage imbalance when duplicating the first genes and a strong founding effect.
The recommended way to bootstrap a eukaryotic run is to generate a haploid organism (with the Standard version of the model) with a halved phenotypic target, and then perform a whole genome duplication by creating a second copy of the obtained chromosome.
However, going from a circular chromosome to a linear chromosome may break essential genes and reduce fitness in the process.
From a WildType
Note that example sequence files with pre-evolved organisms are provided in the example directory.
Usage of aevol_create (output of aevol_create --help)
aevol_create: create an experiment with setup as specified in PARAM_FILE.
Usage : aevol_create -h or --help
or : aevol_create -V or --version
or : aevol_create [PARAM_FILE] [--fasta SEQ_FILE]
Options
-h, --help
print this help, then exit
-V, --version
print version number, then exit
--fasta SEQUENCE_FILE
load sequences from given file (in fasta format) instead of generating it
Run a simulation (aevol_run)
For Standard Aevol
aevol_2b_run
For 4-Bases Aevol
aevol_4b_run
For Eukaryote Aevol
aevol_eukaryote_2b_run
Usage of aevol_run (output of aevol_run --help)
aevol_run: run an aevol simulation.
Usage : aevol_run -h or --help
or : aevol_run -V or --version
or : aevol_run [-b TIMESTEP] [-e TIMESTEP] [-p NB_THREADS] [-v]
Options
-h, --help
print this help, then exit
-V, --version
print version number, then exit
-b, --begin TIMESTEP
specify time t0 to resume simulation at (default read in last_gener.txt)
-e, --end TIMESTEP
specify time of the end of the simulation
(if omitted, run for 1000 timesteps)
-p, --parallel NB_THREADS
run on NB_THREADS threads (use -1 for system default)
-v, --verbose
be verbose
--ui-output-dir UI_OUTDIR
directory in which to output data for the UI
--ui-output-frequency NB_GENER
frequency at which to output data for the UI
Post-Treatments
Reconstruct a lineage
The lineage of a given individual can be reconstructed from the tree files, provided these tree files have been saved at runtime (see Section Parameter File).
Usage of aevol_post_lineage (output of aevol_post_lineage --help)
aevol_post_lineage:
Reconstruct the lineage of a given individual from the tree files
Usage : aevol_post_lineage -h or --help
or : aevol_post_lineage -V or --version
or : aevol_post_lineage [-b TIMESTEP] [-e TIMESTEP] [-I INDEX] [-F] [-v]
Options
-h, --help
print this help, then exit
-V, --version
print version number, then exit
-b, --begin TIMESTEP
specify time t0 up to which to reconstruct the lineage
-e, --end TIMESTEP
specify time t_end of the indiv whose lineage is to be reconstructed
-I, --index INDEX
specify the index of the indiv whose lineage is to be reconstructed
(default: treat only the best)
-F, --full-check
perform genome checks whenever possible
-v, --verbose
be verbose
Note that running aevol_post_2b_lineage with no options will reconstruct the lineage for the whole simulation, starting from the best individual of the final generation (the beginning is \(0\), and the end is the last computed generation).
Examples
# Reconstruct the lineage of the best individual at generation 1000 (Standard Aevol)aevol_2b_post_lineage -b 0 -e 1000# Reconstruct the lineage of the individual with index 42 at generation 1000,# starting at generation 500 (4-Bases Aevol)aevol_4b_post_lineage -b 500 -e 1000 -I 42
Usage of aevol_post_ancestor_stats (output of aevol_post_ancestor_stats --help)
aevol_post_ancestor_stats:
Compute statistics on ancestry described in provided lineage file.
Usage : aevol_post_ancestor_stats -h or --help
or : aevol_post_ancestor_stats -V or --version
or : aevol_post_ancestor_stats [-FMv] [-p NB_THREADS] LINEAGE_FILE PARAM_FILE
Options
-h, --help
print this help, then exit
-V, --version
print version number, then exit
-F, --full-check
perform genome checks whenever possible
-M, --trace-mutations
outputs the fixed mutations (in a separate file)
-v, --verbose
be verbose
-p, --parallel NB_THREADS
run on NB_THREADS threads (use -1 for system default)
Examples
# Compute stats along the provided lineage (Standard Aevol)aevol_2b_post_ancestor_stats LINEAGE_FILE
# Compute stats and trace mutations along the provided lineage (4-Bases Aevol)aevol_4b_post_ancestor_stats -M LINEAGE_FILE
Other post-evolution analyses
Having the perfect knowledge of everything that happened during evolution, it is possible to dig further into the experiments to study specific evolutionary behaviors.
Additional post-treatments of the data can be developed for specific needs