Massive Eco-Evolutionary Synthesis Simulations (MESS) - API Tutorial¶
This is the second part of the MESS tutorial in which we introduce the API mode using jupyter notebooks. This is meant as a broad introduction but does make the assumption you’ve already completed the CLI tutorial some of the parameters and terminology. We will use as an example in this tutorial the spider community data set from La Reunion published by Emerson et al (2017). However, you can follow along with one of the other example datasets if you like, the procedure will be identical although your results will vary.
- Running and connecting to a Jupyter notebook
- Create and parameterize a new MESS Region
- Run MESS serial simulations in API mode
- Run MESS parallel simulations in API mode
- Adding more simulations to a region
- Using simulations to perform statistical inference
- References
Running and connecting to a Jupyter notebook¶
For the purposes of this tutorial, all command interactions will take place inside a jupyter notebook running on your personal computer. For the most part we will be writing and executing python commands. Jupyter is already installed as a dependency of MESS, but if you need help getting a server running see the excellent Jupyter notebook documentation.
Create and parameterize a new MESS Region¶
MESS API mode lets you dive under the hood of the CLI mode a bit. You
have all the power of the CLI mode, yet more flexibility. The first step
in API mode is to create a MESS Region
. A Region
encompasses a
very large Metacommunity and a much smaller local community which is
connected to it by colonization. In creating a Region
the only thing
you’re required to pass in is a name, so lets go with “LaReunion” (No
spaces!), as this is the region the empirical data is drawn from.
reunion = MESS.Region("LaReunion")
print(reunion.get_params())
------- MESS params file (v.0.1.0)----------------------------------------------
LaReunion ## [0] [simulation_name]: The name of this simulation scenario
./default_MESS ## [1] [project_dir]: Where to save files
0 ## [2] [generations]: Duration of simulations. Values/ranges Int for generations, or float [0-1] for lambda.
neutral ## [3] [community_assembly_model]: Model of Community Assembly: neutral, filtering, competition
point_mutation ## [4] [speciation_model]: Type of speciation process: none, point_mutation, protracted, random_fission
2.2e-08 ## [5] [mutation_rate]: Mutation rate scaled per base per generation
2000 ## [6] [alpha]: Abundance/Ne scaling factor
570 ## [7] [sequence_length]: Length in bases of the sequence to simulate
------- Metacommunity params: --------------------------------------------------
100 ## [0] [S_m]: Number of species in the regional pool
750000 ## [1] [J_m]: Total # of individuals in the regional pool
2 ## [2] [speciation_rate]: Speciation rate of metacommunity
0.7 ## [3] [death_proportion]: Proportion of speciation rate to be extinction rate
2 ## [4] [trait_rate_meta]: Trait evolution rate parameter for metacommunity
1 ## [5] [ecological_strength]: Strength of community assembly process on phenotypic change
------- LocalCommunity params: Loc1---------------------------------------------
Loc1 ## [0] [name]: Local community name
1000 ## [1] [J]: Number of individuals in the local community
0.01 ## [2] [m]: Migration rate into local community
0 ## [3] [speciation_prob]: Probability of speciation per timestep in local community
These are all the parameters of the model. The defaults are chosen to
reflect a typical oceanic island arthropod community. Don’t worry at
this point about all the parameters, lets focus for now on
community_assembly_model
, the size of the local community (J
),
and the rate of migration from the metacommunity to the local community
(m
). We will set parameter ranges for these, and each simulation
will sample a random value from this range. In a new cell use the
set_param()
method to change these values:
reunion.set_param("community_assembly_model", "*")
reunion.set_param("J", "1000-10000")
reunion.set_param("m", "0.001-0.01")
NB: Setting thecommunity_assembly_model
to*
indicates that we want to sample uniformly among all three of the model types: neutral, competition, and environmental filtering.
Print the params again to prove to yourself that the ranges are now set:
print(reunion.get_params())
------- MESS params file (v.0.1.0)----------------------------------------------
LaReunion ## [0] [simulation_name]: The name of this simulation scenario
./default_MESS ## [1] [project_dir]: Where to save files
0 ## [2] [generations]: Duration of simulations. Values/ranges Int for generations, or float [0-1] for lambda.
* ## [3] [community_assembly_model]: Model of Community Assembly: neutral, filtering, competition
point_mutation ## [4] [speciation_model]: Type of speciation process: none, point_mutation, protracted, random_fission
2.2e-08 ## [5] [mutation_rate]: Mutation rate scaled per base per generation
2000 ## [6] [alpha]: Abundance/Ne scaling factor
570 ## [7] [sequence_length]: Length in bases of the sequence to simulate
------- Metacommunity params: --------------------------------------------------
100 ## [0] [S_m]: Number of species in the regional pool
750000 ## [1] [J_m]: Total # of individuals in the regional pool
2 ## [2] [speciation_rate]: Speciation rate of metacommunity
0.7 ## [3] [death_proportion]: Proportion of speciation rate to be extinction rate
2 ## [4] [trait_rate_meta]: Trait evolution rate parameter for metacommunity
1 ## [5] [ecological_strength]: Strength of community assembly process on phenotypic change
------- LocalCommunity params: Loc1---------------------------------------------
Loc1 ## [0] [name]: Local community name
1000-10000 ## [1] [J]: Number of individuals in the local community
0.001-0.01 ## [2] [m]: Migration rate into local community
0 ## [3] [speciation_prob]: Probability of speciation per timestep in local community
Run MESS serial simulations in API mode¶
Now we can run community assembly simulations given our new parameterization
using the run()
method. There is one required argument to this method
(nsims
) which indicates the number of independent community assembly
realizations to perform.
reunion.run(sims=1)
Generating 1 simulation(s).
[####################] 100% Finished 0 simulations | 0:01:02 |
Run MESS parallel simulations in API mode¶
Like the CLI, the MESS API can make use of all the cores you can throw at it thanks to integration with the very nice IPyparallel library. To take a moment to launch an ipcluster instance.
Now we assume you have an ipyclient
object initialized in your notebook.
The Region.run()
method can also an optional argument (ipyclient
) for
specifying a connection to an ipyparallel backend, allowing for massive
parallelization. Let’s check to make sure how many cores our ipyclient is
attached to:
len(ipyclient)
40
Now call run and generate 40 simulations on the 40 cores:
reunion.run(sims=40, ipyclient=ipyclient)
Generating 40 simulation(s).
[####################] 100% Performing Simulations | 0:01:31 |
[####################] 100%
Finished 40 simulations
Now we generated 40 simulations in the parallel in the (roughly) the same time
it took to generate 1 simulation in serial. I say ‘roughly’ here for two
reasons. First, The simulations are stochastic, and the amount of time any
given simulation will take is Poisson distributed, so sometimes you’l get
‘unlucky’ with one that takes much longer. Second, by default the
generations
parameter is 0
, which indicates to uniformly sample a
lambda
value from the half-open interval [0-1). Small lambda
will
(on average) run faster than large lambda
, so again, another source of
variability in runtime.
Adding more simulations to a region¶
As with the CLI mode, if you find you need to add more simulations to a
Region
, for whatever reason, you can simply call run()
again, and this
will append the new simulations to what has already been run.
Using simulations to perform statistical inference¶
You can now proceed to the MESS Machine Learning Tutorial to learn how to use the simulations to perform model selection and parameter estimation on real data.
References¶
Emerson, B. C., Casquet, J., López, H., Cardoso, P., Borges, P. A.,
Mollaret, N., … & Thébaud, C. (2017). A combined field survey and
molecular identification protocol for comparing forest arthropod
biodiversity across spatial scales. Molecular ecology resources, 17(4),
694-707.