This tutorial is modified from Taming the BEAST material. You can find the source tutorial, data and author information here.

Requirements

The programs used in this tutorial are listed below. In case you have not done it yet, please follow the installation instructions here before moving on.

Background

In this tutorial, we will study the phylodynamics of the Respiratory syncytial virus subgroup A (RSVA). RSVA infects the human lower respiratory tract, causing symptoms that are often indistinguishable from the common cold. By age 3, nearly all children will be infected, and a small percentage (<3%) will develop a more serious inflammation of the bronchioles requiring hospitalisation.

Goals

The goal of our phylodynamic analysis is to estimate:

  • the rate of molecular evolution (of RSVA’s G gene);
  • the date of the most recent common ancestor (MRCA) of all viral samples (i.e., the tree’s root), interpreted as the first infection event with respect to the samples at hand;
  • the phylogenetic relationships among viral samples.

Data

Our data is comprised of 129 molecular sequences coding for RSVA’s G protein (Zlateva et al., 2004; Zlateva et al., 2005), a glycoprotein that allows the virus to attach itself to the host cells’ membranes, starting the infection. Importantly, our data is time stamped (also referred to as heterochronous or serial data) because it was collected at multiple time points, from 1956–2002.

Below we further detail how we will treat our molecular data in our analyses.

The NEXUS alignment

The data is available in a file named RSV2.nex. If you have installed LPhy Studio, you don’t need to download the data separately. It comes bundled with the studio and can be accessed in the “tutorials/data” subfolder within the installation directory of LPhy Studio.

Multiple partitions

Our molecular alignment corresponds to a 629-bp coding sequence of RSVA’s G protein. We are going to partition our alignment into three compartments corresponding to the first, second and third codon positions.

Tip dates

In phylodynamic and macroevolutionary analyses, quite often our samples (viral sequences from infected individuals in the former case and from different species in the latter, for example) were collected only at the present moment. In such cases we assign our samples an age of 0.0 to represent the present – this is our default.

As mentioned above, however, our sequences have been sampled at various dates going back to the 1950s. We can see the date of each of our samples (sequences) in the NEXUS file, after the last lower-case s. This date can be read as years since 1900.

Whenever tip dates are heterochronous and used as data, they can be read in natural forward time (here, we refer to them simply as “dates”) or backward time (in which case we call them “ages”).

Whether we treat the tip times as dates or ages internally in the code, however, depends on how the models we want to use are parameterized. The same goes for how we output the internal node times when we estimate them during phylogenetic inference. Regardless, the times are first converted into scalar values from 0.0 to some value > 0.0 (in days, years, millions of years, etc.).

LPhy’s behavior is to treat tip times always as ages.

Inputting the script into LPhy Studio

LPhy Studio implements a GUI for users to specify and visualize probabilistic graphical models, as well as for simulating data under those models. These tasks are executed according to an LPhy script the user types (or loads) on LPhy Studio’s interactive terminal. If you are not familiar with LPhy language, we recommend reading the language features before getting started.

You can input a script into LPhy Studio either by using the File menu or by directly typing or pasting the script code into the console in LPhy Studio. Please note if you are working in the console, you do not need to add data and model keywords and curly brackets to define the code blocks. We are supposed to add the lines without the data { } and model { } to the command line console at the bottom of the window, where the data and model tabs in the GUI are used to specify which block we are working on.

Below, we will build an LPhy script in two parts, the data and the model blocks.

Data block

The data { ... } block is necessary when we use LPhy Studio to prepare instruction input files for inference software (e.g., BEAST 2, RevBayes, etc.). The purpose of this block is to tell LPhy which nodes of our graphical model are to be treated as known constants (and not to be sampled by the inference software) because they are observed data. Elsewhere, this procedure has been dubbed “clamping” (Höhna et al., 2016).

In this block, we will either type strings representing values to be directly assigned to scalar variables, or use LPhy’s syntax to extract such values from LPhy objects, which might be read from file paths given by the user.

(Note that keyword data cannot be used to name variables because it is reserved for defining scripting blocks as outlined above.)

In order to start specifying the data { ... } block, make sure you type into the “data” tab of the command prompt, by clicking “data” at the bottom of LPhy Studio’s window.

data {
  options = {ageDirection="forward", ageRegex="s(\d+)$"};
  D = readNexus(file="data/RSV2.nex", options=options);
  codon = D.charset(["3-629\3", "1-629\3", "2-629\3"]);
  L = codon.nchar();
  n = length(codon);
  taxa = D.taxa();
}

Our data here consist of molecular alignments and the sample dates.

We start by parsing the latter with regular expression "s(\d+)" when defining some options in options={}. This tells LPhy how to extract the sample times from the NEXUS file, and then we specify that we want to read those times as dates (i.e., forward in time). After all, our sample times are given to us in natural time.

In order to check if you have set the sample times correctly, click the graphical component taxa and check the column “Age”. The most recent sequences (i.e., from 2002 and that end with s102) should have an age of 0.0, while all other tips should be > 0.0. This is because, as mentioned above, LPhy always treats sample times as ages.

The alignment D is imported from a Nexus file “RSV2.nex”. The argument file="data/RSV2.nex" is used to look for this file under the relative path “data”, with its parent directory being the current working directory. However, this could lead to the issue if the working directory is not the parent directory of “data”, especially when using the LPhy Studio console. To avoid potential issues with relative paths, a simple solution is to use the absolute path in the argument file="...".

Then we must parse the molecular alignments, which we do when initializing variable D. Note that our open reading frame (ORF) starts in position 3, which must be reflected by the arguments we pass on to the charset() function: first ("3-629\3"), second ("1-629\3") and third ("2-629\3") codon positions. Finally, we use the last three lines to set the number of loci L, the number of taxa n, and the taxa themselves, taxa. Note that n=length(codon); is equivalent to n=3; because length() here is extracting the number of partitions in codon.

If you want to double-check everything you have typed, click the “Model” tab in the upper right panel.

Model block

The model { ... } block is the main protagonist of our scripts. This is where you will specify the many nodes and sampling distributions that characterize a probabilistic model.

(Note that keyword model cannot be used to name variables because it is reserved for defining scripting blocks as outlined above.)

In order to start specifying the model { ... } block, make sure you type into the “model” tab of the command prompt, by clicking “model” at the bottom of LPhy Studio’s window.

We will specify sampling distributions for the following parameters, in this order:

  1. π, the equilibrium nucleotide frequencies (with 4 dimensions, one for each nucleotide);
  2. κ, twice the transition:transversion bias assuming equal nucleotide frequencies (with 3 dimensions, one for each partition);
  3. r, the relative substitution rates (with 3 dimensions, one for each partition);
  4. μ, the global (mean) clock rate;
  5. Θ, the effective population size;
  6. ψ, the time-scaled phylogenetic tree, where the time unit is years.
model {
  π ~ Dirichlet(replicates=n, conc=[2.0, 2.0, 2.0, 2.0]);
  κ ~ LogNormal(sdlog=0.5, meanlog=1.0, replicates=n);
  r ~ WeightedDirichlet(conc=rep(element=1.0, times=n), weights=L);
  μ ~ LogNormal(meanlog=-5.0, sdlog=1.25);
  Θ ~ LogNormal(meanlog=3.0, sdlog=2.0);
  ψ ~ Coalescent(taxa=taxa, theta=Θ);
  codon ~ PhyloCTMC(L=L, Q=hky(kappa=κ, freq=π, meanRate=r), mu=μ, tree=ψ);
}

Note that parameters π, κ and r are 3-dimensional vectors, because they represent the nucleotide equilibrium frequencies, (ts:tv)/2, and relative substitution rates of each of the three partitions, respectively. LPhy conveniently uses vectorization, so PhyloCTMC recognizes that three parameters above are vectors, and automatically takes care of building three separate HKY models, one per partition!

One final remark is that we use rep(element=1.0, times=n) (where n evaluates to 3) to create three concentration vectors (one vector per partition) for the WeightedDirichlet sampling distribution.

Again, if you want to double-check everything you have typed, click the “Model” tab in the upper right panel.

The whole script

Let us look at the whole thing:

data {
  options = {ageDirection="forward", ageRegex="s(\d+)$"};
  D = readNexus(file="data/RSV2.nex", options=options);
  codon = D.charset(["3-629\3", "1-629\3", "2-629\3"]);
  L = codon.nchar();
  n = length(codon);
  taxa = D.taxa();
}
model {
  π ~ Dirichlet(replicates=n, conc=[2.0, 2.0, 2.0, 2.0]);
  κ ~ LogNormal(sdlog=0.5, meanlog=1.0, replicates=n);
  r ~ WeightedDirichlet(conc=rep(element=1.0, times=n), weights=L);
  μ ~ LogNormal(meanlog=-5.0, sdlog=1.25);
  Θ ~ LogNormal(meanlog=3.0, sdlog=2.0);
  ψ ~ Coalescent(taxa=taxa, theta=Θ);
  codon ~ PhyloCTMC(L=L, Q=hky(kappa=κ, freq=π, meanRate=r), mu=μ, tree=ψ);
}

If the script contains no errors, you will be able to see the specified graphical model. Simulated data, such as the alignment and the tree, can be seen in the right-hand panel.

LPhyStudio
Figure 1: A screenshot of LPhy Studio.

As alternatives to typing or copying-and-pasting the script above, you can find it in HTML format here, or you can load the script with “File” > “Examples” and select RSV2.lphy.

Phylogenetic inference with BEAST 2

Once we have read in our data and defined our model, we can move to the task of carrying out statistical inference. We shall try to estimate our parameters of interest, and for the purposes of this tutorial we will do that using the BEAST 2 program.

Producing a BEAST 2 .xml using LPhyBEAST

BEAST 2 retrieves the data and model specifications from a XML file. One of our goals with LPhy is to make configuring Bayesian phylogenetic analysis as painless, clear and precise as possible. In order to achieve that, we will utilize an additional companion application called LPhyBEAST, which acts as a bridge between the LPhy script and the BEAST 2 xml. It is distributed as a BEAST 2 package, please follow the instruction to install it.

To run LPhyBEAST and produce a BEAST 2 xml, you need to use the terminal and execute the script lphybeast. The LPhyBEAST usage can help you get started. In our RSV2.lphy script, the alignment file is assumed to be located under the subfolder tutorials/data/. To generate the XML, navigate to the tutorials folder where LPhy is installed, and run the following command in your terminal. After it completes, check the message in the end to find the location of the generated XML file.

For example, on a Mac, after replacing all “x” with the correct version number, you can execute the following lphybeast command in the terminal to launch LPhyBEAST:

# go to the folder containing lphy script
cd /Applications/lphystudio-1.x.x/tutorials
# run lphybeast
'/Applications/BEAST 2.7.x/bin/lphybeast' RSV2.lphy

If you are not familiar with inputting valid paths in the command line, here is our Tech Help that may help you.

For Windows users, please note that “C:\Program Files” is usually a protected directory. However, you can copy the “examples” and “tutorials” folders with “data” into your “Documents” folder and work in that location to avoid any permission issues. Alternatively, run lphybeast.bat on a Windows terminal like this:

# go to the subfolder containing lphy script
cd "C:\Users\<YourUserName>\Documents\tutorials"
# run lphybeast
"C:\Program Files\BEAST2.7.x\bat\lphybeast.bat" RSV2.lphy

Please note that the single or double quotation marks ensure that the whitespace in the path is treated as valid.

Running BEAST 2

After LPhyBEAST generates a BEAST 2 .xml file (e.g., RSV2.xml), we can use BEAST 2 to load it and start the inferential MCMC analysis. BEAST 2 will write its outputs to disk into specified files in the XML, and also output the progress of the analysis and some summaries to the screen.

                        BEAST v2.7.5, 2002-2023
             Bayesian Evolutionary Analysis Sampling Trees
                       Designed and developed by
 Remco Bouckaert, Alexei J. Drummond, Andrew Rambaut & Marc A. Suchard
                                    
                   Centre for Computational Evolution
                         University of Auckland
                       r.bouckaert@auckland.ac.nz
                        alexei@cs.auckland.ac.nz
                                    
                   Institute of Evolutionary Biology
                        University of Edinburgh
                           a.rambaut@ed.ac.uk
                                    
                    David Geffen School of Medicine
                 University of California, Los Angeles
                           msuchard@ucla.edu
                                    
                      Downloads, Help & Resources:
                           http://beast2.org/
                                    
  Source code distributed under the GNU Lesser General Public License:
                   http://github.com/CompEvol/beast2
                                    
                           BEAST developers:
   Alex Alekseyenko, Trevor Bedford, Erik Bloomquist, Joseph Heled, 
 Sebastian Hoehna, Denise Kuehnert, Philippe Lemey, Wai Lok Sibon Li, 
Gerton Lunter, Sidney Markowitz, Vladimir Minin, Michael Defoin Platel, 
          Oliver Pybus, Tim Vaughan, Chieh-Hsi Wu, Walter Xie
                                    
                               Thanks to:
          Roald Forsberg, Beth Shapiro and Korbinian Strimmer

Random number seed: 1690761638808
    ...

    ...
         950000     -6087.2825     -5472.3274      -614.9550         0.4757         0.2427         0.0811         0.2003         0.3013         0.4030         0.1153         0.1802         0.3066         0.4051         0.0690         0.2192         6.3611         8.3711        11.8388         0.7467         0.9401         1.3119         0.0020        43.5413 1m10s/Msamples
        1000000     -6086.6790     -5492.1138      -594.5652         0.5164         0.2214         0.0844         0.1776         0.2826         0.4603         0.0982         0.1587         0.3159         0.3883         0.0789         0.2167         6.2125         9.8228        12.0430         0.7304         0.8693         1.3989         0.0022        36.3590 1m10s/Msamples

Operator                                                                    Tuning    #accept    #reject      Pr(m)  Pr(acc|m)
ScaleOperator(Theta.scale)                                                 0.60746       1349       3173    0.00450    0.29832 
ScaleOperator(kappa.scale)                                                 0.46834       2790       6812    0.00970    0.29056 
ScaleOperator(mu.scale)                                                    0.78815       1139       3342    0.00450    0.25418 
beast.base.inference.operator.UpDownOperator(muUppsiDownOperator)          0.91110       5756     129034    0.13497    0.04270 Try setting scaleFactor to about 0.955
beast.base.inference.operator.DeltaExchangeOperator(pi_0.deltaExchange)    0.12309       1769       7818    0.00970    0.18452 
beast.base.inference.operator.DeltaExchangeOperator(pi_1.deltaExchange)    0.13164       1562       8115    0.00970    0.16141 
beast.base.inference.operator.DeltaExchangeOperator(pi_2.deltaExchange)    0.16248       1059       8446    0.00970    0.11142 
Exchange(psi.narrowExchange)                                                     -      34229     100430    0.13424    0.25419 
ScaleOperator(psi.rootAgeScale)                                            0.70177        598       3984    0.00450    0.13051 
ScaleOperator(psi.scale)                                                   0.87726       3442     130335    0.13424    0.02573 Try setting scaleFactor to about 0.937
SubtreeSlide(psi.subtreeSlide)                                            26.63974       4626     130081    0.13424    0.03434 Try decreasing size to about 13.32
Uniform(psi.uniform)                                                             -      72553      61985    0.13424    0.53928 
Exchange(psi.wideExchange)                                                       -        357     133871    0.13424    0.00266 
WilsonBalding(psi.wilsonBalding)                                                 -        795     133107    0.13424    0.00594 
beast.base.inference.operator.DeltaExchangeOperator(r.deltaExchange)       0.31697       1511       5933    0.00730    0.20298 

     Tuning: The value of the operator's tuning parameter, or '-' if the operator can't be optimized.
    #accept: The total number of times a proposal by this operator has been accepted.
    #reject: The total number of times a proposal by this operator has been rejected.
      Pr(m): The probability this operator is chosen in a step of the MCMC (i.e. the normalized weight).
  Pr(acc|m): The acceptance probability (#accept as a fraction of the total proposals for this operator).


Total calculation time: 72.275 seconds
End likelihood: -6086.679077440407
Done!

Analysing BEAST 2’s output

We start by opening the Tracer program, and dropping the RSV2.log file onto Tracer’s window. Alternatively you can load the trace (.log) file by clicking “File” > “Import Trace File…”.

Tracer will display several summary statistics computed from the trace file in the left-hand and top-right panels. Other summaries will be displayed graphically. It is all pretty self-evident, just click around to familiarize yourself with the program.

The first thing you will notice from this first run is that the effective sample sizes (ESSs) for many of the logged quantities are small (ESSs less than 100 will be highlighted in red by Tracer). This is not good. A low ESS means that the trace contains a lot of correlated samples and thus may not represent the posterior distribution well. In the bottom-right panel you will see a histogram that looks a bit rough. This is expected when ESSs are low.

The posterior histogram of mu
Figure 2: A screenshot of Tracer showing the sampling histogram of `μ` from a short run.

If you click the “Trace” tab in the upper-right corner, you will see the raw trace, that is, the sampled values against the step in the MCMC chain (Fig. 3).

The trace shows the value of a given statistic for each sampled step of the MCMC chain. It will help you see if the chain is mixing well and to what extent the samples are correlated. Here you can see how the samples are correlated.

The trace of short run
Figure 3: A screenshot of Tracer showing the trace of a short run.

In LPhyBEAST, the default MCMC chain length is 1,000,000. After we remove the burn-in (set to 10%), there will be 1,800 samples in the trace (we ran the MCMC for steps sampling every 500). Note how adjacent samples often tend to have similar values – not great.

The ESS for the mean molecular evolution rate (μ) is about 25 so we are only getting 1 independent sample every 72 samples (to a total of ~1800/25 “effective” samples). With such a short run, it may also be the case that the default burn-in of 10% of the chain length is inadequate. Not excluding enough of the start of the chain as burn-in will render estimates of ESS unreliable.

Let us try to fix this low ESS issue by running the MCMC chain for longer. We will set the chain length to 20,000,000, but still log the same number of samples (2,000). You can now run LPhyBEAST again create a new .xml file:

$BEAST_DIR/bin/lphybeast -l 20000000 -o RSV2long.xml RSV2.lphy

Two extra arguments:

  • -l changes the MCMC chain length (default to 1 million) in the XML;
  • -o replaces the output file name (default to use the same file steam as the lphy input file).

Now run BEAST 2 with the new XML again. It may take about half an hour to complete using modern computers.
If you do not want to wait, you can download the completed log and tree files by running the longer MCMC chain.

Load the new log file into Tracer after it is finished (you can leave the old one loaded for comparison). Click on the “Trace” tab and look at the raw trace plot.

The trace of long run
Figure 4: A screenshot of Tracer showing the trace of a long run.

With this much longer MCMC chain, we still have 1,800 samples after removing the first 10% as burn-in, but our ESSs are all > 200. A large ESS does not mean there is no auto-correlation between samples overall, but instead that we have at least 200 effectively independent samples. More effective samples means we are approximating the posterior better than before, when ESSs were less than 100.

(You should keep an eye for weird trends in the trace plot suggesting that the MCMC chain has not yet converged, or is struggling to converge, like different value “plateaus” attracting the chain for a large number of steps.)

Now that we are satisfied with the length of the MCMC chain and its mixing, we can move on to one of the parameters of interest: the global substitution rate, μ. This is the average substitution rate across all sites in the alignment.

First, select “mu” in the left-hand panel, and click the “Marginal Density” tab in the upper-right corner to see the posterior density plot for this parameter. You should see a plot similar to this:

marginal density
Figure 5: A screenshot of Tracer showing the marginal posterior probability density of the mean substitution rate, `μ`.

As we can see in Figure 4, the marginal posterior probability density of μ is roughly bell-shaped. If the curve does not look very smooth, it is because there is some sampling noise: we could smooth it out more if we ran the MCMC chain for longer or took more samples. Regardless, our mean and highest posterior density interval (HPD; also known as the credible interval) estimates should be good given the ESSs.

Note that you can overlay the density plots of multiple traces in order to compare them (it is up to you to determine whether they are comparable on the the same axis or not). For example: select the relative substitution rates for all three codon positions (partitions) in the left-hand panel (labelled “r_0”, “r_1” and “r_2”).

You will now see the posterior probability densities for the relative substitution rate at all three codon positions overlaid:

relative substitution rates
Figure 6: A screenshot of Tracer showing the marginal posterior probability densities for the relative substitution rates, `r_0`, `r_1` and `r_2`.

Summarising the trees

Because of how peculiar and discrete tree space is, it is a bit harder to summarize and visualize the posterior distribution over phylogenetic trees, as compared to the mean molecular rate μ, for example. We will use a special tool for that, TreeAnnotator.

To summarize the trees logged during MCMC and create a maximum clade credibility (MCC) tree, launch the program TreeAnnotator and follow these steps:

  1. Set 10% as the burn-in percentage;
  2. Select “Maximum clade credibility tree” as the “Target tree type”;
  3. For “Node heights”, choose “Mean heights”;
  4. Load the tree log file that BEAST 2 generated (it will end in “.trees” by default) as “Input Tree File”. For this tutorial, the tree log file is called RSV2long.trees. If you select it from the file chooser, the parent path will automatically fill in and “YOUR_PATH” in the screen shot will be that parent path.
  5. Finally, for “Output File”, copy and paste the input file path but replace the RSV2long.trees with RSV2long.tree. This will create the file containing the resulting maximum clade credibility tree.

The image below shows a screenshot of TreeAnnotator with the necessary settings to create the summary tree. “YOUR_PATH” will be replaced to the corresponding path.

TreeAnnotator
Figure 7: A screenshot of TreeAnnotator showing how to create a summary tree from a posterior tree set.

The setup described above will take the set of trees in the tree log file and summarize it with a single MCC tree. The MCC tree is the tree that has the largest clade probability product across all nodes. Divergence times will reflect the mean ages of each node, and these times will be annotated with their 95% highest posterior density (HPD) intervals. TreeAnnotator will also display the posterior clade credibility of each node in the MCC tree. More details on summarizing trees can be found in beast2.org/summarizing-posterior-trees/.

Note that TreeAnnotator only parses a tree log file into an output text file, but the visualization has to be done with other programs (see next section).

Visualizing the trees

Summary trees can be viewed using FigTree and DensiTree, the latter being distributed with BEAST 2. In FigTree, just load TreeAnnotator’s output file as the input (if you were to load the entire tree log file, FigTree would show you each tree in the posterior individually).

After loading the MCC tree in FigTree, click on the “Trees” tab on the left-hand side, then select “Order nodes” while keeping the default ordering (“increasing”). In addition, check “Node Bars”, choosing “height_95%_HPD” for “Display”. This last step shows the estimated 95% HPD intervals for all node heights. You should end up with a tree like the one in Figure 8.

MCC tree
Figure 8: The maximum clade credibility (MCC) tree for the G gene of 129 RSVA-2 viral samples.

In DensiTree, load the tree log file as the input. Here, the tree in blue is the MCC tree, where the green “fuzzy” trees are all the other trees in the posterior set.

MCC tree
Figure 9: The posterior tree set visualised in DensiTree.

Questions

In what year did the common ancestor of all RSVA viruses sampled live? What is the 95% HPD?

Programs used in this tutorial

The following software will be used in this tutorial. We also offer a Tech Help page to assist you in case you encounter any unexpected issues with certain third-party software tools.

  • Java 17 or a later version is required by LPhy Studio.

  • LPhy Studio - This software will specify, visualize, and simulate data from models defined in LPhy scripts. At the time of writing, the current version is v1.4.4. It is available for download from LPhy releases.

  • LPhy BEAST - this BEAST 2 package will convert LPhy scripts into BEAST 2 XMLs. The installation guide and usage can be found from User Manual.

  • BEAST 2 - the bundle includes the BEAST 2 program, BEAUti, DensiTree, TreeAnnotator, and other utility programs. This tutorial is written for BEAST v2.7.5 or higher version. It is available for download from http://www.beast2.org. Use Package Manager to install the required BEAST 2 packages.

  • BEAST labs package - this contains some generally useful stuff used by other BEAST 2 packages.

  • BEAST feast package - this is a small BEAST 2 package which contains additions to the core functionality.

  • Tracer - this program is used to explore the output of BEAST (and other Bayesian MCMC programs). It graphically and quantitatively summarises the distributions of continuous parameters and provides diagnostic information. At the time of writing, the current version is v1.7.2. It is available for download from Tracer releases.

  • FigTree - this is an application for displaying and printing molecular phylogenies, in particular those obtained using BEAST. At the time of writing, the current version is v1.4.3. It is available for download from FigTree releases.

XML and log files

References

  • Zlateva, K. T., Lemey, P., Vandamme, A.-M., & Van Ranst, M. (2004). Molecular evolution and circulation patterns of human respiratory syncytial virus subgroup a: positively selected sites in the attachment g glycoprotein. J. Virol., 78(9), 4675–4683.
  • Zlateva, K. T., Lemey, P., Moës, E., Vandamme, A.-M., & Van Ranst, M. (2005). Genetic variability and molecular evolution of the human respiratory syncytial virus subgroup B attachment G protein. J. Virol., 79(14), 9157–9167. https://doi.org/10.1128/JVI.79.14.9157-9167.2005
  • Drummond, A. J., & Bouckaert, R. R. (2014). Bayesian evolutionary analysis with BEAST 2. Cambridge University Press.
  • Höhna, S., Landis, M. J., Heath, T. A., Boussau, B., Lartillot, N., Moore, B. R., Huelsenbeck, J. P., Ronquist, F. (2016). RevBayes: Bayesian phylogenetic inference using graphical models and an interactive model-specification language. Syst. Biol., 65(4), 726-736.