======== Glossary ======== .. glossary:: Augur A command-line application used for phylogenetic analysis. :doc:`Documentation` Auspice A web application used for phylogenetic visualization and analysis. :doc:`Documentation` pathogen repository A version-controlled folder containing all files necessary to run a pathogen's :term:`workflows`. core repository A :term:`pathogen repository` maintained by the Nextstrain team. workflow A reproducible process comprised of one or more :term:`builds` producing :term:`datasets`. Implementation varies per workflow, but generally they are run by workflow managers such as Snakemake. A Nextstrain :term:`pathogen repository` typically consists of these different workflows 1. :term:`phylogenetic workflow` 2. :term:`ingest workflow` 3. :term:`Nextclade workflow` Our :term:`core workflows` can be divided into two types: 1. Single-build workflow (e.g. Zika workflow): one build producing one dataset. 2. Multi-build workflow (e.g. SARS-CoV-2 workflow): multiple builds producing multiple datasets. .. note:: The individual builds in a multi-build workflow are also "workflows" in the definition of workflow managers like Snakemake. phylogenetic workflow also *Nextstrain workflow* A :term:`workflow` consisting of :term:`build(s)` that execute bioinformatic analyses with :term:`Augur` to generate :term:`phylogenetic dataset(s)` for visualization with :term:`Auspice`. The phylogenetic workflow is often considered the primary workflow in a pathogen repository (e.g. "the Zika workflow" typically means "the phylogenetic workflow in the Zika pathogen repository"). ingest workflow A :term:`workflow` consisting of :term:`build(s)` that curate public metadata and sequences to generate :term:`ingest dataset(s)` that are typically used as input files for :term:`phylogenetic workflows` and :term:`Nextclade workflows`. Nextclade workflow A :term:`workflow` consisting of :term:`build(s)` that generate :doc:`reference tree(s)` to be packaged with other dataset files to create :term:`Nextclade dataset(s)`. core workflow A default :term:`workflow` maintained by the Nextstrain team that can usually be run without additional configurations or customizations. build also *Nextstrain build*, *phylogenetic build*, *ingest build*, *Nextclade build* *(noun)* A sequence of commands, parameters and input files which work together to reproducibly generate a :term:`dataset`. build (verb) A general term for running a :term:`workflow` (e.g. ``nextstrain build``). build step A modular instruction of a :term:`build` which can be run standalone (e.g. ``augur filter``), often with clear input and output files. dataset A collection of output files produced by a :term:`build`. A Nextstrain :term:`pathogen repository` typically produces multiple types of datasets 1. :term:`phylogenetic dataset` 2. :term:`ingest dataset` 3. :term:`Nextclade dataset` phylogenetic dataset also *Auspice JSONs* A :term:`dataset` consisting of :term:`JSONs` produced by a :term:`build` of a :term:`phylogenetic workflow`. It is also the shared file prefix of the JSONs. For example ``flu/seasonal/h3n2/ha/2y`` identifies a dataset which corresponds to the :ref:`files `: - ``flu_seasonal_h3n2_ha_2y.json``: primary JSON file - ``flu_seasonal_h3n2_ha_2y_root-sequence.json``: sidecar file - ``flu_seasonal_h3n2_ha_2y_tip-frequencies.json``: sidecar file Some phylogenetic workflows produce a single, synonymous dataset, like Zika. Others, like seasonal flu, produce many datasets. The phylogenetic dataset is often considered the primary dataset in a pathogen repository (e.g. "the Zika dataset" typically means "the phylogenetic dataset from the Zika pathogen repository"). ingest dataset A :term:`dataset` consisting of curated files produced by a :term:`build` of an :term:`ingest workflow`. Typically consists of the files: * metadata.tsv * sequences.fasta If the ingest workflow includes Nextclade :term:`build steps`, then the dataset will typically include :doc:`Nextclade output files` as well. Nextclade dataset A :term:`dataset` consisting of files required for a :doc:`Nextclade` analysis, usually produced by a :term:`build` of a :term:`Nextclade workflow`. See :doc:`documentation` for more details narrative A method of data-driven storytelling with interactive views of :term:`phylogenetic datasets` displayed alongside multiple pages (or slides) of text and images. Saved as a Markdown file with extended syntax to support additional displays. Viewable on nextstrain.org or with :term:`Auspice` via the :doc:`cli:commands/view` or :doc:`auspice view ` commands. See also :doc:`/guides/communicate/narratives-intro` and :doc:`/tutorials/narratives-how-to-write`. JSONs Special ``.json`` files produced by :term:`Augur` and visualized by :term:`Auspice`. These files make up a :term:`phylogenetic dataset`. See :doc:`data formats`. Nextstrain CLI The Nextstrain command-line interface (**Nextstrain CLI**) provides a consistent way to run and visualize :term:`pathogen builds` and access Nextstrain components like :term:`Augur` and :term:`Auspice` across :term:`runtimes` such as Docker, Conda, and AWS Batch. :doc:`Documentation ` runtime also *Nextstrain runtime* When installing and using the :term:`Nextstrain CLI`, there are different configuration options, or **runtimes**, depending on the operating system. 1. Docker runtime 2. Conda runtime 3. Ambient runtime (:ref:`formerly "native" `) 4. AWS Batch runtime (only for ``nextstrain build``)