Navigation icon
Topics

VCF2MST

Introduction

VCF2MST identifies SARS-CoV-2's PANGO lineages and builds a phylogenetic tree graph through the Minimum Spanning Tree (MST) algorithm, starting from VCF files of Single Nucleotide Polymorphysms (SNPs), without alignment-based phylogenomic inference.

The software builds a Minimum Spanning Tree based on Hamming distances, which measure the number of substitutions necessary for a sequence (string) to be transformed into another string.

More information on the software is available at VCF2MST's official GitHub page ("VCF2MST - Hamming Distance based Minimum Spanning Tree from Samples vcf using graptree") or at the corresponding research article.

uml diagram

Run Analysis VCF2MST

The analysis VCF2MST can be selected from the run analyses page.

The input selection UI delivers an advanced input selection mode, to allow selection of all types of supported input files at once.

Accepted inputs for VCF2MST are from step_2AS_mapping. In the input slection interface there will also be a field to provide the reference genome (auto-filled).

A link to Check analysis will be created after launching the requested analysis. The system will notify the user after a succesful analysis launch and once execution has ended.

The analysis summary lists the output directory and additional options, such as access to some of the metadata, log and output files and direct graph visualization thanks to GrapeTree's integration in Cohesive.

Output directory

Please refer to Cohesive's specific Wiki page for information on file download.

The output directory is available at the link in the download page or at the link in the analysis' summary card. The results directory is located directly in the root directory. Inside results there are 2 subdirectories:

  • meta: ("metadata") contains log and configuration files.
  • result: contains the analysis' output files.

The following table lists output files stored in results.

File Description Location
tree.nwk MST treefile in nwk (newick) format results directory