4TY_cgMLST

Introduction

4TY_cgMLST performs the "core genome Multi-Locus Sequence Typing" (cgMLST), a bacterial isolate characterization protocol, which allows identification of clones in microbial population.

uml diagram

Run Analysis 4TY_cgMLST

Once the analysis 4TY_cgMLST has been selected from the run analyses interface, the user will be able to select which bioinformatic tool to use. The available tool is:

chewBBACA - BSR-Based Allele Calling Algorithm

Tools for cgMLST possess schemas for specific bacteria. Schemas available for chewBBACA are listed in the table below.

Tool	Available schemas
chewBBACA	Listeria monocytogenes, Campylobacter jejuni, Campylobacter coli, Staphylococcus aureus, Brucella, Brucella melitensis, Klebsiella pneumoniae.

Note 1: Running 4TY_cgMLST on a microorganism, for which there is no corresponding cgMLST schema will cause the run to fail.

The input selection UI delivers an advanced input selection mode, to allow selection of all types of supported input files at once.

Accepted inputs can be from:

4TY_cgMLST requires input sequences from de novo assembly or mapping; if the latter are provided, the reference genome that has been used for mapping will also be required.

A link to Check analysis will be created after launching the requested analysis. The system will notify the user after a succesful analysis launch and once execution has ended.

Output directory

Please refer to Cohesive's specific Wiki page for information on file download.

The output directory is guida ufficiale di available at the link in the download page or at the link presente in the analysis' summary card, and will have the following structure: results > YEAR > ID > 4TY_cgMLST > DSXXXXXXXX-DTXXXXXX_chewbbaca. At that path there will be 3 directories:

meta: ("metadata") contains log and configuration files.
result: contains the analysis' output files.
qc: ("quality check") it contains 2 directories (meta and result). In this case quality check is performed with Quast.

Output files from allele call with chewBBACA are available with 3 different encoding:

IZS encoding: each allele is identified with a progressive numeric ID. ID assignment considers all loci, thus it DOES NOT restart from 1 at each new locus.

Pasteur encoding: the code for identified alleles consists of a numeric value. Progression restarts from 1 at each locus. For each execution the analysis restarts from the unmodified, downloaded database. Used schema displays download date.

MD5 encoding: each allele is identified with an alphanumeric code of 16 characters (MD5 code), obtained through a "hash" applied to the allele's sequence.

File	Description	Location
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_new_alleles.txt	sequences of newly-identified alleles	result directory
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_alleles.tsv	allele call with Pasteur encoding in csv format	result directory
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_contigsInfo.tsv	info about the contig mapped on each locus	result directory
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_izsam.csv	allele call with IZS encoding	result directory
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_md5.csv	allele call with md5 encoding	result directory
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_pasteur_2021-05-28.csv	allele call with Pasteur encoding in tsv format	result directory
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_statistics.tsv	metrics on loci encoded as EXC, INF, LNF, PLOT, NIPH, ALM, ASM	result directory
DSXXXXXXXX-ID_import_chewbbaca_check.csv	quality check with info on calledPerc, calledNum, annotated, new, notFound, discarded	qc > result directory

For more information on locus encoding and on chewBBACA's output files, please refer to chewBBACA's official guide.

Introduction
Run Analysis 4TY_cgMLST
Output directory

Topics

4TY_cgMLST

Introduction

Run Analysis 4TY_cgMLST

Output directory

Contents

Previous

4 Genome annotation, 4AN_genes

Next

4 in silico typing, 4TY_lineage

Topics

4TY_cgMLST

Introduction

Run Analysis 4TY_cgMLST

Output directory

Contents

Search within the documentation

Previous

4 Genome annotation, 4AN_genes

Next

4 in silico typing, 4TY_lineage