4TY_wgMLST

Introduction

4TY_wgMLST performs the "whole genome Multi-Locus Sequence Typing" (wgMLST) in silico for the Campylobacter jejuni and Campylobacter coli bacteria. Differently than cgMLST (core genome MLST), characterization takes into account the whole genome.

uml diagram

Run Analysis 4TY_wgMLST

Once the analysis 4TY_wgMLST has been selected from the run analyses interface, the user will be able to select which bioinformatic tool to use. The available tool is chewBBACA - BSR-Based Allele Calling Algorithm.

Note: Only Campylobacter jejuni and Campylobacter coli wgMLST schemas are available. Running the analysis on a different microorganism wiil cause the run to fail.

The input selection UI delivers an advanced input selection mode, to allow selection of all types of supported input files at once.

Accepted inputs can be from:

The software requires the input sequences from de novo assembly or from mapping and, in case the latter is selected, the reference genome used for mapping.

A link to Check analysis will be created after launching the requested analysis. The system will notify the user after a succesful analysis launch and once execution has ended.

Output directory

Please refer to Cohesive's specific Wiki page for information on file download.

The output directory is guida ufficiale di available at the link in the download page or at the link presente in the analysis' summary card, and will have the following structure: results > YEAR > ID > 4TY_wgMLST > DSXXXXXXXX-DTXXXXXX_chewbbaca. At that path there will be 3 directories:

meta: ("metadata") contains log and configuration files.
result: contains the analysis' output files.
qc: ("quality check") it contains 2 directories (meta and result). In this case quality check is performed with Quast.

Output files from allele call with chewBBACA are available with 3 different encoding:

IZS encoding: each allele is identified with a progressive numeric ID. ID assignment considers all loci, thus it DOES NOT restart from 1 at each new locus.

Pasteur encoding: the code for identified alleles consists of a numeric value. Progression restarts from 1 at each locus. For each execution the analysis restarts from the unmodified, downloaded database. Used schema displays download date.

MD5 encoding: each allele is identified with an alphanumeric code of 16 characters (MD5 code), obtained through a "hash" applied to the allele's sequence.

Please note that the example file "DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_ccoli_curated_210722.csv" mentioned in the table below can have either the name C. coli or C. jejuni as suffix, depending on the analyzed species.

File	Description	Location
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_new_alleles.txt	sequence of newly-identified alleles	result directory
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_alleles.tsv	allele call with Pasteur encoding in csv format	result directory
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_ccoli_curated_210722.csv	allele csv table. Rows: samples, columns: loci	result directory
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_contigsInfo.tsv	info about the contig mapped on each locus	result directory
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_crc32.csv	allele csv table encoded in CRC32. Rows: samples, columns: loci	result directory
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_izsam.csv	allele call with IZS encoding	result directory
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_md5.csv	allele call with md5 encoding	result directory
DSXXXXXXXX-DTXXXXXX_ID_chewbbaca_results_statistics.tsv	metrics on loci encoded as EXC, INF, LNF, PLOT, NIPH, ALM, ASM	result directory
DSXXXXXXXX-DTXXXXXX_ID_import_chewbbaca_check.csv	quality check with info on calledPerc, calledNum, annotated, new, notFound, discarded	qc > result directory

For more information on locus encoding and on chewBBACA's output files, please refer to chewBBACA's official guide.

Introduction
Run Analysis 4TY_wgMLST
Output directory

Topics

4TY_wgMLST

Introduction

Run Analysis 4TY_wgMLST

Output directory

Contents

Previous

4 in silico typing, 4TY_MLST

Next

SNP-based clustering, CFSAN

Topics

4TY_wgMLST

Introduction

Run Analysis 4TY_wgMLST

Output directory

Contents

Search within the documentation

Previous

4 in silico typing, 4TY_MLST

Next

SNP-based clustering, CFSAN