4AN_genes

Introduction

4AN_genes performs functional genome annotation, which identifies possible coded proteins and ORFs.

uml diagram

Run Analysis 4AN_genes

Once the analysis 4AN_genes has been selected from the run analyses interface, the user will be able to select which bioinformatic tool to use. The only available tool for this analysis is Prokka - Tool to annotate bacterial, archaeal and viral genomes.

The input selection UI delivers an advanced input selection mode, to allow selection of all types of supported input files at once.

The first required parameter is the kingdom (i.e. virus or bacteria, plus "host", an artificial group which includes possible host organisms). The second parameter is a reference genome.

Accepted inputs can be from:

If output from mapping is provided, the reference genome that has been used for mapping will also be required.

A link to Check analysis will be created after launching the requested analysis. The system will notify the user after a succesful analysis launch and once execution has ended.

Output directory

Please refer to Cohesive's specific Wiki page for information on file download.

The output directory is available at the link in the download page or at the link presente in the analysis' summary card, and will have the following structure: results > YEAR > ID > 4AN_genes > DSXXXXXXXX-DTXXXXXX_prokka. At that path there will be 2 directories:

meta: ("metadata") contains log and configuration files.
result: contains the analysis' output files.

The following table lists Prokka's output files.

File	Description	Location
log_errore_controlli_esami.log	run's warning and error log	main directory
metadata_samples.tsv	samples' metadata summary table in tsv format	main directory
results.csv	summary table separated by semicolon (";") containing sample IDs and information	main directory
DSXXXXXXXX-DTXXXXXX_ID_prokka_REFID_result.err	text report file with run's errors	results directory
DSXXXXXXXX-DTXXXXXX_ID_prokka_REFID_result.faa	amminoacidic sequences from translation of identified coding genes (faa format - fasta aminoacid)	results directory
DSXXXXXXXX-DTXXXXXX_ID_prokka_REFID_result.ffn	nucleotidic sequences of identified coding genes (fnn format - fasta nucleotide)	results directory
DSXXXXXXXX-DTXXXXXX_ID_prokka_REFID_result.fna	nucleotidic sequences of identified coding genes (fna format)	results directory
DSXXXXXXXX-DTXXXXXX_ID_prokka_REFID_result.fsa	sequences in fsa format (fragment analysis data file)	results directory
DSXXXXXXXX-DTXXXXXX_ID_prokka_REFID_result.gbk	output file in GenBank format	results directory
DSXXXXXXXX-DTXXXXXX_ID_prokka_REFID_result.gff	output file in gff format (General Feature Format)	results directory
DSXXXXXXXX-DTXXXXXX_ID_prokka_REFID_result.log	Prokka's run log	results directory
DSXXXXXXXX-DTXXXXXX_ID_prokka_REFID_result.sqn	file per la sottomissione a GenBank in formato Sequin	results directory
DSXXXXXXXX-DTXXXXXX_ID_prokka_REFID_result.tbl	text file with information on sequence and loci	results directory
DSXXXXXXXX-DTXXXXXX_ID_prokka_REFID_result.tsv	tsv list of loci and proteins from mapped coding genes	results directory
DSXXXXXXXX-DTXXXXXX_ID_prokka_REFID_result.txt	metrics on identified CDS	results directory
proteins.faa	protein sequnces in faa format	results directory

For more details on Prokka's output files, please refer to Prokka's official manual.

Introduction
Run Analysis 4AN_genes
Output directory

Topics

4AN_genes

Introduction

Run Analysis 4AN_genes

Output directory

Contents

Previous

4 Genome annotation, 4AN_AMR

Next

4 in silico typing, 4TY_cgMLST

Topics

4AN_genes

Introduction

Run Analysis 4AN_genes

Output directory

Contents

Search within the documentation

Previous

4 Genome annotation, 4AN_AMR

Next

4 in silico typing, 4TY_cgMLST