Navigation icon
Topics

Panaroo

Introduction

Panaroo elaborates the pan-genome and builds presence/absence matrices of samples' annotated genes, starting from Prokka's genome annotation.

For in depth information about Panaroo and its operation, please refer to Panaroo's official guide.

Panaroo's GitHub Page: https://github.com/gtonkinhill/panaroo

uml diagram

Run Analysis Panaroo

Once the analysis Panaroo has been selected from the run analyses interface, the wizard will present a confirmation UI: there is no need for tool selection, since the only tool available is "Panaroo - An updated pipeline for pangenome investigation".

The input selection wizard will allow to confirm the input for Panaroo, which is deisgned to work on Prokka's output (thus the annotation files from 4AN_genes will be the input). Fields are pre-filled and no further selection by the user is required.

A link to Check analysis will be created after launching the requested analysis. The system will notify the user after a succesful analysis launch and once execution has ended.

Output directory

Please refer to Cohesive's specific Wiki page for information on file download.

The output directory can be reached from the link of the download page or from the link in the analysis summary. The results directory is located directly in the root directory and it contains the following 2 directories:

  • meta: ("metadata") contains log and configuration files.
  • result: contains the analysis' output files.

The following table lists files created by Panaroo, alongside some useful information. More information on Panaroo's output files are available at the official guide to Panaroo's outputs.

File Description Location
combined_DNA_CDS.fasta fasta file of nucleotide sequences from annotated genes and genes identified by Panaroo results directory
combined_protein_CDS.fasta fasta file of aminoacid sequences from annotated genes and genes identified by Panaroo results directory
combined_protein_GFF3cdhit_out.txt log of Panaroo's CD-HIT phase results directory
combined_protein_cdhit_out.txt.clstr CD-HIT cluster info results directory
core_alignment_header.embl results directory
core_gene_alignment.aln alignment file results directory
final_graph.gml pan-genome graph results directory
gene_data.csv csv table with sequences of annotated genes and corresponding Panaroo internal codes results directory
gene_presence_absence.Rtab gene presence/absence binary matrix results directory
gene_presence_absence.csv csv file of gene presence in samples results directory
gene_presence_absence_roary.csv csv file of gene presence in samples (Roary model) results directory
pan_genome_reference.fa pan-genome reference fasta for genes in the dataset results directory
pre_filt_graph.gml raw pan-genome graph results directory
struct_presence_absence.Rtab presence/absence binary matrix for gene rearrangement events results directory
summary_statistics.txt metrics summary file results directory

Panaroo's authors suggest Cytoscape for graph visualization. more information on pan-genome graph visualization are available at Panaroo's official documentation page.