Skip to content

Welcome to bacannot pipeline documentation

F1000 Paper GitHub release (latest by date including pre-releases) Documentation Nextflow run with docker run with singularity License Follow on Twitter Zenodo Archive

About

Bacannot is a pipeline designed to provide an easy-to-use framework for performing a comprehensive annotation on prokaryotic genomes. It is developed with Nextflow and Docker. It can annotate resistance genes, virulence factors, genomic islands, prophages, methylation and more.

Workflow

The pipeline's main steps are:

Analysis steps Used software or databases
Genome assembly (if raw reads are given) Flye and Unicycler
Identification of closest 10 NCBI Refseq genomes RefSeq Masher
Generic annotation and gene prediction Prokka or Bakta
rRNA prediction barrnap
Classification within multi-locus sequence types (STs) mlst
KEGG KO annotation and visualization KofamScan and KEGGDecoder
Annotation of secondary metabolites antiSMASH
Methylation annotation Nanopolish
Annotation of antimicrobial (AMR) genes AMRFinderPlus, ARGminer, Resfinder and RGI
Annotation of virulence genes Victors and VFDB
Prophage sequences and genes annotation PHASTER, Phigaro and PhySpy
Annotation of integrative and conjugative elements ICEberg
Annotation of bacterial integrons Integron Finder
Focused detection of insertion sequences digIS
In silico detection and typing of plasmids Plasmidfinder, Platon and MOB-typer
Prediction and visualization of genomic islands IslandPath-DIMOB and gff-toolbox
Custom annotation from formatted FASTA or NCBI protein IDs BLAST
Merge of annotation results bedtools
Genome Browser renderization JBrowse
Circos plot generation easy_circos
Renderization of automatic reports and shiny app for results interrogation R Markdown, Shiny and SequenceServer

Quickstart

A quickstart is available so you can quickly get the gist of the pipeline's capabilities.

About prokka annotation

In order to increase the accuracy of prokka annotation, this pipeline includes an additional HMM database to prokka's defaults. It can be either TIGRFAM (smaller but curated) or PGAP (bigger comprehensive NCBI database that contains TIGRFAM).

Usage

The pipeline's common usage is very simple as shown below:

# usual command-line
nextflow run fmalmeida/bacannot \
    --bacannot_db "./bacannot_databases" \
    --input "bacannot_samplesheet.yml"

Quote

Some parameters are required, some are not. Please read the pipeline's manual reference to understand each parameter.

Support contact

Whenever a doubt arise feel free to contact me at almeidafmarques@gmail.com