↓ Skip to main content

PLOS

When Whole-Genome Alignments Just Won't Work: kSNP v2 Software for Alignment-Free SNP Discovery and Phylogenetics of Hundreds of Microbial Genomes

Overview of attention for article published in PLOS ONE, December 2013
Altmetric Badge

Mentioned by

policy
2 policy sources
twitter
7 X users
patent
2 patents
wikipedia
1 Wikipedia page

Citations

dimensions_citation
210 Dimensions

Readers on

mendeley
265 Mendeley
citeulike
1 CiteULike
Title
When Whole-Genome Alignments Just Won't Work: kSNP v2 Software for Alignment-Free SNP Discovery and Phylogenetics of Hundreds of Microbial Genomes
Published in
PLOS ONE, December 2013
DOI 10.1371/journal.pone.0081760
Pubmed ID
Authors

Shea N. Gardner, Barry G. Hall

Abstract

Effective use of rapid and inexpensive whole genome sequencing for microbes requires fast, memory efficient bioinformatics tools for sequence comparison. The kSNP v2 software finds single nucleotide polymorphisms (SNPs) in whole genome data. kSNP v2 has numerous improvements over kSNP v1 including SNP gene annotation; better scaling for draft genomes available as assembled contigs or raw, unassembled reads; a tool to identify the optimal value of k; distribution of packages of executables for Linux and Mac OS X for ease of installation and user-friendly use; and a detailed User Guide. SNP discovery is based on k-mer analysis, and requires no multiple sequence alignment or the selection of a single reference genome. Most target sets with hundreds of genomes complete in minutes to hours. SNP phylogenies are built by maximum likelihood, parsimony, and distance, based on all SNPs, only core SNPs, or SNPs present in some intermediate user-specified fraction of targets. The SNP-based trees that result are consistent with known taxonomy. kSNP v2 can handle many gigabases of sequence in a single run, and if one or more annotated genomes are included in the target set, SNPs are annotated with protein coding and other information (UTRs, etc.) from Genbank file(s). We demonstrate application of kSNP v2 on sets of viral and bacterial genomes, and discuss in detail analysis of a set of 68 finished E. coli and Shigella genomes and a set of the same genomes to which have been added 47 assemblies and four "raw read" genomes of H104:H4 strains from the recent European E. coli outbreak that resulted in both bloody diarrhea and hemolytic uremic syndrome (HUS), and caused at least 50 deaths.

X Demographics

X Demographics

The data shown below were collected from the profiles of 7 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 265 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 5 2%
United Kingdom 4 2%
Sweden 3 1%
Australia 2 <1%
Belgium 2 <1%
Slovenia 1 <1%
France 1 <1%
Spain 1 <1%
Germany 1 <1%
Other 0 0%
Unknown 245 92%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 72 27%
Researcher 68 26%
Student > Master 33 12%
Student > Bachelor 15 6%
Other 13 5%
Other 41 15%
Unknown 23 9%
Readers by discipline Count As %
Agricultural and Biological Sciences 103 39%
Biochemistry, Genetics and Molecular Biology 51 19%
Computer Science 27 10%
Immunology and Microbiology 19 7%
Medicine and Dentistry 13 5%
Other 14 5%
Unknown 38 14%