Skip to content

Latest commit

 

History

History
42 lines (30 loc) · 1.46 KB

README.md

File metadata and controls

42 lines (30 loc) · 1.46 KB

krust

Counts k-mers, written in rust

Usage: krust <k> <path> [reader]

Arguments:
  <k>       provides k length, e.g. 5
  <path>    path to a FASTA file, e.g. /home/lisa/bio/cerevisiae.pan.fa
  [reader]  select *rust-bio* or *needletail* as FASTA reader [default: rust-bio]

Options:
  -h, --help     Print help information
  -V, --version  Print version information

krust is a k-mer counter - a bioinformatics 101 tool for counting the frequency of substrings of length k within strings of DNA data. krust is written in Rust and run from the command line. It takes a fasta file of DNA sequences and will output all canonical k-mers (the double helix means each k-mer has a reverse complement) and their frequency across all records in the given data. krust is tested for accuracy against jellyfish.

krust supports either rust-bio or needletail to read fasta records.

Run krust with rust-bio's fasta reader to count 5-mers like this:

cargo run --release 5 your/local/path/to/fasta_data.fa

or, searching for 21-mers with needletail as the fasta reader like this:

cargo run --release 21 your/local/path/to/fasta_data.fa needletail

krust prints to stdout, writing, on alternate lines:

>114928
ATGCC
>289495
AATCA
...