Slorado

Slorado is a simplified version of Dorado built on top of S/BLOW5 format. Slorado is an extremely lean basecaller with fewer external dependencies and is thus relatively easier to compile than Dorado. Slorado is developed using C/C++ and depends on torchlib. Currently, slorado only supports the Linux operating system (or Windows through WSL). slorado can utilise NVIDIA or AMD GPU accelerators on x86_64 CPUs. Slorado also works on ARM64-based NVIDIA Jetson devices.

Slorado is mainly for our research and educational purposes. Thus, only a minimal set of basecalling features are supported and may not be up-to-date with Dorado. For a feature-rich and up-to-date S/BLOW5-based basecaller for routine use, please see buttery-eel.

Quick start

We provide compiled binaries for NVIDIA (cuda) and AMD (rocm) GPU accelerators on x86_64 CPUs for Linux. You can download the latest relevant binary release that includes the most recent supported basecalling models from releases as below:

VERSION=v0.2.0-beta
GPU=cuda   # GPU=rocm for AMD GPUs
wget "https://github.com/BonsonW/slorado/releases/download/$VERSION/slorado-$VERSION-x86_64-$GPU-linux-binaries.tar.gz"
tar xvf slorado-$VERSION-x86_64-$GPU-linux-binaries.tar.gz
cd slorado-$VERSION
./bin/slorado basecaller models/[email protected] reads.blow5  -o out.fastq -x cuda:all

Detailed instructions are available at:

Binaries for the CPU-only version are not provided as basecalling on the CPU is impractically slow. Nevertheless, the CPU-only version is easier to build compared to the GPU version (see below).

Refer to troubleshoot for help resolving common problems.

Compilation and running

Compilation

Compilation instructions differ based on the system. Please pick one of the following that matches your system:

Running

We have tested this slorado version on basecalling models [email protected], [email protected] and [email protected]. You can download them using the provided script (the binary releases already include these):

scripts/download-models.sh

Now run on a test dataset:

# for CPU
./slorado basecaller -x cpu models/[email protected] test/5khz_r10/one_5khz.blow5 -o reads.fastq
# for GPU
./slorado basecaller -x cuda:all models/[email protected] test/5khz_r10/one_5khz.blow5 -o reads.fastq

Refer to troubleshoot for help resolving common problems. We are currently working on supporting the newer v5 basecalling models.

Testing

After running on a test dataset, you can use minimap2 to align the reads to the reference and calculate the identity score statistics. If the identity score statistics are close enough to what we expect from these models then things are good.

A script to calculate basecalling accuracy is provided:

set environment variable MINIMAP2, if minimap2 is not in PATH.
scripts/calculate_basecalling_accuarcy.sh hg38noAlt.fa reads.fastq

Options

All options supported by slorado basecaller are detailed below:

Option:	Decription:	Default Value:
-t INT	number of processing threads	8
-K INT	batch size (max number of reads loaded at once)	2000
-C INT	gpu batch size (max number of chunks loaded at once)	500
-B FLOAT[K/M/G]	max number of bytes loaded at once	500.0M
-o FILE	output to file	stdout
-c INT	chunk size	10000
-p INT	overlap	150
-x DEVICE	specify device (e.g., cpu; cuda:0; cuda:1,2; cuda:all)	cuda:all (GPU version) or cpu (CPU version)
-h	shows help message and exits	-
--verbose INT	verbosity level	4
--version	print version

Batchsizes

A large batch size (-K and -B) may take up significant RAM during run-time. Similarly, your GPU batch size (-C) will determine how much GPU memory is used. Slorado currently does not implement automatic batch size selection based on available memory. Thus, if you see an out-of-RAM error, reduce the batch size using -K or -B. If you see an out-of-GPU memory error, reduce the GPU batch size using the -C option.

Acknowledgement

A lot of code is coming from Dorado which is licensed under Oxford Nanopore Technologies PLC. Public License Version 1.0. Those files are located at thirdparty/dorado.
tomlc99 library under thirdparty/tomlc99, is licensed under MIT license.
Some code snippets have been taken from Minimap2.

Name		Name	Last commit message	Last commit date
Latest commit History 533 Commits
.github/workflows		.github/workflows
build		build
docs		docs
openfish @ 0ff0847		openfish @ 0ff0847
scripts		scripts
slow5lib @ 18e0bc5		slow5lib @ 18e0bc5
src		src
test		test
thirdparty		thirdparty
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENCE		LICENCE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Slorado

Quick start

Compilation and running

Compilation

Running

Testing

Options

Batchsizes

Acknowledgement

About

Releases 2

Packages

Contributors 4

Languages

License

BonsonW/slorado

Folders and files

Latest commit

History

Repository files navigation

Slorado

Quick start

Compilation and running

Compilation

Running

Testing

Options

Batchsizes

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 4

Languages

Packages