Crate fastq_stats

Source
Expand description

§Fastq file statistics

This is a minimal Python package to quickly count reads and basepairs in .fastq.gz files.

To achieve its performance goals, the package is implemented in Rust and packages for Python using the Maturin project.

The package contains two functions: analyze_file() and analyze_files(). Both return an instance of ReadsStats, which exposes the read_count and basepair_count of the input files.

§Python Usage Example

from pathlib import Path
from fastq_stats import ReadsStats, analyze_file, analyze_files

# single thread
result = analyze_file(Path("file.fastq.gz"))
assert result == ReadsStats(read_count=10, basepair_count=120)

# multiple threads (default: up to os.cpu_count())
result = analyze_files([Path("file1.fastq.gz"), Path("file2.fastq.gz")])
assert result == ReadsStats(read_count=12, basepair_count=250)

# multiple threads (explicit)
result = analyze_files([Path("file1.fastq.gz"), Path("file2.fastq.gz")], num_threads=3)
assert result == ReadsStats(read_count=12, basepair_count=250)

Structs§

Enums§

  • Collection of Path-like Python types

Functions§

  • Count all reads and basepairs in a .fastq.gz file.
  • Count all reads and basepairs in multiple .fastq.gz files.