Expand description
§Fastq file statistics
This is a minimal Python package to quickly count reads and basepairs in .fastq.gz
files.
To achieve its performance goals, the package is implemented in Rust and packages for Python using the Maturin project.
The package contains two functions: analyze_file()
and analyze_files()
.
Both return an instance of ReadsStats
, which exposes the read_count
and basepair_count
of the input files.
§Python Usage Example
from pathlib import Path
from fastq_stats import ReadsStats, analyze_file, analyze_files
# single thread
result = analyze_file(Path("file.fastq.gz"))
assert result == ReadsStats(read_count=10, basepair_count=120)
# multiple threads (default: up to os.cpu_count())
result = analyze_files([Path("file1.fastq.gz"), Path("file2.fastq.gz")])
assert result == ReadsStats(read_count=12, basepair_count=250)
# multiple threads (explicit)
result = analyze_files([Path("file1.fastq.gz"), Path("file2.fastq.gz")], num_threads=3)
assert result == ReadsStats(read_count=12, basepair_count=250)
Structs§
- Statistics about NGS reads
Enums§
- Collection of Path-like Python types
Functions§
- Count all reads and basepairs in a
.fastq.gz
file. - Count all reads and basepairs in multiple
.fastq.gz
files.