Hi all,
We are happy to announce the release of GMSC-mapper v0.2.0. GMSC-mapper is a command line tool to query the Global Microbial smORFs Catalog (GMSC) and find candidate small proteins.
What's new in v0.2.0:
New features
- Database downloads are now significantly smaller: FASTA indexes are downloaded xz-compressed and habitat/taxonomy files are downloaded compressed, both decompressed on the fly.
- Generated TSV outputs and summary.txt now include a "# GMSC-mapper version ..." header comment for reproducibility.
- New "gmsc-mapper citation" subcommand prints the paper citation.
- New --version flag.
- Download progress feedback (file size) is now shown during execution downloaddb.
Bug fixes
- Fixed inverted condition in contig length check.
- Fixed duplicate sequence filtering in filter_smorfs().
- Fixed overly broad exception handling in generate_fasta().
- Fixed resource leak in predicted_smorf_count().
- Fixed typo in output filename: predicted.filterd.smorf.faa is now predicted.filtered.smorf.faa.
- Fixed FutureWarning from newer pandas in taxonomy mapping.
Breaking changes
- DIAMOND and MMseqs2 must now be installed before running GMSC-mapper. The tool no longer attempts to auto-download them and will show a clear error if they are missing.
- The output file predicted.filterd.smorf.faa has been renamed to predicted.filtered.smorf.faa (typo fix). Pipelines that depend on the old filename will need updating.
- Packaging has moved from setup.py to pyproject.toml. Install with "pip install ." or conda/pixi
- The --mode argument in createdb is now validated; invalid values are rejected instead of silently ignored.
Bioconda packages are up-to-date and you can install with conda or pixi:
conda install gsmc-mapper
Full changelog:
https://github.com/BigDataBiology/GMSC-mapper/blob/main/ChangeLog
If you use GMSC-mapper in a publication, please cite:
Duan, Y., Santos-Junior, C.D., Schmidt, T.S. et al. A catalog of small proteins from the global microbiome. Nat Commun 15, 7563 (2024).
https://doi.org/10.1038/s41467-024-51894-6
As usual, we welcome feedback and bug reports.
Thank you,
Luis Pedro Coelho
Luis Pedro Coelho | Queensland University of Technology |
https://luispedro.org
https://orcid.org/0000-0002-9280-7885