Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1022243: ITP: thefuzz -- Fuzzy string matching in Python (was fuzzywuzzy)

10 views
Skip to first unread message

Edward Betts

unread,
Oct 22, 2022, 12:40:03 PM10/22/22
to
Package: wnpp
Severity: wishlist
Owner: Edward Betts <edw...@4angle.com>
X-Debbugs-Cc: debian...@lists.debian.org, debian...@lists.debian.org

* Package name : thefuzz
Version : 0.19.0
Upstream Author : Adam Cohen <ad...@seatgeek.com>
* URL : https://github.com/seatgeek/thefuzz
* License : GPL-2
Programming Lang: Python
Description : Fuzzy string matching in Python

Various methods for fuzzy matching of strings in Python, including:
.
- String similarity: Gives a measure of string similarity between 0 and 100.
- Partial string similarity: Inconsistent substrings are a common problem
when string matching. To get around it, use a "best partial" heuristic
when two strings are of noticeably different lengths.
- Token sort: This approach involves tokenizing the string in question,
sorting the tokens alphabetically, and then joining them back into a
string.
- Token set: A slightly more flexible approach. Tokenize both strings, but
instead of immediately sorting and comparing, split the tokens into two
groups: intersection and remainder.

I plan to maintain this package as part of the Python team.

This Python library was previously known as fuzzywuzzy before being renamed to
thefuzz.

There are five packages in Debian that depend on fuzzywuzzy:

gnome-pass-search-provider
python3-fluids
wajig
sublime-music
python3-fluids

Once these packages have switched to using thefuzz I will write to FTP master
and ask for fuzzywuzzy to be deleted from the archive.
0 new messages