Parsing special characters to standard a-z

16 views
Skip to first unread message

Jonn Doe

unread,
May 15, 2024, 12:31:22 PMMay 15
to beautifulsoup

What I mean is where it would say this content taken from .... and its all in different char tables that look similar to standard a-z. This is so you can search for them easily and remove.

So, anyone knows of a way of 1. Locating all non standard chars easily without to much overhead and 2 converting them to standard chars all done via the wonderful BeautifulSoup? Anyone has to do this already?

Thanks

Chris Papademetrious

unread,
May 15, 2024, 1:30:32 PMMay 15
to beautifulsoup
Hi jonn,

As a starting point, perhaps you could do a search for "converting Unicode to ASCII" and find a library that works the way that you want? Then you could iterate through all the NavigableString objects and apply that library's processing.

 - Chris

Jonn Doe

unread,
May 15, 2024, 1:38:21 PMMay 15
to beauti...@googlegroups.com
Thanks I was just wondering if there was a pre cooked routine lol.

--
You received this message because you are subscribed to the Google Groups "beautifulsoup" group.
To unsubscribe from this group and stop receiving emails from it, send an email to beautifulsou...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/beautifulsoup/1ef297b2-ae1e-4640-b187-530fe40aea31n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages