[newbie] Recommended way to clean URL of parameters?

20 views
Skip to first unread message

Heck Lennon

unread,
Jan 20, 2023, 10:53:32 AM1/20/23
to beautifulsoup
Hello,

Before I go regex on them, is there a better way to remove parameters from URLs?

================
from bs4 import BeautifulSoup, Comment
from bs4.builder import LXMLTreeBuilderForXML

soup = BeautifulSoup(open("a.html", 'r',encoding="utf8"), features='lxml')
for link in soup.find_all("a",{"class":"myclass"}):
    #Get rid of params after URL?
    #href="/myurlh?a=b&c=d"
    #I just need to keep " /myurl"
    print(link.get('href')
================

Thank you.
Reply all
Reply to author
Forward
0 new messages