Unable to capture multiple hars with the Python library?

54 views
Skip to first unread message

Linda Nguyen

unread,
Apr 28, 2020, 10:37:58 AM4/28/20
to BrowserMob Proxy
Hello!

I am using the Python library with selenium webdriver. I am able to successfully capture one HAR, but on the next request, I get an empty HAR file.

I am trying to make two requests because I need to authenticate and pass in the cookies to the next request, which is the page I actually want the HAR file of.

    server = Server("./Resources/browsermob-proxy-2.1.4/bin/browsermob-proxy")
    server.start()
    sleep(1)
    # WARNING: trustAllServers is a security risk but is necessary for SSL due to browsermob-proxy cert checks: WARNING
    # gotcha: need to install BMP cert into browser
    proxy = server.create_proxy(params={"trustAllServers": "true"})
    sleep(1)

    options = Options()
    options.headless = True

    # configure the browser proxy in Firefox
    profile = webdriver.FirefoxProfile()
    profile.set_proxy(proxy.selenium_proxy())
    browser = webdriver.Firefox(
        options=options,
        firefox_profile=profile,
        executable_path="./Resources/geckodriver",
        proxy=proxy.selenium_proxy(),
    )
    
    user = random.choice(list(creds.keys()))
    proxy.new_har(ref="Auth",options={"captureHeaders": True, "captureContent": True})
    browser.get("http://127.0.0.1:4000/login")
    response_har = proxy.har
    
    #storing the cookies generated by the browser
    request_cookies_browser = browser.get_cookies()

    #making a persistent connection using the requests library
    s = requests.Session()
    data = {"name":user, "password":creds[user]}
    r = s.post("http://127.0.0.1:4000/login", data=data)

    #passing the cookie of the response to the browser
    dict_resp_cookies = s.cookies.get_dict()
    response_cookies_browser = [{'name':name, 'value':value} for name, value in dict_resp_cookies.items()]
    
    c = [browser.add_cookie(c) for c in response_cookies_browser]

    proxy.new_har(ref="Content", options={"captureHeaders": True, "captureContent": True})
    browser.get(url)
    print(proxy.har)

    # wait until the entire page loads
    print(url)
    sleep(5)

    # returns network logs (HAR) as JSON
    result = extract_resource_urls(proxy.har, weight)
    server.stop()
    browser.quit()

So for /login, I get a normal HAR file, but for the url, I get an empty file:

{'log': {'version': '1.2', 'creator': {'name': 'BrowserMob Proxy', 'version': '2.1.4', 'comment': ''}, 'pages': [{'id': 'Content', 'startedDateTime': '2020-04-28T10:19:08.885-04:00', 'title': 'CTFd', 'pageTimings': {'comment': ''}, 'comment': ''}], 'entries': [], 'comment': ''}}

Any idea why? Is it because I'm using localhost? Do I need to reset the HAR file in between (I thought new_har does that)?
Reply all
Reply to author
Forward
0 new messages