Can Robot Framework automates Web Scraping?

958 views
Skip to first unread message

Ru eL

unread,
Feb 4, 2020, 7:43:24 AM2/4/20
to robotframework-devel
Can Robot Framework automates Web Scraping?

Ru eL

unread,
Feb 4, 2020, 8:10:47 AM2/4/20
to Raju Penumatsa, robotframework-devel
Hi Raju, 

Thank you for your response.


I prefer python libraries but I don't know how to create python libraries. Need help also on this on how to create on it so that I can Automate Web scraping. 

Bdwy, I used Python 3.


Thanks,
Ruel

On Tue, 4 Feb 2020, 8:07 PM Raju Penumatsa, <ptb...@gmail.com> wrote:
Hi Ruel, 


Yes,  we can automate Web Scraping using Robot Framework. But Robot Framework, may not have the capability to do it. You should use an external library or uses python libraries to do Web Scraping. 

If you can give more details about what you want to achieve and what programming language you wish to prefer we can help you with that. 


Raju 



On Tue, Feb 4, 2020 at 6:43 AM Ru eL <ruelar...@gmail.com> wrote:
Can Robot Framework automates Web Scraping?

--
You received this message because you are subscribed to the Google Groups "robotframework-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to robotframework-d...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/robotframework-devel/a5ec230b-a59f-4de2-96eb-d312be7a5c69%40googlegroups.com.

Ru eL

unread,
Feb 4, 2020, 8:13:51 AM2/4/20
to Raju Penumatsa, robotframework-devel
Bdwy, the scenario is simply  to Web scrape only using Robot Framework. 

Thanks,
Ruel

Raju Penumatsa

unread,
Feb 4, 2020, 10:34:27 AM2/4/20
to Ru eL, robotframework-devel
Ok, Will send some examples by tomorrow.

Raju 

Ru eL

unread,
Feb 4, 2020, 6:40:39 PM2/4/20
to Raju Penumatsa, robotframework-devel
Great, thanks Raju. 

Thanks,
Ruel

Raju Penumatsa

unread,
Feb 5, 2020, 12:03:46 AM2/5/20
to Ru eL, robotframework-devel
Hi Ruel, 

Please find the example below. 

There are different libraries for Web Scrapping, I choose Bs4, you may choose a different one. 

my code(This is very basic code, to read HTML elements ).

RF Code: 
*** Settings ***
Documentation Suite description
Library ../../Bs4.py
Library RequestsLibrary
*** Test Cases ***
Test title
[Tags] DEBUG
Create Session test https://python.org/
${Resp} Get Request test /
#Log To Console ${Resp}
#Log To Console ${Resp.text}
${Text} Get All Text From Html ${Resp.text} main-header
Log To Console ${Text}

Bs4.py Code : 
from bs4 import BeautifulSoup
from robot.api.deco import keyword


@keyword("Parse Html")
def parse_html(text):
"""
Parse Html
:param text: html text
"""
return BeautifulSoup(text, 'html.parser')


@keyword("Get All Text From Html")
def get_all_text_html(text, html_class):
"""
Get all Text from Html
"""
soup = parse_html(text)
return soup.find(class_=html_class)


Raju 

Ru eL

unread,
Feb 5, 2020, 12:10:22 AM2/5/20
to Raju Penumatsa, robotframework-devel
Hi Raju,

Thank you very much for helping. Appreciate it. Later I will try to implement using your code. 
This is interesting! Keep up the good work.!!

Thank you.

Ru eL

unread,
Feb 5, 2020, 3:50:29 AM2/5/20
to Raju Penumatsa, robotframework-devel
Hi Raju,

I got passed result with your script, but there is something else in the logs  I encountered an error message. see blue highlighted text below :

| PASS |
------------------------------------------------------------------------------
Web Scraping :: Suite description                                     | PASS |
1 critical test, 1 passed, 0 failed
1 test total, 1 passed, 0 failed
==============================================================================
Output:  C:\Users\ruela\Documents\robot_scripts\web_scraping\web_scraping_results\output.xml
Log:     C:\Users\ruela\Documents\robot_scripts\web_scraping\web_scraping_results\log.html
c:\program files (x86)\python37-32\lib\site-packages\urllib3\connectionpool.py:1004: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning,
Report:  C:\Users\ruela\Documents\robot_scripts\web_scraping\web_scraping_results\report.html

Process finished with exit code 0

Raju Penumatsa

unread,
Feb 5, 2020, 7:50:07 AM2/5/20
to Ru eL, robotframework-devel
Hi Ruel, 

That is just a warning, you can ignore it or add disable_warings=1. Please find the code below, this should not show that waring.

Create Session   test   https://python.org/   disable_warnings=1
Raju 

Ru eL

unread,
Feb 5, 2020, 7:56:04 AM2/5/20
to Raju Penumatsa, robotframework-devel
Hi Raju, 

Thanks, I'll check it out.

Thanks,
Ruel
Reply all
Reply to author
Forward
0 new messages