Unable to handle the pdf file

373 views
Skip to first unread message

Tribhuvan Yadav

unread,
Apr 23, 2012, 1:58:23 PM4/23/12
to Selenium Users
Hi,
I have a link in my application, which on clicking opens a pdf file ..
i have to verify whether the correct pdf file is opened or not.. i
tried by selenium.selectwindow().. but couldn't succeed.

i came to know abt AutoIt.. but i am very new to AutoIT.. if any one
knows how i can get the autoIt code and integrate in my selenium
code.. that would be much appreciated.


Specification
1. I am using selenium RC+ chrome browser


Thanks
Tribhuvan

Mark Collin

unread,
Apr 24, 2012, 4:00:41 AM4/24/12
to seleniu...@googlegroups.com

This all depends upon how much detail you need to go into.

 

If you just want to check based on filename all you need to do is parse the src attribute of your link.  If you want to physically load up the PDF and check it you'll need to use some external libraries and the process is not straight forward.

 

Before going down this path ask yourself what the benefit is. 

 

·         Is the PDF likely to have an incorrect filename?

·         If you load the PDF up how much actual checking are you going to do?

·         Would it be enough to just know the filename and filesize?

·         Could you download it and do an MD5/SHA1 hash to determine it’s the correct file without loading it up and checking the content?

 

If you are determined to do more than check the filename you will first of all need to download the PDF using something like my Downloader Class.  You can then check filesize and/or perform an MD5/SHA1 hash to confirm that it is the file you expect. 

 

If you want to go further and load the PDF up and validate parts of its content you will then need to write a PDF handler, a quick google brings up http://pdfbox.apache.org/ which looks like it will probably meet your requirements.

 

This is all very Java orientated (because I use Java).  The principle will be the same for other languages.

--

You received this message because you are subscribed to the Google Groups "Selenium Users" group.

To post to this group, send email to seleniu...@googlegroups.com.

To unsubscribe from this group, send email to selenium-user...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/selenium-users?hl=en.

 

Tribhuvan Yadav

unread,
Apr 24, 2012, 9:07:14 AM4/24/12
to seleniu...@googlegroups.com
Hi mark,
 
Thanks for the mail..
 
Actually i what i have to test is , when i click the link, which results in opening of the New pdf file.. i have to verify the text in that newly opened pdf file.. so can you tell me how i can procced..
Thanks
Tribhuvan

daitha shankar

unread,
Apr 24, 2012, 9:32:36 AM4/24/12
to Selenium Users
Hi Tribhuvan Yadav,

Here i dont know how to handle the your given scenario. i am also
looking for the same solution,

But i can give you solution for the how to integrate auto it code for
selenium , I am working on web driver in mozilla.

======================

Create a Auto it script for the your scenario. write your code in
SciTE Script editor (it will come when u install Auto it) .

Create a EXE file

Type the following code in ur selenium program.
---------------------------------------------------------------------

try {
String[] commands = new String[]{};
commands = new String[]{"exe file path"}; // location of
the autoit executable
Runtime.getRuntime().exec(commands);
} catch (IOException e) {}



i used this code for uploading images and files in my application.
and i scu


On Apr 24, 1:00 pm, "Mark Collin" <mark.col...@lazeryattack.com>
wrote:
> This all depends upon how much detail you need to go into.
>
> If you just want to check based on filename all you need to do is parse the
> src attribute of your link.  If you want to physically load up the PDF and
> check it you'll need to use some external libraries and the process is not
> straight forward.
>
> Before going down this path ask yourself what the benefit is.
>
> .         Is the PDF likely to have an incorrect filename?
>
> .         If you load the PDF up how much actual checking are you going to
> do?
>
> .         Would it be enough to just know the filename and filesize?
>
> .         Could you download it and do an MD5/SHA1 hash to determine it's
> the correct file without loading it up and checking the content?
>
> If you are determined to do more than check the filename you will first of
> all need to download the PDF using something like my Downloader Class
> <https://github.com/Ardesco/Ebselen/blob/master/ebselen-core/src/main/...
> om/lazerycode/ebselen/customhandlers/FileDownloader.java> .  You can then
> check filesize and/or perform an MD5/SHA1 hash to confirm that it is the
> file you expect.
>
> If you want to go further and load the PDF up and validate parts of its
> content you will then need to write a PDF handler, a quick google brings uphttp://pdfbox.apache.org/which looks like it will probably meet your
> requirements.
>
> This is all very Java orientated (because I use Java).  The principle will
> be the same for other languages.
>
>
>
>
>
>
>
> -----Original Message-----
> From: seleniu...@googlegroups.com
>
> [mailto:seleniu...@googlegroups.com] On Behalf Of Tribhuvan Yadav
> Sent: 23 April 2012 18:58
> To: Selenium Users
> Subject: [selenium-users] Unable to handle the pdf file
>
> Hi,
>
> I have a link in my application, which on clicking opens a pdf file ..
>
> i have to verify whether the  correct pdf file is opened or not.. i tried by
> selenium.selectwindow().. but couldn't succeed.
>
> i came to know abt AutoIt.. but i am very new to AutoIT.. if any one knows
> how i can get the autoIt code  and integrate in my selenium code.. that
> would be much appreciated.
>
> Specification
>
> 1. I am using selenium RC+ chrome browser
>
> Thanks
>
> Tribhuvan
>
> --
>
> You received this message because you are subscribed to the Google Groups
> "Selenium Users" group.
>
> To post to this group, send email to
> <mailto:seleniu...@googlegroups.com> seleniu...@googlegroups.com.
>
> To unsubscribe from this group, send email to
> <mailto:selenium-user...@googlegroups.com>
> selenium-user...@googlegroups.com.
>
> For more options, visit this group at
> <http://groups.google.com/group/selenium-users?hl=en>http://groups.google.com/group/selenium-users?hl=en.

Mark Collin

unread,
Apr 24, 2012, 9:45:56 AM4/24/12
to seleniu...@googlegroups.com

I have told you that already, read my previous message again, specifically the part about creating a PDF handler.

 

I suspect all you really need to do is download the file and MD5/SHA1 hash it and compare it to an MD5/SHA1 hash of a known good copy of the file.  It will be much easier than writing a PDF handler.

SantoshSarma

unread,
Apr 24, 2012, 12:06:25 PM4/24/12
to seleniu...@googlegroups.com
Hi Tribhuvan,

 Whether that pdf file will open in browser or anyother ?

If it opens in browser, and you want search some names, content (text) then go for isTextPresent() method

To post to this group, send email to selenium-users@googlegroups.com.

To unsubscribe from this group, send email to selenium-users+unsubscribe@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/selenium-users?hl=en.

 

--

You received this message because you are subscribed to the Google Groups "Selenium Users" group.

To post to this group, send email to selenium-users@googlegroups.com.
To unsubscribe from this group, send email to selenium-users+unsubscribe@googlegroups.com.


For more options, visit this group at http://groups.google.com/group/selenium-users?hl=en.

--
You received this message because you are subscribed to the Google Groups "Selenium Users" group.

To post to this group, send email to selenium-users@googlegroups.com.
To unsubscribe from this group, send email to selenium-users+unsubscribe@googlegroups.com.

Krishnan Mahadevan

unread,
Apr 24, 2012, 12:22:50 PM4/24/12
to seleniu...@googlegroups.com
I dont think you will be able to extract any text out of a web browser window that is showing a pdf file.


Thanks & Regards
Krishnan Mahadevan

"All the desirable things in life are either illegal, expensive, fattening or in love with someone else!"


To view this discussion on the web visit https://groups.google.com/d/msg/selenium-users/-/-500fqNCNB8J.

To post to this group, send email to seleniu...@googlegroups.com.
To unsubscribe from this group, send email to selenium-user...@googlegroups.com.

Tribhuvan Yadav

unread,
Apr 24, 2012, 12:55:34 PM4/24/12
to seleniu...@googlegroups.com
Hi Santosh ,

I have tried the same way as u mentioned.. but selenium is not able to handle it.


thanks
Tribhuvan

Tribhuvan Yadav

unread,
Apr 24, 2012, 12:58:34 PM4/24/12
to seleniu...@googlegroups.com
Hi Shankar,

Thanks for your reply.. i will try to implement  the same .. and will let u know if i am facing any issue.


Thanks
tribhuvan

To post to this group, send email to seleniu...@googlegroups.com.
To unsubscribe from this group, send email to selenium-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/selenium-users?hl=en.


SantoshSarma

unread,
Apr 24, 2012, 12:59:07 PM4/24/12
to seleniu...@googlegroups.com
@Krishnan : I think it is not possible but, I think we can verify text using selenium.isTextPresent() method. (I didn't try just my assumption)

Tribhuvan Yadav

unread,
Apr 24, 2012, 1:02:56 PM4/24/12
to seleniu...@googlegroups.com
santosh,

when i click the link , which results in opening a pdf file in browser.. i doing like this

selenium.selectwindow() to the newly open window

and then selenium.isTextpresent();

when i run it.. selenium is unable identify the new window.. and my test case gets failed

thanks
Tribhuvan

To view this discussion on the web visit https://groups.google.com/d/msg/selenium-users/-/tzVYvcmryo4J.

To post to this group, send email to seleniu...@googlegroups.com.
To unsubscribe from this group, send email to selenium-user...@googlegroups.com.

Krishnan Mahadevan

unread,
Apr 24, 2012, 1:03:12 PM4/24/12
to seleniu...@googlegroups.com
Santosh,
Nope you cant get hold of any text within the pdf using isTextPresent() method


Thanks & Regards
Krishnan Mahadevan

"All the desirable things in life are either illegal, expensive, fattening or in love with someone else!"
On Tue, Apr 24, 2012 at 10:29 PM, SantoshSarma <santosh...@gmail.com> wrote:
To view this discussion on the web visit https://groups.google.com/d/msg/selenium-users/-/tzVYvcmryo4J.

To post to this group, send email to seleniu...@googlegroups.com.
To unsubscribe from this group, send email to selenium-user...@googlegroups.com.

SantoshSarma

unread,
Apr 24, 2012, 1:07:59 PM4/24/12
to seleniu...@googlegroups.com
Ok.Thank you Krishnan...! 

Raju

unread,
Apr 24, 2012, 2:34:28 PM4/24/12
to Selenium Users

Once the PDF file got opened copy the content and store it to a String
variable. and check your data in that String variable. If you want,
You can parse the String as per your need and check your output
accordingly.

On Apr 24, 10:07 am, SantoshSarma <santoshsarma...@gmail.com> wrote:
> Ok.Thank you Krishnan...!
>
>
>
>
>
>
>
> On Tuesday, April 24, 2012 10:33:12 PM UTC+5:30, Krishnan wrote:
>
> > Santosh,
> > Nope you cant get hold of any text within the pdf using isTextPresent()
> > method
>
> > Thanks & Regards
> > Krishnan Mahadevan
>
> > "All the desirable things in life are either illegal, expensive, fattening
> > or in love with someone else!"
> > My Scribblings @http://wakened-cognition.blogspot.com/
>
> > On Tue, Apr 24, 2012 at 10:29 PM, SantoshSarma <santoshsarma...@gmail.com>wrote:
>
> >> @Krishnan : I think it is not possible but, I think we can verify text
> >> using selenium.isTextPresent() method. (I didn't try just my assumption)
>
> >> On Tuesday, April 24, 2012 9:52:50 PM UTC+5:30, Krishnan wrote:
>
> >>> I dont think you will be able to extract any text out of a web browser
> >>> window that is showing a pdf file.
>
> >>> Thanks & Regards
> >>> Krishnan Mahadevan
>
> >>> "All the desirable things in life are either illegal, expensive,
> >>> fattening or in love with someone else!"
>
> >>> On Tue, Apr 24, 2012 at 9:36 PM, SantoshSarma <santoshsarma...@gmail.com
> >>> > wrote:
>
> >>>> Hi Tribhuvan,
>
> >>>>  Whether that pdf file will open in browser or anyother ?
>
> >>>> If it opens in browser, and you want search some names, content (text)
> >>>> then go for isTextPresent() method
>
> >>>> On Tuesday, April 24, 2012 7:15:56 PM UTC+5:30, Mark Collin wrote:
>
> >>>>> I have told you that already, read my previous message again,
> >>>>> specifically the part about creating a PDF handler.
>
> >>>>> I suspect all you really need to do is download the file and MD5/SHA1
> >>>>> hash it and compare it to an MD5/SHA1 hash of a known good copy of the
> >>>>> file.  It will be much easier than writing a PDF handler.
>
> >>>>> *From:* selenium-users@googlegroups.**co**m<seleniu...@googlegroups.com>[mailto :
> >>>>> selenium-users@**googleg**roups.com <seleniu...@googlegroups.com>]
> >>>>> *On Behalf Of *Tribhuvan Yadav
> >>>>> *Sent:* 24 April 2012 14:07
> >>>>> *To:* selenium-users@googlegroups.**co**m<seleniu...@googlegroups.com>
> >>>>> *Subject:* Re: [selenium-users] Unable to handle the pdf file
>
> >>>>> Hi mark,
>
> >>>>> Thanks for the mail..
>
> >>>>> Actually i what i have to test is , when i click the link, which
> >>>>> results in opening of the New pdf file.. i have to verify the text in that
> >>>>> newly opened pdf file.. so can you tell me how i can procced..
>
> >>>>> Thanks
>
> >>>>> Tribhuvan
>
> >>>>> On Tue, Apr 24, 2012 at 1:30 PM, Mark Collin <
> >>>>> mark.col...@lazeryattack.com> wrote:
>
> >>>>> This all depends upon how much detail you need to go into.
>
> >>>>> If you just want to check based on filename all you need to do is
> >>>>> parse the src attribute of your link.  If you want to physically load up
> >>>>> the PDF and check it you'll need to use some external libraries and the
> >>>>> process is not straight forward.
>
> >>>>> Before going down this path ask yourself what the benefit is.
>
> >>>>> ·         Is the PDF likely to have an incorrect filename?
>
> >>>>> ·         If you load the PDF up how much actual checking are you
> >>>>> going to do?
>
> >>>>> ·         Would it be enough to just know the filename and filesize?
>
> >>>>> ·         Could you download it and do an MD5/SHA1 hash to determine
> >>>>> it’s the correct file without loading it up and checking the content?
>
> >>>>> If you are determined to do more than check the filename you will
> >>>>> first of all need to download the PDF using something like my Downloader
> >>>>> Class<https://github.com/Ardesco/Ebselen/blob/master/ebselen-core/src/main/...>.
> >>>>> You can then check filesize and/or perform an MD5/SHA1 hash to confirm that
> >>>>> it is the file you expect.
>
> >>>>> If you want to go further and load the PDF up and validate parts of
> >>>>> its content you will then need to write a PDF handler, a quick google
> >>>>> brings uphttp://pdfbox.apache.org/which looks like it will probably
> >>>>> meet your requirements.
>
> >>>>> This is all very Java orientated (because I use Java).  The principle
> >>>>> will be the same for other languages.
>
> >>>>> -----Original Message-----
> >>>>> From: selenium-users@googlegroups.**co**m<seleniu...@googlegroups.com>[mailto :
> >>>>> selenium-users@**googleg**roups.com <seleniu...@googlegroups.com>]
> >>>>> On Behalf Of Tribhuvan Yadav
> >>>>> Sent: 23 April 2012 18:58
> >>>>> To: Selenium Users
> >>>>> Subject: [selenium-users] Unable to handle the pdf file
>
> >>>>> Hi,
>
> >>>>> I have a link in my application, which on clicking opens a pdf file ..
>
> >>>>> i have to verify whether the  correct pdf file is opened or not.. i
> >>>>> tried by selenium.selectwindow().. but couldn't succeed.
>
> >>>>> i came to know abt AutoIt.. but i am very new to AutoIT.. if any one
> >>>>> knows how i can get the autoIt code  and integrate in my selenium code..
> >>>>> that would be much appreciated.
>
> >>>>> Specification
>
> >>>>> 1. I am using selenium RC+ chrome browser
>
> >>>>> Thanks
>
> >>>>> Tribhuvan
>
> >>>>> --
>
> >>>>> You received this message because you are subscribed to the Google
> >>>>> Groups "Selenium Users" group.
>
> >>>>> To post to this group, send email to selenium-users@googlegroups.**co*
> >>>>> *m <seleniu...@googlegroups.com>.
>
> >>>>> To unsubscribe from this group, send email to
> >>>>> selenium-users+unsubscribe@**goo**glegroups.com<selenium-users+unsubscribe@ googlegroups.com>
> >>>>> .
>
> >>>>> For more options, visit this group athttp://groups.google.com/**group
> >>>>> **/selenium-users?hl=en<http://groups.google.com/group/selenium-users?hl=en>
> >>>>> .
>
> >>>>> --
> >>>>> You received this message because you are subscribed to the Google
> >>>>> Groups "Selenium Users" group.
> >>>>> To post to this group, send email to selenium-users@googlegroups.**co*
> >>>>> *m <seleniu...@googlegroups.com>.
> >>>>> To unsubscribe from this group, send email to
> >>>>> selenium-users+unsubscribe@**goo**glegroups.com<selenium-users%2Bunsubscrib e...@googlegroups.com>
> >>>>> .
> >>>>> For more options, visit this group athttp://groups.google.com/**group
> >>>>> **/selenium-users?hl=en<http://groups.google.com/group/selenium-users?hl=en>
> >>>>> .
>
> >>>>> --
> >>>>> You received this message because you are subscribed to the Google
> >>>>> Groups "Selenium Users" group.
> >>>>> To post to this group, send email to selenium-users@googlegroups.**co*
> >>>>> *m <seleniu...@googlegroups.com>.
> >>>>> To unsubscribe from this group, send email to
> >>>>> selenium-users+unsubscribe@**goo**glegroups.com<selenium-users+unsubscribe@ googlegroups.com>
> >>>>> .
> >>>>> For more options, visit this group athttp://groups.google.com/**group
> >>>>> **/selenium-users?hl=en<http://groups.google.com/group/selenium-users?hl=en>
> >>>>> .
>
> >>>>  --
> >>>> You received this message because you are subscribed to the Google
> >>>> Groups "Selenium Users" group.
> >>>> To view this discussion on the web visithttps://groups.google.com/d/**
> >>>> msg/selenium-users/-/-**500fqNCNB8J<https://groups.google.com/d/msg/selenium-users/-/-500fqNCNB8J>
> >>>> .
>
> >>>> To post to this group, send email to selenium-users@googlegroups.**com<seleniu...@googlegroups.com>
> >>>> .
> >>>> To unsubscribe from this group, send email to
> >>>> selenium-users+unsubscribe@**googlegroups.com<selenium-users%2Bunsubscribe@ googlegroups.com>
> >>>> .
> >>>> For more options, visit this group athttp://groups.google.com/**
> >>>> group/selenium-users?hl=en<http://groups.google.com/group/selenium-users?hl=en>
> >>>> .
>
> >>>  --
> >> You received this message because you are subscribed to the Google Groups
> >> "Selenium Users" group.
> >> To view this discussion on the web visit
> >>https://groups.google.com/d/msg/selenium-users/-/tzVYvcmryo4J.
>
> >> To post to this group, send email to seleniu...@googlegroups.com.
> >> To unsubscribe from this group, send email to
> >> selenium-user...@googlegroups.com.

Mark Collin

unread,
Apr 24, 2012, 3:12:02 PM4/24/12
to seleniu...@googlegroups.com
Good luck getting AutoIT to *read* a PDF.

Hi Tribhuvan Yadav,

======================

Create a EXE file

To post to this group, send email to seleniu...@googlegroups.com.


To unsubscribe from this group, send email to

selenium-user...@googlegroups.com.
For more options, visit this group at

http://groups.google.com/group/selenium-users?hl=en.


Mark Collin

unread,
Apr 24, 2012, 3:13:26 PM4/24/12
to seleniu...@googlegroups.com

There I times when I wonder why I bother posting.

 

You cannot handle PDF’s with Selenium, the only way you can do it is to write your own PDF handler, this *will* involve downloading the PDF to parse it.

 

PDF’s are displayed in the browser by loading the flash plugin, they are not rendered as HTML and will never be able to be read by Selenium.

 

From: seleniu...@googlegroups.com [mailto:seleniu...@googlegroups.com] On Behalf Of Tribhuvan Yadav
Sent: 24 April 2012 18:03
To: seleniu...@googlegroups.com
Subject: Re: [selenium-users] Unable to handle the pdf file

 

santosh,



when i click the link , which results in opening a pdf file in browser.. i doing like this

selenium.selectwindow() to the newly open window

and then selenium.isTextpresent();

when i run it.. selenium is unable identify the new window.. and my test case gets failed

thanks
Tribhuvan

On Tue, Apr 24, 2012 at 10:29 PM, SantoshSarma <santosh...@gmail.com> wrote:

@Krishnan : I think it is not possible but, I think we can verify text using selenium.isTextPresent() method. (I didn't try just my assumption)


On Tuesday, April 24, 2012 9:52:50 PM UTC+5:30, Krishnan wrote:

I dont think you will be able to extract any text out of a web browser window that is showing a pdf file.

 


Thanks & Regards
Krishnan Mahadevan

"All the desirable things in life are either illegal, expensive, fattening or in love with someone else!"

On Tue, Apr 24, 2012 at 9:36 PM, SantoshSarma <santosh...@gmail.com> wrote:

Hi Tribhuvan,

 

 Whether that pdf file will open in browser or anyother ?

 

If it opens in browser, and you want search some names, content (text) then go for isTextPresent() method



On Tuesday, April 24, 2012 7:15:56 PM UTC+5:30, Mark Collin wrote:

I have told you that already, read my previous message again, specifically the part about creating a PDF handler.

 

I suspect all you really need to do is download the file and MD5/SHA1 hash it and compare it to an MD5/SHA1 hash of a known good copy of the file.  It will be much easier than writing a PDF handler.

 

To post to this group, send email to seleniu...@googlegroups.com.

To unsubscribe from this group, send email to selenium-user...@googlegroups.com.

For more options, visit this group at http://groups.google.com/group/selenium-users?hl=en.

 

--

You received this message because you are subscribed to the Google Groups "Selenium Users" group.

To post to this group, send email to seleniu...@googlegroups.com.
To unsubscribe from this group, send email to selenium-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/selenium-users?hl=en.

 

--

You received this message because you are subscribed to the Google Groups "Selenium Users" group.

To post to this group, send email to seleniu...@googlegroups.com.
To unsubscribe from this group, send email to selenium-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/selenium-users?hl=en.

--

You received this message because you are subscribed to the Google Groups "Selenium Users" group.

To view this discussion on the web visit https://groups.google.com/d/msg/selenium-users/-/-500fqNCNB8J.

To post to this group, send email to seleniu...@googlegroups.com.
To unsubscribe from this group, send email to selenium-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/selenium-users?hl=en.

--
You received this message because you are subscribed to the Google Groups "Selenium Users" group.

To view this discussion on the web visit https://groups.google.com/d/msg/selenium-users/-/tzVYvcmryo4J.


To post to this group, send email to seleniu...@googlegroups.com.
To unsubscribe from this group, send email to selenium-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/selenium-users?hl=en.

 

--

You received this message because you are subscribed to the Google Groups "Selenium Users" group.

Moises Siles

unread,
Apr 24, 2012, 3:28:08 PM4/24/12
to seleniu...@googlegroups.com
Basically the answer was given in the second post :) in this email chain....

Madan Singh

unread,
Oct 24, 2013, 4:59:53 AM10/24/13
to seleniu...@googlegroups.com
Can any body help me, I have a page where a link available to download the pdf, now I want to read some content from this pdf file to verify that pdf is correct or not..

Thanks in advance

Madan


On Tue, Apr 24, 2012 at 7:02 PM, daitha shankar <daithas...@gmail.com> wrote:
To post to this group, send email to seleniu...@googlegroups.com.
To unsubscribe from this group, send email to selenium-user...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/selenium-users?hl=en.




--
M P Singh
9971360313

Priyanga Asokan

unread,
Oct 26, 2013, 6:55:25 AM10/26/13
to seleniu...@googlegroups.com
Kindly help as now only learning selenium. In my application we need to validate the pdf content as well as the correct filename.Is that all possible in selenium or not?

**Once i tried writing scripts in AutoIT but i failed


Spcification:
I am using Selenium RC in IE
Reply all
Reply to author
Forward
0 new messages