Is it possible to copy all the text from a pdf opened in a browser?

495 views
Skip to first unread message

Chema del Barco

unread,
Feb 6, 2012, 1:59:18 PM2/6/12
to webdriver
Hi all!

I am creating a webdriver test that needs to read all the text from a
pdf opened in a browser (firefox, for instance). I had a SeleniumRC
version that works perfectly, but the plan is to switch all the
framework to webdriver, and I have not been able to make it work with
it.

What I am trying is to get all the text copied to the clipboard (by
pressing "CTRL+A", then "CTRL+C") and then process the text got.
Please note that as it is a pdf window, meaning the only element to
interact inside the web is an "embed" element inside the body.

Now, I have tried different approximations (2.17, firefox 9):

- The one in http://groups.google.com/group/webdriver/browse_thread/thread/69e9ddfad8dcb1e6:

new Actions(driver())
.click() // tried it to see if the pdf was not having the
focus
.sendKeys(Keys.LEFT_CONTROL + "a")
.sendKeys(Keys.LEFT_CONTROL + "c")
.build()
.perform();

- The previous one without click()

- A more careful try ("waits()" is a WebDriverWait custom
implementation):

Actions builder = new Actions(driver());

Action click =
builder.moveToElement(driver().findElement(By.tagName("embed"))).click().build();

Action control = builder.keyDown(Keys.CONTROL).build();
Action a = builder.sendKeys("a").build();
Action c = builder.sendKeys("c").build();

click.perform();
waits().pause(1000);

control.perform();
waits().pause(1000);

a.perform();
waits().pause(4000);

control.perform();
waits().pause(1000);

c.perform();

- Switching to WebDriverBackedSelenium and put the working code in
SeleniumRC (backed selenium reported the error that keyPressNative is
not implemented):

Selenium selenium = new WebDriverBackedSelenium(driver(), <url of
the pdf>);
selenium.keyPressNative(String.valueOf(KeyEvent.VK_CONTROL)); //
Stands for CONTROL
waits().pause(1000);
selenium.keyPressNative("65"); // Stands for A "ascii code for A"
waits().pause(4000);
selenium.keyPressNative(String.valueOf(KeyEvent.VK_CONTROL));//
Stands for CONTROL
waits().pause(1000);
selenium.keyPressNative("67"); // Stands for C "ascii code for C"

Any of the stated implementations worked, and right now I'm out of
ideas... Could anyone give some light in here please? Help is much
appreciated!



Mark Collin

unread,
Feb 7, 2012, 7:21:41 AM2/7/12
to webd...@googlegroups.com
I assume you mean google chrome:

http://code.google.com/p/selenium/wiki/ChromeDriver

Hi all!

click.perform();
waits().pause(1000);

control.perform();
waits().pause(1000);

a.perform();
waits().pause(4000);

control.perform();
waits().pause(1000);

c.perform();

--
You received this message because you are subscribed to the Google Groups
"webdriver" group.
To post to this group, send email to webd...@googlegroups.com.
To unsubscribe from this group, send email to
webdriver+...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/webdriver?hl=en.


--
This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.

If you have received this email in error please notify postm...@ardescosolutions.com

Mark Collin

unread,
Feb 7, 2012, 8:29:29 AM2/7/12
to webd...@googlegroups.com
Looks like this has gone to the wrong thread... Opps

Chema del Barco

unread,
Feb 7, 2012, 8:26:16 AM2/7/12
to webdriver
No, I am using firefox with native events and JS enabled. Here's the
part of the method to create the webdriver session:

public void startWebDriverSession(DesiredCapabilities capability)
{
String browser = capability.getBrowserName();
capability.setJavascriptEnabled(true);

// Enable this to enable Navite Events in Firefox driver
if
(capability.getBrowserName().equals(DesiredCapabilities.firefox().getBrowserName()))
{
FirefoxProfile profile = new FirefoxProfile();
profile.setEnableNativeEvents(true);
capability.setCapability(FirefoxDriver.PROFILE, profile);
}

this.driver = new RemoteWebDriver(new URL("http://" + serverHost +
":" + serverPort + "/wd/hub"), capability);
}


On 7 feb, 13:21, "Mark Collin" <m...@ardescosolutions.com> wrote:
> I assume you mean google chrome:
>
> http://code.google.com/p/selenium/wiki/ChromeDriver
>
>
>
>
>
>
>
> -----Original Message-----
> From: webd...@googlegroups.com [mailto:webd...@googlegroups.com] On
>
> Behalf Of Chema del Barco
> Sent: 06 February 2012 18:59
> To: webdriver
> Subject: [webdriver] Is it possible to copy all the text from a pdf opened
> in a browser?
>
> Hi all!
>
> I am creating a webdriver test that needs to read all the text from a pdf
> opened in a browser (firefox, for instance). I had a SeleniumRC version that
> works perfectly, but the plan is to switch all the framework to webdriver,
> and I have not been able to make it work with it.
>
> What I am trying is to get all the text copied to the clipboard (by pressing
> "CTRL+A", then "CTRL+C") and then process the text got.
> Please note that as it is a pdf window, meaning the only element to interact
> inside the web is an "embed" element inside the body.
>
> Now, I have tried different approximations (2.17, firefox 9):
>
> - The one inhttp://groups.google.com/group/webdriver/browse_thread/thread/69e9ddf...
> For more options, visit this group athttp://groups.google.com/group/webdriver?hl=en.
>
> --
> This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.
>
> If you have received this email in error please notify postmas...@ardescosolutions.com
Reply all
Reply to author
Forward
0 new messages