Connecting to printers: Firefox webdriver gets HTML code. PhantomJS gets XML

95 views
Skip to first unread message

Manuel Gonzalez

unread,
Feb 9, 2015, 5:27:09 AM2/9/15
to seleniu...@googlegroups.com
Hi:

I'm programming code to connect to 30 printers to obtain total number of copies and prints.

When I connect with Firefox, I get HTML code, and I can operate with it. But due to lack of graphical environment (I need to launch it throught cron), I changed to PhantomJS. When I run same code changing Firefox by PhantomJS, I get XML instead HTML code. How I can get the source HTML code?

I attach some screenshots so you can understand me.

I use Linux and python2.

Thanks in advance. Regards.
Konica1.png
Konica2-UsingFirefox.png
Konica3-UsingPhantomJS.png

Selenium Framework

unread,
Feb 9, 2015, 7:29:24 PM2/9/15
to seleniu...@googlegroups.com
I thought HTML is basically XML with more markup. Can you explain what is the difference between source XML and source HTML please?
Maybe can help then. selenium source generally prints the page source html

cheers,

Manuel Gonzalez

unread,
Feb 10, 2015, 3:40:14 AM2/10/15
to seleniu...@googlegroups.com

Hi, thanks for your answer.

As you can see in attachtments 2 and 3, it's not the same. I think internally browsers converts XML into HTML.

By example, this is the source code of the Konica Minolta front page:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="top.xsl" type="text/xsl"?>
<MFP>
<SelNo>Es</SelNo>
<LangNo>Es</LangNo>
<Service><Setting><AuthSetting><AuthMode><AuthType>Device</AuthType><MiddleServerUse>Off</MiddleServerUse>
<ListOn>false</ListOn>
<PublicUser>true</PublicUser>
<DefaultAuthType></DefaultAuthType>
<BoxAdmin>false</BoxAdmin>
<EnableAuthDeviceType2Mode2Auth>Off</EnableAuthDeviceType2Mode2Auth></AuthMode><TrackMode><TrackType>None</TrackType></TrackMode></AuthSetting><MiddleServerSetting><ControlList><ArraySize>0</ArraySize></ControlList><Screen><Id>0</Id></Screen></MiddleServerSetting>
<PswcForm>HtmlFlash</PswcForm>
</Setting></Service><LangDummy>false</LangDummy><FuncVer>5</FuncVer>
<DN70B5>Off</DN70B5><DN70B1>Off</DN70B1></MFP>





By example, if I wish to pick Administrator check and hit RETURN I must do:

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
md=webdriver.Firefox()
md.gets("http://xxxxxxxx")
md.find_element_by_id("Admin").click()
md.find_element_by_id("Admin").send_keys(Keys.RETURN)

This works perfectly because Firefox returns HTML. But this another code (same one, but changing Firefox by PhantomJS):


from selenium import webdriver
from selenium.webdriver.common.keys import Keys
md=webdriver.PhantomJS()
md.gets("http://xxxxxxxx")
md.find_element_by_id("Admin").click()
md.find_element_by_id("Admin").send_keys(Keys.RETURN)

gives error because of can't locate element.

If you want to give a try, you can find some Konica Minolta printers using Google (searching "inurl:wcd/top.xml"). By example:

http://tpshedr.anu.edu.au/wcd/top.xml
http://131.247.144.27/wcd/top.xml

Thanks!!!!

Selenium Framework

unread,
Feb 10, 2015, 10:56:14 AM2/10/15
to seleniu...@googlegroups.com
Yes I see your point now - I was able to reproduce your problem using Java, ghostdriver and phantomJS.

1) I also monitored the HTTP traffic using Fiddler -- looking at traffic differences between using phantomJS and ChromeDriver

Observation
==========
It is surprising that somehow when phantomJS is being used , the resources (html, css...) are NOT being downloaded to client side -- whereas the same works fine with ChromeDriver --- At this point, I am out of suggestions for you from the client side -- We will have to understand the server side layers to see if they put checks to detect user agent and process based on that --- Anybody else ideas ?

import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.openqa.selenium.phantomjs.PhantomJSDriver;
import org.openqa.selenium.phantomjs.PhantomJSDriverService;
import org.openqa.selenium.remote.DesiredCapabilities;

public class SimpleTest {
public static void main(String args[]) throws InterruptedException {

WebDriver driver = new ChromeDriver();
DesiredCapabilities caps = new DesiredCapabilities();
// caps.setCapability(PhantomJSDriverService.PHANTOMJS_EXECUTABLE_PATH_PROPERTY,"C://softwares//phantomjs-1.9.8-windows//phantomjs.exe");
// WebDriver driver = new PhantomJSDriver(caps);
driver.get("http://tpshedr.anu.edu.au/wcd/top.xml");
Thread.sleep(5000);
System.out.println(driver.getPageSource());
WebElement admin_element = driver.findElement(By.id("Admin"));
System.out.println(admin_element.getAttribute("outerHTML"));
driver.quit();
}
}




cheers,

Manuel Gonzalez

unread,
Feb 11, 2015, 7:21:25 AM2/11/15
to seleniu...@googlegroups.com
Hi:

Thanks for your efforts. I need it to be run in a console, because it will launched by a cron scheduler. I only know PhantomJS.

Really, this is very strange.

Thanks again. Regards:
Reply all
Reply to author
Forward
0 new messages