UserAgent chaching error

159 views
Skip to first unread message

David Garcia

unread,
Jul 8, 2011, 8:10:16 PM7/8/11
to BrowserMob Proxy
I ran into a weird problem today, and after some debugging i found the
user agent caching was failing to parse a wrong userAgentString.txt
file that is downloaded from http://user-agent-string.info/rpc/get_data.php?key=free&format=ini

Turns out that it is refreshed once a day, and this morning my code
broke.
It happened only when i used
proxy.newHar()

if i didn't use any har, it worked ok.


The wrong file was something like this:

<br />
<b>Notice</b>: Undefined index: ver in <b>/mnt/space/weby/virtual/
user-agent-string.info/htdocs/rpc/get_data.php</b> on line <b>7</
b><br />
<br />
<b>Notice</b>: Undefined index: botIP-All in <b>/mnt/space/weby/
virtual/user-agent-string.info/htdocs/rpc/get_data.php</b> on line
<b>16</b><br />
<br />
<b>Notice</b>: Undefined index: botIP-Googlebot in <b>/mnt/space/weby/
virtual/user-agent-string.info/htdocs/rpc/get_data.php</b> on line
<b>32</b><br />
<br />
<b>Notice</b>: Undefined index: botIP-Yahoo in <b>/mnt/space/weby/
virtual/user-agent-string.info/htdocs/rpc/get_data.php</b> on line
<b>47</b><br />
<br />
<b>Notice</b>: Undefined index: botIP-MSN-Bing in <b>/mnt/space/weby/
virtual/user-agent-string.info/htdocs/rpc/get_data.php</b> on line
<b>62</b><br />
<br />
<b>Notice</b>: Undefined index: botIP-Baiduspider in <b>/mnt/space/
weby/virtual/user-agent-string.info/htdocs/rpc/get_data.php</b> on
line <b>77</b><br />
<br />
<b>Notice</b>: Undefined index: md5 in <b>/mnt/space/weby/virtual/
user-agent-string.info/htdocs/rpc/get_data.php</b> on line <b>95</
b><br />
<br />
<b>Notice</b>: Undefined index: sha1 in <b>/mnt/space/weby/virtual/
user-agent-string.info/htdocs/rpc/get_data.php</b> on line <b>108</
b><br />
<br />
<b>Notice</b>: Undefined index: ico in <b>/mnt/space/weby/virtual/
user-agent-string.info/htdocs/rpc/get_data.php</b> on line <b>121</
b><br />
<br />
<b>Notice</b>: Undefined index: ico in <b>/mnt/space/weby/virtual/
user-agent-string.info/htdocs/rpc/get_data.php</b> on line <b>150</
b><br />
<br />
<b>Notice</b>: Undefined index: UASparser in <b>/mnt/space/weby/
virtual/user-agent-string.info/htdocs/rpc/get_data.php</b> on line
<b>180</b><br />
<br />
<b>Notice</b>: Undefined index: uaslist in <b>/mnt/space/weby/virtual/
user-agent-string.info/htdocs/rpc/get_data.php</b> on line <b>212</
b><br />
<br />
<b>Notice</b>: Undefined index: uasOSlist in <b>/mnt/space/weby/
virtual/user-agent-string.info/htdocs/rpc/get_data.php</b> on line
<b>262</b><br />
<br />
<b>Notice</b>: Use of undefined constant verze - assumed 'verze' in
<b>/mnt/space/weby/virtual/user-agent-string.info/htdocs/rpc/
get_data.php</b> on line <b>311</b><br />
<br />
<b>Notice</b>: Undefined index: download in <b>/mnt/space/weby/
virtual/user-agent-string.info/htdocs/rpc/get_data.php</b> on line
<b>320</b><br />
; Data (format ini) for UASparser - http://user-agent-string.info/download/UASparser
; Version: 20110701-01
; Checksum:
; MD5 - http://user-agent-string.info/rpc/get_data.php?format=ini&md5=y
; SHA1 - http://user-agent-string.info/rpc/get_data.php?format=ini&sha1=y
;
[robots]
; bot_id[] = "bot useragentstring"
; bot_id[] = "bot Family"
; bot_id[] = "bot Name"
; bot_id[] = "bot URL"
; bot_id[] = "bot Company"


So the properties parser in CachingOnlineUpdateUASparser was failing
(org.browsermob.proxy.http.BrowserMobHttpClient.java:381) and the
exception caused all requests to get 502 Bad Gateway.

The exception was:
java.lang.StringIndexOutOfBoundsException: String index out of range:
-1
at java.lang.String.substring(String.java:1937)
at
cz.mallat.uasparser.fileparser.PHPFileParser.loadFile(PHPFileParser.java:
65)
at
cz.mallat.uasparser.fileparser.PHPFileParser.<init>(PHPFileParser.java:
29)
at cz.mallat.uasparser.UASparser.loadDataFromFile(UASparser.java:212)
at
cz.mallat.uasparser.CachingOnlineUpdateUASparser.<init>(CachingOnlineUpdateUASparser.java:
55)
at
cz.mallat.uasparser.CachingOnlineUpdateUASparser.<init>(CachingOnlineUpdateUASparser.java:
26)
at
org.browsermob.proxy.http.BrowserMobHttpClient.execute(BrowserMobHttpClient.java:
381)

The workaround was to delete the file /tmp/usrAgentString.txt so it is
downloaded again.
And the fix is to catch the exception in BrowserMobHttpClient, which
as of now only catches IOExceptions.

I'll send a patch soon.

Hope this helps.

Patrick Lightbody

unread,
Jul 8, 2011, 9:05:49 PM7/8/11
to browserm...@googlegroups.com
Wow, very interesting. I just grabbed that user-agent code but never looked closely at it. Probably don't want to rely on external services so heavily like that. Maybe we can just check in a version of that userAgentString.txt file in to the project and update it every release?

--
Patrick Lightbody
+1 (415) 830-5488

Srinivasa Raja

unread,
Jul 24, 2012, 4:18:57 PM7/24/12
to browserm...@googlegroups.com
Hi,

 I am facing the same problem.  Can you elaborate on what I have to do to fix it.

Thanks,
Srini.

runwolf

unread,
Jul 24, 2012, 4:29:54 PM7/24/12
to browserm...@googlegroups.com
I am guessing a lot of people are having big log today :-)

same issue for me

pat...@lightbody.net

unread,
Jul 27, 2012, 12:09:37 PM7/27/12
to browserm...@googlegroups.com
David,
Did you ever send the patch in?

Patrick

runwolf

unread,
Jul 31, 2012, 11:03:43 AM7/31/12
to browserm...@googlegroups.com
It's no longer happening, I guess the online service resumed ??


On Friday, July 27, 2012 12:09:37 PM UTC-4, Patrick Lightbody wrote:
David,
Did you ever send the patch in?

Patrick
Reply all
Reply to author
Forward
0 new messages