Size of standard browscap.ini

454 views
Skip to first unread message

Anthony Lalande

unread,
Sep 20, 2014, 4:46:14 AM9/20/14
to brow...@googlegroups.com
Hi,

I recently updated the browscap.ini file on my site, to be able to recognize iOS 8 UAs, and was surprised to find that between the file I downloaded in March (v5024; normal) and the one I downloaded yesterday (v5032; normal), the file size has more than tripled! (2.6 MiB vs. 8 MiB!)

Even the lite edition of v5032 doubles the size of the file (2.6 MiB vs. 5.2 MiB)!


Could someone here help me understand what changes were made to the file over the summer that explains this substantial growth?

I tried looking at diffs but there were simply too many to get a comprehensive understanding of what the major changes were. I also tried looking at the issues and commits in GitHub, but I'm still left scratching my head.

Was there an explosion in new web browsers over the summer?

Thanks for your help,
- Anthony

James Titcumb

unread,
Sep 20, 2014, 5:42:10 AM9/20/14
to browscap on behalf of Anthony Lalande

hi Anthony

It is a great question. We have done numerous improvements to the quality of data contained in the ini. However, the trade off is that the file is now much larger.

We acknowledge that this is a problem in some common cases (including running out of memory parsing the file), and we are trying to take an effective course of action.

Because I do this in spare time, so far I have not time to remedy this, but there are definitely plans to address these issues :)

Hope that helps!

Thanks
James

--
You received this message because you are subscribed to the Google Groups "browscap" group.
To unsubscribe from this group and stop receiving emails from it, send an email to browscap+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Dino Termini

unread,
Sep 20, 2014, 6:04:09 AM9/20/14
to brow...@googlegroups.com
Hi,

like I mentioned a while ago, our workaround to avoid memory issues with our WordPress plugin, was to split the file and then loop until a match is found:

https://plugins.trac.wordpress.org/browser/wp-slimstat/trunk/databases

We left the most popular matches in the first chunch, mobile agents in the second, and so on. You may want to follow a similar approach.

Dino.

Anthony Lalande

unread,
Sep 20, 2014, 8:45:59 PM9/20/14
to brow...@googlegroups.com

Hi James,

Thanks for your prompt reply.

I am thinking an interesting solution to this problem would be to cross-reference browser definitions with their popularity out in the wild, in order to be able to build files which include the most popular browsers.

Rather than (or perhaps in addition to) manually curating a standard and lite version, it would be interesting to be able to build your own definition files on-the-fly. Since usage statistics would be associated with each browser family, one could generate a definitions file to cover the 90th percentile, 95th percentile, 99.9th percentile, etc., possibly manually-including other browser families of interest as well.

e.g.:
`bin/browscap build --percentile=99 --include="BlackBerry 3.7"`
`bin/browscap build --size=3M --include="Safari 5.0" --include="Nagios"`


There would need to be ways of including browsers which aren't common out in the wild, but which are needed, e.g. crawlers.


Thanks for all of your work on this project, it has definitely made UA detection much easier.
- Anthony

James Titcumb

unread,
Sep 21, 2014, 12:30:52 AM9/21/14
to browscap on behalf of Anthony Lalande

Hi Anthony

Indeed - this sort of solution is not unknown to us, and although technically it works fine, in practice there are issues... At the moment, the cost to run the server for the project is minimal, but if we allowed people to generate their own from the website, the processing power required would undoubtedly increase dramatically - and thus the cost of running the servers would become unmanageable for me.

So where does that leave us? Well - the end user could generate their own e.g. using a command similar to the one you have pointed out... But those who are having memory issues will have memory issues still. The memory required to generate the INI in the first place is much higher (mostly, in fact due to the different format generation), so this actually wouldn't solve the problem for those users.

In practice there are such issues, so I think, at least to start with there are two main things we need to change, in order:

- Reduce the size of the "standard" version by removing uncommon properties (like JavaScript, css version etc.) - these I think you'll agree are better detected by something like Modernizr. Lite version should include less UAs too, and we should better manage which UAs are included.

- Improve the browscap-php memory usage much better. No easy feat, and this will require a lot of work, possibly even breaking BC, or requiring a newer PHP version etc.

The reason I list this second is because changing the PHP parser will not help anyone not using PHP... so is actually not that effective. Nevertheless it is something we can work towards as we support that component and it is definitely an issue for many.

Thanks
James

DennisG

unread,
Oct 21, 2014, 3:20:26 PM10/21/14
to brow...@googlegroups.com
Is there a way to maybe allow someone to generate a file from the website (that cannot be automated to avoid the huge overhead) according to a "last active" date or a "how common" field for the definitions? For example, if there is a last seen date of over 2 years ago, I might want to skip it. Or, if it a really rare (maybe on a scale of 1-10) we can exclude them to a certain level. This MIGHT dramatically reduce the file and should be pretty easy to implement. Just spit-ballin' here! :)

James Titcumb

unread,
Oct 21, 2014, 3:35:11 PM10/21/14
to browscap on behalf of DennisG
Hi DennisG,

Anthony suggested before a somewhat similar idea and as I explained, if this custom generation of the INI files became a well-used of the website, it would cost a fortune to run the servers, whereas it costs us $20 a month. Generating the current set of INI files takes about 5 minutes or so, a lot of memory and a lot of CPU, so it is not realistic to offer custom downloads. We'd pretty much only be able to generate one file per CPU, which would get very expensive.

Additionally, how would we determine how frequently a UA is used? We don't actually have a database of user agents seen (Gary Keith used to maintain one, but I don't know where this data was sourced from so this was not carried over), so there is no real way for us to know frequency of use... so yeah, if we had access to that sort 

I have plans to change the categorisation of the files as I mentioned, which should significantly reduce the size. You can read more about this here: https://github.com/browscap/browscap/wiki/Proposal---re-arrange-files-for-efficiency

Any feedback or ideas are of course welcomed, but unfortunately at this time solutions like this are quite impractical I think.

Thanks
James

On 21 October 2014 20:20, DennisG via browscap <browscap+noreply-APn2wQegn24W520...@googlegroups.com> wrote:
Is there a way to maybe allow someone to generate a file from the website (that cannot be automated to avoid the huge overhead) according to a "last active" date or a "how common" field for the definitions?  For example, if there is a last seen date of over 2 years ago, I might want to skip it.  Or, if it a really rare (maybe on a scale of 1-10) we can exclude them to a certain level.  This MIGHT dramatically reduce the file and should be pretty easy to implement.  Just spit-ballin' here! :)

DennisG

unread,
Oct 21, 2014, 4:09:00 PM10/21/14
to brow...@googlegroups.com
Thanks James, makes sense.

Antti29

unread,
Jan 13, 2015, 3:50:08 AM1/13/15
to brow...@googlegroups.com
> It is a great question. We have done numerous improvements to the quality of data contained in the ini. However, the trade off is that the file is now much larger.
>
> We acknowledge that this is a problem in some common cases (including running out of memory parsing the file), and we are trying to take an effective course of action.
>
> Because I do this in spare time, so far I have not time to remedy this, but there are definitely plans to address these issues :)


We recently updated our browscap files. The resulting cache file increased in size to a whopping 13 megabytes and now takes several seconds to load, while the old version was quick and not really noticeable. And yes, this is the "lite" version I'm talking about.

For us, browscap worked well in reliably and unobtrusively checking the browser version but this change really defeats the purpose. I noticed that we're now presented with info such as support for tables, cookies, and javascript. Do we really need those? No matter what you're developing, if (any of) these aren't supported, there really is nothing you can do about it, there is no workaround.

We have now reverted back to our venerable regexp, which identifies the five major browsers and their versions, along with the user's OS.

James Titcumb

unread,
Jan 13, 2015, 4:12:07 AM1/13/15
to browscap on behalf of Antti29
Hi Antti29

We are working on reducing the size of the files, but we are one of the most comprehensive suites for identifying user agents. We go far beyond just identifying the "five major browsers".

The 6000 build that will be released soon will have approximately 6-7mb filesize for the "lite" version, and we are working hard to reduce the size of the "standard" version too (we've shaved off about 1mb, but I'd like to do more).

Oh, by the way, browscap has always had info such as tables/cookies/javascript, that is a hangover from legacy days where this information was useful. The new lite/standard versions will not contain these properties, but the "full" version will still have these.

Thanks
James

Antti29

unread,
Jan 13, 2015, 4:36:59 AM1/13/15
to brow...@googlegroups.com
> We are working on reducing the size of the files, but we are one of the most comprehensive suites for identifying user agents. We go far beyond just identifying the "five major browsers".
>
> The 6000 build that will be released soon will have approximately 6-7mb filesize for the "lite" version, and we are working hard to reduce the size of the "standard" version too (we've shaved off about 1mb, but I'd like to do more).
>
> Oh, by the way, browscap has always had info such as tables/cookies/javascript, that is a hangover from legacy days where this information was useful. The new lite/standard versions will not contain these properties, but the "full" version will still have these.


In my opinion, a "lite" version should not be comprehensive (at the expense of usability), and not go far beyond the bare essentials. Right now, there is no alternative for those who only need the major browsers and little else, and exhaustive has become exhausting.

James Titcumb

unread,
Jan 13, 2015, 4:46:35 AM1/13/15
to browscap on behalf of Antti29
Yes :) I agree with you, that's basically what I was saying :) the lite version should still include common crawlers though, e.g. Googlebot and so on

Locke Hajo

unread,
Mar 27, 2015, 7:55:12 AM3/27/15
to brow...@googlegroups.com
Hello,

just want to support the threadstarter. Currently we use an older browscap.ini with size 785K
Updating to new ones with size 12m or lite version 6m seems impossible. Every php-process is loading the ini separately and so our servers get high load because they going out of ram. Unfortunately we think about removing browscap.ini from php.ini (hosting company)
It would be nice if there would exist a real lite-version which dont crashes our servers ;)

Thanks,
Hajo

James Titcumb

unread,
Mar 27, 2015, 8:53:12 AM3/27/15
to brow...@googlegroups.com
Hello there,

The issue is that there are so many user agents that exist, in order to have this much data, the files need to be quite large. 

The size of the files has been cut from version "6000" dramatically, but I feel there is still more work to do, these issues exist:


Our build of user agents will probably never reduce down to 785k again... there is simply no way to include the huge amount of detailed user agents in such a small amount of space any more, but we are working to reduce the size of the files as much as possible!

Thanks
James

Locke Hajo

unread,
Mar 27, 2015, 9:42:58 AM3/27/15
to brow...@googlegroups.com
Hello,

understand. But the more info a file is containing, the more unuseable it gets and practical effect decreases.
May be there is a way to reduce filesize to a real lite-version by excluding exotic or outdated browsers. Just containig major-clients in some versions.

Thanks,
Hajo

James Titcumb

unread,
Mar 27, 2015, 9:44:17 AM3/27/15
to browscap on behalf of Locke Hajo
That's the plan! :)

Thanks
James

Locke Hajo

unread,
Mar 31, 2015, 10:43:33 AM3/31/15
to brow...@googlegroups.com
ahh, great.

one last question. when do you expect a release of a new real-lite version?
My boss is asking and iam afraid of getting the job to create a new browscap on my own ;(
iam not very familiar with clients of all kinds.

Thanks,
Hajo

James Titcumb

unread,
Mar 31, 2015, 10:48:41 AM3/31/15
to browscap on behalf of Locke Hajo
Hi Hajo

Not sure what the roadmap is - we maintain this in our free time, so we can't really make any promises... perhaps submit a PR to us, so everyone can benefit ;)

Thanks
James

Locke Hajo

unread,
Apr 1, 2015, 2:31:09 AM4/1/15
to brow...@googlegroups.com
Hello,

Hmm, pity.

can you give a raw clue? (days/months etc.). So i could tell my boss to wait or create a file on my own, which i try to avoid.

Thanks,
Hajo

James Titcumb

unread,
Apr 1, 2015, 2:58:47 AM4/1/15
to browscap on behalf of Locke Hajo

Hey

As this is an open source project that is only run by a tiny handful of people (pretty much 1-2 people), I'm afraid we can't give you an idea. The project is not invested enough to have a roadmap or anything. However, like I said, perhaps instead of going and creating your own version, you help us (and thus everyone) by contributing a change to Browscap that means even lighter files :) that way, you'd be helping your boss, the project and everyone else in the process! if you need help contributing, all you need to do is ask! ;)

Thanks
James

Jeff Harkavy

unread,
Apr 2, 2015, 12:27:15 PM4/2/15
to brow...@googlegroups.com
We use the lite asp version, and while it's smaller than the others, version 6000 (6,580,550 bytes, 12-Mar-2015) is still 14x larger than version 5020 (446,038 bytes, 29-jul-2013). The biggest problem (stating the obvious and preaching to the choir) is these darn browser manufacturers and their new(ish) rapid release cycle version wars. 43 versions of Chrome, 37 versions of Firefox, etc. oy.

For our situation there's gobs of overkill even in the lite version. Our deployment is internal and not exposed to the world at large, so we don't care one bit about crawlers, spiders, bots, or many (most?) of the browsers. I looked into customizing the ini file a while back, but the organization of the file at the time caused me to accidentally nuke too much and I just reverted to the source for lack of time.

What would be helpful is some form of tool that can safely create a customized version of the ini file based on the original. Something like "I only care about Firefox, Chrome, and IE" or "Firefox versions 25-37, IE >=8, Chrome >=35 on Windows platforms, plus all Android or IOS platforms". Maybe in my copious spare time (sarcasm)...

James & crew - thanks for all you folks do to keep the project alive.

James Titcumb

unread,
Apr 8, 2015, 5:53:22 AM4/8/15
to brow...@googlegroups.com
For all who are looking out for a lighter "lite" version, I've been working a bit on one - may not be perfect solution, but it's dramatically reduced the size of the files.

You can review the change and also download example files to see how well it works for you from this PR: https://github.com/browscap/browscap/pull/614

Any comments/suggestions welcome, but please add them to the PR if they are relevant!

Thanks
James

Locke Hajo

unread,
Apr 10, 2015, 7:49:58 AM4/10/15
to brow...@googlegroups.com
Hello,

thanks for your work. size looks really good.
we will do some tests in our serverenvironment.

Thanks,
Hajo
Reply all
Reply to author
Forward
0 new messages