I'm working on getting node.js running on mobile platforms and one of the questions I'm was curious about is - how common is it for node packages to use node-gyp during install?
Figuring this out is a bit tricky because near as I can tell there is a semi-magic behavior during npm install that if it finds a binding.gyp in the root then it will automatically put in a preinstall command to compile with node-gyp. But I haven't figured out a way to determine if a package has a binding.gyp in its root without actually downloading the package. That's a challenge because there are literally terabytes of packages and I really don't want to download them all.
So I am using a heuristic. I was hoping folks on this list might take a gander at the heuristic and the numbers it produced and let me know if this all sounds right.
The code uses PouchDB to synch with NPM and then cleans up the data and looks for packages it identifies as directly using node-gyp. The heuristic is defined in lib/describeGypUsage.js/isGypRoot(). What it does is look to see if the entry has the gypfile flag set, if it has an install script, if it has an explicit dependency on node-gyp or on NAN. I don't look for OS or CPU dependencies because I'm interested specifically in node-gyp usage during install. It's fine if a package has a binary that is provided some other way.
Note that this approach should catch anyone using node-pre-gyp since that requires an install script and we flag everything with an install script. Strictly speaking I probably should check if the "--fallback-to-build" flag is being used since if it's not
then technically node-gyp isn't being used on install.
The heuristic I'm using is clearly too aggressive but given how few results it finds that doesn't worry me.
When I ran the program on 2/18 I got:
Total number of Packages: 126,535
Total number of node-gyp packages: 2006
Total number of packages that use node-gyp themselves or anywhere in their dependency tree: 16,322
So right off the bat only 13% or so of packages have node-gyp anywhere in their dependency tree.
Total number of NPM Package Downloads in January 2015: 860,888,414
Total number of downloads of packages that use node-gyp themselves or anywhere in their dependency tree in January 2015: 20,950,038
So roughly 2.4% of all downloads in January 2015 involved node-gyp anywhere in their dependency tree.
This argues that node-gyp usage during install is really rare.
Right away I don't really trust the download data from NPM. I already found what appear to be anomalies such as dependencies having fewer downloads than their dependents. I filed a bug on this [1] but nothing came of it. My guess is that I just don't understand
what the NPM download stats are actually counting.
In any case, I was wondering what people thought. Do the numbers seem reasonable?
Thanks,
Yaron
P.S. According to the program the 10 most popular node-gyp related packages, in order of popularity, are ws, socket.io, socket.io-client, chokidar, npm, phantomjs, karma, fsevents, mongodb and kerberos.
I realized I needed a TL;DR. See subject. 
Yaron