Removing vendoring of libxml2

89 views
Skip to first unread message

Mike Dalessio

unread,
Mar 29, 2016, 9:46:28 PM3/29/16
to nokogi...@googlegroups.com, nokogiri-talk
Hello world,

TL;DR: I had a chat with Aaron last night, and we came to the conclusion that we'd like to try *not* vendoring libxml2 in the next minor release of Nokogiri.


Historically, we began vendoring libxml2 (and libxslt) when two things happened simultaneously:
  • libxml 2.9.0 contained a bug[1] that broke much of Nokogiri's CSS parsing
  • homebrew started installing libxml 2.9.0 by default
which led to all kinds of crazy.

As a response, we started shipping Nokogiri with a version of libxml2 that was known to work.


Since then, we've had mixed results. On the one hand, we've greatly improved the install experience for many people, we've been able to deliver modern libxml2 to many people on older or ill-maintained systems, and we've been able to focus on keeping Nokogiri support to a single version of libxml2 (rather than maintaining complex kludges to work around bugs in specific versions of the library).

On the other hand, we're redistributing something that should really be on people's systems, and thus taking on the responsibility of keeping Nokogiri's libxml2 up-to-date with patches.


I'd like to propose that we reverse the default value for NOKOGIRI_USE_SYSTEM_LIBRARIES, so that by default we'd use it, and only if it was set to a falsey value (i.e., "0", "f", "F", "false", "FALSE", etc.) would we try to compile and use the vendored version of Nokogiri.


I'd love to hear everyone's thoughts, as I think this is simpler and perhaps also just better than the proposal[2] I made last year for Nokogiri to auto-detect whether its vendored version is better than the installed system version.


If there are no major objections, I'll widen this conversation up to the Internet At Large and make it a Github Issue RFC.

Cheers,
-m



Mike Dalessio

unread,
May 27, 2016, 12:46:00 PM5/27/16
to nokogi...@googlegroups.com, nokogiri-talk
It's notable that nobody responded to this proposal.

Because there were no objections, I'm formally proposing to use system libraries by default in the next big release of Nokogiri, and remove vendored libraries altogether in a future release.

Should that release be 1.7.0, or should it be 2.0.0? I've been meaning to move towards semver, and this might be a good opportunity.

John Shahid

unread,
Jun 7, 2016, 1:11:51 PM6/7/16
to nokogi...@googlegroups.com, nokogiri-talk
I think 1.7.0 makes more sense. There are no major API changes that justifies a major version bump and the gem can fallback to using vendored libraries if it fails to find the system libraries. From the user pov there are no major/breaking changes.

--
You received this message because you are subscribed to the Google Groups "nokogiri-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nokogiri-cor...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Aaron Patterson

unread,
Jun 7, 2016, 5:02:50 PM6/7/16
to nokogi...@googlegroups.com, nokogiri-talk
+1.  It's up to you though Mike.

Mike Dalessio

unread,
Jun 7, 2016, 7:27:56 PM6/7/16
to nokogi...@googlegroups.com, nokogiri-talk
... he replies, 12 hours after I declare 2.0.0 as "next" in the CHANGELOG and in Github Milestones. ;)

I'd like to make it 2.0.0, and perhaps use this as a rare opportunity to think about other things that might make sense to stop supporting. Some examples:

But I don't feel strongly either way at this point.

Dwayne M

unread,
Jun 20, 2016, 1:13:59 PM6/20/16
to nokogiri-talk, nokogi...@googlegroups.com
I'm not sure this is the correct place to post this comment, but it seems close enough that I decided to continue.

I've recently tried to install nokogiri on a Windows 2012 Server.  This node did not have internet access.  I could not get it to installed, and tried the versions with the prebuilt DLL's too.  Based on the errors in the logs, it seemed to be intent on compiling the libiconv-1.14 and zlib-1.2.8 libraries.  I also found that neither of these source bundles were included in the .../ports/archives folder within the nokogiri GEM file, while the source bundles for libxml2-2.9.2 and libxslt-1.1.28 were.

The install process did not log an attempt to download either of the missing source bundles, though I see in some other threads that it logged activity when successfully downloading the bundles.  Since my server did not have internet access, it could not download the bundles.

So I unpacked the GEM, added the two source bundles,  created a new gemspec from the old GEM, added the two new files, and repackged the GEM.  Installing the new GEM worked like a champ (though it did take a while to install).

I was using nokogiri-1.6.7.2, and this is the version that I repackaged.  I tried to install nokogiri-1.6.8, but it was also failing, even with the prebuilt DLLs.  But I did not examine the GEM file to see if it was also missing the two source bundles.

So, I think it would be valuable in a small way to continue to package the source bundles for the desired versions of those libraries within the nokogiri GEM file.

Mike Dalessio

unread,
Jun 20, 2016, 1:16:12 PM6/20/16
to nokogiri-talk
Hi Dwayne,

Can you please open a Github issue for this? If we're not packaging these source tarballs, that's a bug/regression that we should fix.


--

Mike Dalessio

unread,
Jul 11, 2016, 7:33:24 AM7/11/16
to nokogiri-talk
Just following up here: Dwayne kindly created this issue to track the problem:


Thank you, Dwayne!

John Smart

unread,
Jul 14, 2016, 8:16:11 PM7/14/16
to nokogiri-talk, nokogi...@googlegroups.com
I love the nokogiri install process is much much easier now.  This was tripping up our development team over and over again before the change, so I'd like to see it continue to be a part of Nokogiri with an option to use system libraries if desired.  However, at runtime, it appears to be using the system libraries causing a warning.  Nokogiri should be consistent -- if it was built using the included libxml2 then it should use it during runtime also.  

William Entriken

unread,
Mar 26, 2017, 4:24:39 PM3/26/17
to nokogiri-talk, nokogi...@googlegroups.com
Hi all,

I am just a Nokogiri user. This proposal makes lots of sense to me.

RFC discussion at https://github.com/sparklemotion/nokogiri/issues/1220 very clearly describes a path forward to using system libxml.

It is nice that Nokogiri can fall back on vendored libraries for exceptional circumstances. And maybe it would be better to set strict versioning requirements as a prerequisite for installing Nokogiri. But either way, the default and best practice should be to use system libraries if possible.

Maybe there's not a lot of people here on the Google Group. But if you're here any feedback YOU have is appreciated.

Regards,
Will
Reply all
Reply to author
Forward
0 new messages