libpsl v0.20.0 released

16 views
Skip to first unread message

Tim Rühsen

unread,
Feb 22, 2018, 4:31:28 AM2/22/18
to libps...@googlegroups.com
Hi,

this release comes with a new function flag PSL_TYPE_NO_STAR_RULE to be
used with psl_is_public_suffix2().
It disables the PSL 'prevailing star rule' and thus returns false for
TLDs that are not explicitly listed in the PSL.

The new flag makes only sense in combination with either PSL_TYPE_ANY,
PSL_TYPE_ICANN or PSL_TYPE_PRIVATE.

Thanks to Claudio Saavedra (libsoup), who was the driving force behind
this change.

22.02.2018 Release V0.20.0
* Remove hard-coded gcc flag in Makefile.am
* Prevent excessive CPU cycles on large inputs
* New flag PSL_TYPE_NO_STAR_RULE to skip star rule

Project Website
https://github.com/rockdaboot/libpsl

Online Documentation
https://rockdaboot.github.io/libpsl/

Getting the Source Code
git clone https://github.com/rockdaboot/libpsl

Source Code Test Coverage
https://coveralls.io/github/rockdaboot/libpsl?branch=master

Regards, Tim

signature.asc

Daniel Kahn Gillmor

unread,
Feb 22, 2018, 9:51:59 PM2/22/18
to Tim Rühsen, libps...@googlegroups.com
Hi Tim--

Thanks for this new version of libpsl!

On Thu 2018-02-22 10:31:24 +0100, Tim Rühsen wrote:
> this release comes with a new function flag PSL_TYPE_NO_STAR_RULE to be
> used with psl_is_public_suffix2().
> It disables the PSL 'prevailing star rule' and thus returns false for
> TLDs that are not explicitly listed in the PSL.
>
> The new flag makes only sense in combination with either PSL_TYPE_ANY,
> PSL_TYPE_ICANN or PSL_TYPE_PRIVATE.

Layering in this flag on the "type" argument makes the API tougher to
reason about than i was expecting. I'm trying to follow the logic in
the implementation, but i think i might be missing it.

there are tests in psl.c that compare type directly to PSL_TYPE_ICANN,
without necessarily stripping off any extra flag bits:

e.g.:

/* check for correct rule type */
if (type == PSL_TYPE_ICANN && !(rc & _PSL_FLAG_ICANN))
goto suffix_no;
else if (type == PSL_TYPE_PRIVATE && !(rc & _PSL_FLAG_PRIVATE))
goto suffix_no;

if (rc & _PSL_FLAG_EXCEPTION)
goto suffix_no;


I note that type hasn't been touched at this point (unless
suffix.nlabels == 1, which isn't always the case). These == comparisons
are guaranteed to fail if you treat type as a bitfield and more than one
bit is set :( Am i missing something?

Come to think of it, i don't know how this would work with PSL_TYPE_ANY
either -- can you explain that to me?

Maybe the test suite needs to be augmented to test this more rigorously.
I note that the live PSL has several entries with * rules with multiple
suffix labels, which might be useful to test against. here are three
that might be interesting test cases:

- *.nom.br
- *.sch.uk
- *.compute.amazonaws.com


> 22.02.2018 Release V0.20.0
> * Remove hard-coded gcc flag in Makefile.am
> * Prevent excessive CPU cycles on large inputs
> * New flag PSL_TYPE_NO_STAR_RULE to skip star rule

this release appears to update LIBPSL_SO_VERSION like so:

-AC_SUBST([LIBPSL_SO_VERSION], [7:0:2])
+AC_SUBST([LIBPSL_SO_VERSION], [8:0:2])

This results in libpsl.so.6, when before it was libpsl.so.5. but i
don't see how this makes sense. :/

Following the instructions in configure.ac for updating
current:revision:age i don't even see how it's possible to arrive at
8:0:2. It looks like you took steps 3 and 4, but missed steps 5 and 6:

5. If any interfaces have been added since the last public release, then increment age.
6. If any existing interfaces have been removed or changed since the last public release, then set age to 0.

I wish SONAME management was less confusing!

If you think that the inclusion of this flag warrants an SONAME bump
(i.e., steps 5 and 6 were triggered), then LIBPSL_SO_VERSION should be
8:0:0 (resulting in libpsl.so.8, and triggering a rebuild of all tools
against this new package).

But if you think the changes of this version do not warrant an SONAME
bump (i.e. only step 5 was triggered), then LIBPSL_SO_VERSION should be
8:0:3 (resulting in sticking with libpsl.so.5, and no recompilation
needed).

There's a caveat here -- some package could be built against a newer
version of libpsl.so.5, knowing about the flag, and then it could be
transferred to a machine with an older version of libpsl.so.5 that
*doesn't* know about the flag.

I don't think any older versions of libpsl check whether any of the
unknown bits in type are set (though as you can see above, i don't grok
the logic here very well), which means that older versions might
silently ignore this new flag. If another tool is depending on the flag
having the intended effect, they could get in trouble if they're
transferred to a machine with an older version of libpsl.so.5. But an
soname bump is a pretty heavyweight tool just for adding a flag.

Let me know what you think we should dohere. I'm reluctant to include
this version in debian without sorting out both of these concerns, but
if you've got a plan for making them make sense, i'd be happy to think
it through with you.

all the best,

--dkg

Tim Rühsen

unread,
Feb 23, 2018, 6:44:33 AM2/23/18
to Daniel Kahn Gillmor, libps...@googlegroups.com
Hi Daniel,

thanks for your thorough review, shame on me !

I relied very much on the test suite - I remembered it as pretty
complete, but it didn't catch the wrong code.

Improved the test suite to stumble over the bad code and fixed the psl.c
accordingly.


More comments follow inline...
You are right, the PSL_TYPE_NO_STAR_RULE flag must be reset before going
further in the code.

The == comparisons work out since 'type' either is equal to
PSL_TYPE_ICANN, PSL_TYPE_PRIVATE or (PSL_TYPE_ICANN|PSL_TYPE_PRIVATE).

We assume a rule to be either in the private or in the icann section,
not both. With these assumptions, the == comparison makes sense.

E.g.
if (type == PSL_TYPE_ICANN && !(rc & _PSL_FLAG_ICANN))
means
If the lookup is restricted to ICANN and the found rule is not in
ICANN section, we return 'not a public suffix'.


> Maybe the test suite needs to be augmented to test this more rigorously.
> I note that the live PSL has several entries with * rules with multiple
> suffix labels, which might be useful to test against. here are three
> that might be interesting test cases:
>
> - *.nom.br
> - *.sch.uk
> - *.compute.amazonaws.com

We, I added *.compute.amazonaws.com tests to test-is-public.c and
test-is-public-builtin.c without triggering the issue.

But test-is-public-all.c triggered on some exceptions when extending the
code. It should now test any combination of flags with all entries in
the PSL.

What we still don't have is a test for arbitrary domain that are *not*
in the PSL. They all should return 0 (except for TLDs when
PSL_TYPE_NO_STAR_RULE is not given).
Have to think about a practical algorithm to generate and test those.


>> 22.02.2018 Release V0.20.0
>> * Remove hard-coded gcc flag in Makefile.am
>> * Prevent excessive CPU cycles on large inputs
>> * New flag PSL_TYPE_NO_STAR_RULE to skip star rule
>
> this release appears to update LIBPSL_SO_VERSION like so:
>
> -AC_SUBST([LIBPSL_SO_VERSION], [7:0:2])
> +AC_SUBST([LIBPSL_SO_VERSION], [8:0:2])
>
> This results in libpsl.so.6, when before it was libpsl.so.5. but i
> don't see how this makes sense. :/
>
> Following the instructions in configure.ac for updating
> current:revision:age i don't even see how it's possible to arrive at
> 8:0:2. It looks like you took steps 3 and 4, but missed steps 5 and 6:
>
> 5. If any interfaces have been added since the last public release, then increment age.
> 6. If any existing interfaces have been removed or changed since the last public release, then set age to 0.
>
> I wish SONAME management was less confusing!

Me too :-(
I know we had this issue before and really tried to not mess it up
again. It was my intention to *not* bump up SONAME, lol.

> If you think that the inclusion of this flag warrants an SONAME bump
> (i.e., steps 5 and 6 were triggered), then LIBPSL_SO_VERSION should be
> 8:0:0 (resulting in libpsl.so.8, and triggering a rebuild of all tools
> against this new package).
>
> But if you think the changes of this version do not warrant an SONAME
> bump (i.e. only step 5 was triggered), then LIBPSL_SO_VERSION should be
> 8:0:3 (resulting in sticking with libpsl.so.5, and no recompilation
> needed).
>
> There's a caveat here -- some package could be built against a newer
> version of libpsl.so.5, knowing about the flag, and then it could be
> transferred to a machine with an older version of libpsl.so.5 that
> *doesn't* know about the flag.
>
> I don't think any older versions of libpsl check whether any of the
> unknown bits in type are set (though as you can see above, i don't grok
> the logic here very well), which means that older versions might
> silently ignore this new flag. If another tool is depending on the flag
> having the intended effect, they could get in trouble if they're
> transferred to a machine with an older version of libpsl.so.5. But an
> soname bump is a pretty heavyweight tool just for adding a flag.
>
> Let me know what you think we should dohere. I'm reluctant to include
> this version in debian without sorting out both of these concerns, but
> if you've got a plan for making them make sense, i'd be happy to think
> it through with you.

You are right with your above concerns... there *might* be cases where
someone gets burned (app using new flag linked with old library).

We had that before and we fixed it by releasing a version reverting the
SONAME bump. Not 100% perfect, and I am not sure how much this affects
real-life systems. What's your advice ?

Regards, Tim

signature.asc

Tim Rühsen

unread,
Feb 23, 2018, 6:57:11 AM2/23/18
to Daniel Kahn Gillmor, libps...@googlegroups.com
Regarding the SONAME bump...


# 4. If any interfaces have been added, removed, or changed since the
last update, increment current, and set revision to 0.


Well, adding or removing a flag *is* a change of the interface. Not
clear here if 'interface' means 'function'. From a logical point of
understanding it could be anything changing the behavior of the library
*or* the application. That means it could be flags, enum values,
functions, function params, defines, ...

An *automated* stable (packaging and runtime) behavior could only be
achieved if SONAME gets bumped in any of the above cases. If we revert
the bump, there will be chances to get an unstable behavior (as you
stated as well). Even if only one person is affected and even if that is
unlikely - why open up a hole ?


With Best Regards, Tim

signature.asc

Daniel Kahn Gillmor

unread,
Feb 23, 2018, 7:23:01 AM2/23/18
to Tim Rühsen, libps...@googlegroups.com
On Fri 2018-02-23 12:57:09 +0100, Tim Rühsen wrote:
> Well, adding or removing a flag *is* a change of the interface. Not
> clear here if 'interface' means 'function'. From a logical point of
> understanding it could be anything changing the behavior of the library
> *or* the application. That means it could be flags, enum values,
> functions, function params, defines, ...

agreed, the semantics are fuzzy here.

> An *automated* stable (packaging and runtime) behavior could only be
> achieved if SONAME gets bumped in any of the above cases. If we revert
> the bump, there will be chances to get an unstable behavior (as you
> stated as well). Even if only one person is affected and even if that is
> unlikely - why open up a hole ?

well, bumping the SONAME also means that it's a "library transition",
which means that everything which builds against libpsl. Gratuitous
library transitions make the buildd maintainers grumpy :/

I note that if all previous versions had returned something like
ENOTIMPL if they detected a flag that they didn't know about, an SONAME
bump *definitely* wouldn't be required (at the expense of the library
user needing to be able to handle an additional error case at runtime).

Anyway, i'm inclined to revert the SONAME bump for this one, and move it
back to 8:0:3 -- but if you add code to provide ENOTIMPL-ish semantics,
then i'd be inclined to consider that worth doing the SONAME bump for,
since it means we'll have a clear way to add new flags in the future.

what do you think?

--dkg

Tim Rühsen

unread,
Feb 23, 2018, 8:53:20 AM2/23/18
to Daniel Kahn Gillmor, libps...@googlegroups.com
On 02/23/2018 01:22 PM, Daniel Kahn Gillmor wrote:
> On Fri 2018-02-23 12:57:09 +0100, Tim Rühsen wrote:
>> Well, adding or removing a flag *is* a change of the interface. Not
>> clear here if 'interface' means 'function'. From a logical point of
>> understanding it could be anything changing the behavior of the library
>> *or* the application. That means it could be flags, enum values,
>> functions, function params, defines, ...
>
> agreed, the semantics are fuzzy here.
>
>> An *automated* stable (packaging and runtime) behavior could only be
>> achieved if SONAME gets bumped in any of the above cases. If we revert
>> the bump, there will be chances to get an unstable behavior (as you
>> stated as well). Even if only one person is affected and even if that is
>> unlikely - why open up a hole ?
>
> well, bumping the SONAME also means that it's a "library transition",
> which means that everything which builds against libpsl. Gratuitous
> library transitions make the buildd maintainers grumpy :/

'grumpy' because there is manual work involved ? In this case it
shouldn't be - since no app is using the new feature yet, everything
should work fully automatic and smoothly.

> I note that if all previous versions had returned something like
> ENOTIMPL if they detected a flag that they didn't know about, an SONAME
> bump *definitely* wouldn't be required (at the expense of the library
> user needing to be able to handle an additional error case at runtime).
>
> Anyway, i'm inclined to revert the SONAME bump for this one, and move it
> back to 8:0:3 -- but if you add code to provide ENOTIMPL-ish semantics,
> then i'd be inclined to consider that worth doing the SONAME bump for,
> since it means we'll have a clear way to add new flags in the future.

I agree with moving back to 8:0:3.
Prepared the release in branch 'prepare-0.20.1'. Please ACK if it looks
good to you.

Regards, Tim

signature.asc

Daniel Kahn Gillmor

unread,
Feb 23, 2018, 4:47:13 PM2/23/18
to Tim Rühsen, libps...@googlegroups.com
On Fri 2018-02-23 14:53:14 +0100, Tim Rühsen wrote:
> On 02/23/2018 01:22 PM, Daniel Kahn Gillmor wrote:
>> well, bumping the SONAME also means that it's a "library transition",
>> which means that everything which builds against libpsl. Gratuitous
>> library transitions make the buildd maintainers grumpy :/
>
> 'grumpy' because there is manual work involved ? In this case it
> shouldn't be - since no app is using the new feature yet, everything
> should work fully automatic and smoothly.

the concern isn't about "no app using the new feature", it's about
having to rebuild everything that depends on libpsl so that they can
link against libpsl8, otherwise they'll all break once libpsl5 drops out
of the archive.

some library transitions can happen automatically. others require a
little effort.

> I agree with moving back to 8:0:3.
> Prepared the release in branch 'prepare-0.20.1'. Please ACK if it looks
> good to you.

those changes look sensible to me thus far.

I'd appreciate a little bit more detail in the psl documentation itself
about the expected semantics of NO_STAR_RULE -- it apparently doesn't
mean "ignore all rules that have stars in them", right? But i don't see
where someone reading the documentation would learn that.

thanks for sorting this out!

--dkg
Reply all
Reply to author
Forward
0 new messages