On Tuesday 11 August 2015 23:21:31 Daniel Kahn Gillmor wrote:
> hi there--
>
> the only function that uses a non-const psl_ctx_t* is psl_free. It
> knows to avoid trying to actually free() the builtin psl ctx.
>
> I'm wondering if we shouldn't just return a regular (non-const)
> psl_ctx_t* from psl_builtin -- it would make it easier to initialize a
> variable and then forget about how the caller initialized it -- just
> always free the thing.
As you write above, just fire psl_free() and forget.
If you want to avoid a compiler warning you'll have to use a cast like
psl_free((psl_ctx_t *)psl).
You can argue back and forth if to use const or not , in the end it is a
matter of taste (in this special case).
But just think of future extensions like e.g. psl_add_rule(psl_ctx_t *psl,
const char *rule). You simply can't do that with a builtin psl context. If we
change psl_builtin() to return non-const, you won't get a compiler
warning/error when calling psl_add_rule() - you'll see the problem only at
runtime. Staying with const allows the compiler to detect the problem at
compile time, which I prefer.
This is maybe not just a theoretical case, as I heard that Google uses their
own 'PSL' list builtin into Chrome/Chromium. We could have a function to merge
Mozillas PSL and Google's !?
> Alternately, i was considering a new function psl_builtin_or_update()
> which would look for the builtin filename in the current filesystem,
> compare its mtime with the builtin mtime, and either build a new
> psl_ctx_t and return it, or return the psl_builtin. this would need to
> be a non-const return value, i think.
Having builtin data requires regular recompilation of libpsl (whenever the PSL
becomes updated). That's how we designed it from the beginning.
Having the PSL data as git submodule since 0.8.0 makes it very easy to do
exactly this automatically.
AFAIR, for Debian there is no rule like 'if package x becomes rebuilt, also
rebuild package y'. If there is, libpsl could be automatically rebuilt each
time the package publicsuffix changes. Did you start a request for such ? (I
wouldn't even know where to ask.)
Maybe we could merge the publicsuffix and libpsl into one source package ?
To avoid the builtin / non-built conflict for a function like
psl_builtin_or_update(), we could instead have something like
if (psl_builtin_outdated())
psl = psl_load_file(psl_builtin_filename());
// fallback if something goes wrong
if (!psl)
psl = psl_builtin();
> another couple thoughts:
>
> --------------
> what if passing NULL to any of the functions that require a psl_ctx_t
> would automatically use the psl_builtin?
>
> that would turn:
>
> psl_is_public_suffix(psl_builtin(), domain);
>
> into:
>
> psl_is_public_suffix(NULL, domain);
Basically a good idea... but i have to check if builtin data exists before I
call psl_is_public_suffix(), else I possibly get a misleading 0 as return
value.
These kind of simplifications tend to provoke erroneous code.
> --------------
>
> Also: what if we had a non-threadsafe function like the proposed
> psl_builtin_or_update() that initialized a single new builtin psl object
> at runtime on the basis of the local filesystem copy of the builtin file
> being updated?
It wouldn't create or return a new builtin psl object. It either returns the
builtin psl or a new dynamically created. That makes such a function thread-
safe.
> --------------
>
> I'm just thinking here trying to streamline the use case of the common
> user who just wants to use the latest system copy, if we can't keep the
> the C library updated with the same frequency as the baseline package.
>
> I'm not sure about the tradeoffs for runtime efficiency
> vs. packaging/distribution efficiency.
>
> any thoughts?
Thanks for your ideas.
I also think, something must be changed to pass the frequent PSL changes to
applications.
Aggregated from above
1.
if (psl_builtin_outdated())
psl = psl_load_file(psl_builtin_filename());
We could add some code examples and/or 'best practice' into the docs.
2.
Maybe we could merge the publicsuffix and libpsl into one source package ?
That's for Debian only, I guess. That of course only makes sense if the PSL
data is not updated too often (once a week seems ok, daily updates makes it
inefficient).
3. (new point)
Just in theory. A web service, like e.g. OCSP responders (for certificate
revocation checking). :-) Maybe the DNS system would be able to give such
information ?
Best Regards, Tim