On Mon, 16 Aug 2021, Josh Bressers wrote:
> On Mon, Aug 16, 2021 at 1:38 PM Ariadne Conill <ari...@dereferenced.org
> GNU grep 3.7 was released over the weekend with this bugfix:
> This was a performance regression caused by a defect in the way patterns
> were being cached.
> For example, the command `: | grep -Ff <(seq 6400000 | tr 0-9 A-J)` was
> characterized as taking multiple days to complete with affected versions,
> instead of a few seconds as with non-affected versions.
> This is obviously a security impacting issue, because CGI scripts and
> whatnot which use grep can possibly be abused to exhaust CPU on a host.
> But MITRE, as far as I understand their policies, are unlikely to issue a
> CVE for this bug, even though it needs to be handled like any other kind
> of security update.
> Should we track these kinds of bugs in UVI?
> I think yes. I would also compare this to the fuzzing findings from OSV. The legacy vulnerability databases are constrained by their human handlers, so they have to be overly selective, this is why
> UVI was created.
> The question becomes how to track and handle these. Ideally we don't want humans gatekeeping anything. The DWF project had a web form anyone could fill out to request an ID (it's still in the
> uvi-tools repo). I'm not against this approach, but I would prefer something more automated. For example, there are probably plenty of other commits in the grep repo
> If we search for performance I see a few others that I think are comparable to this.
Right, exactly, we want to try to map out the entire software defect
ecosystem. Humans can enrich the data already ingested by adding
additional tags and metadata, such as labelling it as a security-impacting
And that, too, can be automated: if a distribution releases a security
update with UVI-2021-XXXXXXX as an associated identifier, then obviously
that record can be automatically updated to be a security issue.
> I've been thinking about ways to automate digging through git commits for certain keywords. Many projects would have different keywords. "performance" for grep matters but it might not for the
> compiler. Another favorite example is this query
> 3.6 million github issues that mention "prototype pollution". There are a bit over 200K CVE IDs ever handed out right now. Even if only 10% of those are problems, that's more IDs than we've ever
> issued up to this point.
> Of course how to consume all of this data is another problem, but we can't really figure that out before we have data :)
This also sounds really interesting.