handling changes to distro updaters

Paul Aldridge

unread,

Mar 23, 2023, 6:47:46 AM3/23/23

to clair-dev

Recently updated clair from v4.4.4 (with claircore v1.4.4) to v4.6.0 (with claircore v1.4.17) and had some questions about distro changes.  
The dynamic updater changes, for ubuntu and debian, also changed the name of the updater name format from `debian-stretch-updater` to `debian/updater/stretch`. Is the intention for both to be active in the database? and which will be matched on? It looks like some vulns have duplicated in the database but not all, e.g. (see screenshot)

 Basically wondering if the old style updater content could or should be purged from the database or if it needs to be left. We also use the updater_status table to alert if any updaters are failing to run over time, so now it rightly shows the old style are no longer running, so removing record of them would help that as well.

I think most of these changes are in claircore v.1.4.5: https://github.com/quay/claircore/releases/tag/v1.4.5

One other thing I noticed was that support for ubi9 was introduced with the sqlite changes. I’m finding that new ubi9 images are scanning ok, but existing images that had already been scanned were not refreshing the scan and therefore not working - is there something to do to cause a rescan in situations like this?

Screenshot 2023-03-23 at 09.57.22.png

Joseph Crosland

unread,

Mar 24, 2023, 11:16:13 AM3/24/23

to clair-dev

Hi Paul, thanks for the message. Yes the change to the updater names can cause duplicates and those old style updaters can be purged from the DB (I believe if you delete the update_operations with the old updater names the associations should be cleaned up during the next GC cycle).

Regarding the manifests that fall into the abyss before support is added: I'm not familiar with your submission mechanism but you can always use the delete manifest endpoint to delete any manifests you determine need to be re-indexed. It's worth noting that this problem is often eventually worked out if you are using the /index_state endpoint to determine when to rescan as when a scanner's version changes the /index_state changes and any resubmission will force a re-indexing with that scanner (note: only the updated scanners will be re-run).

Let me know if that makes sense,

Crozzy

Paul Aldridge

unread,

Mar 27, 2023, 11:40:44 AM3/27/23

to clair-dev

Thanks for the reply Crozzy. Ok great thanks for explaining. If we didn't purge from database would it match on both? It looks to me like some vulns are being duplicated by the new updaters, but others are not being added, I assume because they exist from the old updater. For example, in the screenshot for jessie it the same vuln for old and new updater style, but stretch only has the one (and I'm not seeing any vulns in our database with the new style stretch updater, even though I can see update operations under that style name - I think this is because the vulns it's inserting are already in under the old updater name, based on me finding those vulns when looking in the db).

Aha that is interesting about index state thank you! Am I understanding the flow correctly like this: after a scanners version has changed any manifest re-submission will cause a re-index (without having to separately check /index_state or do anything with that)? But we could also check the index_state endpoint to inform us that we need to re-submit (some?) manifests for potential re-index?

Thanks

Paul

Paul Aldridge

unread,

May 5, 2023, 7:44:19 AM5/5/23

to clair-dev

Hi Crozzy,

We’ve been looking at bit more at a re-indexing strategy. Previously we would index a manifest once and they’d be no mechanism to resubmit it again.

From what you said above am I correct that if we track the index_state and re-submit all manifests for indexing when this changes, that will successfully re-index without having to use delete_manifest?

 We are quite keen to have a way of knowing what index state each manifest was indexed at, so that we have awareness if all are up to date, how far through re-indexing we are, and can confirm that everything was successfully re-indexed. Is that something you’ve considered before or would be open to? We were thinking ideally as part of the index report so that it persisted, but it could be returned as a header or similar from the indexer scan request. We can’t really use the index_state report endpoint for this as it would mean making a 2nd request before/after the scan request, which might hit another instance of a clair indexer, which could be on a different index state (e.g. part way through a deployment of a new version of clair).

It would also be really interesting to know your re-indexing approach to see if it’s something we can learn from.

Thanks,

Paul

Joseph Crosland

unread,

May 5, 2023, 2:46:44 PM5/5/23

to clair-dev

Hi Paul, Clair itself is purposely does not have a mechanism for re-indexing, we just surface the index_state. In Quay's case the process looks something like:

There is a table that holds indexing statuses (manifest, error/success, index_state)
When the security worker starts it grabs the /index_state and looks up all manifests that don't have that state
Quay submits those to Clair and updates the DB row with the new index_state (gleaned from the original call to /index_state)

Pretty much how you described. I think it's a good idea giving the caller the index_state as part of the index_report request to better bind the state to the report and avoid the "rollout-cornercase". I haven't looked but I think getting the state in context wouldn't be difficult, the question I would have is: Should it be a part of report creation and report retrieval? If that's a yes, then I think it would need to be a part of the index_report so it is persisted. That would also give one the ability to see from a Clair context how consistent manifests are with the current state (although you might have to parse JSON depending on how it'd be implemented).

Thanks,

Crozzy

Paul Aldridge

unread,

May 10, 2023, 4:34:09 AM5/10/23

to clair-dev

Great info thanks Crozzy!

Your process looks similar to what we had in mind - recording each manifests index state in a table, and using that to trigger re-indexes.

I think it's most important to have the index_state returned from the report creation request, but can see it would be nice to include the index_state at the time of report creation when an index report is retrieved. That would mean users wouldn't be responsibly for separately storing manifests index states like we're currently doing to support re-indexing. So I think including it in the index_report makes good sense.

Is this something you're happy for us to start working on a change for? Shall I write this up in an issue?

As an aside, it might even be nice if indexers returned their current index state as part of any request, depending on how complex that is, just for more exposure - it could be logged, or used to compare against index report retrieval so see if a re-index is needed.

Thanks,

Paul

Joseph Crosland

unread,

May 10, 2023, 5:15:08 PM5/10/23

to clair-dev

I would like to get some more input so I created a discussion here: https://github.com/quay/clair/discussions/1744 feel free to add anything

Paul Aldridge

unread,

May 11, 2023, 5:31:34 AM5/11/23

to clair-dev

Nice one thank you

Reply all

Reply to author

Forward