Hi,
Thanks a lot for your responses. For some reason this thread was not subscribed, and I didn't get any email notifications. So I thought this post went unnoticed. Thanks a lot for your feedback and sorry for me not responding back immediately.
@Mike,
One question might be is it better to report many vulnerabilities through multiple paths. And thus over report, but provide accurate locations. Or to only report each vulnerable package once, on the assumption that once updated it will 'fix' all the vulnerabilities. What do you think of each approach? - The usability aspect of presentation of the alerts was not a goal of this study, therefore, I won't have much to comment on that. Anyways, no matter what way you group the results, I think there should be some way to know about each detail. As different vulnerabilities in the same package can get fixed in different version, developers may individually assess the risk if for some reason they don't just want to adopt the latest release.
Also the commercial tools allow false positives to be 'marked' or 'accepted', so that they are excluded from future runs. This isn't something that free tools I've seen currently provide. I'd be interested to hear of any that do. - GitHub has an option to dismiss alerts under various reasons. Also, Jeremy pointed out a way to suppress with OWASP DC.
Are there any plans to do a similar analysis of dynamic analysis tools? - Are you talking about DAST tools like a fuzzer or SCA tools that incorporate dynamic analysis? For the former, evaluating DAST tools is not in my research plan. But I think there is already quite a lot of work been done on dynamic analysis tools. If you want, I can try to find some references.
@Jeremy,
However, FP and FN are always an issue. - One thing we pointed out in this paper is how OWASP DC inflates alerts by reporting the same CVE over multiple related packages. This is probably due to relying on the CPE configuration presented in the CVE data. This discussion is presented in Section 5.1. As one example, CVE-2014-3625 was only
reported for spring-webmvc by MSV, Snyk, Steady, and Commercial A. However, OWASP DC reported this CVE for five
separate spring packages. I would appreciate your opinion on this matter as we're currently revising the paper.
Regarding False Negative, what do you think are possible reasons besides an incomplete database? Could there be dependencies used in a project that tools are not able to pick up? What should a good tool do in that regard do you think?
Also, I can tell that is OWASP DC does not maintain its own vulnerability database like Snyk or GitHub, rather only incorporates data from third-party sources. Is it right?
@Steve,
There’s some great research in here. Kudos. - Thanks a lot. As a student, this is really encouraging for me.
I am not sure if I have fully understood your comment. Can you elaborate more on the term "component identity"? Does it simply mean the list of components/dependencies (inventory) ? In the study, we assume the tools would do both the things - 1) identifying the components, 2) and reporting if they contain any known vulnerabilities based on some vulnerability database. Our initial assumption was that most tools would output the same results (before a further advanced vulnerability analysis), however, we found that there is difference even in the basic cases.
you’ll also be able to account for newer, non-SCA tools in the mix, which don’t have to make educated guesses on component identity, - Can you elaborate how we can incorporate non-SCA tools? Are you referring to the part where some tools perform reachability analysis which can be done by non-SCA tools as well if the vulnerability report were already known?
I agree with the limitations on evaluating reachability through both static and dynamic analysis. Code analysis has many pitfalls, and we probably need further research to understand how acceptable is such analysis to the developers when assessing risk of vulnerabilities in dependencies.
The paper incorporates a lot of different types of tools, with varying methodologies, and analysis methods. As a potential follow-up idea, it may be really interesting to see how these differences benefit certain use cases or industries. - Thank you for the suggestion. Do you think interviewing developers on what analysis methods fit well in what use cases can be a starting point to do such research?
Thanks,
Nasif