Hi Aman,
My original philosophy of testing was that testbenches are very "expensive" to maintain, so having a few testbenches with a ton of tests is the right approach. This is why 1 TB was originally allocated for FE, BE, ME and TOP. Over time, I've softened on this: unit tests are valuable for raising code coverage while integration tests work better as a nightly regression. I still believe that building testbenches along well-defined interfaces is the best way to avoid accidental decoupling.
My understanding of LLM capabilities here have changed over the last year. A year ago, I would have said it's unlikely to actually speed up the debug process. Now, I believe they may be useful in two distinct usages:
- Spinning up an isolated testbench for a specific issue. I have heard reports of agents being presented a bug, generating a testbench, and reading dumps to find bugs. This seems a great use case as we throw away the code after identifying the bug.
- Generating scenarios for existing UVM-style testbenches. This is where I would like to explore after the GSoC period. Once the testbenches are solid and industry standard, I believe LLMs could generate code specifically for exposing certain behaviors, enhancing coverage etc.
I still don't trust the LLM to generate the testbench itself. And as GSoC is primarily for learning, I expect the student to handwrite it.
> From your experience — is manual testbench writing actually the bottleneck, or is the harder problem somewhere else in the verification flow?
So to answer this more directly, testbench maintenance and manipulation are the current bottlenecks. Populating the scenarios in an efficient way that exposes the most coverage is still difficult and unsolved.
Good questions, and I don't claim to be the authority on these issues so I welcome being proved wrong by useful tools!
Best,
-Dan