GSoC 2026 Interest: AI-Powered XPath Generator

52 views

Skip to first unread message

Daksh Jain

unread,

Feb 21, 2026, 7:42:53 AMFeb 21

to checkstyle-devel

Hi everyone,

I'm Daksh, a Checkstyle contributor since Hacktoberfest 2025. I've been following the XPath Generator project with a lot of interest and wanted to share where my head is at before writing my proposal.

I spent time going through the existing PoC and I think I understand why a complete redesign makes sense, it's missing a validation layer and the prompt structure is pretty basic. My thinking for the redesign is to use a RAG-based approach, where we build a knowledge base of known correct XPath suppressions and AST patterns, so the LLM has relevant examples as context rather than generating blindly. I think this would dramatically improve precision.

Two things I wanted to get clarity on before I go too deep:

Should the solution be Java-native or is a Python LLM service with a Java wrapper acceptable? And does the RAG direction make sense to you, or do you have a different accuracy improvement strategy in mind?

Would love any feedback asap.

Thanks,

Daksh R Jain

GitHub: github.com/DakshRJain737

Reply all

Reply to author

Forward

0 new messages