Hi ,
My name is Lavanaya. I am second year B.Tech student from IIT Kanpur.
I have spent the last few weeks into the SymPy codebase to get familiar with the workflow and have been working on some of the PRs:
PR #29338: Cleaning up matrix delimiter logic and adding type hints in sympy/parsing/latex/__init__.py.
PR #28806: Adding strict type hints to dense matrix rotation functions in sympy/matrices/dense.py.
Through my work on the LaTeX parsing module, I came across Tirthankar’s 2023 GSoC report and the master tracking issue (#25365) for the Lark LaTeX parser. I would love to help pick up where this left off for a (Medium) GSoC project, and I wanted to start a discussion with the community and potential mentors to see if my thinking aligns with the project's current goals.
To keep the scope of the project realistic, I was thinking of focusing on the two major challenges mentioned in the "Future Work" section:
Ambiguous Expressions (Issue #25482): Differentiating multiplication from function calls (e.g., f(x) vs f * x).
Context-Sensitive Differentials (Issue #23551): Properly parsing dx within integrals without confusing d and x for standard variables.
Before I start drafting a formal proposal, I would love to get your thoughts on a couple of things:
Do these issues feel like the right priorities to help us eventually deprecate the ANTLR backend?
For the context-sensitive differentials, Tirthankar mentioned that standard CFGs struggle here. Is there a general preference in the community for a "two-pass" parsing approach (e.g., walking the AST post-parse to bind d and x inside integrals), or should we try to resolve this directly within the Lark grammar or Transformer?
I would really appreciate any guidance or thoughts you have.
Best regards,