Hello SymPy community,
I am preparing a GSoC proposal for the project “Code Generation: Efficient Jacobian and Hessian Evaluation for Optimization and ODE Integration” from the ideas page, and I would really appreciate feedback on whether my current direction is reasonable.
I have been studying the previous work around this idea, especially the earlier PR
#25801, the PR
#26773 (_forward_jacobian), and the mailing-list discussion from last year about Jacobian/Hessian code generation for ODE/optimization.
My current understanding is:
1. forward_jacobian already provides a faster Jacobian construction path for large systems, but it is still an internal building block rather than a complete user-facing codegen pipeline.
2. The larger remaining gap seems to be integrating this kind of derivative computation with lambdify / codegen workflows in a way that avoids unnecessary expansion and repeated work.
3. For this project, it probably makes more sense to solve the dense Jacobian/Hessian workflow first before thinking about sparse structure or larger backend changes.
Based on that, my current proposal idea is roughly:
1. Build on the work from
#26773 and extend the CSE-based derivative workflow from first derivatives to second derivatives, so that Hessians can also be constructed from the CSE representation.
2. Provide SciPy-friendly numerical outputs for optimization and ODE workflows, i.e. functions that return Jacobians/Hessians in ndarray-compatible form.
3. Investigate whether a symjit-style execution backend could be used after the symbolic/CSE derivative planning stage to accelerate numerical evaluation.
I am currently thinking of this as an optional backend direction rather than the core deliverable.
My main questions are:
1. Does this overall scope sound reasonable, or should I narrow the proposal further and focus first on the dense Jacobian + codegen/lambdify path?
2. Is extending the CSE-based workflow from Jacobian to Hessian the right architectural direction, or would maintainers prefer a different approach?
3. Is it reasonable to build this proposal around cse(), or are there important limitations / edge cases that I should explicitly account for?
Thank you very much for your time and feedback.
Evan