Parsing/Codegen GSoC Project

94 views
Skip to first unread message

Tirthankar Mazumder

unread,
Mar 28, 2023, 12:59:28 PM3/28/23
to sympy
Hello everyone, my name is Tirthankar Mazumder and I am a third year undergraduate mathematics student in the Indian Institute of Technology, Bombay. This year, I would like to do a GSoC project in either the parsing or codegen submodules of SymPy.

I've looked at the project ideas for both submodules, and also the previous work done in those areas. I saw that there was an early GSoC project in 2010 by Øyvind Jensen which essentially gave birth to the autowrap stuff in the sympy.codegen submodule.

Along with this, there were two more recent projects in 2015 and 2019 respectively, by Ankit Pandey and Nikhil Maan. Ankit added functionality for generating Fortran code from the equivalent SymPy code using LFortran, and also did some work in optimizing certain matrix codegen operations. Nikhil setup the framework for and wrote most of the current C and Fortran parsers.

With this prior work in mind, I want to ask about three things to the community:
1) Is there a mentor for this project, or someone willing to mentor for this project? I know that last year, Anurag had a very strong proposal which was unfortunately not accepted as a GSoC project due to a lack of mentors.
2) Even though I have used SymPy over the past two years for some assignments and other projects, I have not used much of SymPy's codegen or parsing capabilities beyond lambdify. What kind of use cases are these submodules usually used for, and what kind of improvements (bug fixes, feature requests, more documentation, etc.) would the community like to see in these areas?
3) What are some of the possible GSoC projects could I do for these two projects? (Note that while I am asking about information for both submodules, I am fine with just contributing to one of these submodules.)

The GSoC ideas page mentions that a lot of work has been done in the intervening time between now and the previous GSoC, and hence to ask about the current state of affairs in these submodules.

parsing:
In the C parser, there are currently a few bugs (one of which I have been working on), and the transform_string_literal node code is left as a TODO. One of the things I could do as a part of a parsing related GSoC is to fix these issues.

Along with that, I could also perhaps add another language to the parsing submodule (like, say, Rust/Julia/C++), but is there a demand for that? Would it be worth the developer and maintainer effort to do something like that? Are there other parsing related projects I could do?

codegen:
While I admit that I didn't poke around in the codegen submodule too much (and hence am not too familiar with the current state of affairs), the GSoC ideas page mentions that adding support for more fnodes for Fortran, adding support for OpenMP directives, and looking into optimized matrix codegen calls could be a solid GSoC project. What are your thoughts on that?

Alan Bromborsky

unread,
Mar 28, 2023, 3:02:42 PM3/28/23
to sy...@googlegroups.com

If you are interested in codegen for different languages you might look at Asymptote Code as a target -

https://asymptote.sourceforge.io/

The coding language is close to C/C++.  Look at the galleries to see what you can plot (output can be eps, pdf, webgl, html, etc.).  Here is my favorite -

https://asymptote.sourceforge.io/gallery/3Dwebgl/Klein.html

You can rotate, zoom, and pan the image with your mouse.  Here is the wiki page -

https://en.wikipedia.org/wiki/Asymptote_(vector_graphics_language)

--
You received this message because you are subscribed to the Google Groups "sympy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sympy+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/e6fe81be-fb43-4ffd-b189-c6682e31b217n%40googlegroups.com.

Aaron Meurer

unread,
Mar 28, 2023, 5:46:22 PM3/28/23
to sy...@googlegroups.com
There's been a lot of progress on the codegen module. I think you missed this GSoC project which created it https://github.com/sympy/sympy/wiki/GSoC-2017-Report-Bj%C3%B6rn-Dahlgren:-Improved-code-generation-facilities. We also currently have Sam Brockie working on codegen stuff (as well as physics.mechanics) stuff as part of our CZI grant. The good news though is that there is plenty to do, so having multiple people working on this is definitely welcome. It might be a good idea to sync up with Sam if you are interested in working on codegen to see what a good project might be.

If you're interested in *any* parsing related project, I'd say the highest priority project relating to parsing right now is to work on the LaTeX parser, which is by far the most popular parsing tool in SymPy. We need to rewrite it so that it uses a much lighter dependency than antlr. Lark has been suggested, but we are open to other ideas as well. I would also like to see it be possible for users to extend the parser at runtime, which is currently impossible with the antlr parser. General improvements to its parsing capabilities are needed as well (search the issue tracker for issues related to parse_latex).

I would try to pick one or the other, parsing or codegen. While they can be related, they are disjoint enough that they shouldn't be mashed together into a single project proposal.

You may also be interested in lfortran, which is a different GSoC organization (https://summerofcode.withgoogle.com/programs/2023/organizations/fortran-lang), but which has close ties to SymPy (Ondřej Čertík the creator of lfortran was also the original creator of SymPy). 

Aaron Meurer

--

Tirthankar Mazumder

unread,
Mar 30, 2023, 4:03:29 AM3/30/23
to sympy
On Wednesday, March 29, 2023 at 3:16:22 AM UTC+5:30 asme...@gmail.com wrote:
There's been a lot of progress on the codegen module. I think you missed this GSoC project which created it https://github.com/sympy/sympy/wiki/GSoC-2017-Report-Bj%C3%B6rn-Dahlgren:-Improved-code-generation-facilities. We also currently have Sam Brockie working on codegen stuff (as well as physics.mechanics) stuff as part of our CZI grant. The good news though is that there is plenty to do, so having multiple people working on this is definitely welcome. It might be a good idea to sync up with Sam if you are interested in working on codegen to see what a good project might be.
Ah, you're right, I did miss that 2017 GSoC project 😅 
If you're interested in *any* parsing related project, I'd say the highest priority project relating to parsing right now is to work on the LaTeX parser, which is by far the most popular parsing tool in SymPy. We need to rewrite it so that it uses a much lighter dependency than antlr. Lark has been suggested, but we are open to other ideas as well. I would also like to see it be possible for users to extend the parser at runtime, which is currently impossible with the antlr parser. General improvements to its parsing capabilities are needed as well (search the issue tracker for issues related to parse_latex).
That's actually a pretty interesting project idea. The fact that it's widely used is a great motivator for me.
 
I would try to pick one or the other, parsing or codegen. While they can be related, they are disjoint enough that they shouldn't be mashed together into a single project proposal.
That sounds like a good idea to me.

Tirthankar Mazumder

unread,
Mar 30, 2023, 4:04:41 AM3/30/23
to sympy
I guess I'll write up a GSoC project proposal for improving the LaTeX parser and send it here soon.

As for mentor availability, do you have any ideas about who might be a good mentor for this project?

Aaron Meurer

unread,
Mar 30, 2023, 4:06:17 AM3/30/23
to sy...@googlegroups.com
On Thu, Mar 30, 2023 at 2:04 AM Tirthankar Mazumder
<greenw...@gmail.com> wrote:
>
> I guess I'll write up a GSoC project proposal for improving the LaTeX parser and send it here soon.
>
> As for mentor availability, do you have any ideas about who might be a good mentor for this project?

If we decide to accept the project, we'll decide on a mentor for it.
This is a project that most likely any of the listed mentors could
mentor. It really depend on which other projects are accepted, since
each mentor generally only mentors one project. For now, it's best to
just discuss the idea here on the list.

Aaron Meurer
> To view this discussion on the web visit https://groups.google.com/d/msgid/sympy/91f8fa32-2c08-47a2-a329-f20f1f4c7795n%40googlegroups.com.

Tirthankar Mazumder

unread,
Apr 4, 2023, 12:40:01 PM4/4/23
to sympy
I've written up a proposal, which can be found here:
GSoC Project Proposal - SymPy.pdf
Reply all
Reply to author
Forward
0 new messages