Anyone interested in writing a "Cantera" in Julia?

312 views
Skip to first unread message

Weiqi Ji

unread,
May 18, 2020, 2:00:28 PM5/18/20
to Cantera Users' Group
Hi,

I am thinking of writing a mini-version of Cantera in Julia. Based on my experience, calling cantera in Julia via PyCall seems to be unstable. Therefore, it might be good to have a chemical kinetics wrapper purely in Julia. I have wrote a PyTorch implementation before. With cantera's YAML mechanism, it is very smooth to parse the mechanism file.

Our motivation is mainly to explore the adjoint sensitivity analysis of DifferentialEquation package in Julia. Especially, for the ignition delay time. To give you a role of thumb. To analysis a large mechanism, say ARAMCO (492 species, 2716 reactions), comprising thousands of reactions. Using forward continuous sensitivity analysis in Cantera might take 30 mins. Using adjoint sensitivity, I guess the time could be reduced to below 1 mins, so that one to two orders of speedup. Then, a lot of gradient descent based optimization algorithms can be applied, which has been successful in optimizing neural network.

I have just started to working on it and I have just made a repo for it, see https://github.com/DENG-MIT/reactorch.jl

If you are also interested into it, it will be great to team up.

Ray Speth

unread,
May 26, 2020, 11:55:54 AM5/26/20
to Cantera Users' Group
Hi Weiqi,

I'm not sure why using Cantera through PyCall would be unstable, but I'd suggest that the preferable approach would be to skip the Python interface entirely and access the Cantera C++ code more directly through Julia's `ccall` mechanism (calling the C-compatible Cantera functions defined in `clib`). We are planning on providing an interface that works this way as part of an overhaul of some of the external language interfaces (see this enhancement proposal), but building a wrapper for just a few key functions manually should be fairly simple.

I'm glad you're finding the YAML format useful -- being able to use the format in tools outside of Cantera is certainly one of the intended benefits of moving away from the CTI format.

I agree that adjoint sensitivity analysis offers a great opportunity for improving performance, especially alongside the other integrator enhancements that we are already working on (such as this enhancement proposal). Rather than rewriting Cantera from scratch, the approach that I've had in mind for a while is to use an automatic differentiation library such as Adept to get the necessary analytical derivatives, and then use the adjoint sensitivity analysis capabilities of CVODES.

Regards,
Ray

Weiqi Ji

unread,
Feb 20, 2021, 12:22:49 AM2/20/21
to Cantera Users' Group
Hi Ray,

Some updates on this thread, which might be also useful for future Cantera development.

For the Julia-python interface, I believe it will be gradually improved. Although I personally always feel pain about it. 

Instead, I have written a native Julia package that computes the reaction source term. The code is in a pretty early phase at https://github.com/DENG-MIT/Arrhenius.jl. But it is now sufficient for most zero-d simulations, and we can hybrid a neural network into the reaction model and train the neural networks using backpropagation. The coding is relatively short since it relies on Cantera to interpret the reaction mechanism for now.

Best,
Weiqi

Ingmar Schoegl

unread,
Feb 20, 2021, 9:42:52 AM2/20/21
to Cantera Users' Group
Dear all,

While I have no background in Julia (at least not yet), I have some questions at this point that I would like to put out there. I do understand that Julia is popular in data science and there are useful tools that make a reimplementation of core functions desirable. It's also great to hear that it appears to be possible to write some wrappers 'in a couple of hours' (see other post).

However, from a community perspective I am really afraid that some really useful initiatives (like Arrhenius.jl) won't be supported long term, and end up as 'yet another abandoned project'. It's certainly a big step up from before opensource really took off, where all of us wrote our own code. Out of personal *painful* experience, maintenance will become inhibitive. As a somewhat related example, all of my code that I wrote as a grad student became obsolete when cantera changed its Python interface to cython (and/or Cantera->cantera). So - at least from my perspective - it would be really useful to start a discussion on how to get Julia capability for Cantera on Cantera/enhancements. Personally, I hope to see another subfolder named 'julia' in cantera/interfaces/, at which point I can rest assured that it will be maintained for the long run.

On a related matter, now that Cantera 2.5.1 is finally out, I believe there's an opportunity to start designing some currently missing core functionality from the ground up. It appears that a flexible interpretation of reactions keeps coming up on the user group and elsewhere (see other recent posts, discussion of surface reactions, etc.). Whether this is interpreted by C++, Julia (and how it's implemented) is another point, but it may make sense to address some common features.

Just 2 cents from another contributor,

-ingmar-

Weiqi Ji

unread,
Feb 20, 2021, 10:28:29 AM2/20/21
to Cantera Users' Group
Hi Ingmar,

Thanks for sharing your great perspective! To be honest, I am also relatively new to Julia and I am constantly learning it throughout research code. Here are some of my temporal thoughts on those points.

It is so true that a lot of initiative packages will be phased out, even backed up by a big name, just like a startup. For the story of Arrhenius.jl, I was intended to write and I have spent quite a lot of time finding existing solutions. It is really difficult to come up with a new package while there existing a lot of great ones, like Canetra Chemkin Pro (Ansys) Converge, etc. I believe we need some killer apps to motivate a new app. That's to say if, in Julia, it runs two times faster, people may not buy it. Not to mention that it is not easy since the language of C++ is fast enough. The chemistry acceleration, like mechanism reduction, QSSA, better ODE solver with pre-condition/sparsity, etc can be built on top of Cantera as well.

What I see as something promising with Julia is to bring in more data science algorithms, like machine learning, auto differentiation, neural networks, to do some tasks that we can not do in the past. Those directions are actually once popped out twenty years ago in combustion and popular again thosedays. But, I guess too many people in combustion, it is still unclear how far this wave could travel. Therefore, the challenge of discussing what to make with Julia is a little bit different from the past when we know what the task is. But the good news is that TensorFlow and PyTorch have already told us what's the core functionally for machine learning. I am trying to build some killer apps of hybrid existing kinetic models with neural network models for missing species and pathways for modeling complex systems like condensed phase fuels, surface chemistry. In such a task, we must have to enable auto-differentiation over the entire reaction source term. Quite a lot of ML research is learning sub-models inside a big model. Those research directions could be the driving force.

My personal experience with Arrhenius.jl is that implementing equations is relatively easier compared to interpreting reaction mechanisms. Not even to mention defining a standard for mechanism files, like YAML. This could add to your points on more efforts into the functionality of Cantera's inflexible interpretation of reactions and exchanging information with other platforms.

Regards,
Weiqi

Ingmar Schoegl

unread,
Feb 20, 2021, 10:53:03 AM2/20/21
to Cantera Users' Group
Hi Weiqi,

There's no doubt that ML will be highly important. I've seen some really great work on kinetics submodels using neural networks just the other week. It's indeed great to see efforts that build on cantera, so please take my comments that take the community perspective not as a sign of disrespect.

-ingmar-

Ingmar Schoegl

unread,
Feb 20, 2021, 11:29:36 AM2/20/21
to Cantera Users' Group
PS: to expand on my point. There are already numerous great projects out in our field that unfortunately are of limited use as they're simply too cumbersome for various reasons (complex compilation toolchains, lack of development and/or documentation are some of them). Take https://github.com/LLNL/zero-rk (orders of magnitudes faster than Cantera, happens to use an ancient version of cantera's chemkin parser), https://github.com/SLACKHA/pyJac (implements much-needed jacobians), https://github.com/speth/ember (transient solvers), just to name a few.

Weiqi Ji

unread,
Feb 20, 2021, 11:51:26 AM2/20/21
to Cantera Users' Group
Those examples echos the point that we need something that is inevitably unique. For instance, I used ember in one of my recent works https://www.sciencedirect.com/science/article/abs/pii/S0010218020302455 since I have to compute the sensitivity of extinction strain rate robustly. Then, there are no alternatives that I can use (Chemkin can do that, but I don't know why it just not working).

(I am not really a CFD expert, please correct me if I am wrong). For zero-RK and pjJac, the major motivation is for acceleration. I guess acceleration is more attractive to people from CFD rather than people developing kinetic models. It is not so beneficial to run ignition delay simulations faster from 2 mins to 12 seconds. For CFD communities, I believe zero-RK and pyJac are definitely very attractive. I have also advertised them to people who asked me if there are tools to compute Jacobian. The challenge in the CFD community is that commercial software has those functionally already, like CONVERGE. So there are some good alternatives there. But I have very good faith that zero-RK and PyJac will be gradually adopted in the OpenFOAM communities. By the way, the tricky thing is that the combustion CFD community is still far away from a community that endorses open source. So those packages are sometimes being used behind the scene and not reward top the ecosystem.

Therefore, we could continuously think of features that are unique killer app in addition to features that does similar jobs as existing software.

Ingmar Schoegl

unread,
Feb 20, 2021, 12:09:44 PM2/20/21
to Cantera Users' Group
At least to me personally, Cantera already is the 'killer app' that helps in our field of research. It provides an extensible framework that you can easily 'hack' into as needed (which is where some of the other examples fail). I disagree that Jacobians, fast simulations, and the ability to truly run in parallel are only of interest to the CFD community. Some of the routine examples that are included in the code date back many years, and could benefit from being overhauled. None of this is glorious work of course.

-ingmar-

Weiqi Ji

unread,
Feb 20, 2021, 12:14:47 PM2/20/21
to Cantera Users' Group
Aha, indeed. Many good words to Cantera, which is so successful that I even did not think of it in the line.

Ray Speth

unread,
Feb 20, 2021, 12:42:51 PM2/20/21
to Cantera Users' Group
Hi all,

I think this is a good discussion to have (never mind the rather provocative thread title). Like Ingmar, I've also noticed the proliferation of codes that, are at least at their starting point, reimplementations of a subset of what Cantera or Chemkin does. For me, the questions that arise from this are (1) what motivates these projects to start over from scratch, rather than leveraging the work that's already been done in Cantera and (2) what can be done to make it more attractive for others to build capabilities on top of Cantera and make those contributions part of Cantera. Of course, Cantera itself initially reimplemented a lot of what Chemkin did, but in that case, the rationale is quite clear, since an open source package couldn't be based on Chemkin. For my own part, Ember is a bit different -- it builds directly on top of Cantera, and the main reason I never attempted to make it a part of Cantera was to avoid introducing several new dependencies to Cantera.

It's less clear to me what drove the choices made for some of the other packages. I know the difficulty of extending Cantera models due to the current requirement that they be written in C++, which is not a challenging language to work in for many scientific users, is part of it. I think this is a requirement that can be relaxed, and it should be possible to allow users to provide various models such as reaction rates or reactor governing equations in other languages. I have been planning to introduce a version of this to allow some model code to be provided in Python, but I the concept could be extended to any language that provides an interface to be called from C/C++ (including Julia). Another common thread among these projects is the desire to have better access to Jacobians / gradients / other derivatives for the purpose of using more sophisticated numerical algorithms, whether those be better ODE integrators or various ML applications. I think this could be achieved in Cantera using automatic differentiation tools for C++. While this would probably be a fairly invasive change throughout the library, I think it would be easier than starting over from scratch in Julia, and would still make it possible to use Cantera as a library from a wide range of other languages.

I suppose on each of these topics, I have some enhancement proposals that I should really get around to writing up in more detail.

Regards,
Ray

Ingmar Schoegl

unread,
Feb 20, 2021, 1:21:29 PM2/20/21
to Cantera Users' Group
Hi all,

Thanks for your thoughts, Ray. From my perspective, the difference between the approaches is scope/duration of work. It's relatively simple to implement things for an individual paper, a PhD/postdoctoral project, or even some 5-year initiative and just push the code to GitHub (which is a huge improvement over back in the day when things collected 'dust' on some hard drives). The hard stuff starts when it comes to 'future-proofing' your code for the long haul, i.e. making sure that code is extendable, write documentation, unit tests, etc.. Understanding the code base of established projects, figuring out how what you need fits, and eventually getting PR's merged takes effort and time (not just on the side of the contributor); bypassing this process probably more time-effective. While I believe much of the short-term work that is being done is excellent, I don't think that those repos will survive, and betting on being the exception is treacherous. Based on the limited amount of man-hours a single researcher (or small team of researchers) can invest, those efforts are doomed to fail. I mean this in the nicest possible way.

-ingmar-

Weiqi Ji

unread,
Feb 20, 2021, 1:52:44 PM2/20/21
to Cantera Users' Group

Hi Ingmar,

I fully agree with your point on the cost. Also, I think this point applied to many communities out of combustion communities as well. It is a pretty common phenomenon of scientific computing.

I would like to mention some good sides of those potential temporal packages. For scientific communities, people ultimately value more on innovations, innovative ideas, algorithms, etc, this is a little bit of discouragement the investment in software. But meanwhile, developing code is less risky compared to developing algorithms, since we know it will definitely work if we put in enough effort. Therefore, those temporal packages are mostly served as proof of concept. If late on it becomes popular, we can then put more effort into the ecosystem. A good example is how the machine learning frameworks evolve, there were several ones like Torch eventually phased out. But they are not disappeared, instead, the core is now in PyTorch backend by Facebook. Similarly, the AutoGrad now in Jax by Google. Therefore, from the perspective of scientific innovations, those packages are of great success. Of course, it has to be open source.

One of the solutions to low down the barrier for a new package is to have a decentralized system. Similar to Ray's point, if we can easily build on top of Cantera, or treat Cantera as a module, the workload becomes much lighter. From this perspective, I would expect more discussions on what base Cantera can enhance to offer more opportunities for data science in combustion.

As a side note, a deep learning paper without source code is very tricky and is very challenging to reproduce (too many hyper-parameters could blow out the model). I believe this has substantially slowed down the pace of deep learning in combustion. So there is definitely an urgent need for an open-source machine learning package for combustion.

Weiqi

Ingmar Schoegl

unread,
Feb 20, 2021, 2:08:16 PM2/20/21
to Cantera Users' Group
I agree that there are legitimate reasons to 'move fast' (or putting code for a paper on GitHub for that matter). Regarding ML in combustion, I am aware of some excellent work at Argonne at the moment, so things are already in progress. There is likely much more out there, and we need tools to enhance this.

To Ray's point, I believe it's important to know what exactly is needed to make Cantera more flexible, and it's definitely worth having those discussions: please do engage in those discussions. If Arrhenius.jl is a temporal package that ultimately helps Cantera to remain the 'killer app' I believe it's a fair effort. Just re-coding some portions in Julia is imho just a proof of concept and not necessarily viable as an external package. A more central point is automatic differentiation, and there are numerous approaches as Ray mentioned: some are more helpful to the overall goals of the Cantera community than others. This is of course a personal opinion and says nothing about the quality of your work. It's more a philosophical point that I am trying to raise here.

-ingmar-

Weiqi Ji

unread,
Feb 20, 2021, 2:24:26 PM2/20/21
to Cantera Users' Group
Aha, I didn't interpret your discussions on judging the quality of those packages :). Apologize for my bad writing skills.

I am also very glad to see more progress in discussing the details of automatic differentiation in Cantera.

Ray Speth

unread,
Feb 20, 2021, 3:55:09 PM2/20/21
to Cantera Users' Group
Hi Ingmar,

Yes, you're right that the "last step" of actually integrating something with an established code base can be a challenge. I think it's worth encouraging people to at least start with what already exists, even if they never get through all those steps to see something merged into the core of Cantera (or any other mature project). There are a variety of forks of Cantera at this point that are what you mentioned, e.g. implementations of some feature for a specific project or paper. Some of those have gotten merged back into Cantera over the years, and others may get left behind, but at least there is some possibility of reintegration if someone has the time and interest to do so. But projects which start from scratch are far less likely to ever be something that can be integrated into Cantera, though, for reasons varying from clashing licenses to differences in code structure.

As an additional note, for those who are interested, I've now written up my plans to enable development of models written in high-level languages and to use automatic differentiation to calculate derivatives.

Regards,
Ray

Ingmar Schoegl

unread,
Feb 20, 2021, 4:22:59 PM2/20/21
to Cantera Users' Group
Ray,

Thanks for writing your thoughts up. I hope this is a starting point for fruitful discussions on Cantera/enhancements (and, of course, eventual implementation).

-ingmar-
Reply all
Reply to author
Forward
0 new messages