MemoryError: bad allocation

4,091 views
Skip to first unread message

H.C. Lee

unread,
Sep 14, 2013, 9:31:49 PM9/14/13
to canter...@googlegroups.com
Hi all, 

I am encountering "MemoryError: bad allocation", when I try to restore a previous solution solved using a smaller Mechanism. 

Error Message:-

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Anaconda\lib\site-packages\spyderlib\widgets\externalshell\sitecustomize.py", line 523, in runfile
    execfile(filename, namespace)
  File "C:\Users\hsulee\Desktop\Cantera Bug\RestoreSolution.py", line 31, in <module>
    sim1.restore(filename='GRI3.0_Stoichiometric.xml',name='solution',loglevel=2)
  File "onedim.pyx", line 793, in cantera._cantera.Sim1D.restore (cantera/_cantera.cpp:44636)
MemoryError: bad allocation

Interestingly, if I reduce the Number of Species to 170, then the error goes away and the iteration process starts. It throws the above error whenever the species in the mechanism file exceeds 170. Besides, if the premixed gas is H2/air, then it will not throw the above error and will solve the problem flawlessly using similar approach and similar mechanisms (GRI3.0 and Larger Mechanism). 

I posted all the necessary files to reproduce the above error.

The files attached herein are: 

RestoreSolution.py - Restore a previously saved solution and use the larger mechanism to solve
GRI3.0_Stoichiometric.xml - The Solution file solved using GRI3.0 
AramcoMech_1.3_C5chem.xml - The larger mechanism (consists of 316 species, Courtesy of Dr. Henry Curran, NUI Galway).  

Hopefully there is a solution to my problem. 

Thank you very much in advance.

Thanks,
Lee 
RestoreSolution.py
GRI3.0_Stoichiometric.xml
AramcoMech_1.3_C5chem.xml

Ray Speth

unread,
Sep 17, 2013, 1:41:01 PM9/17/13
to canter...@googlegroups.com
Hi Lee, 

I tried running this example and I did not observe the reported problem. I did have to change your script to remove the extra 'FlameLoc' argument from the call to set_initial_guess. I guess this is a modification you made to the Python module?

Is this a copy of Cantera that you compiled yourself, or from one of the binary installers for 2.1.0b3? One possible concern is that the binary installers, which were compiled against the stock Python from python.org may have issues with Anaconda. I would hope that's not the case, but it might be worth checking.

If you did compile this yourself, and since I can't replicate the error you're getting, all I can suggest is that you might be able to pin down where in Sim1D::restore (in Sim1D.cpp) the error occurs. You can either try adding debug output messages to that method (and the methods that calls) or running your script inside the Visual Studio debugger and having the debugger stop when it encounters an exception.

Regards,
Ray

H.C. Lee

unread,
Sep 17, 2013, 2:41:06 PM9/17/13
to canter...@googlegroups.com
Hi Ray,

Thanks for replying me. I did make the modification in the Python module and compiled myself. Thanks for letting me know it is my problem. I will try debug it.

Thank you so much.

Thanks,
Lee

H.C. Lee

unread,
Sep 17, 2013, 6:46:26 PM9/17/13
to canter...@googlegroups.com
Hi Ray,

I think I found the culprit, but am not very sure if it is the 32bit that caused the error.

So, I downloaded the 32bit Python 2.7 and 32bit Python 3.3 from the binary installers for 2.1.0b3 and they gave me the exact same error.

However, I installed the 64 bit Python 3.3 from the binary installers, and I am able to process to the iteration process.

Are you using 64bit to run the code? is it possible for you to run the attached code on a 32bit Cantera?

I will re-compile the Cantera in 64bit environment instead of 32bit and let you know if the problem still persists.

Nevertheless, thank you so much for your help and also your time! I really appreciate it.

Hope to hear from you.

Thanks,
Lee
Message has been deleted

H.C. Lee

unread,
Sep 18, 2013, 1:27:15 AM9/18/13
to canter...@googlegroups.com
Hi Ray,

I found out what happen! It is because the mechanism requires 4.3GB of memory which is why it fails on 32bit environment.

It is not because of Cantera!

Thank you.

Thanks,
Lee

Ray Speth

unread,
Sep 19, 2013, 11:31:28 PM9/19/13
to canter...@googlegroups.com
Lee,

Yes, it seems that it is actually running into memory limitations for 32-bit applications. The Jacobian data for the 1D flame requires O(nPoints * nSpecies^2) storage, which gets pretty large with a 316 species mechanism. Part of the problem is that Cantera's banded matrix implementation isn't very efficient, as it stores the Jacobian and its LU factorization separately, effectively doubling the memory requirement.

Regards,
Ray

H.C. Lee

unread,
Sep 20, 2013, 7:49:39 PM9/20/13
to canter...@googlegroups.com
Hi Ray,

I agreed. When my grid went up to 1000pts, it took approximately 5.7GB of memory. It took me approximately 18 hours (Clock Time) just to get the solution using the 316 species. Nevertheless, I am able to obtain the solution.

Have you tried Chemkin before? does it converge faster for 1-D problem (given such a large mechanism)? Is it possible to parallerize Cantera so it can run on multiple cores, maybe it will run faster? 

Thank you so much for your help all these while. Reducing the complexity of Cantera. :) 

Thanks,
Lee 

Nick Curtis

unread,
Sep 23, 2013, 5:27:20 PM9/23/13
to canter...@googlegroups.com
HC, if you want to make Cantera run faster look into getting an optimized BLAS/LAPACK library (e.g. OpenBLAS or MKL) or implementing a preconditioned GMRES solver, (e.g. Adaptive Preconditioning Strategies for Integrating Large Kinetic Mechanisms from Matthew J. McNenly, Russell A. Whitesides and Daniel L. Flowers)

H.C. Lee

unread,
Sep 25, 2013, 11:59:59 AM9/25/13
to canter...@googlegroups.com
Hi Nick,

That's a very useful information and I believe it will be useful to the entire Cantera Community. Thank you very much.

I will look into it and try to implement on my Computer.

Thanks,
Lee

ischoegl

unread,
Sep 26, 2013, 10:58:36 PM9/26/13
to canter...@googlegroups.com
Nick,

... to follow up on your thoughts: I believe that optimized BLAS/LAPACK is definitely a very good idea. The really nice part of this that it is easily achieved without having to touch the code basis.

Alternative matrix solvers would definitely be interesting, but this addresses only part of the performance issues. While the Newton/time-stepping solver tries to periodically find a steady state solution, the solution is moved ahead by a (pseudo-) transient algorithm, which is only first order accurate in time (backward-euler) and (in my experience) also prone to getting stuck. I believe this is the root cause for some of the convergence issues that have been recently discussed for the counter flow cases.

I am pretty certain that the (pseudo-)transient stepping forms the current bottleneck from a performance point of view. Unfortunately, creating an infrastructure for improvements would involve a *lot* of pain due to the 1D code basis: there are no clean interfaces. The current solver is more or less hard-coded and cannot be easily replaced or overloaded: the same code is used for both time-step and Newton solve, where inheritance involves a custom MultiNewton object defined in OneDim (MultiNewton->MultiJac->BandMatrix->Lapack), and the solution vector is defined as a C++ vector in Sim1D. Pulling these apart would be extremely tedious. While cantera is actually compiled against sundials (which imho would be great due to the potential for parallel processing), it is currently only implemented for zeroD objects.

Another point for improvement would be the computational grid, which is low-order (derivatives are hard-coded) and non-adaptive during time-stepping. Still, Cantera works pretty well as it stands, is well maintained (there were some pretty substantial improvements in usability), and is overall a terrific piece of software.

-ingmar-

Ray Speth

unread,
Sep 29, 2013, 3:43:56 PM9/29/13
to canter...@googlegroups.com
Hi All,

There are a lot of ways in which the performance and stability of the 1D solver could be improved. The current implementation basically follows the solution algorithm of the original PREMIX and OPPDIF codes, which are about 15 years old at this point. These codes work well for mechanisms with relatively small numbers of species, but don't scale well to the much larger mechanisms that many people are now interested in. The time complexity for factorizing the banded Jacobian is O(N*K^3) time where N is the number of points and K is the number of species. As a point of comparison, I did a bit of profiling, and for the mechanism that Lee posted (316 species), this means that over 50% of the time is spent on this factorization, as compared to only 19% for solving with GRI Mech 3.0 (53 species). Also, as noted here, the memory requirement scales as O(N*K**2), which can be problem as well. 

Using an optimized BLAS/LAPACK implementation will certainly help some, but all that does is reduce the constant multiplier on the time when what you'd really like is to reduce the exponent on the species term, which means getting rid of the dense/banded factorization.

The sparse preconditioned Krylov methods like GMRES are great, but you still need to form a preconditioner, which effectively brings you back to the problem of building and factorizing the Jacobian. These methods will tolerate some approximations in the Jacobian that you can't get away with when using a direct solver, which is why its possible to consider an inexact factorization that exploits the sparsity of the Jacobian. In my experience, this these can work well, but they're not exactly easy to implement, and it can be difficult to know how accurate the Jacobian needs to be to get acceptable solver performance. The paper Nick linked to is a pretty good summary of this case. One additional concern with trying to use a sparse solver in the 1D problem is that the transport-related terms in the Jacobian may generate a lot of additional fill-in that reduces the benefit of using the sparse solver.

I don't think that the first-order nature of the transient solver should be a source of convergence trouble, given that the stability regions of the higher-order BDF formulas are smaller than the first-order method. That said, it might be interesting to see whether using Sundials to provide the higher-order methods would improve the overall rate of convergence. Of course, the convoluted relationship among Sim1D/OneDim/MultiNewton/etc. would make this a nontrivial undertaking.

Using higher-order spatial derivatives is an interesting idea. One problem is that this would increase the bandwidth of the Jacobian, assuming you're working with the existing linear solver.

Of course, another option is to step away from the fully implicit solver entirely, which is what I decided to do with my time-dependent solver, Ember. By going to an operator-split method, you can implement different solvers for each term, and the individual problems are much simpler. If you stick with the direct solver for the reaction terms, you only need O(K^2) memory for the Jacobian (no dependence on N), since you're only solving one point at a time. And if you want to use a sparse solver, you are solving almost exactly the problem described in the McNenly et al. paper. I think that implementing higher-order spatial derivatives would be much easier in an operator-split code as well, since there's better separation of each of the terms in the original equations, rather than the monolithic approach that you usually end up with with the fully implicit approach. Additionally, operator splitting provides a very natural opportunity for implementing parallelism.

Regards,
Ray

rahul kumar

unread,
Jun 19, 2020, 11:42:51 AM6/19/20
to Cantera Users' Group
A memory error means that your program has ran out of memory. If you get an unexpected MemoryError and you think you should have plenty of RAM available, it might be because you are using a 32-bit python installation. This could be caused by either the 2GB per program limit imposed by Windows (32bit programs), or lack of available RAM on your computer. The easy solution, if you have a 64-bit operating system, is to switch to a 64-bit installation of python. The issue is that 32-bit python only has access to ~4GB of RAM. This can shrink even further if your operating system is 32-bit, because of the operating system overhead.
Reply all
Reply to author
Forward
0 new messages