Shelve an op2 object?

39 views
Skip to first unread message

benjamine...@gmail.com

unread,
Oct 2, 2017, 4:11:46 PM10/2/17
to pyNastran Discuss
Hello,

New to pyNastran (although I've known about it for quite some time). But just wanted to know if anyone else is having trouble (or luck!) using Python's shelve module to store a persistent op2 object after reading it in. My op2 files are quite large (>4 GB) and take a long time to read-in. I was hoping to shelve them and save some time in the future. Here's the error I got on the first try.

TypeError: can't pickle instancemethod objects


Follow-on question: If this functionality simply isn't possible at this point, are there any plans to make this (or a similar solution) possible in the future?

Thanks in advance for your response!

Steven Doyle

unread,
Oct 2, 2017, 5:07:33 PM10/2/17
to pyNastran Discuss
I'm not familiar with the shelve module.  Is that different than just pickle?

I thought it was possible though.  It looks like there is some issue with some of the objects.

 My op2 files are quite large (>4 GB) and take a long time to read-in. 

It really shouldn't.  A 4 GB file should only take 8 seconds.  

Also, assuming you're using the master, you can also just dump everything to HDF5 using **model.export_to_hdf5(hdf5_filename)**.

Steve

--
You received this message because you are subscribed to the Google Groups "pyNastran Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pynastran-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

benjamine...@gmail.com

unread,
Nov 20, 2017, 12:11:58 PM11/20/17
to pyNastran Discuss
Hi Steve,

Thank you for your response. I am not currently using the master. When are you planning to do another PyPi release? We are looking forward to this (and other) improved functionality in future released versions.

Thank you,
Ben
To unsubscribe from this group and stop receiving emails from it, send an email to pynastran-disc...@googlegroups.com.

Steven Doyle

unread,
Nov 20, 2017, 3:19:14 PM11/20/17
to pyNastran Discuss
There aren't any releases planned right now as the last version was released just a few months ago.  They're a lot of extra work in regards to documentation.  Master is about as bug free as any other official release.  With new features (e.g., the system vs executive parsing or the much faster case control deck), I like to let it gel for a bit to make sure that there's not some new bugs.

Steve

To unsubscribe from this group and stop receiving emails from it, send an email to pynastran-discuss+unsubscribe@googlegroups.com.

ste...@gmail.com

unread,
Nov 28, 2017, 12:44:33 PM11/28/17
to pyNastran Discuss
Hi, I'm also new to pyNastran. We have been using it for a couple months to help with extracting transient stress results for a couple hundred elements over +300,000 time steps. Using pyton to automate this is much better than using a commercial program and extracting the results via a gui. The question I have is similar to the first post about the read times of the OP2's. I saw the comments saying a 4GB OP2 file should be read in about 8secs however in my script I added a simple timer and for my 4.6Gb file it takes 9.8mins to read the OP2. I'm doing this on Windows 7  with a Intel Xeon E5-2623 3Ghz CPU and 128Gb of memory. The version of pynastran is the latest official release. However when I tried installing it initially I was behind a firewall and couldn't install it properly. Therefore I'm just calling the files that were in the release download. Any suggestions as to why the read speeds are slower than expected? Any testing and changes I can do to try and improve this?  Thanks 
Pynastran_OP2_read.png

Steven Doyle

unread,
Nov 28, 2017, 1:06:50 PM11/28/17
to pyNastran Discuss
Did you run that using test_op2 or your code?  I ask because that looks like test_op2.  By default, test_op2 runs twice.  First it runs in vectorized mode, which is fast.  Then it runs in non-vectorized mode, which is ~100x slower.  Then it compares the results.  The flag to run slowly is undocumented, so if you're writing your own script, you're not using that.  If you want to run test_op2 faster, you can use:

    test_op2 -c op2_filename

If you're using read_op2, my guess would be you're using an uncommon result type, so while all objects are vectorized, not all objects have vectorized readers.  I'd need a smaller example in order to see.

The version of pynastran is the latest official release. However when I tried installing it initially I was behind a firewall and couldn't install it properly. Therefore I'm just calling the files that were in the release download. Any suggestions as to why the read speeds are slower than expected? Any testing and changes I can do to try and improve this?  Thanks 

I guess I don't really follow on how exactly you're running it, but that won't be a problem unless you're using something like PortablePython, but even then, that's a factor of 2.  SSDs also make a big difference, far more than any processor/memory.  You'll will need an SSD to get full speeds, but you probably have one.

Steve

To unsubscribe from this group and stop receiving emails from it, send an email to pynastran-discuss+unsubscribe@googlegroups.com.

ste...@gmail.com

unread,
Nov 28, 2017, 3:46:00 PM11/28/17
to pyNastran Discuss
Hi Steve,

Thanks for the response. I'm using read_op2 with my own script. The output says read mode1 (array sizing) then read mode2 (array filling).
I'm actually running over a network drive. Python is on a network drive and the results are on a network drive. Our connection to the network drives are 10Gb so it shouldn't be an issue since I can copy the file from network to local in ~10secs. I did check though and the read time drops to about 6mins by having the OP2 on a local SSD still not as fast as ~8secs. The models are from Optistruct 2017. To make this work I edited your script that checks for version and made it more generic to accept all Optistruct versions. I'm not sure if this is part of the issue or what you mentioned about an uncommon result type. I'll check and see if that makes any difference but I don't think we are doing anything uncommon right now. 

Steve

benjamine...@gmail.com

unread,
Jan 4, 2018, 12:42:50 PM1/4/18
to pyNastran Discuss
Hey guys. I don't know if this is the true solution, but I switched to Python 3.5.4 (previously I was using Anaconda 64-bit with Python 2.7.13), and the op2 files seem to be loading much faster. I'm also using the .read_op2 method, and I'm also running Python 3.5.4 from a network drive using WinPython 64-bit.

Steve D.,
Do you think Python 3 vs. Python 2 might be the culprit here?

Thanks,
Ben

Steven Doyle

unread,
Jan 4, 2018, 2:32:53 PM1/4/18
to pyNastran Discuss
Is that bad? Maybe I finally have a reason to update to Python 3 :)

Assuming it's not the network, it's probably the dictionary improvements in Python 3.x.  They're used almost everywhere in Python (e.g., passing arguments to a function/class), so I wouldn't be surprised.

I'd bet Anaconda 3.5/6 would be even faster.  They bundle in the Intel MKL libraries, which are used by numpy.


To unsubscribe from this group and stop receiving emails from it, send an email to pynastran-discuss+unsubscribe@googlegroups.com.

Steve Walters

unread,
Jan 6, 2018, 10:29:32 PM1/6/18
to pynastra...@googlegroups.com
I've been using Anaconda 4.3 with Python 3.6 in all my testing. I had some time the last couple days to look at this again. From the read times stated in the documentation it is 500mb/s so I did some testing with several of my files and examples included with the installation to see what read speeds I could get.  I was doing the testing on my work laptop which is not an SSD and for SOL 103 models I could average >300mb/s read time which is good for a SATA drive. I did have some strange results where the first time I read a file it was anywhere from 2x to 10x slower than subsequent reads. I'm assuming it has to do with the SATA drive having the file indexed after the first read and not memory issue since the file was erased from memory once the scripts finished running. Either way I tested 3 main models for read speeds a 1.3gb SOL 103, a 718mb SOL 103, and a 714mb SOL 109 with 720 time steps and ~4120 elements. What I found was when I am reading large or small models that are SOL 103 reads at the highest speeds of >300mb/s but the SOL 109 model reads slower. The slow down amount appears to be dependent on how many time steps are in the transient calculation. For the case I just mentioned it was ~10% slower than a similar sized SOL 103 model (718mb). When I went back to my earlier example I posted about that is 6.4gb with 300,000 time steps and 315 elements with results stored,  the read speed calculated to be ~30mb/s. I believe the issue to be in the _read_subtables function as it is using a while loop to read the 300,000 tables of data with several other function calls that also have loops. Since python is known to be slow in long loops this is currently the theory I have for the slow down on transient models with many time steps.

All that said, I was wondering if you had any ideas that could be implemented to speed this up. Is there a way around the "while" loops for reading the tables? Is it possible to know how many tables need to be read and specify that upfront so the reading can be vectorized?

Thanks

--
You received this message because you are subscribed to a topic in the Google Groups "pyNastran Discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/pynastran-discuss/8ECmw32GxIA/unsubscribe.
To unsubscribe from this group and all its topics, send an email to pynastran-discuss+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.



--
Steve

Steven Doyle

unread,
Jan 7, 2018, 2:12:45 PM1/7/18
to pyNastran Discuss
    I guess my first question is what is the breakdown of the results that you're processing from the 3 models.  If you run `test_op2 -c model.op2, you'll see:
    
    DEBUG:   op2.py:530                   -------- reading op2 with read_mode=1 (array sizing) -----
    INFO:    op2_scalar.py:1305           op2_filename = 'loadstep_elements.op2'
    DEBUG:   op2_scalar.py:1536             table_name='GEOM1S'
    DEBUG:   op2_scalar.py:1536             table_name='GEOM2S'
    DEBUG:   op2_scalar.py:1536             table_name='GEOM3S'
    DEBUG:   op2_scalar.py:1536             table_name='GEOM4S'
    DEBUG:   op2_scalar.py:1536             table_name='EPTS'
    DEBUG:   op2_scalar.py:1536             table_name='MPTS'
    DEBUG:   op2_scalar.py:1536             table_name='DIT'
    DEBUG:   op2_scalar.py:1536             table_name='OESCP'
    DEBUG:   op2.py:542                   -------- reading op2 with read_mode=2 (array filling) ----
    DEBUG:   op2_scalar.py:1536             table_name='GEOM1S'
    DEBUG:   op2_scalar.py:1536             table_name='GEOM2S'
    DEBUG:   op2_scalar.py:1536             table_name='GEOM3S'
    DEBUG:   op2_scalar.py:1536             table_name='GEOM4S'
    DEBUG:   op2_scalar.py:1536             table_name='EPTS'
    DEBUG:   op2_scalar.py:1536             table_name='MPTS'
    DEBUG:   op2_scalar.py:1536             table_name='OESCP'
    DEBUG:   op2.py:750                   combine_results
    DEBUG:   op2.py:550                   finished reading op2

    displacements[1]
      isubcase = 1
      type=RealDisplacementArray ntimes=4 nnodes=43, table_name=OUGV1
      data: [t1, t2, t3, r1, r2, r3] shape=[4, 43, 6] dtype=float32
      node_gridtype.shape = (43, 2)
      sort1
      lftsfqs = [ 0.25  0.5   0.75  1.  ]

    crod_stress[1]
      type=RealRodStressArray ntimes=4 nelements=2
      data: [ntimes, nnodes, 2] where 2=[axial, torsion]
      data.shape = (4, 2, 2)
      element.shape = (2,)
      element name: CROD
      sort1
      load_steps = [ 0.25  0.5   0.75  1.  ]

etc.

Without that, I'm be guessing.  Even then, I need an example.  SOL 109 is not used as much, so while you get a numpy array, the read method might not be vectorized.

It's just very suspicious that on the same geometry, there is such a discrepancy in the read rates.  The vectorized OP2 is ~500x faster than a non-vectorized read, so seeing a 10x slowdown indicates you have:
 - 1+ non-trivial, non-dominating sized results that's aren't vectorized
 - you are suppressing 95% of your elements in the SOL 109 case
 - you just have a very small model relative to the number of time steps

The 300,000 time step loop is certainly not helping, but I'm not sure that the most important problem to solve.  It certainly doesn't have the easiest solution.  That issue is your data wants to be in SORT2 format, but you're reading it as SORT1.  pyNastran doesn't support SORT2 well and neither do most MSC and NX Nastran (MSC is better though), so despite you requesting SORT2, it gives you SORT1.

It's probably possible to speed up the 300,000 time step loop, but at a significant memory cost.  You could stack the data, cast it all at once, and then reshape it, but it's a major change to the way the code works internally.

Steve
Reply all
Reply to author
Forward
0 new messages