Best practice for improving speed with large datasets

1,085 views
Skip to first unread message

Richard Smith

unread,
Apr 25, 2014, 4:08:59 PM4/25/14
to pyqt...@googlegroups.com
I'm plotting data sets with 500k points up into the several million range.

With this many data points the zoom functions basically become unusable.  I know there has been some work recently on improving speed.  I've pulled and installed git HEAD as of today but I don't see any really difference in speed.

Whats the best practice(s) for dealing with graphs with this many data points?


Luke Campagnola

unread,
Apr 26, 2014, 2:59:31 PM4/26/14
to pyqt...@googlegroups.com
Hi Richard, 
Have you tried out the new downsampling options? See:

These options are also available via the PlotItem context menu. 
 

Richard Smith

unread,
Apr 27, 2014, 6:58:17 PM4/27/14
to pyqt...@googlegroups.com
Thanks.  That's just the info I was looking for. I've started playing with these options.

setClipToView() 

This makes a noticeable difference once you zoom in beyond the initial display.

My attempt with downsampling were not successful.  I tried combinations of 
mode and setting auto to true along with ds to a number but all of them yielded errors similar to this:

/usr/local/lib/python2.7/dist-packages/pyqtgraph-0.9.8_develop_c8ee4a86be_-py2.7.egg/pyqtgraph/graphicsItems/PlotDataItem.py:530: RuntimeWarning: divide by zero encountered in double_scalars
  ds = int(max(1, int(0.2 * (x1-x0) / width)))
Traceback (most recent call last):
  File "/home/rsmith/bin/process_hr.py", line 732, in <module>
    process_files(args.filenames,args)
  File "/home/rsmith/bin/process_hr.py", line 622, in process_files
    w1p1.plot(t,red_source,pen='r')
  File "/usr/local/lib/python2.7/dist-packages/pyqtgraph-0.9.8_develop_c8ee4a86be_-py2.7.egg/pyqtgraph/graphicsItems/PlotItem/PlotItem.py", line 622, in plot
    self.addItem(item, params=params)
  File "/usr/local/lib/python2.7/dist-packages/pyqtgraph-0.9.8_develop_c8ee4a86be_-py2.7.egg/pyqtgraph/graphicsItems/PlotItem/PlotItem.py", line 515, in addItem
    item.setDownsampling(*self.downsampleMode())
  File "/usr/local/lib/python2.7/dist-packages/pyqtgraph-0.9.8_develop_c8ee4a86be_-py2.7.egg/pyqtgraph/graphicsItems/PlotDataItem.py", line 323, in setDownsampling
    self.updateItems()
  File "/usr/local/lib/python2.7/dist-packages/pyqtgraph-0.9.8_develop_c8ee4a86be_-py2.7.egg/pyqtgraph/graphicsItems/PlotDataItem.py", line 462, in updateItems
    x,y = self.getData()
  File "/usr/local/lib/python2.7/dist-packages/pyqtgraph-0.9.8_develop_c8ee4a86be_-py2.7.egg/pyqtgraph/graphicsItems/PlotDataItem.py", line 530, in getData
    ds = int(max(1, int(0.2 * (x1-x0) / width)))
OverflowError: cannot convert float infinity to integer
 

Is there more I have to do besides just setting .setDownsampling(auto=True,mode='peak') on the plot?

Luke Campagnola

unread,
Apr 27, 2014, 10:22:09 PM4/27/14
to pyqt...@googlegroups.com
On Sun, Apr 27, 2014 at 6:58 PM, Richard Smith <sm...@whoop.com> wrote:


On Saturday, April 26, 2014 2:59:31 PM UTC-4, Luke Campagnola wrote:
On Fri, Apr 25, 2014 at 4:08 PM, Richard Smith <sm...@whoop.com> wrote:
I'm plotting data sets with 500k points up into the several million range.

With this many data points the zoom functions basically become unusable.  I know there has been some work recently on improving speed.  I've pulled and installed git HEAD as of today but I don't see any really difference in speed.

Whats the best practice(s) for dealing with graphs with this many data points?

Hi Richard, 
Have you tried out the new downsampling options? See:

These options are also available via the PlotItem context menu.

Thanks.  That's just the info I was looking for. I've started playing with these options.

setClipToView() 

This makes a noticeable difference once you zoom in beyond the initial display.

My attempt with downsampling were not successful.  I tried combinations of 
mode and setting auto to true along with ds to a number but all of them yielded errors similar to this:
 
[snip]
  File "/usr/local/lib/python2.7/dist-packages/pyqtgraph-0.9.8_develop_c8ee4a86be_-py2.7.egg/pyqtgraph/graphicsItems/PlotDataItem.py", line 530, in getData
    ds = int(max(1, int(0.2 * (x1-x0) / width)))
OverflowError: cannot convert float infinity to integer
 
Is there more I have to do besides just setting .setDownsampling(auto=True,mode='peak') on the plot?


It appears you are doing it correctly, but the branch you have (c8ee4a) is rather old and there happens to be a divide-by-zero bugfix at that line, further ahead in the develop branch. So pulling the latest code may just solve that problem. If not, it would be awesome if you could dig in to the lines surrounding the error and tell me what it is about your data that triggers the error.. 


Luke  

Richard Smith

unread,
Apr 27, 2014, 10:53:38 PM4/27/14
to pyqt...@googlegroups.com
On Sunday, April 27, 2014 10:22:09 PM UTC-4, Luke Campagnola wrote:

It appears you are doing it correctly, but the branch you have (c8ee4a) is rather old and there happens to be a divide-by-zero bugfix at that line, further ahead in the develop branch. So pulling the latest code may just solve that problem. If not, it would be awesome if you could dig in to the lines surrounding the error and tell me what it is about your data that triggers the error.. 

That was it.  I pulled develop HEAD and its working now with no error.  I've only played with mode='peak' so far but looks like it makes a big difference.  My 4 million point plot is actually usable with the zoom now.

Huge thanks for everyone who helped with these features..  This addition was perfect timing for me.

Reply all
Reply to author
Forward
0 new messages