Why does the memory of the computer increase with the amount of data when drawing with pyqtgrph?

192 views
Skip to first unread message

xiangyu ning

unread,
Nov 25, 2020, 8:37:46 PM11/25/20
to pyqtgraph

 

最近使用pyqtgraph遇到问题了,问题表现在,我写了个上位机记录单片机通过串口发给电脑的数据,并将数据存入数据库,问题就在于当我用上位机从数据库中读取保存的数据,并且用pyqtgraph显示出来的时候,随着数据量的增多,电脑内存占用会逐渐加大,直到最后爆掉,我试过屏蔽掉self.plotWidget_ted.plot(y=b, x=a, pen='b')这句话,也就是说不画图,内存就不会有变化了,请问是什么原因导致的呢,是画图的时候,把那些点写入到系统内存里了吗,我上位机在读取数据的时候是循环读取,循环加载,加载完成,会清除掉已经读取的数据,

Recently, I encountered a problem in using pyqtgraph. The problem is that I wrote a PC to record the data sent to the computer by the MCU through the serial port, and stored the data in the database. The problem is that when I use the upper computer to read the saved data from the database and display it with pyqtgraph, with the increase of the amount of data, the computer memory occupation will gradually increase, until it finally bursts, I tried to block it self.plotWidget_ ted.plot (y = B, x = a, pen'b '), that is to say, without drawing, the memory will not be changed. What is the reason for that? When drawing, did those points be written into the system memory? When reading the data, my upper computer would read the data circularly. After loading, the data would be cleared,

Patrick

unread,
Nov 25, 2020, 9:14:36 PM11/25/20
to pyqtgraph
Hi,

I'm guessing you are calling self.plotWidget.plot(x, y) a lot, which creates a new plot item each time. You should instead make one plot, and then use .setData(x, y) to update the existing plot with the new data.

self.plot = self.plotWidget.plot()
# .... later, when data is received
self.plot.setData(x, y)

Patrick
Message has been deleted
Message has been deleted
Message has been deleted
Message has been deleted

xiangyu ning

unread,
Nov 26, 2020, 10:20:48 AM11/26/20
to pyqtgraph
Thank you for your answer. After testing, I used your suggestion and found that after using "self.plot.setData(x, y)", I found that the previous curve on the pyqt5 interface could not be saved. After consulting the pyqtgraph document, I found the document Display: setData(*args, **kargs)[source] Clear any data displayed by this item and display new data. See __init__() for details; it accepts the same arguments. Using setData will clear the previously loaded data, so that the previous data cannot be retrieved, because my software needs to use " LinearRegionItem" selects the data of the specified interval.    企业微信截图_16064039279384.png 

Magne Lauritzen

unread,
Nov 26, 2020, 10:57:18 AM11/26/20
to pyqt...@googlegroups.com
Also remember that when you provide data to plot, internal copies are made. So plotting 1GB of data will result if several GB of memory being consumed. This has previously caused problems for me. 

On Thu, Nov 26, 2020, 15:14 xiangyu ning <ning...@gmail.com> wrote:

Thank you for your answer, because it takes a few days to get a large amount of data, so I have modified the code to test according to your suggestions, and the results may be produced in a few days
在2020年11月26日星期四 UTC+8 上午10:14:36<Patrick> 写道:

--
You received this message because you are subscribed to the Google Groups "pyqtgraph" group.
To unsubscribe from this group and stop receiving emails from it, send an email to pyqtgraph+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pyqtgraph/5d5aac83-10c2-4235-b016-41e0b347aec4n%40googlegroups.com.

xiangyu ning

unread,
Nov 26, 2020, 12:12:32 PM11/26/20
to pyqtgraph
谢谢你,请问你后来是怎么解决问题的?我现在的情况是mongdb里面有1.5GB的数据,如果使用pyqtgraph全部显示在界面上,会占用我电脑90%的内存,约超过90%*24GB =21.6GB的内存,这太恐怖了,而且这仅仅是三天的数据量,我的计划是挂测10天以上,这样数据,可能达到十几个GB,如果目前来看,这个想法不能实现
Thank you, how did you solve the problem later? My current situation is that there is 1.5GB of data in mongdb. If you use pyqtgraph to display all on the interface, it will occupy 90% of my computer's memory, about 90%*24GB = 21.6GB of memory, which is horrible, and this is only three days of data. My plan is to test for more than 10 days, so the data may reach more than a dozen GB. If you look at it now, this idea cannot be realized.

Ognyan Moore

unread,
Nov 26, 2020, 12:49:03 PM11/26/20
to pyqt...@googlegroups.com
When dealing with lots of data like that, consider if you need to plot all that data, or if you can subsample to get the total data down to something more manageable.

xiangyu ning

unread,
Nov 26, 2020, 7:23:38 PM11/26/20
to pyqtgraph
谢谢你的回答,我的项目基于python 实现一个直流分析仪的功能,所以需要呈现所有的数据,然后通过鼠标拖拽的方式使用"LinearRegionItem"功能去选择感兴趣的区间,然后进一步分析区间内的数据.另外我也降低我的采样率,但是这只能缓解内存占用过大的问题,而不能从根本上解决,实验的时间延长以后,还是会出现内存不足的情况,另外如楼上所说,内存复制的机制是QT底层的原因还是pyqtgraph的原因呢,选择python实现是因为python模块很多,避免了重复制造轮子的问题,但是如果内存不够的问题解决不了,我能选用QT直接去做吗?会有这个问题吗?
Thank you for your answer. My project is based on python to implement the function of a DC analyzer, so all the data needs to be presented, and then use the "LinearRegionItem" function to select the interval of interest by dragging the mouse, and then further analyze the interval Data. In addition, I also reduce my sampling rate, but this can only alleviate the problem of excessive memory usage, but cannot solve it fundamentally. After the experiment time is extended, there will still be insufficient memory. In addition, as mentioned above , The mechanism of memory copy is the underlying reason of QT or the reason of pyqtgraph. The python implementation is chosen because there are many python modules, which avoids the problem of repetitive manufacturing of wheels, but if the problem of insufficient memory cannot be solved, can I use QT to do it directly? ? Will there be this problem?  

Ognyan Moore

unread,
Nov 26, 2020, 7:26:57 PM11/26/20
to pyqt...@googlegroups.com
I want to do some experimentation, can you give the shape of the array you're plotting? Not sure it helps with memory issue but you can also try using pyqtgraph's plotting downsample methods.

Patrick

unread,
Nov 26, 2020, 8:08:31 PM11/26/20
to pyqtgraph
Hi,

I know these are the pyqtgraph forums, and we all love pyqtgraph here, but have you looked at some other options? For example, a dedicated time series database (InfluxDB, Prometheus) and Grafana for the plotting interface.
I've used InfluxDB and Grafana with a home-made temperature logger which communicates over a serial port. The Grafana web interface lets you select time and date ranges and zoom into the regions of data. It's accessible over the network and works really well!
You'll still need to write some code to actually log your data and put it into the database, but that's relatively easy. The program I wrote got a little out of hand, and I ended up releasing it publicly here. You'd just need to write your own data source plugin. Even if you don't use that framework there's a lot of instructions for setting up the database etc.

If you want to stay with writing your own plotting program, then what Ognyan is saying above about resampling your data is correct. Trying to load 1+ GB of raw data and plot all of it will be slow and run into memory limits. You need to load only specific data from disk depending on the zoom level and range of your plot. As your view changes, you'll need to load different parts of the data from disk (this is what Grafana does when accessing its database). In this case a MongoDB database might not be a good choice, and why a time series database would be better. Another alternative that might work is a memory-mapped array --- I was looking at using zarr for a project I'm working on at the moment where the size of the collected data may be larger than the computer's RAM size.

Patrick

xiangyu ning

unread,
Nov 26, 2020, 9:12:25 PM11/26/20
to pyqtgraph
Thank you for your enthusiastic guidance. I have learned a lot. As you said, we all like pyqtgraph and think it is a perfect productivity tool, so many of my projects are done using pyqtgraph. I checked through Google All the professional tools and modules you mentioned, but they don’t seem to meet my expectations. And I only know python, and I am not very good at English. Forum posts are all with the help of Google Translate, so I learn the use of new tools or new Technology is more difficult for me, I think you can confirm for me: memory usage is objective and cannot be solved, right? If only pyqt5 and pyqtgraph are used?  

Magne Lauritzen

unread,
Nov 27, 2020, 3:07:37 AM11/27/20
to pyqt...@googlegroups.com
I have also had the problem of needing to plot very large datasets (100s of GB). I solved it by dynamically subsampling the dataset to only plot a manageable amount. The subsampling takes a lot of time though, so when a subsampling has been performed I store the result in a LRU cache. Maybe you can use a similar approach.

When you select the region to analyze in the plot, you can grab the same region from the original dataset i(n chunks to prevent memory overflow) and process it. This method decouples the representation (the plot) from the dataset. 

Reply all
Reply to author
Forward
0 new messages