I do not have a solution, but my two cents on this:
OpenCV?
to do this in a dsp buffer you want to probably do it using the operators defined in openCV, like ADD, OR, XOR, AND, etc that operate with
SIMD instructions to handle efficiently memory vectors of values. i would probably think about this as a (smaller) sample buffer that is written (at the right place) in a (bigger) display buffer, the Right place is probably a MOD operation on the index order of the sample address to the address of the bigger buffer.
I highly doubt you want this since i am suspicious why you want this, and the scale that maybe is pretended: the openCV solution to this problem means that you would be sending bigger buffer to the graphic card than you really need; every time you send the total updated display buffer, even the values that did not change.
Shaders?
I think the solution is just to send the most actual sample(s) to the shader, and handle that update to the display buffer in the shader itself.
much more performant and maybe easier to implement and scale. it may even be very similar to what is already done in the timeSeries visualizer.