Fastest way to convert UMat to JavaFX image

131 views
Skip to first unread message

Benjamin Xiao

unread,
Jun 22, 2020, 11:13:40 PM6/22/20
to javacv
Hello fellow JavaCV users!

I've been working on a security camera project in Java and I am currently looking at ways to reduce overall CPU usage. Thanks to the Bytedeco OpenCV library, I've been able to switch over to using UMats, which allow me to use GPU acceleration to run my object detection. However, there's still quite a bit of CPU usage being generated whenever I need to display security camera footage inside my JavaFX application. Currently, this is my approach.

public static Image toJFXImage(UMat mat) {
        // JavaFX native format is BGRA so let's convert on GPU
        UMat bgraMat = new UMat();
        cvtColor(mat, bgraMat, COLOR_BGR2BGRA);
        
        var w = bgraMat.cols();
        var h = bgraMat.rows();
        var channels = 4;
        
        var imageArray = new byte[w * h * channels];
        // Get regular Mat from UMat so that we can access memory.
        // Couldn't figure out a way to do it otherwise.
        // Wrap in try-resource block because tempMat needs to be closed
        // otherwise we crash during GC.
        try (Mat tempMat = bgraMat.getMat(ACCESS_READ)) {
            tempMat.data().get(imageArray);
        }
        var buffer = new PixelBuffer<ByteBuffer>(w, h, ByteBuffer.wrap(imageArray), PixelFormat.getByteBgraPreInstance());
        
        return new WritableImage(buffer);
    }

As far as I know, there's only one buffer copy happening on the CPU here. This is as optimized as I know how to make this, but it is still incurring some CPU cost.

Now, I know that modern applications usually leverage hardware-accelerated video overlays to paint to the screen. However, I am at a loss for how to do that with UMats and JavaCV/OpenCV. Does anyone know how to efficiently render webcam video to a JavaFX GUI?

Thanks in advance and thank you Bytedeco for all the amazing work.

Ben

Samuel Audet

unread,
Jun 22, 2020, 11:32:28 PM6/22/20
to jav...@googlegroups.com, Johan Vos, Robert Ladstätter
Hi,

We can access GPU memory that OpenCV uses, but I'm not sure if JavaFX has any hooks for this.
Johan, would you know? Or know who we should be talking to about this?

In any case, you'll get more speed by using a direct NIO buffer instead of an array.
See Robert's blog post about that here:

Samuel

Benjamin Xiao

unread,
Jun 23, 2020, 1:59:29 AM6/23/20
to javacv
Hi Samuel

Thanks for that suggestion! I was looking for a direct way to get ByteBuffers from Mats, but I couldn't find a way until now. Now my code looks like this:
public class VideoUtils {
    public static Image toJFXImage(UMat mat) {
        UMat bgraMat = new UMat();
        cvtColor(mat, bgraMat, COLOR_BGR2BGRA);
        
        var w = bgraMat.cols();
        var h = bgraMat.rows();
        
        try (Mat tempMat = bgraMat.getMat(ACCESS_READ)) {
            var pixelBuf = new PixelBuffer<ByteBuffer>(w, h, tempMat.createBuffer(), PixelFormat.getByteBgraPreInstance());
            return new WritableImage(pixelBuf);
        }    
    }
}
The results are promising. On my Surface Pro, my CPU usage used to peak around 45%, now it peaks around 33%. On my Ryzen 3900x, my CPU usage dropped from 4% to 2.5%.

Is there any copying when I call bgraMat.getMat(ACCESS_READ)? Or is it just a wrapper around a UMat?

Samuel Audet

unread,
Jun 23, 2020, 4:26:27 AM6/23/20
to javacv
It gets copied. Mat can't hold data on GPUs.

2020年6月23日(火) 14:59 Benjamin Xiao <ben.r...@gmail.com>:
--

---
You received this message because you are subscribed to the Google Groups "javacv" group.
To unsubscribe from this group and stop receiving emails from it, send an email to javacv+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/javacv/5ea6e4a6-2551-4cba-8688-9383f7c18305o%40googlegroups.com.

Benjamin Xiao

unread,
Jun 23, 2020, 5:08:15 AM6/23/20
to javacv
Thanks for the info, Samuel.

Unfortunately, I did some further testing using direct NIO buffers to JavaFX images and on my Surface Pro, it seems to consistently crash outside of the JVM. I've attached a log file. The crash does not happen on my Ryzen machine, curiously enough. From a cursory glance at the logs, I am wondering if its a video driver issue?

On Tuesday, June 23, 2020 at 1:26:27 AM UTC-7, Samuel Audet wrote:
It gets copied. Mat can't hold data on GPUs.

2020年6月23日(火) 14:59 Benjamin Xiao <ben....@gmail.com>:
To unsubscribe from this group and stop receiving emails from it, send an email to jav...@googlegroups.com.
hs_err_pid5248.log

Benjamin Xiao

unread,
Jun 23, 2020, 5:22:53 AM6/23/20
to javacv
Also, if umat.getMat(ACCESS_READ) returns a copy of the buffer in a Mat, then why is there a limitation that all derived Mats need to be closed before the parent UMat gets GC'ed?

I did some more digging into the above crash. Is it crashing because I get a ByteBuffer from a Mat, feed that into a WritableImage for display by JavaFX, and then close the Mat? Here's the relevant function:
       try (Mat tempMat = bgraMat.getMat(ACCESS_READ)) {
           var pixelBuf = new PixelBuffer<ByteBuffer>(w, h, tempMat.createBuffer(), PixelFormat.getByteBgraPreInstance());
           return new WritableImage(pixelBuf);
       }
Does tempMat.createBuffer() in this case create a new buffer that can outlast tempMat.close()? Or is tempMat.close() basically destroying that buffer before JavaFX has a chance to display it, causing the crash?

Sorry for inundating you with questions. It's just hard to judge what getMat() and createBuffer() actually do from the API docs.

Samuel Audet

unread,
Jun 23, 2020, 6:31:32 AM6/23/20
to jav...@googlegroups.com, Benjamin Xiao
There is a bug in OpenCV concerning closing Mat and UMat:
Please report upstream if this is important to your application.

No copy happens with Mat.createBuffer(), so yes it will crash if you try to access it after Mat is destroyed.

Benjamin Xiao

unread,
Jun 23, 2020, 3:45:53 PM6/23/20
to javacv
Darn, okay. These C++ / Java interops get tricky. I did some more tests, and sure enough if I do a deep copy on the ByteBuffer, the application doesn't crash, but now I've reintroduced another buffer copy. Is there a way I can directly copy from a UMat into a ByteBuffer so I can skip that intermediary umat.getMat(ACCESS_READ) step?

Samuel Audet

unread,
Jun 23, 2020, 8:07:15 PM6/23/20
to jav...@googlegroups.com, Benjamin Xiao
UMat.copyTo(Mat) might work? If not, we can surely do something with the pointers in UMat.u...

Benjamin Xiao

unread,
Jun 23, 2020, 9:57:39 PM6/23/20
to javacv
I've already tried UMat.copyTo(Mat) and it still crashes. Here's a version of my function that does it this way.

    public static Image toJFXImage(UMat mat) {
        UMat bgraMat = new UMat();
        cvtColor(mat, bgraMat, COLOR_BGR2BGRA);

        Mat tempMat = new Mat();
        bgraMat.copyTo(tempMat);

        // The buffer behind tempMat.createBuffer() is probably freed when tempMat gets GC'ed.
        var pixelBuf = new PixelBuffer<ByteBuffer>(tempMat.cols(), tempMat.rows(), tempMat.createBuffer(), PixelFormat.getByteBgraPreInstance());
        return new WritableImage(pixelBuf);
    }

As you can see, I create a tempMat using the UMat#copyTo. Then, I create a ByteBuffer from it and feed it into a WritableImage. This code still crashes on my Surface Pro and I believe it is because tempMat gets GC'ed and the buffer behind tempMat.createBuffer() is destroyed in the process. Later when I try to display the Writable image, it crashes because the buffer behind it is gone.

This seems like weird behavior to me from a Java language standpoint. Normally, the WritableImage should have a valid reference to the ByteBuffer created by tempMat.createBuffer(). When tempMat get's GC'ed, the buffer should technically still be okay because there's a valid reference to it. This is probably just one of those C++ intricacies that don't jive well with Java.

Anyways, this behavior is why I have to deep copy the ByteBuffer that I get from tempMat. However, this is an additional buffer copy which I'd like to avoid, which is why I asked if there's some way to directly copy from whatever buffer is behind a UMat, directly to a ByteBuffer that I create. This way, there's only one buffer copy from UMat and the ByteBuffer won't be destroyed when the UMat goes away. So ideally, I want something like this:

    public static Image toJFXImage(UMat mat) {
        UMat bgraMat = new UMat();
        cvtColor(mat, bgraMat, COLOR_BGR2BGRA);
        
        var w = bgraMat.cols();
        var h = bgraMat.rows();
        var ch = bgraMat.channels();

        ByteBuffer buf = ByteBuffer.allocate(w * h * ch);
        buf.put(bgraMat.asByteBuffer());
        
        var pixelBuf = new PixelBuffer<ByteBuffer>(w, h, buf, PixelFormat.getByteBgraPreInstance());
        return new WritableImage(pixelBuf);
    }

This code doesn't work because bgra.asByteBuffer() gives me an invalid buffer with a length of 1. I know that to get data out of a UMat, I'll need to do a copy since it's in GPU memory. But I want to make the copy directly into a ByteBuffer, rather than with some intermediary step with a Mat, etc.

I looked into UMat.u, but I am at a loss for how to get the buffer using the resulting UMatData object. I've already tried doing UMat.u().data().asByteBuffer(). But that gives me another invalid buffer with a length of 88. Any tips on how to proceed?

Thanks again for your help with all this, Samuel.

Samuel Audet

unread,
Jun 23, 2020, 10:11:50 PM6/23/20
to jav...@googlegroups.com, Benjamin Xiao
Native memory allocation can be pretty expensive, so you shouldn't be reallocating that way all the time anyway.
Just allocate the buffer once, and never deallocate it, so you won't have any issues concerning its lifetime either!

Benjamin Xiao

unread,
Jun 24, 2020, 12:15:02 AM6/24/20
to javacv
Usually, yes I'd agree. But the problem here is that the WritableImages and UMats are used at different moments. I have one thread that's grabbing frames (UMats) from a webcam and sending that into two queues, one for a display thread, and another for object detection. This allows me to maintain UI fluidity and camera video smoothness. But this also means that UMat lifetimes won't line up with WritableImage lifetimes, which is why I want to do a copy. The problem right now is that I am forced to do two copies instead of just one.

If there really is no way around the two copy issue, I'll restructure that part of the code to use a circular buffer or something and then we can maintain the lifetimes hopefully.

I am still curious to see if there's any way to display directly from the GPU, so if you or Johan could point me in the right direction that'd be great!

Thanks for all the help! In the meantime i'll restructure my code to do less allocations

Ben
Reply all
Reply to author
Forward
0 new messages