Dynamic cropping

Can

unread,

Sep 1, 2021, 6:57:34 AM9/1/21

to Bonsai Users

Hello

I would like to discuss how I can implement dynamic cropping in Bonsai, which was introduced in v2.1. Not to be confused with static cropping, which is a pre-existing function in Bonsai.

Link to GitHub issue with some relevant links.

Kind regards

Can

brunocruz

unread,

Sep 2, 2021, 9:10:06 AM9/2/21

to Bonsai Users

Not sure what you mean by static cropping. The crop node can be changed dynamically on every frame if it needs to. I took a quick stab at it and I came up with two different solutions:

1. Use the "Crop" node where you pass the centroid of the animal as the coordinates of your crop + a couple of ints that define the size of the crop. You might need to do a bit of math to offset the center of the crop and some conditions to make sure the code doesn't crash since if Centroid.X + CropWidth > ImageWidth, you will get an error...

2. What I would say is a more general and thus more elegant solution :P. Use the affineTransform function to center your frame in the centroid of the animal (you can also correct for rotation if you need to actually). Anyway after you move the image, just do a static crop on top of the center of the image and you are good to good. The advantage of this solution is that bonsai already "pans" the rest of the picture so you don't need to worry about running "out of bounds".

Let me know if this works and good luck!

B

dynamicCropping_withAffine.bonsai

dynamicCropping_withCrop.bonsai

Can

unread,

Sep 2, 2021, 9:22:21 AM9/2/21

to Bonsai Users

Thank you for the reply brunocruz. I should've underlined that this is with respect to the DeepLabCut package of Bonsai. Your comment is therefore not an answer to my request, my apologies for not making this clear. If you click on the link to the issue, you can see exactly how dynamic cropping works in DeepLabCut, it is a very elegant method to speed up inferencing in live applications.

Link to explanation: https://deeplabcut.github.io/DeepLabCut/docs/standardDeepLabCut_UserGuide.html?#dynamic-cropping-of-videos

Gonçalo Lopes

unread,

Sep 2, 2021, 9:27:17 AM9/2/21

to Can, Bonsai Users

Hi Can,

Reading the link you sent, DLC dynamic cropping seems to be doing what Bruno suggested (from the first paragraph):

Namely, if you have large frames and the animal/object occupies a smaller fraction, you can crop around your animal/object to make processing speeds faster

So I guess I'm not sure why you say that Bruno's answer is not an answer to your question? As long as you send the cropped image to Bonsai-DLC node, the inference will be done only on the image subset.

Is there anything else you would need it to do?

--
You received this message because you are subscribed to the Google Groups "Bonsai Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bonsai-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/bonsai-users/bbbfc4db-a150-4834-b25a-d8480bb8c34bn%40googlegroups.com.

Can

unread,

Sep 2, 2021, 9:50:37 AM9/2/21

to Bonsai Users

The cropping is done with respect to the current position of the animal, and the largest bounding box including that would include all the body parts, which is recorded from the annotations from the DLC data set. The suggestion on #1 includes no method to detect the current centroid of the animal.

From the docs:

dynamic: triple containing (state, detectiontreshold, margin)

If the state is true, then dynamic cropping will be performed. That means that if an object is detected (i.e., any body part > detectiontreshold), then object boundaries are computed according to the smallest/largest x position and smallest/largest y position of all body parts. This window is expanded by the margin and from then on only the posture within this crop is analyzed (until the object is lost; i.e., <detectiontreshold). The current position is utilized for updating the crop window for the next frame (this is why the margin is important and should be set large enough given the movement of the animal).

Gonçalo Lopes

unread,

Sep 2, 2021, 9:57:02 AM9/2/21

to Can, Bonsai Users

I just opened the two examples that Bruno sent two emails ago, and both of them are computing the centroid of the animal to produce an image which is centered on the animal position. Are you referring to a different example?

To view this discussion on the web visit https://groups.google.com/d/msgid/bonsai-users/ae82a7af-d8a2-4d06-a674-e65b5255aa9an%40googlegroups.com.

brunocruz

unread,

Sep 2, 2021, 10:15:05 AM9/2/21

to Bonsai Users

The function you are describing is a pre-processing function from DLC that does not exist in Bonsai. However, you can easily replicate the preprocessing routine using Bonsai.

With that in mind I will try to explain a bit better what the workflow is doing (which can obviously be improved upon):

From a video file (FileCapture) convert the frames to Grayscale and Threshold the image (in this case I coded it to detect black objects on a white background). Then, find the contours (FindContours) in the binarized image and find the largest binary region in the frame (BinaryRegionAnalysis + LargestBinaryRegion). We take the Centroid (which is just a Point2f with an X and Y coordinate of the center of your animal) and use it to set the Crop X and Y position.

If you want to dynamically change the size of the crop to the size of the animal you could, for instance, use the BinaryRegionExtremes node that computes the extremes of your binary region and use these as your input for your Crop Height and Width properties (see updated workflow).

B

dynamicCropping_withCrop.bonsai

Can

unread,

Sep 2, 2021, 10:15:14 AM9/2/21

to Bonsai Users

I understand that you are trying to avoid re-inventing the wheel.

However, this algorithm uses the largest binary region, and is not as robust as the method presented by the DeepLabCut method (mentioned above and in the docs). We would also need to map the coordinates back to the original frame.

Other critiques:

- The largest binary region might not be the animal.

- The bounding box might be larger or smaller because of irregular lighting because of thresholding.

Can

unread,

Sep 2, 2021, 10:16:38 AM9/2/21

to Bonsai Users

Please understand that I want to avoid thresholding altogether and use the DeepLabCut inferencing pipeline to crop my video.

Gonçalo Lopes

unread,

Sep 2, 2021, 10:49:42 AM9/2/21

to Can, Bonsai Users

Hi Can,

Mapping the coordinates back to the original frame is a matter of applying an offset to the detected points, the same as what is supported by FindContours. I guess the option of adding such an offset could be usefully added to the DetectPose node.

If you want to compute a crop based on the last detected pose, then you could compute a bounding rectangle with the detected pose points and feed that back into Crop. This could also be included in the Pose output, as it would be the same to what is in the LargestBinaryRegion node (Contour > Rect property gives you the contour bounding box so you could use that rect to dynamically determine your crop based on animal size for example).

Of course using the output of pose itself means you have to first analyse the video using the full-frame until you detect an animal, otherwise there is no information to help you crop.

The argument about robustness I believe is more complicated than you seem to be implying. Thresholding and DLC inference have different sources of error, which can lead to quite serious failure modes in both cases. I actually believe they can be complementary to each other, of course depending on your setup.

Adding the above two changes seem reasonable to me (adding an Offset or Crop property to the DetectPose node, and adding the bounding rect to the output Pose). I don't think hard-coding the specific dynamic cropping algorithm in the node is very useful, as there can be all kinds of different ways to compute the crop and/or input images to detect poses on and we want to maximise and expand the potential uses of the node.

For example, in the past people have used dynamic crop thresholding to track body parts of multiple animals in one image by first finding the animals using thresholding and then running DLC twice per frame on each of the cropped windows. This flexibility to compose your pre-processing pipeline is something we definitely want to keep.

To view this discussion on the web visit https://groups.google.com/d/msgid/bonsai-users/578bbf59-1d67-442e-a86a-a30c76fbec2en%40googlegroups.com.

Can

unread,

Sep 2, 2021, 12:16:09 PM9/2/21

to Bonsai Users

Hello,

Before continuing, and for us to avoid talking in circles, I would like to underline that I am not saying that thresholding is unreasonable, my guess is that it is faster than the DLC method. Nevertheless, there are scenarios where it just won't work with respect to the environmental lighting, animal and the contrast between the animal and the environment. The speed increase from the DLC method is well documented, and I think its performance improvement is beyond any question at this point. Both methods should be in Bonsai, to not have one but the other is a lost opportunity for experiment design. I also misrepresented bruocruz'es method, and for that, I would like to apologize, I should've worded myself better.

My thoughts on some of the statements

brunocruz

>The function you are describing is a pre-processing function from DLC that does not exist in Bonsai.

Is this a problem? If nobody can do this I can do it, I just thought it would be no-brainer to implement this function (apparently it isn't), and goncalocpes would probably implement it faster than I can. Moreover, I am always eager to learn, and I am up for it if nobody wants to implement it.

goncalocpes

>If you want to compute a crop based on the last detected pose, then you could compute a bounding rectangle with the detected pose points and feed that back into Crop. This could also be included in the Pose output, as it would be the same to what is in the LargestBinaryRegion node (Contour > Rect property gives you the contour bounding box so you could use that rect to dynamically determine your crop based on animal size for example).

This is a very cool idea; however, it is not what I need at the moment. Do you have a project file with this? I might use it down the line.

>Thresholding and DLC inference have different sources of error, which can lead to quite serious failure modes in both cases"

This would imply that the DLC model being used for inferencing isn't competent and acquiret better/more data for the training. With that said, thresholding only provides speed, and not robustness.

>Of course using the output of pose itself means you have to first analyse the video using the full-frame until you detect an animal, otherwise there is no information to help you crop.

This is certainly true; however, it is still better than thresholding in my opinion. This method ensures that the animal isn't cropped out , which might occur in thresholding.

Thank you for your time guys, we have been discussing and using mind space for hours. Means a lot :)

Can

unread,

Sep 2, 2021, 12:25:45 PM9/2/21

to Bonsai Users

Correction:

>This would imply that the DLC model being used for inferencing isn't competent and acquiret better/more data for the training. With that said, thresholding only provides speed, and not robustness.

This implies that the DLC model being used for inferencing isn't competent and you should acquire better/more data for the training. With that said, in my humble opinion, thresholding only provides speed, not robustness, provided that the DLC model is fit.

Gonçalo Lopes

unread,

Sep 2, 2021, 1:38:59 PM9/2/21

to Can, Bonsai Users

Hi Can,

thresholding only provides speed, and not robustness.

This depends on what you are looking for in your choice of tracking method. Under certain degenerate conditions, thresholding can actually be more robust and accurate than DLC tracking, especially in very high contrast situations where there is no texture information for machine learning to exploit, but again, this is a much longer debate which is probably outside the scope of this discussion.

>The function you are describing is a pre-processing function from DLC that does not exist in Bonsai.

Is this a problem? If nobody can do this I can do it, I just thought it would be no-brainer to implement this function (apparently it isn't)

This exchange I find unfortunate. Bruno just stated such a pre-processing function didn't exist, and provided a valid workaround. He did not make any judgments about whether it should exist, or about coding skills necessary to implement this. We are all here to help each other using our free time, as you have acknowledged, and these kinds of statements can easily drain motivation from fellow contributors.

We are not aware of the background of everyone who joins us in the forum, and certainly from the way the original question was worded it was not clear that this was about adding this feature to the DLC package, so Bruno tried to help the best he could. Taking the time to phrase things more carefully would certainly have spared Bruno's time and energy.

There are a lot of features we would like to add to Bonsai, all the time. Differently from deeplabcut, Bonsai is a general purpose programming language which covers a lot of different areas, from behavior, to data acquisition, signal processing, visual stimulus generation, physics simulation, VR, networking, etc. Of course we would like to add support not just for DLC but other tracking libraries, cameras, operating systems, language interfaces and the like, but we don't have infinite time and resources, so we have to prioritize. We usually carefully weigh the pros and cons of not just adding, but maintaining features. To be honest, the less features we have to maintain, the better, less bugs that can creep in and less dependencies we have to keep up to date. From a maintainer's point of view it is indeed extremely useful if we can co-opt existing features rather than keep adding more options.

If we do add new features, it is best if they can be reused in many different possible situations, so we expand a lot of the space of applications with little effort. Adding dynamic cropping in the DLC node might well be one of these "enabler" features, but in that case the appropriate forum to discuss this is probably the original github issue. Ideally come prepared to discuss the pros and cons of different implementations in detail, hear proposals from other people, and possibly provide implementation suggestions, etc.

At this stage, I would propose we move the discussion back to the github issue (just reopen it) and we can evaluate the best way to provide support for this.

To view this discussion on the web visit https://groups.google.com/d/msgid/bonsai-users/7118a31a-9729-49c2-acd7-c85f656dbd00n%40googlegroups.com.

Can

unread,

Sep 2, 2021, 3:36:08 PM9/2/21

to Gonçalo Lopes, Bonsai Users

Hello,

>Under certain degenerate conditions...

I understand, this problem might be somewhat nuanced after all.

>This exchange I find unfortunate.

Oh you must have misunderstood me. Looking back, I think I should've worded myself better. I just meant to say that it is not a problem for me if nobody is willing to take up the mantle on this one as I am up for it. Hope it wasn't demotivating for Bruno, I really appreciate your help.

See you on github!

Can

Gonçalo Lopes

unread,

Sep 2, 2021, 3:53:49 PM9/2/21

to Can, Bonsai Users

Sounds good, see you on github!

Reply all

Reply to author

Forward