Re: GPU Module No available contact pair

259 views
Skip to first unread message
Message has been deleted

Ruochun Zhang

unread,
May 13, 2022, 5:43:36 PM5/13/22
to ProjectChrono
Hi David,

This issue is a weakness in the default assumption we made that a sphere can have at most 12 contacts. This assumption is made to save GPU memory and to help identify some large-penetration problems in simulation which is typical with insufficient time step size. This assumption is fine with near-rigid spherical contacts, but problematic when meshes are involved (each mesh facet in contact with a sphere eats up one slot as well). Imagine a sphere sitting on the tip of a needle made of mesh, it could have contacts with tens of mesh facets, and we haven't counted the sphere neighbors it can potentially have.

The fix is easy, please go to the file ChGpuDefines.h (in chrono\src\chrono_gpu), and replace
#define MAX_SPHERES_TOUCHED_BY_SPHERE 12
by
#define MAX_SPHERES_TOUCHED_BY_SPHERE 20
or some even larger number if you need it. Rebuild it and your script should run fine. Note the error messages are hard-coded to say 12 is not enough if  MAX_SPHERES_TOUCHED_BY_SPHERE is exceeded, so if 20 is not enough and you need even more, just change it and do not let the error messages confuse you.

Another thing is that it is better to use meshes with relatively uniform triangle sizes. I attached a rebuilt mesh based on your original one. It's optional and does not seem to affect this simulation, but it's a good practice.

To answer your other questions: Unfortunately C::GPU does not currently have an efficient way of streaming particles into the system. The method you are using (re-initialization) is probably what I would do too if I have to. With a problem size similar to yours, it should be fine. And C::GPU does not have an official API that enforces manual particle position changes. However this should be fairly straightforward to implement. The naive approach is of course, do it on the host side with a for loop. If you care about efficiency, then we should instead add one custom GPU kernel call at the end of each iteration, that scans the z coordinates of all particles, and add an offset to them if they are below a certain value. It would be nice if you can tailor it to your needs, but if you need help implementing this custom kernel you can let us know (it may be good to add it as a permanent feature).

Lastly, I don't know if you are interested or not but in the new generation of DEM simulator that we are currently developing, apart from supporting non-trivial particle geometries, there will be efficient ways to do both things (sleeper and active entities; periodic boundary with no extra cost). It is not out yet, however.

Thank you,
Ruochun

On Thursday, May 12, 2022 at 10:47:27 PM UTC-5 dreg...@gmail.com wrote:
Hello,

I have been working on trying to use the GPU module in project chrono to fill a vessel with spherical particles. I have been able to successfully do so by using the method in the demos of generating particle sheets and allowing them to settle in the vessel. I have recently, however, been attempting to fill the vessel with a "particle source" method that continuously streams particles into the domain until a certain number of particles is reached. I am unsure if this method is officially supported with the GPU module, and I am encountering crash that continuously seems to happen. I receive the error No available contact pair slots for body # and body # after the simulation has progressed. It seems to occur sometime after the particles hit the bottom of the vessel. I have tried reducing my timestep, reducing the "flow rate" of incoming particles, changing the height of the particle inflow, and altering some stiffness/damping constants, but this error seems to always happen soon after the particles make contact with the vessel. I have attached my input files, any help would be appreciated.

An unrelated question, but does the GPU module support the changing of particle positions during the simulation (i.e. taking all particles below a certain z and moving them to the top to "recycle" them continuously during the simulation)?

Thanks!
David


GPBR_Vessel_Fine.obj

David Reger

unread,
May 16, 2022, 10:55:47 AM5/16/22
to ProjectChrono
Hi Ruochun,

Thanks for the help, it seems to be working now! I was able to get the particle relocation working as well.

I am interested in the new solver. Let me know when a release/test build is available for it, I’d like to try it out to see if it’s faster for these applications. 

Thanks!
David

David Reger

unread,
May 16, 2022, 11:41:03 AM5/16/22
to ProjectChrono
Actually, it looks like the particle source still isn’t working, even when increasing the MAX_SPHERES_TOUCHED_BY_SPHERE up to 200. The simulation will run for longer, but still fail with the same contact pairs error. Interestingly, it seems like it will fail sooner if I made the particle source radius smaller (fails after 627 pebbles added (step 34) when the source radius is 0.26 and fails after 31499 pebbles added (step 85) when the source radius is 1.1.). Do I still just need to increase the number further or is this a different issue?

Thanks!
David

Ruochun Zhang

unread,
May 16, 2022, 2:23:06 PM5/16/22
to ProjectChrono
Hi David,

I am pretty sure that script worked for me until reaching a steady state, like in the picture attached. One thing is that I'd be quite surprised if MAX_SPHERES_TOUCHED_BY_SPHERE = 200 and the kernels did not fail to compile... I'd say something like 32 is the maximum that you should assign it. Maybe you should try something like 30 to see if it works. But if it still gives the same error, we have to have a look at the script. Is it still the same script you attached?

Changing particle sizes has large impact on the physics and, "contacts over limit" problem can happen naturally (like in your first question), or happen as a result of non-physical behavior in the simulation, which is often related to improper sim parameters wrt the sphere radius. So it's hard to say without context. One thing you should do is of course, visualize simulation results before the crash and see if there is something non-physical.

Thank you,
Ruochun
vessel.jpg

Ruochun Zhang

unread,
May 16, 2022, 2:29:58 PM5/16/22
to ProjectChrono
Hi David,

Oh sorry before you do that, could you try this: I assume you cloned Chrono and built from source. Then can you checkout the feature/gpu branch first, then apply the  MAX_SPHERES_TOUCHED_BY_SPHERE change, and then build and try again with the script you failed to run initially? I did apply a bug fix in feature/gpu branch and it is probably not in develop branch yet, and I hope to rule out the possibility that this bug was hurting you.

Thank you,
Ruochun

David Reger

unread,
May 16, 2022, 3:43:17 PM5/16/22
to ProjectChrono
Hi Ruochun,

Sorry, I had made some changes to my script. I redownloaded the original scripts I provided here earlier, and rebuilt chrono with the feature/gpu branch from a fresh repo clone with the touched by sphere change. After doing all of this and running the exact same script that I had uploaded originally, I now got a “negative local pod in SD” error around frame 90. This is a bit strange since you had managed to run that script successfully, and everything was a clean install with the same script that I uploaded, so it should’ve had the same outcome as your run. Did you make any changes to the script/json? 

Ruochun Zhang

unread,
May 16, 2022, 4:28:23 PM5/16/22
to ProjectChrono
Hi David,

It's a bit weird, I checked and I almost did not change anything. I did comment out line 120~122 (because in your json file you don't have rolling friction defined), but I tested adding them back and it affected nothing, I can still run it. Are you running it with your original mesh? If so can you have a try with the mesh I attached in a earlier post let me know if it helps? If it does not help, we can go from there; however I'd be very confused at that point.

Thank you,
Ruochun

David Reger

unread,
May 16, 2022, 4:47:21 PM5/16/22
to ProjectChrono
I gave it a try with my original mesh and your new mesh and both gave the negative local… error around frame 90 still. You’re just using the chrono version  that is from the repo with the feature/gpu branch, right? If you haven’t already, could you try a fresh clone of the repo, apply the max_touched change, and then run the script to see if it’s successful just to make sure that we’re both doing the exact same thing and seeing a different outcome?

Thanks!
David

David Reger

unread,
May 16, 2022, 8:40:36 PM5/16/22
to ProjectChrono
Hi Ruochun,

I just tried the script on a different machine using the feature/gpu branch and increasing the max_touched to 20 and the script worked,  so the issue must just be something with the setup on the system I was using. I'll put an update in here once I find out what the differences are between the two machines in case anyone else has a similar issue.

Thanks a lot for your help!
David

Ruochun Zhang

unread,
May 16, 2022, 9:34:23 PM5/16/22
to ProjectChrono
Hi David,

Glad that worked for you. In general, that "negative SD" problem is that particles got out of the simulation "world" somehow, that is usually a consequence of unusually large penetrations (and subsequent huge velocities). To avoid that, the typical thing to do is reducing the time step size and checking that you don't instantiate particles overlapping with each other. I know that the GPU execution order will make each DEM simulation slightly different from each other, but statistically they should be the same, and since I (and you on the second machine) can consistently run that script, I don't think this is the cause; it is more likely that the operating systems caused the code to compile differently on these 2 machines.

I would be interested in knowing what you find out in the end, it would be a help to me.

Thank you!
Ruochun

David Reger

unread,
May 17, 2022, 10:50:13 PM5/17/22
to ProjectChrono
Hi Ruochun,

It looks like the problem was the cuda version that was used on the original machine. The machine that was having issues was using cuda 11.2.2, but the other system was using cuda 10.1.243. After switching the original problematic machine to 10.1.243, the script ran without issue.

Thanks!
David

Ruochun Zhang

unread,
May 17, 2022, 11:22:18 PM5/17/22
to ProjectChrono
Hi David,

I vaguely remember CUDA 11.2 was quite a bugged version, for our purposes at least. Maybe we used to have problems with that version too, but I don't recall clearly. Thankfully 11.3 came out soon enough and right now, we are using CUDA 11.6 and having no problem. I'm letting you know this because I don't think you are stuck with CUDA 10, you can give the newest version a try should you be interested.

Thank you,
Ruochun

David Reger

unread,
May 18, 2022, 12:53:47 PM5/18/22
to ProjectChrono
Thanks!
Also, I have another unrelated question. I want to assign particles a group ID based on their position when a simulation is first started (starting from a checkpoint file) so that I can track the particles in each group as the simulation progresses. I just want to dump the group ID’s as a column in the particle output file. Could you give some guidance on what files I will need to modify to add this functionality? 

Thanks!
David

David Reger

unread,
May 18, 2022, 4:59:25 PM5/18/22
to ProjectChrono
Sorry, disregard the previous message, I was able to figure I out haha. I can give you my modifications if you’d like to make the group functionality available for others as well, let me know!

Thanks!
David

Ruochun Zhang

unread,
May 18, 2022, 8:23:24 PM5/18/22
to ProjectChrono
Hi David,

That would be great! If you have your own fork of Chrono, can you create a pull request on branch feature/gpu, so I can incorporate your changes? It is preferrable, but don't worry if you can't; you can also share with me through whatever method that is easy for you.

Thank you!
Ruochun

Yves Eric Maxime Robert

unread,
Jun 4, 2022, 3:02:14 PM6/4/22
to ProjectChrono
Hello,

I would be very interested in this feature for my project, would it be possible to share the changes you made, especially for the particles relocation? 
Please let me know, we can exchange by email as well.

Yves

Ruochun Zhang

unread,
Jun 6, 2022, 12:29:33 AM6/6/22
to ProjectChrono
Hi Yves,

If by "relocation" you mean something similar to the "particle source" method mentioned in this thread, then you can already see how it's done from the scripts shared therein: basically, adding more spheres to the system then "re-initialize". If you meant establishing a periodic boundary, then I don't believe it is discussed in this thread, and as I said this is a more dedicated issue and worth a separated discussion.

Thanks,
Ruochun

David Reger

unread,
Jun 6, 2022, 11:34:24 AM6/6/22
to ProjectChrono
Realized I forgot to share my changes to add the group functionality. I’m not super familiar with GitHub so I’ll just upload my chrono_gpu source files here. I believe that I only made changes to ChGPUDefines.h,,ChSystemGPU.h and .cpp, and ChSystemGpu_impl.h and .cpp to add the SetParticleGroup function. 

Thanks!
David

7693838451459351512chrono_gpu.zip

Ruochun Zhang

unread,
Jun 8, 2022, 12:07:16 PM6/8/22
to ProjectChrono
Hi David,

Thanks for sharing!

Ruochun

Yves Eric Maxime Robert

unread,
Jun 14, 2022, 11:16:05 AM6/14/22
to ProjectChrono
Hello,

I am actually running into the exact same problem.

For instance, with the scripts attached in the first post, and modifying the MAX_SPHERES_TOUCHED_BY_SPHERE to 32 (then rebuilding the solver), I get the "No available contact pair" error after around 40 steps.
I am on the updated version of the feature/gpu branch, so I am a bit confused.

Do you have any clue of what it could be?

Best regards,
Yves

Ruochun Zhang

unread,
Jun 14, 2022, 2:29:26 PM6/14/22
to ProjectChrono
Not sure, the scripts in this post runs well for me in the end. Make sure you are using the newest CUDA, not 11.2. Then, common causes were already mentioned in this thread. If you are gradually refining the particle size in a problem and at some point (or just directly running with small particles size and large time step size), it stops working and gives "No available contact pair" error, it's more likely that the time step size is not sufficiently small for this particle size and the simulation crashes with large particle penetration. To debug, you can:
1. Visualize to see if the simulation already becomes not physical before the crash;
2. Try larger particles size to see if it works or, if you don't mind longer testing time, try smaller time steps.

I don't think MAX_SPHERES_TOUCHED_BY_SPHERE being 12 causes that many problems in itself, unless with some specific meshes, maybe you should debug with that set back to 12.

Thank you,
Ruochun

David Reger

unread,
Jun 15, 2022, 4:39:46 PM6/15/22
to ProjectChrono
Hi Yves,

Make sure that after making the change in ChrGpuDefines.h that you are rerunning make in both your chrono_build directory, and in the application directory so that everything is properly linked with the changes. The script previously posted should work once the changes are made, I only increased  MAX_SPHERES_TOUCHED_BY_SPHERE to 12 and it worked for me.

Thanks,
David

Yves Eric Maxime Robert

unread,
Jun 19, 2022, 4:09:32 AM6/19/22
to ProjectChrono
Hello, 

Thank you for your answers. 

It seems that after a clean install, it works. 
I must have done something wrong in the steps. 
I appreciate your support.

Best regards,
Yves
Reply all
Reply to author
Forward
0 new messages