Reserving cores not behaving as expected.

59 views
Skip to first unread message

markeljfp

unread,
May 5, 2020, 11:35:47 AM5/5/20
to Royal Render Knights Tavern
Greetings Knight!

After much experimentation I have concluded that reserving cores for the artists is not working as I would expect or hope.

Now we have 128 core machines (hyperthreaded), I am attempting to reserve 30 cores for the artist. What appears to be happening is this: rendering takes up 98/128 of all of the CPU power across all the cores.

This means that >75% of every core is in use. This results in poor performance for the artist - they do not get a full core for single thread tasks, causing stuttering and laggy behaviour, particularly for animators.

Having checked in the Task Manager, I find that mayabatch is running on low priority, which is good. I am unable to manipulate the affinity as with other tasks running on a machine, so I cannot see the settings for it (it returns "access denied").

Interestingly the behaviour in the machine itself is not reflected in rrControl, which clearly states how many cores in the Client window, and the bar graph concurs with that. But on the node itself I see as per attached screenshot. This is an AMD Threadripper3 and we have previously been dealing with Intel Xeon chips. 

We were really hoping that background rendering would be the answer to some of our render capacity issues. Can you give me some pointers on this for troubleshooting? I'd be expecting to see discretely idle cores when mayabatch kicks in but that just doesn't seem to be the case.

I hope you can help with this!

Cheers

markel

rr_cores_in_use.jpg

RR, Schoenberger

unread,
May 6, 2020, 4:37:19 AM5/6/20
to rrKn...@googlegroups.com

Hi

 

>This means that >75% of every core is in use.

Yes, Windows likes to distribute a process onto all cores evenly.

Just start any application that uses one core only (or change the renderer settings to 1 thread)

 

But no matter what, reducing the cores even more is the only way to control how much texture data is read and how fast the memory usage changes.

Which has a big effect on the performance for the artist.

(The low process priority cannot control the memory and network access)

You can use a test scene that mostly uses CPU only to do a test like this one to see if there is a difference for the artist:

www.RoyalRender.de/download/support/RRMaya2018_spheres_longrender_arnold.zip

 

 

>I am unable to manipulate the affinity as with other tasks running on a machine,

It might be that there is some new security for services. Or you have to run the Task Manager “as admin”.

 

But there is another issue: Arnold and VRay have a “new feature” since 2(?) years.

If you change the core affinity and disable a few cores, then the renderer changes it back.

Which is why RR cannot change the core usage on the fly if you logout.

The only way to control the renderer is to specify the number of threads in their settings.

From my tests I remember the core affinity changed back once a frame starts to render.
You should be able to test it within any of your users Maya session (and the scene file above).

Perhaps a combination of the thread reduction and core affinity works better.

The rrClient could verify the core affinity from time to time, perhaps once every minute?

But it will not work if your render time is e.g. 30 seconds.

The function is already in the rrClient, I might be able to enable it for the update this week.

I think the only thing missing was some research.
I should check how Physical cores are split into logical hyperthreading cores for Intel and AMD.

E.g. If physical core number 0 is split into logical core 0+1, then RR has to disable 0+1 to get a real free core.

But if physical core number 0 is split into logical core 0+64 (on your 64 core system), then RR has to disable 0+64 to get a free core.

 

 

 

 

Different topic:

On the other hand, machines with a huge amount of cores would actually need more testing.

In my old tests, machines with Hyperthreading have been ~20% faster than without (which means without HT a core is used around 84%).

But there have been tests by some larger animation studios (Siggraph BOF discussion) that machines with more than 32 cores do not scale up really well in rendering.

Which means a 64 core machine is not twice as fast as a 32 core machine. And a 128 core machine not twice as fast as a 64 core machine.

I have not done tests myself, so I cannot tell you

- if the issue related to the fact that there are a fewer and fewer image tiles left in the end and therefore more cores are idle in the end.
   (This could be solved by reducing the image tile size, but not too much)

- or if the issue is simply that the renderer do not scale well.
   In such a case it might not make a difference if you enable or disable Hyperthreading.
   But: machines without HT can be controlled better.

 

If you like to verify it, then you can disable HT on one machine.

Then choose some default production scene
(You might want to test my scene above as well, but it will only test CPU and not texture load and memory)

Modify it:

-change the output image
-sequence length to 5 or 10 frames

-remove all animation to get the same image and render time

-enable Render Settings/Arnold Renderer/Sampling/Advanced/Lock Sampling Pattern

-save it

Open the rrSubmitter.

-Change frame range to e.g. 10-19 and assign Client A,  submit, keep open.

-Change frame range to e.g. 20-29 and assign Client B,  submit.

(You might want to change “Start After” to 3am if the machines are used right now)



If you like, you can do further tests with the Tile Size.

Once with half the tile size and once with doubled size.
(Render Settings/System/Render Settings/Bucket size)

 

 

regards,
Holger Schönberger

 

Please use the rrKnights Tavern
or our support system for new questions.

RR, Schoenberger

unread,
May 7, 2020, 3:57:50 PM5/7/20
to rrKn...@googlegroups.com

Hi

 

The new version is almost ready (release on Friday).

It modifies the core assignment and checks every 60 seconds if the assignment is the same.

You can enable log message level Debug-Jobs via the menu of rrClientWatch.

Then the client logs the modifications into the app log file RR/sub/log/C_...txt

I have tested it with Maya Arnold, once a new frame starts, the setting is lost, but changed by RR again after 60 seconds.

RR, Schoenberger

unread,
May 8, 2020, 3:47:01 AM5/8/20
to rrKn...@googlegroups.com

Please download and run this tool:

https://docs.microsoft.com/en-us/sysinternals/downloads/coreinfo

At some point it states:

 

Logical to Physical Processor Map:

**----------  Physical Processor 0 (Hyperthreaded)

--**--------  Physical Processor 1 (Hyperthreaded)

----**------  Physical Processor 2 (Hyperthreaded)

------**----  Physical Processor 3 (Hyperthreaded)

--------**--  Physical Processor 4 (Hyperthreaded)

----------**  Physical Processor 5 (Hyperthreaded)

 

On our machines, I would need to disable core 0+1 to disable the first physical core.
Which is what I have implemented for now.

markeljfp

unread,
May 13, 2020, 7:54:22 AM5/13/20
to Royal Render Knights Tavern
Hi Holger

That sounds great. Is it in the 8.2.30 release (#7893 )?

I'll try to get that implemented and see how we get on.

Many thanks!

m

markeljfp

unread,
May 13, 2020, 11:55:15 AM5/13/20
to Royal Render Knights Tavern
Hi All

Sadly, something is still not working. 

Despite being logged into six different machines the problem has become worse.

May 13. 16:32.32| NEW   {T37a} job received 69-70,1
May 13. 16:32.32|     {T37a} Job Received: {T37a} shangrila|Maya|--004|fallA_mist_FX_Lighting_v004 (Render  1070-1071,1)
May 13. 16:32.32|     {T37a} Rendering to: C:\RR_localdata\renderout\A\waterfallA_mist_FX\Lighting\maya\images\waterfallA_mist_FX_Lighting_v004\waterfallA_mist_FX_Lighting_v004_mist_BG_beauty.  copy to:\\islay\projects\shangrila\work\assets\FX\
May 13. 16:32.32|     {T37a} Auto-version change. Job: 2019.02000  client config: 2019.00000
May 13. 16:32.32|     {T37a} Starting 1 Job instances... {T37a} shangrila|Maya|--004|fallA_mist_FX_Lighting_v004 (System Memory:  Available/Free: 103534MB  System Cache: 103MB  Kernel: 1380MB)
May 13. 16:32.33|     {T37a} Job Started
May 13. 16:32.45| WRN {T37a} Unable to set cores for render process: Error 87: The parameter is incorrect..
May 13. 16:33.02|     ******* Sending client status '{T37a} Rendering Job 1070-1071' userIdle: -1*******
May 13. 16:34.35| WRN {T37a} Unable to set cores for render process: Error 87: The parameter is incorrect..
May 13. 16:35.16|     {T37a} Render task done.
May 13. 16:35.17|     {T37a} Job Done: {T37a} shangrila|Maya|--004|fallA_mist_FX_Lighting_v004
May 13. 16:35.20|     ******* Sending client status '{T37a} Render Successfull 1070-1071' userIdle: -1*******


This is directly from the UI log. It's rendering but now it's using all the cores 100% instead of even reserving some power, which is actually worse than before.

These machines use the workstation mix preset - I changed the reserved cores from 30 to 32 (just in case the setting needed updating or something). All nodes have been rebooted at least once (Windows updates)

I hope this can be resolved. Thanks for fixing the reserve core setting!

m

RR, Schoenberger

unread,
May 13, 2020, 12:18:08 PM5/13/20
to rrKn...@googlegroups.com

Hi

 

The new core assignment should be an addition to the render threads set in the render app.

So the total usage should not change with the update.

 

There is still one issue in the new and all old versions.

Reserve cores had to be active when the scene was send to the client the first time.

If the job does not have KeepScenOpen set, then reserve cores had to be active when the client gets a  new frame chunk to render.

 

 

Please check the render log file in rrControl, there is a section after Maya starts:

R 13| _______________________________________________________ Maya started ____________________________________________________________________

R 29| ' 00:13.40 rrMaya      : Flag  threads        : '4'

 

 

 

>The parameter is incorrect..

Might be an issue with the processor.

I did not had a thread ripper CPU to test, only Intel machines with 64 cores.

RR, Schoenberger

unread,
May 19, 2020, 4:32:26 AM5/19/20
to rrKn...@googlegroups.com

FYI:

I just spoke to Jeremy, I will test the new core affinity function on your hardware, probably next week.

RR, Schoenberger

unread,
Jun 9, 2020, 11:25:05 AM6/9/20
to rrKn...@googlegroups.com

Hi

 

I have done some tests on the threadripper.

It takes more time to completely change the affinity function.

Therefore I have to move it into a dev ticket which is scheduled for summer.

 

 

 

Explanation:

It is a bit more complicated as you can see in the Task Manager Affinity control as well.

The affinity mask works for up to 64 cores (some OS limitation).

If you have 128 cores, then Windows creates 2 groups of 64 cores.

Then Windows assigns a process to one of the 64-core-groups.

If you want to change the affinity via the Task manager, you have to decide in which of the 64-cores-group the process should be running.

 

But how can Arnold use all 128 cores?

Arnold starts 128 threads for rendering.

Then the threads call a function to switch themselves to one of the 64-Core-groups.

(Which is probably the reason why the affinity is lost once Arnold starts to render a new frame)

 

 

In the end it means that RR has to get all threads and set the affinity for each thread.

As RR does not know in which 64-Core-group the artists process are running,

RR has to move half the application threads into each 64-cores-group.

 

 

 

Side-Note:

The processor groups could be the issue with the “old” thread limitation that RR used with Arnold.

The old way tells Arnold to use 64 render threads and Arnold might assign all 64 threads to the first 64-core-group, the second one is left free.

If Windows assigns the artists process to the first group as well, then the second 64-core-group is not used at all and artist and RR fight for resources inside the first group.

Windows assigns processes started in a round-robin manner to a 64-core-group.

 

 

PS:
Perhaps it might be possible to change the I/O (files access) and memory priority for threads this way as well.

This might have a big effect on the foregroup app performance (Or not, in this case we have still the core reduction).

markeljfp

unread,
Jun 9, 2020, 2:14:15 PM6/9/20
to Royal Render Knights Tavern
Thanks Holger!

This is great to know and have confirmed.

Till then, we will try to figure out another way to manage.

Look forward to your success!

m

JeremySmithJFP

unread,
Jun 12, 2020, 5:27:06 AM6/12/20
to Royal Render Knights Tavern
Hello Hoger,

Thanks for looking at this the other day, apologies that I didn't have a lot of time to spend with you.

I came across these links awhile ago and thought I would pass them on as it may be good reference?


Thanks and will speak to you later!

jeremy

Jeremy Smith

unread,
Sep 12, 2021, 4:01:46 PM9/12/21
to rrKn...@googlegroups.com

Hello Holger

 

Trust that you are well.  We would like to try and test this again on a 3990X CPU (64 cores 128 threads).  I know that you were going to look at this again in the summer and just wondering if this was include in the limiting the CPUs for the update that you gave to Mark recently or if this still needs to be done?

 

We would really like to get this working and would be great to try this again.  Essentially, we want to be in a position in where when an artist is logged in we could have something like the artist use 16 threads (for Maya/Houdini/Nuke etc) and the remaining 112 threads can be used for rendering.  When the artists logs out, all 128 threads would be used for rendering.

 

Thank you and will speak to you later!

 

Jeremy

 


Jeremy Smith
CTO

Jellyfish Pictures
Visual Effects & Animation

www.jellyfishpictures.co.uk
86-88 Valentia Place, London, SW9 8EP
020 7580 8154
+44 7704237360



Nominated for EMMY 2017 for Design and Graphics, Gold winner of AEAF Awards 2017, Winner of 2015 EMMY for Design and Graphics, selected as one of 25 global VFX companies to watch 2015,
2012 BAFTA and Primetime EMMY and winners of 2008 BAFTA for ‘Best Visual Effects’, winners of 2012 and 2008 VES Award for ’Outstanding Visual Effects‘ and winners of RTS 2010 and 2008



Please consider the environment before printing this e-mail

IMPORTANT - CONFIDENTIAL: The information contained in this e-mail is intended for the person to whom it is addressed and may contain confidential and/or privileged information. The contents of this message may contain personal views which are not the views of Jellyfish Pictures ltd unless specifically stated. You should not copy, retain, forward or disclose its contents to anyone else, or take any action based upon it, if it is not addressed to you personally. Jellyfish Pictures Limited Registered in England No. 4453713.


Jellyfish Pictures / Animation puts the security of the client at a high priority. Therefore, we have put efforts into ensuring that the message is error and virus-free. Unfortunately, full security of the email cannot be ensured as, despite our efforts, the data included in emails could be infected, intercepted, or corrupted. Therefore, the recipient should check the email for threats with proper software, as the sender does not accept liability for any damage inflicted by viewing the content of this email.

If you have received this e-mail in error please contact the sender immediately at in...@jellyfishpictures.co.uk

--
If you reply, the message is send to the user group which is sufficient.
("Reply All" just sends the message twice to the last author which is not required)
---
You received this message because you are subscribed to the Google Groups "Royal Render Knights Tavern" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rrKnights+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/rrKnights/ce600066-65bd-4e04-b0ce-975a40a9bf8ao%40googlegroups.com.

Reply all
Reply to author
Forward
0 new messages