Windows Tune

0 views
Skip to first unread message

Celena Angolo

unread,
Aug 3, 2024, 11:17:53 AM8/3/24
to ddecesertab

I am working on a customised gymnasium SAC environment under RLLIB, using Ray Tune for hyper parameter tuning, under Windows 10/11 environment. After around 10,000 steps, regardless of how long the expected episode length is (I tried up to more than a million), the following error messages always appear:

It is to me quite clear the syncer is trying to copy the .is_checkpoint to itself, causing the second error and stopping me from completing the training. But since I am running entirely local, I am not sure why syncer comes in to play - I was under the impression that it is for distributed environments.

In any case, you can pass SyncConfig(syncer=None) to disable syncing. But again, it would be good to know how it currently looks like so we can see what the problem may be. It might be a windows-specific problem.

I did specify both local_dir and storage_path earlier (when synced = "auto"), because if I did not, a ray_result folder is created in C:\users[username]\ray_results and I prefer having it in my project folder (excuse my OCD).

When I tried SyncConfig(syncer=None) and have only local_dir set, Ray churned out a UserWarning saying I should use RunConfig.storage_path, then a ValueError: upload_dir enables syncing to cloud storage, but syncer=None disables syncing. Either remove the upload_dir, or set syncer to 'auto' or a custom syncer. No training could happen before Ray was terminated.

When I tried SyncConfig(syncer=None) and have only storage_path set, Ray gave ValueError: upload_dir enables syncing to cloud storage, but syncer=None disables syncing. Either remove the upload_dir, or set syncer to 'auto' or a custom syncer. Again, no training could happen before Ray was terminated.

Since my last post I tried using Stable Baselines 3 and the environment works perfectly, so my customized environment is probably not the peril. It is only when I use newer versions (I tried 2.4, 2.6.0 and 2.6.1) of Ray that this error pops up. As @Sushwyzr mentioned, the same error pops up regardless of whether Ray Tune is used, and even if I use WSL environment under Windows it still happens.

Hi @Teenforever , thank you for your response. I am using ray Dataset for my project and looks like Dataset have undergone significant changes since 2.3 to 2.6, which is breaking data processing parts of my project.
This Ieaves me with exploring two options:

Thanks for raising this and following up. This is indeed a bug, and it should be fixed here: [train/tune] Use posix paths throughout library code by krfricke Pull Request #38319 ray-project/ray GitHub

Starting the second sftp session does not affect the speed of the first one iota. This makes sense to me as I noticed that my iperf tests were limited under TCP- each connection was 16 Mbps which leaves room on the wire for more connections. I was able to make iperf saturate my line by running many parallel connections, and it was a linear increase in bandwidth. That is, with 5 parallel connections I achieved bandwidth of about 5x16 == 80 Mbps.

I have reviewed What is the best way to transfer a single large file over a high-speed, high-latency WAN link? as well as other serverfault questions and there are a lot of interesting tools but what I am looking for is a way for me to adjust Windows' TCP stack to make sure I get my full 2 MB at least... and maybe more, since the iPerf tests were across untuned machines.

I notice, too, that when I drag-and-drop a file on a London machine from my Chicago fileserver, it seems to transfer at quite a nice clip (as shown by looking at the details during the transfer) but after a few minutes it drops off the cliff into the 10's of Kbps. It was very odd.

I have a Linux VM that is "close" to my Windows desktop, network-wise. That is, they are on different subnets but both machines go into switches that both go into another switch that goes to London. Right now I am running an sftp from Linux to London and it is running at 8 MB/s while my desktop continues at 500 KB/s.

The only thing I can conclude is that although there may be TCP tweaks that will help, the app you use can make a big difference! Furthermore, I don't know why my filecopy acted so strangely the other day. If it happens again I will attempt to debug.

With the ActiveCare feature on, this tune-up PC software can automatically scan and fix up to 30,000 issues on your PC in real-time to ensure you get the peak performance while gaming, streaming, video editing, and downloading.

Useful PC tune-up software offers file management tools to uninstall programs, wipe PC or hard drive, and destroy deleted files; Also, System Management tools to find memory-intensive background tasks, optimize Windows startup programs, scan network connections, and remove harmful software are available.

Use the information in this topic to tune the performance network adapters for computers that are running Windows Server 2016 and later versions. If your network adapters provide tuning options, you can use these options to optimize network throughput and resource usage.

Do not use the offload features IPsec Task Offload or TCP Chimney Offload. These technologies are deprecated in Windows Server 2016, and might adversely affect server and networking performance. In addition, these technologies might not be supported by Microsoft in the future.

For example, consider a network adapter that has limited hardware resources.In that case, enabling segmentation offload features might reduce the maximum sustainable throughput of the adapter. However, if the reduced throughput is acceptable, you should go ahead an enable the segmentation offload features.

RSS can improve web scalability and performance when there are fewer network adapters than logical processors on the server. When all the web traffic is going through the RSS-capable network adapters, the server can process incoming web requests from different connections simultaneously across different CPUs.

Avoid using both non-RSS network adapters and RSS-capable network adapters on the same server. Because of the load distribution logic in RSS and Hypertext Transfer Protocol (HTTP), performance might be severely degraded if a non-RSS-capable network adapter accepts web traffic on a server that has one or more RSS-capable network adapters. In this circumstance, you should use RSS-capable network adapters or disable RSS on the network adapter properties Advanced Properties tab.

The default RSS predefined profile is NUMAStatic, which differs from the default that the previous versions of Windows used. Before you start using RSS profiles, review the available profiles to understand when they are beneficial and how they apply to your network environment and hardware.

For example, if you open Task Manager and review the logical processors on your server, and they seem to be underutilized for receive traffic, you can try increasing the number of RSS queues from the default of two to the maximum that your network adapter supports. Your network adapter might have options to change the number of RSS queues as part of the driver.

Some network adapters set their receive buffers low to conserve allocated memory from the host. The low value results in dropped packets and decreased performance. Therefore, for receive-intensive scenarios, we recommend that you increase the receive buffer value to the maximum.

To control interrupt moderation, some network adapters expose different interrupt moderation levels, different buffer coalescing parameters (sometimes separately for send and receive buffers), or both.

You should consider interrupt moderation for CPU-bound workloads. When using interrupt moderation, consider the trade-off between the host CPU savings and latency versus the increased host CPU savings because of more interrupts and less latency. If the network adapter does not perform interrupt moderation, but it does expose buffer coalescing, you can improve performance by increasing the number of coalesced buffers to allow more buffers per send or receive.

Many network adapters provide options to optimize operating system-induced latency. Latency is the elapsed time between the network driver processing an incoming packet and the network driver sending the packet back. This time is usually measured in microseconds. For comparison, the transmission time for packet transmissions over long distances is usually measured in milliseconds (an order of magnitude larger). This tuning will not reduce the time a packet spends in transit.

Set the computer BIOS to High Performance, with C-states disabled. However, note that this is system and BIOS dependent, and some systems will provide higher performance if the operating system controls power management. You can check and adjust your power management settings from Settings or by using the powercfg command. For more information, see Powercfg Command-Line Options.

Disable the Interrupt Moderation setting for network card drivers that require the lowest possible latency. Remember, this configuration can use more CPU time and it represents a tradeoff.

Handle network adapter interrupts and DPCs on a core processor that shares CPU cache with the core that is being used by the program (user thread) that is handling the packet. CPU affinity tuning can be used to direct a process to certain logical processors in conjunction with RSS configuration to accomplish this. Using the same core for the interrupt, DPC, and user mode thread exhibits worse performance as load increases because the ISR, DPC, and thread contend for the use of the core.

Many hardware systems use System Management Interrupts (SMI) for a variety of maintenance functions, such as reporting error correction code (ECC) memory errors, maintaining legacy USB compatibility, controlling the fan, and managing BIOS-controlled power settings.

The SMI is the highest-priority interrupt on the system, and places the CPU in a management mode. This mode preempts all other activity while SMI runs an interrupt service routine, typically contained in BIOS.

c80f0f1006
Reply all
Reply to author
Forward
0 new messages