To generate traces for the DNN application

Ajinkya Bankar

<ajinkyasbankar@gmail.com>

unread,

Mar 31, 2021, 11:18:46 AM3/31/21

to accel-sim

Hello,

I want to generate traces for the DNN application such as given below:

https://github.com/XianweiCheng01/Cuda/tree/main/LeNet5

The above LeNet architecture is trained and tested for the MNIST dataset in the given code. May I get the detailed procedure to generate the traces and run them in the Accel-Sim framework? What should be the directory structure for Cuda codes and regarding data files? What commands shall be used? Any crucial strategy to avoid memory problems?

I really appreciate any help you can provide. Thanks.

Mahmoud Khairy

<khairy2011@gmail.com>

unread,

Mar 31, 2021, 11:32:51 AM3/31/21

to accel-sim

Hello,

I would highly encourage you to read and go through the accel-sim readme tutorial and the nvbit tracer tutorial listed below:

https://github.com/accel-sim/accel-sim-framework

https://github.com/accel-sim/accel-sim-framework/tree/release/util/tracer_nvbit

Ajinkya Bankar

<ajinkyasbankar@gmail.com>

unread,

Apr 3, 2021, 5:11:12 PM4/3/21

to accel-sim

Hi,

I am reading a tutorial to generate traces for specific individual application:

https://github.com/accel-sim/accel-sim-framework/tree/release/util/tracer_nvbit

If our application, e.g., vectoradd, requires data, how to specify it in the following command?

LD_PRELOAD=./tracer_tool/tracer_tool.so ./nvbit_release/test-apps/vectoradd/vectoradd

Mahmoud Khairy

<khairy2011@gmail.com>

unread,

Apr 3, 2021, 5:23:02 PM4/3/21

to accel-sim

just add it as a normal std input argument, so execute the program as normal and add the LD_PRELOAD in the header. For example, if the vector add want an argument value of 1000:

LD_PRELOAD=./tracer_tool/tracer_tool.so ./nvbit_release/test-apps/vectoradd/vectoradd 1000

You can google and read more about LD_PRELOAD trick. It is a Linux ldd feature.



Ajinkya Bankar

<ajinkyasbankar@gmail.com>

unread,

Apr 3, 2021, 5:27:08 PM4/3/21

to accel-sim

It's simple! Thank you very much.

Ajinkya Bankar

<ajinkyasbankar@gmail.com>

unread,

Apr 3, 2021, 9:15:13 PM4/3/21

to accel-sim

Hi Mahmoud Sir,

Now, I want to generate traces for the training and testing of the LeNet on the MNIST dataset. I make the executable of the application and the same directory has /data subdirectory which contains training and testing dataset (images, labels). This code works well on the real GPU. But, when I issue the following command, it does not generate the traces.

$ LD_PRELOAD=./tracer_tool/tracer_tool.so ./nvbit_release/test-apps/LeNet5_Training_Testing/lenet

It is generating 'traces' directory with kernelslist (having Cuda memcpy commands only) and empty 'stats.csv' files.

Please guide to get the traces for this application. Thank you.

Mahmoud Khairy

<khairy2011@gmail.com>

unread,

Apr 4, 2021, 9:20:47 AM4/4/21

to accel-sim

Hi,

1- Have you used any of the kernel limits environment DYNAMIC_KERNEL_LIMIT_START or DYNAMIC_KERNEL_LIMIT_END? If so, please ensure to clear them or use a new prompt window form scratch.

2- The is the code of nvbit tracer and this is where we trace kernel. You can add checkpoint print statements there to see what is going on.

https://github.com/accel-sim/accel-sim-framework/blob/4c2bf09a79d6b57bb10fe1898700930a5dd5531f/util/tracer_nvbit/tracer_tool/tracer_tool.cu#L285

3- Please, ensure to have nvbit requirements listed here:

https://github.com/NVlabs/NVBit

4- If you have any error or inquiries regarding nvbit, you can ask the nvbit team as they are more expert than us in nvbit.

You can report an issue to them here:

https://github.com/NVlabs/NVBit/issues

Ajinkya Bankar

<ajinkyasbankar@gmail.com>

unread,

Apr 5, 2021, 10:09:26 AM4/5/21

to accel-sim

Hello Sir,

My tracer works well with the rodinia_2.0-ft, so I guess it is satisfying environment requirements.

So, I am trying to see what is going behind the scene. I am using the LeNet example given here. I generate executable with the following command as compute_20 is not supported in CUDA 11:

$ nvcc -arch=sm_60 *.cu -lcublas -o lenet

When I launch the application on real GPU, I get output as:

millisecond : 0.003392
millisecond : 0.017792
millisecond : 0.012160
millisecond : 0.017056
millisecond : 0.012288
millisecond : 0.039968
millisecond : 0.023648
millisecond : 0.012480
Learning
error: 6.247417e-01, time_on_gpu: 10.856199

Time - 10.856199
Error Rate: 22.60%

But when I launch it for trace generation then I get output as:

millisecond : 0.003232
millisecond : 0.061184
millisecond : 0.037984
millisecond : 0.039712
millisecond : 0.037376
millisecond : 0.061504
millisecond : 0.049216
millisecond : 0.038272
Learning
error: -nan, time_on_gpu: 0.000000

Time - 0.000000
Error Rate: -nan%

I guess that during trace generation it is not reading the dataset. Therefore, I am getting error and Error Rate as nan. Is there any requirement of data format for the tracer tool? Or am I missing any tracer tool-specific argument while generating the executable by nvcc? Kindly help. Thank you.

Mahmoud Khairy

<khairy2011@gmail.com>

unread,

Apr 5, 2021, 10:18:35 AM4/5/21

to Ajinkya Bankar, accel-sim

what is the command that you used on real HW without tracer? and what is the command you used for tracing?

have you tried to dig into the tracer code by yourself and add checkpoint print statements as mentioned in my last email, to see how the process/kernel launches on the tracer is happening?

--
You received this message because you are subscribed to a topic in the Google Groups "accel-sim" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/accel-sim/_R4_jiaHtIw/unsubscribe.
To unsubscribe from this group and all its topics, send an email to accel-sim+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/accel-sim/a8bafce2-3c46-46d7-86c8-c5a938201851n%40googlegroups.com.

--

Thanks!

-Mahmoud

Ajinkya Bankar

<ajinkyasbankar@gmail.com>

unread,

Apr 5, 2021, 10:26:33 AM4/5/21

to accel-sim

Hello,

Thanks for the reply. I generate the application executable with:

$ nvcc -arch=sm_60 *.cu -lcublas -o lenet

and use $ ./lenet to execute it on the real HW without tracer. I use the following command to get the traces:

$ LD_PRELOAD=./tracer_tool/tracer_tool.so ./nvbit_release/test-apps/LeNet5_Training_Testing/lenet

I have kept /data directory in the /LeNet5_Training_Testing where my application executable is present.

It's difficult to understand the tracer code, so I haven't tested yet how the kernel launches on the tracer.

Ajinkya Bankar

<ajinkyasbankar@gmail.com>

unread,

Apr 5, 2021, 11:14:38 AM4/5/21

to accel-sim

Hi,

I checked that, the tracer is entering here: https://github.com/accel-sim/accel-sim-framework/blob/release/util/tracer_nvbit/tracer_tool/tracer_tool.cu#L276

But never entering in this else if: https://github.com/accel-sim/accel-sim-framework/blob/release/util/tracer_nvbit/tracer_tool/tracer_tool.cu#L288

Is that mean the kernel is not launched?

Ajinkya Bankar

<ajinkyasbankar@gmail.com>

unread,

Apr 7, 2021, 11:55:14 AM4/7/21

to accel-sim

Hi,

I could figure out that the problem was due to /data directory path was not specified properly during the launch of the trace generation process.

But, after specifying it correctly, I get the following error before it starts to generate the traces:

lenet: arch/gm10x_hal.cpp:181: void set_imm_relative_control_flow(uint64_t*, int64_t): Assertion `!IS_LARGER_THAN_24BIT(imm)' failed.
Aborted (core dumped)

Please help to debug the problem. Thanks.

Mahmoud Khairy

<khairy2011@gmail.com>

unread,

Apr 7, 2021, 12:22:33 PM4/7/21

to accel-sim

What is GPU hardware platform do you have?

Mahmoud Khairy

<khairy2011@gmail.com>

unread,

Apr 7, 2021, 1:10:34 PM4/7/21

to accel-sim

Please see this:

https://github.com/NVlabs/NVBit/issues/20

It seems this is an Nvbit bug in Kepler and Maxwell cards. If you move to Volta, Turing or Ampere cards, this bug may be gone.

Ajinkya Bankar

<ajinkyasbankar@gmail.com>

unread,

Apr 7, 2021, 2:32:03 PM4/7/21

to accel-sim

I have an NVIDIA GeForce 1080Ti card. It can generate the traces for other applications but giving this error for LeNet. So, can we say that it's a card problem?

Mahmoud Khairy

<khairy2011@gmail.com>

unread,

Apr 7, 2021, 2:45:14 PM4/7/21

to Ajinkya Bankar, accel-sim

Yes, it is a hardware problem as the Nvbit issue below has shown. You can follow up with the Nvbit team on this, or you can move to other hardware platforms that I mentioned.

https://github.com/NVlabs/NVBit/issues/20

Do you have multiple hardware on your system? please run "nvidia-smi" command and see how many GPUs do you have and ensure you are using the right hardware.

To view this discussion on the web visit https://groups.google.com/d/msgid/accel-sim/90a52b05-b402-477f-a667-a74b851f4dfdn%40googlegroups.com.

--

Thanks!

-Mahmoud

Ajinkya Bankar

<ajinkyasbankar@gmail.com>

unread,

Apr 7, 2021, 2:48:48 PM4/7/21

to accel-sim

I have two GPUs on my machine. But, unfortunately, they are the same. I will try other options.

Thank you for the help.

Reply all

Reply to author

Forward