Running the AstraSim simulation together with the HTSim network backend

129 views
Skip to first unread message

Sharon Fraiman

unread,
May 28, 2025, 7:58:00 AM5/28/25
to astrasi...@googlegroups.com, Lior Friedman
Hello.
I'm running the AstraSim simulation together with the HTSim network backend, feeding Chakra Workload Execution files from Meta (Llama2). 
I am printing logs in AstraSim for start and end of a flow  (from Workload::issue and Workload::callback), and on the HTSim side in the tcp.cpp, also on start and end flow, printing the time in ticks. 
 The timing that the AstraSim outputs seems to be the exact duration in the Chakra ET files (after activating the chakra_jsonizer on the et files), while the HTsim start and end flow seem not connected to the logs from Astra. 
Can you please help me understand the issue ?

Thanks,
Sharon Fraiman
Optimal Nets

Morris, Jalil

unread,
May 28, 2025, 8:40:41 AM5/28/25
to Sharon Fraiman, astrasi...@googlegroups.com, Lior Friedman
Hi,

Does the problem include both computation and communication nodes?

If so, there is a strong possibility this is not due to the network backend.
Locate your system configuration file (should be *.json) and look for a field named "replay-only". Set this value to 0.
If this key/value pair is not present in your system configuration, there is a chance it is enabled by default, so I would add it and again set it to 0.
After doing so, you may also have to set the "roofline-enabled" field to 1 to ensure everything runs smoothly, particularly the computation tasks.
See image below for an example.
Then everything should run as expected.

I hope this helps!

image.png

Best,
Jalil

--
You received this message because you are subscribed to the Google Groups "ASTRA-sim Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to astrasim-user...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/astrasim-users/TL0P290MB0558632734EB9F34AC47525ED267A%40TL0P290MB0558.ISRP290.PROD.OUTLOOK.COM.
For more options, visit https://groups.google.com/d/optout.

Sharon Fraiman

unread,
May 29, 2025, 5:23:52 AM5/29/25
to Morris, Jalil, astrasi...@googlegroups.com, Lior Friedman
Hello Jalil.
Thank you very much for your detailed response.
Yes, both the computation (COMP) and communication (COMM) nodes show the duration time value as can see in the Chakra ET nodes themselves.
The "replay-only" is zero by default, but I've added the "replay-only":0 anyway, and also "roofline-enabled": 1.
The "roofline-enabled" flag causes the HTsim simulation to time out (".......Warning: Simulation timed out"), so I removed this flag.
The following graph shows the visualization of the logs I've added in the AstraSim and HTsim, and one can see that the HTSim COMM logs are out of sync with the AstraSim COMM logs, both are out of sync in the duration and the Start / End times :


I'm constructing this graph by taking the log lines for start and end of a flow from both AstraSim and HTSim, taking the Tick value to calculate the start of the flow and the duration (There is a factor of 1000 in the Tick value between AstraSim's Sys::boostedTick() and HTSim's eventlist().now(), which I fix in the Script python that builds the json for the perfetto visualizer) : 

AstraSim : 
void Workload::issue(shared_ptr<Chakra::ETFeederNode> node) {
 <!-- Snip -->
        } else if (!node->is_cpu_op() &&
                   (node->type() == ChakraNodeType::COMM_COLL_NODE ||
                    (node->type() == ChakraNodeType::COMM_SEND_NODE) ||
                    (node->type() == ChakraNodeType::COMM_RECV_NODE))) {
                if (sys->trace_enabled) {
                    logger->debug("issue,sys->id={}, tick={}, node->id={}, "
                                  "node->name={}, node->type={}, node->comm_size={}",
                                  sys->id, Sys::boostedTick(), node->id(),
                                  node->name(),
                                  static_cast<uint64_t>(node->type()),
                                  node->comm_size());
 <!-- Snip -->

void Workload::call(EventType event, CallData* data) {
 <!-- Snip -->
    if (event == EventType::CollectiveCommunicationFinished) {
        IntData* int_data = (IntData*)data;
        hw_resource->tics_gpu_comms += int_data->execution_time;
        uint64_t node_id = collective_comm_node_id_map[int_data->data];
        shared_ptr<Chakra::ETFeederNode> node = et_feeder->lookupNode(node_id);

        if (sys->trace_enabled) {
            LoggerFactory::get_logger("workload")
                ->debug("callback,sys->id={}, tick={}, node->id={}, "
                        "node->name={}, node->type={}, "
                        "node->comm_size={}",
                        sys->id, Sys::boostedTick(), node->id(), node->name(),
                        static_cast<uint64_t>(node->type()),
                        node->comm_size());
        }

HTSim : 
TcpSrc::startflow() {
    <!-- Snip -->
    std::cout << "Node: " << nodename() << " started Sending flow ID: " << tag
              << " from: " << src_id << " to: " << dst_id << " Size: " << _flow_size
              << " at: " << eventlist().now() << std::endl;

    send_packets();
}

void TcpSrc::receivePacket(Packet& pkt)
{   
     <!-- Snip -->
    if (seqno >= _flow_size){
        <!-- Snip -->
        std::cout << "Node: " << nodename() << " finished Sending flow ID: " << tag
                  << " from: " << src_id << " to: " << dst_id << " Size: " << _flow_size <<  " at: " << eventlist().now() << std::endl;



Any though would be greatly appreciated.
Thank you very much,
Sharon


From: Morris, Jalil <jmo...@g.harvard.edu>
Sent: Wednesday, May 28, 2025 3:40 PM
To: Sharon Fraiman <sha...@optimalnets.com>
Cc: astrasi...@googlegroups.com <astrasi...@googlegroups.com>; Lior Friedman <li...@optimalnets.com>
Subject: Re: Running the AstraSim simulation together with the HTSim network backend
 

Krishna, Tushar

unread,
May 30, 2025, 2:58:12 PM5/30/25
to Morris, Jalil, Sharon Fraiman, astrasi...@googlegroups.com, Lior Friedman, Senad Durakovic
I'm cc:ing Senad Durakovic from Marvell, whose team contributed the HTsim network backend into ASTRA-sim.
Senad Durakovic  - can you please forward this to any relevant folks who can respond?
 
Thanks,
Tushar
On May 29, 2025 at 5:24 AM -0400, Sharon Fraiman <sha...@optimalnets.com>, wrote:
Hello Jalil.
Thank you very much for your detailed response.
Yes, both the computation (COMP) and communication (COMM) nodes show the duration time value as can see in the Chakra ET nodes themselves.
The "replay-only" is zero by default, but I've added the "replay-only":0 anyway, and also "roofline-enabled": 1.
The "roofline-enabled" flag causes the HTsim simulation to time out (".......Warning: Simulation timed out"), so I removed this flag.
The following graph shows the visualization of the logs I've added in the AstraSim and HTsim, and one can see that the HTSim COMM logs are out of sync with the AstraSim COMM logs, both are out of sync in the duration and the Start / End times :

<image.png>
<image.png>

Best,
Jalil

On Wed, May 28, 2025 at 7:58 AM Sharon Fraiman <sha...@optimalnets.com> wrote:
Hello.
I'm running the AstraSim simulation together with the HTSim network backend, feeding Chakra Workload Execution files from Meta (Llama2). 
I am printing logs in AstraSim for start and end of a flow  (from Workload::issue and Workload::callback), and on the HTSim side in the tcp.cpp, also on start and end flow, printing the time in ticks. 
 The timing that the AstraSim outputs seems to be the exact duration in the Chakra ET files (after activating the chakra_jsonizer on the et files), while the HTsim start and end flow seem not connected to the logs from Astra. 
Can you please help me understand the issue ?

Thanks,
Sharon Fraiman
Optimal Nets
--
You received this message because you are subscribed to the Google Groups "ASTRA-sim Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to astrasim-user...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/astrasim-users/TL0P290MB0558632734EB9F34AC47525ED267A%40TL0P290MB0558.ISRP290.PROD.OUTLOOK.COM.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "ASTRA-sim Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to astrasim-user...@googlegroups.com.

Senad Durakovic

unread,
Jun 2, 2025, 6:00:55 PM6/2/25
to Krishna, Tushar, Morris, Jalil, Sharon Fraiman, astrasi...@googlegroups.com, Lior Friedman

Hello,

 

Thank you for the note and I wanted to follow up on the email thread.

 

We will have Adam Latos reply to your message tomorrow EU time.

 

He is trying to setup his google account with the proper Marvell account.

If that does not work in a timely manner, he would respond by email.

 

Regards,

 

Senad

 

From: Krishna, Tushar <tus...@ece.gatech.edu>
Sent: Friday, May 30, 2025 11:58 AM
To: Morris, Jalil <jmo...@g.harvard.edu>; Sharon Fraiman <sha...@optimalnets.com>
Cc: astrasi...@googlegroups.com; Lior Friedman <li...@optimalnets.com>; Senad Durakovic <sdura...@marvell.com>
Subject: [EXTERNAL] Re: Running the AstraSim simulation together with the HTSim network backend

 

I'm cc:ing Senad Durakovic from Marvell, whose team contributed the HTsim network backend into ASTRA-sim. Senad Durakovic - can you please forward this to any relevant folks who can respond? Thanks, Tushar On May 29, 2025 at 5:24 AM -0400, Sharon

ZjQcmQRYFpfptBannerStart

Prioritize security for external emails:

Confirm sender and content safety before clicking links or opening attachments

    Report Suspicious    ‌

ZjQcmQRYFpfptBannerEnd

Adam Latos

unread,
Jun 3, 2025, 4:29:23 AM6/3/25
to ASTRA-sim Users
Hello,

A couple notes:
Issuing one COMM node in AstraSim can result in many flows in HTSim to start, completing a COMM requires many flows to finish, and it is not trivial to assign a flow to a comm if there's more comm nodes. One idea is perhaps the events in the Perfetto graph are not exactly the same?

An AstraSim Sys object will issue a a COMM node, which will cause flow(s) to start on the HTSim side. We should see the same time on both ends, as AstraSim's boostedTick() relies on the network backend (so HTSim in this case) to provide a time source. The 1000x factor in this case is because HTSim internally uses picoseconds, but AstraSim's API expects nanoseconds. In the HTSim snippet the following would fix the print:
timeAsNs(eventlist().now())
 
Another thing that might be confusing is if the .et file has a "duration" field for COMM nodes. With the defaults (no replay-only, no roofline model) the COMP ops should have the same duration in your log as in the input .et file. But the COMM nodes should have a different duration, as it's dependent on the network backend.
If the above does not help, is it possible to share the .et trace that exhibits the issue, or some minimal example file for reproduction of it?

Best regards,
Adam
Message has been deleted

sfsharon

unread,
Jun 6, 2025, 7:05:23 AM6/6/25
to ASTRA-sim Users

Hi Adam,

Thank you for your reply.

I believe I've identified the issue (though not yet the root cause): missing dependencies between collective communication nodes and computation nodes that should depend on them.

Issue Details:
I'm using Llama2 Chakra files from the CommonML public Google Drive as workload input for Astra-Sim simulation:
- Path: `MLC Public -> Working Groups (Public) -> Chakra -> Chakra Traces -> v0.0.4 -> Llama2`
- Specific file tested: `llama_chakra.0.et`

What I Found:
When I applied `chakra_jsonizer` to convert the ET file to JSON, I discovered that certain collective communication nodes have no dependent computation or communication nodes waiting for them to complete. This issue specifically affects:
- `COMM_COLL_NODE` nodes with `ncclKernel_` prefix in their names
- Nodes where `is_cpu_op` attribute is `false`

Example:
The node shown below (ID 1178) is a collective communication operation, but no subsequent nodes reference it as a dependency:

```json
{
  "id": "1178",
  "name": "ncclKernel_AllGather_RING_LL_Sum_int8_t(ncclDevComm*, unsigned long, ncclWork*)",
  "type": "COMM_COLL_NODE",
  "ctrlDeps": ["1177"],
  "dataDeps": ["1179", "93", "95", "827", "829", "1174", "1176"],
  "durationMicros": "3898",
  "inputs": {
    "values": "[[226, 92, 0, 25297920, 2, 'cuda:0']]",
    "shapes": "[[25297920]]",
    "types": "['Tensor(c10::Half)']"
  },
  "outputs": {
    "values": "[]",
    "shapes": "[]",
    "types": "[]"
  },
  "attr": [
    {
      "name": "is_cpu_op",
      "boolVal": false
    },
    {
      "name": "tid",
      "int64Val": "84"
    },
    {
      "name": "comm_type",
      "int64Val": "2"
    },
    {
      "name": "comm_size",
      "int64Val": "50595840"
    },
    {
      "name": "involved_dim",
      "boolList": {
        "values": [true]
      }
    }
  ]
}
```

This suggests that the dependency graph may be incomplete, potentially causing simulation inaccuracies where collective operations appear to have no impact on subsequent computations. Astra-Sim schedules other nodes in parallel to these ncclKernel_ COMM_COLL_NODE, so the Job Completion Time (JCT) is wrong.

Has anyone else encountered similar dependency issues with the Llama2 Chakra traces? Any insights into potential causes or solutions would be greatly appreciated.

Best regards,
Sharon

Joongun Park

unread,
Jun 6, 2025, 1:44:23 PM6/6/25
to ASTRA-sim Users

Dear Sharon,

I saw the discussion regarding the missing dependencies.
It appears that the trace and Chakra version used for linking/conversion are quite outdated.

  • Path: MLC Public → Working Groups (Public) → Chakra → Chakra Traces → v0.0.4 → Llama2

  • File tested: llama_chakra.0.et

Since then, Chakra has been updated multiple times to improve its correctness.
Could you try using a more recent trace and the latest Chakra version?

You might find this PR helpful. There is the trace I used previously (Mixtral 8x3B):
https://github.com/mlcommons/chakra/pull/185

Best regards,
Joongun

Reply all
Reply to author
Forward
0 new messages