Question regarding partitioning of Google Cluster Trace v3 JSON files.

8oom Choi

unread,

Jul 11, 2023, 3:22:39 AM7/11/23

to googlecluste...@googlegroups.com

Hello,

I have a question about egarding partitioning of Google Cluster Trace v3 JSON files.

When I downloaded the file, I found that there were 1 machine-events.csv and 1 machine_attributes.csv file, 9 collection_events-*.csv files, 56 instance_event-*.csv files, and 1555 instance_usage-*.csv files, a total of 1622.

What are the criteria for dividing the collection_events, instance_events, and instance_usage files?

Is it just divided because of the file size problem?

Or is it divided by machine or switch or collection?
If I want to use only some files due to the huge data set capacity, for example, if I want to analyze for a specific switch, are the instances that make up the switch randomly distributed in the 1555 instance_usage-*.csv files?

john wilkes

unread,

Jul 11, 2023, 8:07:17 PM7/11/23

to googlecluste...@googlegroups.com

Hi. The documentation talks about this a bit - take a careful read. It's mainly a file-size issue.

john

--
You received this message because you are subscribed to the "Google cluster data - discussions" group. To post to this group, send email to googlecluste...@googlegroups.com. To unsubscribe from this group, send email to googleclusterdata-...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/googleclusterdata-discuss?hl=en-US.
---
You received this message because you are subscribed to the Google Groups "Google cluster data - discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to googleclusterdata-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/googleclusterdata-discuss/51AF1A59-5C65-4733-B9DF-4E8286E2185E%40gmail.com.

Reply all

Reply to author

Forward

Message has been deleted