Question regarding partitioning of Google Cluster Trace v3 JSON files.

76 views
Skip to first unread message

8oom Choi

unread,
Jul 11, 2023, 3:22:39 AM7/11/23
to googlecluste...@googlegroups.com
Hello, 
I have a question about egarding partitioning of Google Cluster Trace v3 JSON files.


When I downloaded the file, I found that there were 1 machine-events.csv and 1 machine_attributes.csv file, 9 collection_events-*.csv files, 56 instance_event-*.csv files, and 1555 instance_usage-*.csv files, a total of 1622.



What are the criteria for dividing the collection_events, instance_events, and instance_usage files? 
Is it just divided because of the file size problem? 
Or is it divided by machine or switch or collection?
If I want to use only some files due to the huge data set capacity, for example, if I want to analyze for a specific switch, are the instances that make up the switch randomly distributed in the 1555 instance_usage-*.csv files?

john wilkes

unread,
Jul 11, 2023, 8:07:17 PM7/11/23
to googlecluste...@googlegroups.com
Hi. The documentation talks about this a bit - take a careful read.  It's mainly a file-size issue.
  john

--
You received this message because you are subscribed to the "Google cluster data - discussions" group. To post to this group, send email to googlecluste...@googlegroups.com. To unsubscribe from this group, send email to googleclusterdata-...@googlegroups.com. For more options, visit this group at https://groups.google.com/d/forum/googleclusterdata-discuss?hl=en-US.
---
You received this message because you are subscribed to the Google Groups "Google cluster data - discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to googleclusterdata-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/googleclusterdata-discuss/51AF1A59-5C65-4733-B9DF-4E8286E2185E%40gmail.com.
Reply all
Reply to author
Forward
Message has been deleted
0 new messages