CloudSim Plus 4.0 beta: Reading trace files from Google Cluster Data (call for contribution)

32 views
Skip to first unread message

Manoel Campos

unread,
Aug 10, 2018, 11:00:35 AM8/10/18
to CloudSim Plus
Hello everyone,

I've been working to release CloudSim Plus 4.0 that will introduce one of the most useful features for researchers.
It's in beta stage and enables reading trace files from the Google Cluster Data.
Currently, it's just processing the following files:
  • machine_events: creates hosts with CPU and RAM from the trace data. The machine_attributes files don't have any useful data to be used to create Hosts. Enables dynamic addition and removal of Hosts. All the Hosts in the trace that are added for timestamp 0 will be immediately available for the Datacenter and returned by the GoogleMachineEventsTraceReader class. Other Hosts added for a timestamp greater than 0 will be requested to be created dynamically during simulation runtime.
  • task_events: creates Cloudlets requiring a specific number of CPU cores and RAM and DatacenterBrokers from the trace data. Enables starting, pausing, finishing and canceling/destroying Cloudlets at the times specified in the traces. The trace file contains lines that request the creation of tasks for different timestamps. But different from Hosts, Cloudlets can be submitted to a broker with a specific delay. This way, all tasks (Cloudlets) inside the trace are returned by the GoogleTaskEventsTraceReader class. 
A spreadsheet google-cluster-data-samples.xlsx that makes it easier to understand the traces structure is provided in the docs dir.
One example for each of the traces mentioned above is provided in the cloudsim-plus-examples/src/main/java/org/cloudsimplus/examples/googletraces dir.
These examples use reduced sample trace files available at the cloudsim-plus-examples/src/main/resources/workload/google-traces dir.

If you want to use the original Google Cluster trace files, you can use the script/download-google-cluster-data.sh to download them.
Execute the script with -h to show the usage help.

We know that using data from real cloud computing environments is crucial to assess the implementation of algorithms for different goals, such as VM allocation policies for VM placement and migration, VM and Cloudlet scheduling, etc. I'm sure the use of such Google Cluster Data will help researchers in their experiments. 

Since such features are in development process, they are available in the google-cluster-data branch
This way, I'd like to kindly ask everyone who can assess these features and give some feedback, it will be extremely welcome. 
Any kind of help and feedback are appreciated.

Best regards,

Manoel Campos

unread,
Aug 16, 2018, 9:30:56 PM8/16/18
to CloudSim Plus
Hello everyone,

I'd like to inform that the first version of the Google Cluster Data feature is about to be finished.
The google-cluster-data branch was just merged into the dev branch and deleted.
After some more tests, it will be merged into the master branch to be released.
Reply all
Reply to author
Forward
0 new messages