Measuring only algorithm runtime (not loading)

9 views
Skip to first unread message

agala...@gmail.com

unread,
Jun 8, 2014, 3:51:01 PM6/8/14
to stratosp...@googlegroups.com
Hi,

I would like to determine exactly how long the execution of my workflow takes (not including loading/parsing the data). Is it possible to load the input data and then time only how long the algorithm takes (in this case, I am running kmeans using CSV files as input).

Thanks!

Robert Metzger

unread,
Jun 8, 2014, 4:10:04 PM6/8/14
to stratosp...@googlegroups.com
Hi,

check out the webinterface of Stratosphere (running at port 8081). It shows (for finished jobs) a timeline for each job. This helps to determine the runtime of the operators.
Note that this timeline does not show the runtime of the individual iterations (I guess your kmeans example is using the "iterate" operator).

In addition to that, its difficult to measure the exact runtime of the DataSource operators (loading) since they immediately stream the data they've read to the next operator. This means that while the data is still being read, the system already starts processing it.

If this information is not sufficient, I think you'll find some more information in the JobManager log file. If I'm not wrong, it should show when an iteration started and ended.


Robert



--
You received this message because you are subscribed to the Google Groups "stratosphere-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stratosphere-d...@googlegroups.com.
Visit this group at http://groups.google.com/group/stratosphere-dev.
For more options, visit https://groups.google.com/d/optout.

Stephan Ewen

unread,
Jun 9, 2014, 12:59:49 PM6/9/14
to stratosp...@googlegroups.com

One caveat though!

Stratosphere currently pipelines the data between operators. The source and the next operator run definitely together, so the reading influences the successor operator's time still. You can see that from the fact that the operator runtimes overlay in the execution time breakdown (accessible after the program completed).

Reply all
Reply to author
Forward
0 new messages