D3 dagre Graph Visualization

1,651 views
Skip to first unread message

Luis Gonzalez

unread,
Apr 22, 2015, 6:19:04 AM4/22/15
to luigi...@googlegroups.com


So we are adding some features we need for our luigi deployment (aws/openstack luigi tasks, more history information) but also we wanted to make a new visualization of the dependency graph with D3 and dagre to make it more readable and zoomable.



we have testing for a couple days now and its working fine. You can find the code for this in the d3visualizer branch in our fork at https://github.com/hadesbox/luigi/tree/d3visualizer (its up-to-date from the official master branch in the luigi repo) in case you guys wanna test it. Don't know if will be helpful for anyone else or if it  make sense to have both visualizations  (d3 and svg), if so we can fix our code to fit any contribution guidelines and make a pull request. On the lines we added total execution time on top the dependency arrows, the colors show status and current state, in the body of the "task box" you can read some information about the task, and we added a tooltip that show information of that particular task.



everything is in a single commit if you want to review the changes.

cheers,

Luis.

Alexander Krasnukhin

unread,
Apr 22, 2015, 7:08:56 AM4/22/15
to Luis Gonzalez, luigi...@googlegroups.com
Question to the community. Is it possible to have luigi visualizers as a separate pip packages? You can choose the one you want (default/d3) in luigi configuration for example.

--
You received this message because you are subscribed to the Google Groups "Luigi" group.
To unsubscribe from this group and stop receiving emails from it, send an email to luigi-user+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Regards,
Alexander

Arash Rouhani

unread,
Apr 22, 2015, 7:30:13 AM4/22/15
to Alexander Krasnukhin, Luis Gonzalez, luigi...@googlegroups.com
Wow! This looks really cool! We (Spotify) undoubtedly want to merge this in I think.

Though as Alexander pointed out, I'm not certain how this is best packaged. As I see it though, I don't see why you (as a user of the luigi visualizer) don't want to have both at the same time. Would that be possible?

/Arash

Erik Bernhardsson

unread,
Apr 22, 2015, 8:29:16 AM4/22/15
to Arash Rouhani, Alexander Krasnukhin, Luis Gonzalez, luigi...@googlegroups.com
Yeah this looks much better for the default scheduler.

Why would we want to have both? Do they accomplish different things?

Luis Gonzalez

unread,
Apr 22, 2015, 9:14:03 AM4/22/15
to luigi...@googlegroups.com, ar...@spotify.com, er...@malfunction.org, the.m...@gmail.com, hade...@gmail.com
I don't think it accomplish different things, but wasn't sure if anyone else prefers the current Visualization.

I would suggest that you guys test it whenever you can with a fat workflow and let us know if something weird happened with the graph so we can fix it, it shouldn't but just in case (we don't have really big luigi workflows right now to test, we have used some dummy foo/bar example and its not big enough). Should be lighting fast anyway.

We didn't touch any rest service in the luigi back server, it was just the static content of the visualization page. 

Erik Bernhardsson

unread,
Apr 22, 2015, 9:29:49 AM4/22/15
to Luis Gonzalez, luigi...@googlegroups.com, Arash Rouhani, themalkolm
It seems better than the existing solution so I think we should replace it. The existing one is good but doesn't work for larger graphs.

You could always try to generate a synthetic graph just to test what it looks like for big graphs

Luis Gonzalez

unread,
Apr 22, 2015, 9:35:24 AM4/22/15
to luigi...@googlegroups.com, the.m...@gmail.com, er...@malfunction.org, hade...@gmail.com, ar...@spotify.com
Ok then, I'll test with a really big graph and see how it works. If everything is fine I'll send the pull request for review.

This is the example we based the new graph.


so features are similar.

Erik Bernhardsson

unread,
Apr 22, 2015, 9:39:10 AM4/22/15
to Luis Gonzalez, luigi...@googlegroups.com, themalkolm, Arash Rouhani
pretty cool!

i'm curious to see how left to right works vs to to bottom. my hunch is left to right is better because you ahve a lot of nodes on the same level and since boxes are wide it's better to stack them vertically

Luis Gonzalez

unread,
Apr 22, 2015, 9:45:15 AM4/22/15
to Erik Bernhardsson, luigi...@googlegroups.com, themalkolm, Arash Rouhani
The graph is dynamically scaled depending on the number of nodes (and fitted into the html canvas), the boxes get smaller but you can zoom and explore with the mouser (mousewheel) so shouldn't be a problem. D3 its well known for its spectacular performance so shouldn't be slow.

Alexander Krasnukhin

unread,
Apr 22, 2015, 10:58:47 AM4/22/15
to Erik Bernhardsson, Arash Rouhani, Luis Gonzalez, luigi...@googlegroups.com
I'm used to the current one and would like to keep working with it. I would love to have ability to keep working with the existing one.
--
Regards,
Alexander

Erik Bernhardsson

unread,
Apr 22, 2015, 1:25:24 PM4/22/15
to Alexander Krasnukhin, Arash Rouhani, Luis Gonzalez, luigi...@googlegroups.com
Ok – maybe we can make it an option in the web interface to render the new vs old one? That way we can see what people prefer

Luis Gonzalez

unread,
Apr 29, 2015, 10:42:13 AM4/29/15
to luigi...@googlegroups.com, ar...@spotify.com, er...@malfunction.org, hade...@gmail.com, the.m...@gmail.com

'm doing this as an option in the [core] section of the config you either select svg or d3 (svg is default).

here are some screenshots from a bigger flow. svg vs d3

with SVG



WITH D3


D3 WITH ZOOM (wheelmouse)

Erik Bernhardsson

unread,
Apr 29, 2015, 10:47:39 AM4/29/15
to Luis Gonzalez, luigi...@googlegroups.com, Arash Rouhani, themalkolm
That's really cool

I think we should consider making it the default visualizer

What does the "33/s" mean?

Luis Gonzalez

unread,
Apr 29, 2015, 10:53:12 AM4/29/15
to luigi...@googlegroups.com, hade...@gmail.com, the.m...@gmail.com, ar...@spotify.com, er...@malfunction.org
Its the total execution time for that task (after its completion), we calculate it by substrating the starting time of the current node (parent) and the children.

Arash Rouhani

unread,
Apr 29, 2015, 10:56:31 AM4/29/15
to Luis Gonzalez, luigi...@googlegroups.com, themalkolm, Erik Bernhardsson
Cool feature! But I read "/s" as per second. Maybe just have "33s" if it's time and not velocity?

Luis Gonzalez

unread,
Apr 29, 2015, 11:00:23 AM4/29/15
to luigi...@googlegroups.com, er...@malfunction.org, hade...@gmail.com, the.m...@gmail.com
you are right, we should remove the slash.

Erik Bernhardsson

unread,
Apr 29, 2015, 11:01:14 AM4/29/15
to Luis Gonzalez, luigi...@googlegroups.com, themalkolm
It's also a bit confusing that the labels are on the edges not the vertices

I thought it was measuring some kind of flow rather than a total time
Reply all
Reply to author
Forward
0 new messages