Question regarding GSoC 2019 Data Visualization Project

178 views
Skip to first unread message

Karthik Ramesh Iyer

unread,
Mar 1, 2019, 12:37:59 PM3/1/19
to Swift for TensorFlow
Hi!

I am interested in the data visualization project listed in the project ideas. 
I am not familiar with Swift, so while getting started with and exploring Swift I got to know that the Core Graphics Framework is available only with macOS, and I use Ubuntu18.04 LTS and Windows 10.
So is the project goal just to have support on macOS? 
If not, what else can be done (like maybe using C++ and OpenGL)? How can I go about it, can someone please shed light on this?

Thanks,
Karthik

Brennan Saeta

unread,
Mar 1, 2019, 12:45:10 PM3/1/19
to Karthik Ramesh Iyer, Swift for TensorFlow
Hi Karthik!

Thanks for reaching out! It's a great question. For this project, we are interested in cross-platform solutions that work across macOS, Linux, and Windows. An example of something that works cross-platform is Python's Matplotlib, or R's ggplot2.

Hope that helps!

All the best,
-Brennan

--
You received this message because you are subscribed to the Google Groups "Swift for TensorFlow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to swift+un...@tensorflow.org.

Karthik Ramesh Iyer

unread,
Mar 11, 2019, 8:40:48 AM3/11/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
Hi Brennan,

This did help!
I took a brief look at matplotlib's implementation. 
I think a similar solution should be feasible and apt for this project, i.e using the Anti-Grain Geometry rendering C++ library. AGG provides platform support for windowed output, but I think we can try Qt also if needed.
Sorry for the late response, but I've been reading up on Swift basics and the AGG library.
I'll get a sample plot using the above solution ready in a few days.
Does this sound like a good solution? If so, can I send a draft proposal before the application period and get reviews, so that I can make a better proposal?

Thanks,
Karthik

Anthony Platanios

unread,
Mar 11, 2019, 9:30:51 AM3/11/19
to Karthik Ramesh Iyer, Swift for TensorFlow
Hi Karthik and Brennan,

This sounds exciting! I just want to make small comment. I believe that the backend should be completely separate from the plotting specification and in principle we should be able to support multiple backends. One alternative to matplotlib that I believe is also super nice for visualization purposes is the Vega visualization grammar by University of Washington.

Cheers,
Anthony

--

Karthik Ramesh Iyer

unread,
Mar 11, 2019, 9:51:44 AM3/11/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
Hi Anthony!

I'm not sure if I understand what supporting multiple backends means. Does it mean that we should support multiple rendering engines?
If so, why is it necessary? Won't supporting multiple rendering engines be difficult?
We'd need to write code for plotting using APIs of each engine and we also would need to ensure consistent output images from all engines.

Karthik

Brad Larson

unread,
Mar 11, 2019, 10:43:20 AM3/11/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
I think what Anthony might be suggesting, which I would agree with, is that there should be layers to the implementation: an external API for laying out and providing data to a plot, the actual generation of plots in terms of vector elements, and then the conversion of those vector elements into rasterized results. The top two layers would be defined in platform-independent Swift code. It's what to do with the bottom layer that's the question.

On iOS and Mac, you have Core Graphics (and Core Animation for layering). I was one of the founding contributors to the Core Plot project years ago (an Objective-C plotting library for Mac and iOS):


and because we were solely targeting platforms where you had Core Graphics for rasterization, we didn't worry about anything else. We used Core Animation layers for the composited elements in the plot, and Core Graphics to draw lines, arcs, text, etc.

From working on that, I can see how you could define the interface to the lower rasterization layer in terms of a protocol that the upper two layers could interact with. The protocol would define things like telling the rasterizer to draw a line from one coordinate to another, fill an area with a color, draw text at a location, etc. With a generic interface like that, you could have the possibility of using Core Graphics on Mac and iOS for easy UI integration, maybe OpenGL for accelerated rendering on some platforms, or another rasterization implementation elsewhere.

Ideally, it would be nice to have the minimal number of external dependencies be pulled in to use this graphing library, even to the point where we could look at rolling our own Swift rasterization engine as a fallback for platforms that don't have a native one. Even there, I imagine there will be platform-specific elements, like image outputs, display interaction, etc.

I'd be glad to help with this, especially on the developer-facing external API. We made a few mistakes on the early interface for Core Plot that made it a lot harder to use (eventually corrected), and I think some of those lessons could be applied here, especially with what Swift provides in language capabilities.

Karthik Ramesh Iyer

unread,
Mar 11, 2019, 11:07:19 AM3/11/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
Hi Brad,

Thanks a lot for the explanation! Now I understand. 
But, does implementing the top two layers and using a specific rasterization implementation sound like a good and sufficient primary goal for the GSoC period, with multiple backends as a secondary goal (asking for framing a proper proposal) ? I'd be happy to extend necessary rasterization implementations even after GSoC period ends.

I'm a beginner in Swift, so could you please tell me some of the mistakes that you made with Core Plot, so that I'll be careful from the beginning and proceed accordingly.

Thanks, 
Karthik

Brad Larson

unread,
Mar 11, 2019, 12:25:08 PM3/11/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
That sounds like a pretty solid proposal to me. If you can get the general API laid out, put in place internal plotting support for a couple of the most common use cases, have a protocol for rasterization backends to comply to, and then get this operable with one of those, that sounds like a well-scoped project to me. As long as different backends can be evaluated and implemented later if needed, and the structure is flexible enough to gradually add different plot types, that'd be a great place to start from.

Two big problems we had from the start with Core Plot were in exposing every configuration option to someone trying to draw a simple graph, and an impedance mismatch between the kind of high-precision calculations and datatypes under the hood and the floats and doubles most people would provide. You want to make it as simple as possible for someone who just wants to make a simple scatter plot. Let them do that in one or two lines of code, with the graph having sensible defaults, autodetected axis limits, etc. Gradually expose customization options as people need them, but don't require all aspect of an axis to be set just to change tick spacing, etc. Some of our first examples for plots required dozens of lines of setup just to draw something simple onto the screen.

Under the hood, we had used NSDecimal for internal calculations, a Foundation datatype that stores and performs base-10 decimal math to 32-34 significant digits and avoids IEEE 754 floating point artifacts. That's what happens when you have a group of former researchers and scientists putting together a graphing framework. It really didn't need that kind of precision when rendering to pixels or even vectors. We originally exposed this decimal datatype for all the parameters on a graph, but converting to and from the floating point or integer values people actually were providing to the plots became a real headache for developers. It also slowed down the layout calculations and made internal logic more complex.

Swift can be a huge help in both of these areas. Good defaults, strong reliance on types (Swift enums alone could do a lot to make for a good plotting interface), and human-readable methods and parameters will help with getting people up and going. Operator overloading and the use of generics would address the data input and parameter setting problems we experienced. Closures would make it easy to provide custom functions for plotting on a graph.

I recommend starting from a common case that you'd encounter with Swift for Tensorflow, such as plotting loss as a function of training epochs, and work backwards from that to figure out the simplest interface that could do this and then all the backend functionality needed to make that work. Added difficulty: multiple Y axes for both loss and accuracy, live updating in a notebook or other display during training.

Karthik Ramesh Iyer

unread,
Mar 11, 2019, 12:31:55 PM3/11/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
Thanks a lot! I'll keep these in mind and proceed with the above guidelines in mind.

Karthik Ramesh Iyer

unread,
Mar 19, 2019, 6:22:34 AM3/19/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
Hi everyone!

An update...

I've made a sample line graph using the AGG Rendering Library. I haven't made all the three layers as of now, I just wanted to get a rough idea of how I am to use Swift with C++. The code currently supports plotting multiple plots on one graph with legends.
But I haven't yet made windowed output because I'm facing some trouble with the platform support(using SDL) that AGG provides. But I'll figure it out soon, or try using Qt. I've attached the plots generated. I'm saving the plots as PNG and PPM images for now. 
I'm using lodepng to save PNG images.

Here's the link to the github repo for the code: https://github.com/KarthikRIyer/swift-agg-line-graph

Any feedback would be appreciated. Based on it I'll get started with writing my proposal.

I also have a doubt I'd like to clear: To pass a float array from Swift to C++ I get a float pointer to the array, so I am unable to get the length of the array in C++ and am currently getting the length in Swift itself and passing it as a parameter. Is there any other way to do this? WebSearch didn't help.

Thanks,
Karthik
agg_test_2.png
agg_test_1.png

Brad Larson

unread,
Mar 19, 2019, 4:58:56 PM3/19/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
Working directly with C and (especially) C++ in Swift can be a little more challenging. You will spend some time wrangling raw pointers.

If you are serious about using AGG as a possible rendering backend, I'd highly recommend wrapping it in a package via Swift Package Manager. That will let you do something like "import AGG" in your Swift code and go from there. It'll get rid of the command-line compilation and linking you need now and place that within the swift build system. You can also set up C shim headers to help with interfacing into Swift, for things like functions with variadic parameters or C++ types that Swift doesn't directly support. There's an example of this shown here:


Structurally, I'd lift a lot of the C++ code that you have in line_agg.cpp into Swift. We'll want as much of this in Swift as possible. We'll also want to design this to interface with different rendering backends, not just AGG. The issues you're encountering with passing lists of X, Y values or other data will largely go away if the plot generation and layout is handled in Swift and your interaction with the C / C++ rendering engine consists of creating lines to and from points, setting colors, and so on. Also, and this is just my personal take as someone who has written a lot of code in both Swift and C++, I think you might find it easier in the long run to implement this more complex logic in Swift.

If you look at matplotlib as a template, they have everything in Python all the way down to a generic interface for rendering backends, and even the interfaces for those backends. The rendering backend API is relatively simple. The rendering backend manages graphics contexts, rendering primitives (lines, arcs, text, etc.), output to binary data or display, and a few other minor tasks. This lets you easily implement new backends and swap between them from platform to platform.

For example, I think having Core Graphics as a backend on Mac (and eventually iOS) makes sense to take advantage of capabilities on those platforms (and to reduce external dependencies there), and we might want to hook into ipykernel for direct display into Jupyter / Colab notebooks as another backend.
Message has been deleted

Karthik Ramesh Iyer

unread,
Mar 20, 2019, 5:09:53 AM3/20/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
Thanks a lot for the feedback.
Most of the C++ code in the sample is to draw stuff. The only logic for this specific plot I think is how the points are scaled according to the image size and the markers on the axes, so that's what I need to move to Swift, right? I don't see how moving that to Swift would help me with passing X and Y arrays to C++. I'd still need to pass an array of coordinates in the form of float arrays and I'd still get a float pointer in C++.
I'll try redoing the sample with the above suggested guidelines and get back once I have something or if I face any problems.

Brad Larson

unread,
Mar 20, 2019, 11:16:20 AM3/20/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
While most of the C++ code is used to draw your graph, the significant majority of it is being used to lay out a drawing at a higher level than pure direct drawing. For example, these are all of your classes / functions / methods that I think should be lifted to Swift:

class Point
class SubPlot
class LineGraph
LineGraph::addSubPlot
setPoints
getNumberOfDigits
getMaxX
getMaxY
draw_border
addTitle
addXLabel
addYLabel
drawLegend
draw_markers
drawLine
drawGraph
drawAndSaveGraph
addSubPlot
initialize()
initializeAndAddPts

Each of those has to do with higher-level organization of the graph and its various elements. The backend renderer should handle primitive and generic drawing components, so these are the only functions / methods I'd leave behind in C++ for dealing with AGG directly:

drawLine
addTextPathToRasterizerAbsCoord
text
drawColorRect
saveImage
write_png
write_ppm

Each of those (or a C shim equivalent) would be called from a Swift wrapper for the backend rasterizer. That specific Swift wrapper for the backend would conform to a protocol that would define the Swift interface to any backend (SVG, Core Graphics, etc.) and allow for easy swapping.

In such a structure, points to be plotted, plot layout parameters, axis titles, etc. would be provided from Swift code to Swift code and you'd avoid the conversions to and from C++ datatypes. You'll still have conversions at the point of drawing, but the most complex ones would be limited to changing arrays of points for a polyline into a buffer of points to draw or String into some kind of byte buffer for text drawing. Once that was defined a single time, you won't have to worry about it again. All of the more involved logic will happen at a higher level in Swift.

Karthik Ramesh Iyer

unread,
Mar 20, 2019, 11:24:40 AM3/20/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
Yes, I did realise this a little later and am writing all the logic in Swift currently.

But I am facing a problem with making a C++ package with SPM.  I followed the blog you mentioned above but am getting this error on building:
'SwiftCpp' /home/karthik/Swift_programs/SwiftCpp: error: target at '/home/karthik/Swift_programs/SwiftCpp/Sources' contains mixed language source files; feature not supported

Brad Larson

unread,
Mar 20, 2019, 11:35:31 AM3/20/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
My guess is that the SwiftCpp directory or its subdirectories contains both Swift and C++ code. The C++, C, and Swift code need to be separated into individual targets. The Swift Package Manager only supports one language per target at present.

In the above-linked example, they have all the C++ code in the cpplib module and keep it separate from the others. They then make cpplib a dependency of the cwrapper target, and in turn made cwrapper a dependency of the Swift code. The Swift code accesses the C++ code via the C wrapper, and all three language targets are kept in separate directories.

It can take a little bit of work to organize things properly for the Swift Package Manager to combine things together when dealing with system libraries and different languages.

Karthik Ramesh Iyer

unread,
Mar 20, 2019, 11:57:04 AM3/20/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
I got it to work. The think the Package.swift file mentioned in the blog is outdated.

Instead of :
import PackageDescription

let package = Package(
    name: "Objc",
    targets: [Target(name: "objc-exec", dependencies:["objc"]),
              Target(name: "objxec", dependencies:["objc"]),
              Target(name: "swift-exec", dependencies:["objc"])]
)
I had to write this:
import PackageDescription

let package = Package(
    name: "SwiftCpp",
    targets: [
      .target(name: "cpplib", dependencies:[], path : "Sources/cpplib"),
      .target(name: "cpp-exec", dependencies:["cpplib"], path : "Sources/cpp-exec"),
      .target(name: "cwrapper", dependencies:["cpplib"], path : "Sources/cwrapper"),
      .target(name: "swift-exec", dependencies:["cwrapper"], path : "Sources/swift-exec"),
    ]
)
Before I hadn't  given the path to the individual modules. I thought that I just had to give the path as "Sources". Giving the individual directories fixed it

Karthik Ramesh Iyer

unread,
Mar 21, 2019, 4:16:47 PM3/21/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
As suggested above, I've written all the logic in Swift, wrapped AGG and lodepng via Swift Package Manager.

Everything builds well but I am getting a Segmentation fault when I try to run. I am unable to figure out the cause. 

Doing this in gdb:

file .build/debug/GraphPlot
run

gives me this:

Program received signal SIGSEGV, Segmentation fault.
0x000055555558c748 in get_text_width (s=<error reading variable: Cannot access memory at address 0x7fffff7feff0>, size=<error reading variable: Cannot access memory at address 0x7fffff7fefec>, 
    object=<error reading variable: Cannot access memory at address 0x7fffff7fefe0>) at /home/karthik/agg_eg/swift_line_2/LinePlot/Sources/RendererWrapper/RendererWrapper.cpp:32
32 float get_text_width(const char *s, float size, const void *object){

I don't have much experience solving Segmentation fault errors, so any guidance would help. 

Thanks,
Karthik

Karthik Ramesh Iyer

unread,
Mar 22, 2019, 10:05:05 AM3/22/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
I can now get the output in a window using SDL. SDL works for Linux, Windows and MacOS, so does that sound like a feasible option? For SDL just have to figure out how to get it to build with SPM, or have to see how I can add compiler flags with SPM to use SDL development library installed on the system.

I have now added the window to the older code with all the logic in C++. I can do that in the new code with SPM when the Segmentation Fault is solved.
line_graph.png

Marc Rasi

unread,
Mar 22, 2019, 2:38:21 PM3/22/19
to Karthik Ramesh Iyer, Swift for TensorFlow
I have a UI suggestion! Swift-Jupyter can display matplotlib plots inline (see the "rich output" section in the readme). It would be really cool if it could also show plots from a native Swift library inline. (Though no need to do it now. If the Swift library is sufficiently layered, it should be easy to add an inline jupyter display option later).

Necessary pieces are:
  1. Swift-Jupyter should let you import packages that use C code, like the package that you are writing. I have a PR for that, so this should happen in a few days.
  2. Swift-Jupyter should expose some Swift API for displaying graphics. Maybe like `displayInline(png: Data)`. This won't be too hard. The implementation just needs to construct and sign jupyter "display_data" messages, and return the messages when asked for them by this callback mechanism.
  3. The native Swift plotting library needs to pass the image data to the `displayInline` API. Perhaps the library can have some concept of registering a UI, and then the notebook can call `registerPlottingUI(displayInlineJupyter)`, where "displayInlineJupyter" is some object that receives data and passes it to `displayInline(png: Data)`.

On Fri, Mar 22, 2019 at 7:05 AM Karthik Ramesh Iyer <ki...@ch.iitr.ac.in> wrote:
I can now get the output in a window using SDL. SDL works for Linux, Windows and MacOS, so does that sound like a feasible option? For SDL just have to figure out how to get it to build with SPM, or have to see how I can add compiler flags with SPM to use SDL development library installed on the system.

I have now added the window to the older code with all the logic in C++. I can do that in the new code with SPM when the Segmentation Fault is solved.

--

Karthik Ramesh Iyer

unread,
Mar 22, 2019, 3:17:47 PM3/22/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
This looks very interesting! I'd definitely like to work on this once the library is completed!

Karthik Ramesh Iyer

unread,
Mar 23, 2019, 8:50:37 AM3/23/19
to Swift for TensorFlow, ki...@ch.iitr.ac.in
https://matplotlib.org/gallery/images_contours_and_fields/quiver_simple_demo.html

Is this what the term Fields refers to on the GSoC ideas page?
Reply all
Reply to author
Forward
0 new messages