Pretty display of tabular data?

266 views
Skip to first unread message

Matt Jadud

unread,
Mar 13, 2019, 2:19:07 PM3/13/19
to Racket Users
Hi all,

I have a tabular data type that I'd like (I think) to be able to render it either in ASCII or in a prettier way in the Interactions pane. I've explored gen:write and friends, and can get the struct to display the way I want---with ASCII. Essentially easy-peasy.

What I wonder is: am I able to do something prettier? Can I encapsulate some kind of styled rendering as a snip%, or... something... so that I can render the first 5 and last 5 rows of a table with bolding of headers, etc.? 

I don't know where to start, essentially, if I wanted to try and do this. Or, perhaps it is not particularly doable. 

Pointers to examples in codebases are welcome (if such examples exist), and I can work from there. Or, indications that this might be really difficult are also welcome.

Cheers,
Matt

(Apologies if this somehow comes through twice... I sent it to plt-scheme first...)

Laurent

unread,
Mar 13, 2019, 2:25:32 PM3/13/19
to Matt Jadud, Racket Users
Not sure how much this would help you, but there's a `text-table`package. It's fairly simple though.

--
You received this message because you are subscribed to the Google Groups "Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Jay McCarthy

unread,
Mar 13, 2019, 2:26:24 PM3/13/19
to Matt Jadud, Racket Users
90% of the reason I made `raart` is because of this.

https://docs.racket-lang.org/raart/index.html#%28def._%28%28lib._raart%2Fdraw..rkt%29._table%29%29

(require raart
(draw-here (table (text-rows THE-TABULAR-DATA)))

Jay
> --
> You received this message because you are subscribed to the Google Groups "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to racket-users...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
-=[ Jay McCarthy http://jeapostrophe.github.io ]=-
-=[ Associate Professor PLT @ CS @ UMass Lowell ]=-
-=[ Moses 1:33: And worlds without number have I created; ]=-

Jay McCarthy

unread,
Mar 13, 2019, 2:27:08 PM3/13/19
to Matt Jadud, Racket Users
I started with the good text-table library, but found I wanted more
and more other drawing tools and ended up making something pict-like
for the terminal.

jackh...@gmail.com

unread,
Mar 13, 2019, 4:59:22 PM3/13/19
to Racket Users
I've wanted this too, and got the sense that working with `snip%` instead of `gen:custom-write` was 1) the way to go and 2) very difficult. Are you planning on using this in some open source code you have right now in a github repo or something similar? I'd like to bookmark it.

Alex Harsanyi

unread,
Mar 13, 2019, 6:16:36 PM3/13/19
to Racket Users

Here is an example:

    #lang racket
    (require racket/draw pict)

    (define (make-pretty-table items)
      (define column-count (length (car items)))
      (define picts '())
      (for (([row index] (in-indexed items)))
        (define font
          (if (= index 0) ;; header
              (send the-font-list find-or-create-font 12 'default 'normal 'normal)
              (send the-font-list find-or-create-font 10 'default 'normal 'normal)))
        (define color
          (if (= index 0)
              (make-object color% #x77 #x88 #x99)
              (make-object color% #x2f #x4f #x4f)))
        (define face (cons color font))
        (for ([item (in-list row)])
          (set! picts (cons (text (~a item) face) picts))))
      (let ((p0 (table column-count (reverse picts) lc-superimpose cc-superimpose 15 3)))
        (cc-superimpose
         (filled-rounded-rectangle (+ (pict-width p0) 20) (+ (pict-height p0) 20) -0.1
                                   #:draw-border? #f
                                   #:color "LightYellow")
         p0)))

Will produce

Capture.PNG

You can control how the text in the columns is aligned as well and gives you full flexibility on how each cell is displayed (it can be an arbitrary pict).

The disadvantage is that the output is not interactive, and for large sets of data it is impractical, still, you can get really nice results for small amounts of data.

A snip% would also be feasible, and this would allow to implement scrolling and other nice interactive features, but that is more of a weekend project, not a "before my morning coffee one" like the above code :-)

Alex.

Matt Jadud

unread,
Mar 13, 2019, 9:06:12 PM3/13/19
to jackh...@gmail.com, Racket Users
First, thank you for all the great pointers in this thread. It is clear that different renderings will be useful in different contexts, and there's good libraries to leverage in the community. That's what I was hoping. 


(I'll add Github as a second push destination shortly.)

There's a story about how I'm at the point of writing this library. The short version is that I would like to be able to do simple exploratory data analysis with relatively small data in simple ways. The word "simple" is grossly loaded in this context, so I'll just say that I want a library that supports introductory exploratory data analysis in an HtDP context, and I want it to have a pedagogic growth path, so that if students go off to use Python/NumPy/SciPy, or R/Tidyverse (or, horrors, plain R), then they've had the conceptual base to know what they want to do, even if the syntax, semantics, and learning materials are against them.

This was a first dive into starting to think seriously about syntax-case and syntax-parse, and that probably led me down roads that were not entirely productive. I did a lot of implementation work as I explored.

In the last few days, I threw out 3000 lines of exploration, and rewrote it in 300, much more of which is tests. In particular, I decided that everything I wanted to do could be handled by an in-memory SQLite database, I could leverage Ryan's excellent 'sql' library, and in doing, effectively design a small "language" (API? interface? perhaps someday a #lang?) that wraps operations on that data. However, the design of that is subject to discussion and debate, and it might be that the library ultimately encapsulates more than one interface, so that different kinds of data questions can be asked differently.

So, the abstractive lift of using db/sql was huge, and I also like rackunit/chk. I'm also wrapping some parts of plot, so that I can have really, really short pathways to investigating data. It's early days on the pieces (which, in the rewrite, I sprinted based depth to instead stitch a complete pipeline in the name of proof-of-concepting the choice to backend to SQLite). Wrapping everything under a single require, etc., hasn't happened yet, testing is reasonably underway, and documentation on the rewrite is currently lagging.

#lang racket
(require tbl/reading/gsheet
         tbl/plot)

;; The source Google Sheet: http://bit.ly/cities-gsheet
;; read-gsheet takes a version published/shared as a CSV
(define T (read-gsheet "http://bit.ly/cities-csv"))
(show (scatter T "LonD" "LatD"))
 
These two lines let me read in a CSV published via Google Sheets, and get a  scatterplot in DrRacket. 

So, that's a long story. However, I'd welcome dialogue. I may come back with some specific questions. For the moment, I'm exploring. I had (and will have again) the ability to slurp in SQL databases (SQLite, MySQL, etc.), I currently do CSV files, and would like to output a number of these formats as well. In terms of plotting, I'd like to support basics (think early chapters of Tukey) with some customization, but ultimately know that I can always drop down to full 'plot' if I need to. 

The output of the table question is so that students can have a richer view into the tables they're working with. A lot of good pointers were in this thread.

That's long, but there you go. That's the story. A short version may be "I'm standing on the shoulders of giants," because the rewrite feels like a wrapper around sql and db... which, frankly, is lovely. (And, I'm almost starting to understand how to use the various quasiquoting syntactic forms in the sql language to build my own frankensteined queries...)

Cheers,
M




--

Alex Harsanyi

unread,
Mar 14, 2019, 1:26:39 AM3/14/19
to Racket Users
On Thursday, March 14, 2019 at 9:06:12 AM UTC+8, Matt Jadud wrote:
First, thank you for all the great pointers in this thread. It is clear that different renderings will be useful in different contexts, and there's good libraries to leverage in the community. That's what I was hoping. 


(I'll add Github as a second push destination shortly.)

There's a story about how I'm at the point of writing this library.


There are now several projects announced on this list, all of them deal with
data analysis on one way or the other.  Would it be possible to join forces
and merge these projects so that we end up with one library that servers
multiple purposes equally well?  Something where the final product is greater
than the sum of its parts...

Or perhaps these libraries have aims that are so different from each other
that the only thing they share is a very abstract concept of "table"?

Alex.

Matt Jadud

unread,
Mar 14, 2019, 7:42:36 AM3/14/19
to Alex Harsanyi, Racket Users
There are now several projects announced on this list, all of them deal with
data analysis on one way or the other.  Would it be possible to join forces
and merge these projects so that we end up with one library that servers
multiple purposes equally well?  Something where the final product is greater
than the sum of its parts...

Or perhaps these libraries have aims that are so different from each other
that the only thing they share is a very abstract concept of "table"?

Yes?

It makes complete sense, from a practical perspective, to not duplicate work.

Without bikeshedding ("where should the conversation happen?", "what color should the logo be?"), there are easier and harder design constraints. For example, I realized that any sufficiently interesting table interface would, ultimately, embed a copy of LISP... wait... would be a half-assed reimplementation of SQL. So, in my rethink, I just set things on top of the sql library, thus providing the "base language" from which I would work. If SQL can't do it, it's possible I don't need to do it, and it is 100% certain that a first-year, who has been programming for 6 weeks, will not have introductory data questions that cannot be handled by my "target language."

Keeping a non-leaky abstraction (or, as non-leaky as possible) that lets a student who is early in HtDP do work with data (in a principled way... another loaded perspective...) is very important to me. I'll trade all the fancy databases in the world (as well as the full expressivity of SQL, and performance for datasets beyond 100K rows, and and and...) for an interface that does a small number of things very well for novices. If we can do our design work so that there are demarked shells of increasing complexity, then yes, I'm confident that we could find ways to combine forces.

If nothing else, I'm already eyeing other libraries that I want to "wrap," so that they operate on the substrate I'm laying. I want simplified plotting (with possibly reduced levels of customization from full 'plot', either enforced through interfaces or simply enforced by reducing the documentation for the interface), basic tools for summarizing and analyzing data (I'm thinking of wrapping the "data-science" library that is floating around, but not packaged)... so, yes. I don't want to reimplement everything, but I do need a common substrate. At the moment, I've decided that anywhere Racket runs (and that my students will use it) is powerful enough to also have SQLite, and for my time, energy, and task, there are worse choices than just using SQL.

But, back to bike-shedding... at the least, it might be interesting to kick around a set of requirements/wants/needs/desires, and from there think about next steps in design. However, blank-whiteboard design phase work is challenging in distributed/asynchronous modes, so I'm also concerned that a mailing list amongst people who do not know each-other is a hard way to do good design on something nuanced... but, that could just be a failing of mine. Suggestions for "next steps" on collaboration are something I'm absolutely open to. 

At the end of the day, I need tools for next Fall (ideally, sooner, so I can begin developing course materials); that's a hard, non-optional design/implementation deadline, no matter how much interest and goodwill there is to collaborate.

Cheers,
Matt


matt...@ccs.neu.edu

unread,
Mar 14, 2019, 11:51:10 AM3/14/19
to Racket Users


> On Mar 14, 2019, at 1:26 AM, Alex Harsanyi <alexha...@gmail.com> wrote:
>
> Would it be possible to join forces
> and merge these projects so that we end up with one library that servers
> multiple purposes equally well?


1. I have been at this point many times.

A big example is the testing libraries that we use for teaching (test-engine/) vs production programming (rackunit/). Initially I wanted the former to compile to the latter, but because we need John’s stepper to work and we focus on DrRacket and we have pedagogic constraints, that just didn't work out.

Another example would be drawing and event handling, see pict and images and big-bang vs the GUI tool box. They are even further apart than the testing libraries.

2. So Matt’s injection of two perspectives is a really good one.

Still it might make sense to think of the one for devs as a scaled-up version of those for pedagogic setting and vice versa. This can happen at

— the conceptual level: let me introduce concepts in CS1 or my high school course that are useful later when you use Real Racket
— the implementation level: the pedagogic one several restricts the dev-focused one and improves the error messages severely.

3. Which brings me to to the biggest obstacle.

The key difference between the teaching languages and Racket is that the former assume that

— beginners make mistakes
— languages must explain mistakes
— beginners better understand these explanations

This is hard to get right because implementations bake in assumptions about who the users are w/o (usually) paying attention to this idea. Whoever works on the pedagogic version, consider linking in the rewriter library so you benefit from Guillaume Marceau’s research:

(require lang/private/rewrite-error-message)
(get-rewriten-error-message x)


— Matthias

Matt Jadud

unread,
Mar 14, 2019, 1:21:02 PM3/14/19
to Matthias Felleisen, Racket Users
On Thu, Mar 14, 2019 at 11:51 AM <matt...@ccs.neu.edu> wrote:

3. Which brings me to to the biggest obstacle.

— beginners make mistakes
— languages must explain mistakes
— beginners better understand these explanations


Yes to all three of your points. I was thinking about your last point in my first exploration, and error handling/reporting is something that is at the front of my mind as I look at things like Pyret's sanitizers, as well as the many/manifest ways that I see students struggle/fail with R. I don't think I'll be able to "solve" this in the first instance, but keeping centered the lived experience of the novice when learning to work with data (which often comes in "from the wild," and is not consistent/reliable/etc.), how to manage that messiness, and how to report back in a learning-centric way when they encounter difficulty is going to be 1) hard and 2) critical. 

Which is perhaps a reflection/restatement/variation of what your last point.

I'm generally familiar with the broad space in which Guillaume's work is situated, and have seen in presented/read it. So, yes. 

And, thank you for the pointer to the rewriter library. The Racket stack is rich enough at this point that I forget the wealth of tools that are available.

Cheers,
M



Ryan Kramer

unread,
Mar 14, 2019, 1:28:41 PM3/14/19
to Racket Users
On Thursday, March 14, 2019 at 12:26:39 AM UTC-5, Alex Harsanyi wrote:

There are now several projects announced on this list, all of them deal with
data analysis on one way or the other.  Would it be possible to join forces
and merge these projects so that we end up with one library that servers
multiple purposes equally well?  Something where the final product is greater
than the sum of its parts...

Or perhaps these libraries have aims that are so different from each other
that the only thing they share is a very abstract concept of "table"?

I think my project "plisqin" is one of those you are thinking of. Matt's "tbl" is also one. I'm also keeping an eye on Ryan's "sql". Are there any more you were thinking of?

Regarding joining forces/merging these projects, this is a good question that I think warrants discussion. So I'll share my thoughts.

Obviously I can't speak for all of us, but right not I only see the "very abstract concept of "table"" as potential shared code. (Also, learning about snip% earlier in this thread was awesome. I'd love to use something like that in my project.)

I think the differences between plisqin and tbl are fairly obvious - plisqin is an alternative to SQL while tbl is an alternative to "Python/NumPy/SciPy, or R/Tidyverse (or, horrors, plain R)"

Now comparing Ryan's sql to plisqin is a different story. These projects are both alternatives to SQL. But I think there is enough difference between our approaches and scope to warrant separate projects, at least for now.
1) sql seems to be mostly implemented as macros. plisqin is mostly implemented as procedures.
2) plisqin has some design decisions that some might consider "too much magic", namely inline joins and "inject-able aggregates" (need better name) as documented here: https://docs.racket-lang.org/plisqin/intro.html. Whereas sql-the-package seems to more closely mirror SQL-the-language - it would be difficult to surprise yourself with the SQL you generate.
3) I am trying to design #lang plisqin so that people with no Lisp experience can use it. (Whether I will succeed is another matter...)

I apologize to Ryan C if I have mischaracterized sql. I'd like to have a longer conversation about this, but maybe this list is not the right place. (Also, Ryan, if you think our goals are more similar than I do, I'd be happy to work with you. You're definitely a more experienced Racketeer and it would surely boost my code quality.)

- Ryan Kramer

jackh...@gmail.com

unread,
Mar 15, 2019, 6:24:23 AM3/15/19
to Racket Users
I think we should all work towards making our existing code in this area more discoverable, so we can get a better sense of what libraries for working with tables exist in the wild. To those of you who own Racket packages that provide any functionality related to data tables: I recommend adding the "tabular" tag to your package's description in the package catalog. There's no need to remove more-specific tags (like "data-frame") from your package, but even if you have a more specific tag please include the general "tabular" tag so it's easy to search for your package. So far there's only 3 packages tagged with "tabular" (and one of those is a package of mine that I just tagged while writing this post). I see several packages that are good candidates for the tag:
  • data-frame
  • sqlite-table
  • table-panel
  • tabular
  • rml-core (maybe?)
  • sinbad
  • spmatrix (maybe?)
  • spreadsheet-editor
  • csv
  • csv-reading
  • csv-writing
  • simple-csv
  • Most things with the "sql" tag
The more packages we have tagged and documented, the easier it will be to find real code using tables in the wild. Which is information we'll need if we want to understand how a standard `racket/table` API might look.

Greg Hendershott

unread,
Mar 15, 2019, 11:31:20 PM3/15/19
to Jay McCarthy, Matt Jadud, Racket Users
> 90% of the reason I made `raart` is because of this.
>
> https://docs.racket-lang.org/raart/index.html#%28def._%28%28lib._raart%2Fdraw..rkt%29._table%29%29
>
> (require raart
> (draw-here (table (text-rows THE-TABULAR-DATA)))

Although I didn't see one in the docs, it looks like you have an
example in the tests:

https://github.com/jeapostrophe/raart/blob/master/t/draw.rkt#L24-L39

Greg Hendershott

unread,
Mar 15, 2019, 11:52:22 PM3/15/19
to Jack Firth, Racket Users
This is a great idea. Also I want to point out that:

1. Sometimes it's OK to start by sharing a repo on Git{Hub Lab}. Not
everything needs to go on pkgs.racket-lang.org immediately, to be
visible and share, especially early on.

(To be clear, I'm not saying, "oh only perfect 1.0 things should be a
package". I'm just pointing out that pkgs.r-l.org isn't fantastic for
discoverability, so if that's the main motivation, it's not your only
or even your best option.)

2. If you do have a package that does XYZ, and someone then makes a
package for X, sometimes it's OK to change your package just to
re-`provide` their module for X (and yours still does Y and Z).

For example, my rackjure package had a threading macro. Then Alexis
made a `threading` package. It was 99% compatible, she took a PR for
the 1%, and I changed rackjure to re-provide that. And the docs say
so. So, it didn't break users of rackjure. Plus people could switch to
using `threading` directly, if/when they wanted. And the Racket world
had one less bit of duplicate code. I think that worked out well.

(In fact if that continued to where rackjure was "merely" a
"meta-package" that re-provided focused packages, I'd be fine with
that!)

Maybe that idea could apply here?

John Clements

unread,
Mar 16, 2019, 3:13:38 PM3/16/19
to jackh...@gmail.com, Racket Users
Yep, excellent idea. I’ve added the ’tabular’ tag to csv-writing.

John

jackh...@gmail.com

unread,
Mar 16, 2019, 6:54:51 PM3/16/19
to Racket Users
Hooray! Now we're up to 7 tagged packages (that was fast!)

travis.h...@gmail.com

unread,
Mar 22, 2019, 11:21:34 PM3/22/19
to Racket Users
I just came across a post on tabular data structures in R, Python, and SQL. The post is written has a friendly intro to the subject, which the author claims is a gap that needs filling. Thus, the post might not contain much information that is new to this group. Perhaps the opportunity is for the Racket community to use that friendly intro as a springboard to a comparison for how to approach tabular data in Racket.
Reply all
Reply to author
Forward
0 new messages