slides for TACC event today

3 views
Skip to first unread message

Jason Baldridge

unread,
Sep 10, 2010, 10:41:56 AM9/10/10
to textgrou...@googlegroups.com
In case any of you are curious, you can check out the slides I'm using for my short presentation at the TACC event today. Lot's of pictures.

Mike -- I redid the barbecue word cloud without "barbecue" in it -- a nice thing is that Texas pops out as a reasonably high probability word. :) BTW, here's how I create the counts from a text file (e.g., cutting and pasting the text from the Wikipedia page on bbq):

> cat bbq_wikipedia.txt | tr 'A-Z' 'a-z' | tr -cs 'a-z' '\n' | sort | uniq -c | awk '{print $2":"$1}' > bbq_wordle.txt

Then I manually took out barbecue. (Nice ambiguity in meaning there... Yum.)

Jason


--
Jason Baldridge
Assistant Professor, Department of Linguistics
The University of Texas at Austin
http://comp.ling.utexas.edu/people/jason_baldridge

Mike Speriosu

unread,
Sep 10, 2010, 11:22:46 AM9/10/10
to textgrou...@googlegroups.com
Cool, the slides look good! And thanks for the command line tip. -Mike

tmoon

unread,
Sep 10, 2010, 11:35:34 AM9/10/10
to TextGrounder Open Discussion
Thanks!

Looking at the picture on p. 8, I think there's a better way to
present that data. It's not clear at all that certain words in the
image belong to certain delineated regions. This is future work, but
maybe it would be better if there were lines of demarcation indicating
the grid layout of the regions. Or you could color all cells with some
(respectively differently colored) translucent overlay and the words
associated just pop up in descending order of importance (in a stack
in the radial direction, rather than a transverse stack) if you hover
over or click on it. I think they solved the combinatorial problem
that states you can color any map with no more than 5 colors or
something in the 70s. Extend that to our case (or not, which is much
simpler anyway because there are always 4 and only 4 adjacent
regions). I think that will be visually clearer.

On Sep 10, 9:41 am, Jason Baldridge <jbald...@mail.utexas.edu> wrote:
> In case any of you are curious, you can check out the slides I'm using for
> my short presentation<http://groups.google.com/group/textgrounder-open/web/ECADT-baldridge.pdf>at
> the
> TACC event today<http://www.tacc.utexas.edu/education/humanities/emerging-communities-...>.

Jason Baldridge

unread,
Sep 10, 2010, 12:01:23 PM9/10/10
to textgrou...@googlegroups.com
That's a good idea -- definitely a lot we can do to improve the visual aspect!

tmoon

unread,
Sep 10, 2010, 1:25:55 PM9/10/10
to TextGrounder Open Discussion

> something in the 70s. Extend that to our case (or not, which is much
> simpler anyway because there are always 4 and only 4 adjacent
> regions). I think that will be visually clearer.

Stupid me. It's a checkerboard, so we only need two colors.
Reply all
Reply to author
Forward
0 new messages