Fwd: results of metaphor-fql test

38 views
Skip to first unread message

Scott Morrison

unread,
Apr 11, 2013, 7:45:19 PM4/11/13
to Ryan Wisnesky, David Spivak, categor...@googlegroups.com
Oops, reply all.


---------- Forwarded message ----------
From: Scott Morrison <sc...@tqft.net>
Date: Fri, Apr 12, 2013 at 9:44 AM
Subject: Re: results of metaphor-fql test
To: Ryan Wisnesky <ry...@cs.harvard.edu>


In the metaphor code there's currently a method

net.categoricaldata.ontology.Ontology#findIsomorphismsTo(other:
Ontology#Dataset): Iterable[Datamap]

which does all the heavy lifting. You can see it here:

https://bitbucket.org/categoricaldata/categoricaldata/src/761ff5371f5625abe7e87735fd2e8f213d402d35/src/main/scala/net/categoricaldata/ontology/Ontology.scala?at=master#cl-57

It's actually using a nice trick (although I'm somewhat surprised that
it's at all efficient)! We realize the isomorphisms I \to J of
instances over C as points in a certain limit diagram (and then take
advantage of the fact the metaphor tries to model the mathematics
closely, so we can compute limit diagrams independently of computing
pushforwards). What is this diagram?

The vertices come in two flavours, the objects of C and the generating
morphisms of C.
The arrows comes again in two flavours, called left and right. For
each generating morphism x of C, there is a left arrow source(x) \to
x, and a right arrow target(x) \to x.

What are the sets and maps we decorate this diagram with?

An object-vertex x gets labelled by all bijections (just as sets) from
I(x) \to J(x).
A generator-vertex g:x \to y gets labelled by all functions (just as
sets) from I(x) to J(y).
A left-arrow h:x \to y gets labelled by composition, i.e. it maps
bijections I(x) \to J(x) to functions I(x) \to J(y) by postcomposing
with J(h).
A right-arrow h:x \to y gets labelled by composition, i.e. it maps
bijections I(y) \to J(y) to functions I(x) \to J(y) by precomposing
with I(h).

A little thought shows that the instance isomorphisms are indeed the
limit points.

It's important for the implementation, of course, that whenever we
speak of sets of bijections or functions we generate these lazily, and
the algorithm for computing the limit also works lazily. It's also
relatively easy I think to construct examples on which this method
will perform poorly (e.g. where you often have to go right the "end"
of the set of bijections to find what you're looking for), but also
relatively easy to improve the limit algorithm (e.g. by giving our
lazy representation of a set a "filter" method, and then giving case
specific overrides that lazily evaluate the subset, e.g. if you only
want permutations of {1,...,n} fixing 7, you don't have to enumerate
all n! and discard (n-1) (n-1)! of them, you can just directly
enumerate the (n-1)!).

scott

On Fri, Apr 12, 2013 at 9:26 AM, Scott Morrison <sc...@tqft.net> wrote:
> I think in the long run allowing arbitrary strings as content is
> pretty important; people are going to want to store their data in
> these instances. If it's too much trouble to change the FQL syntax to
> allow spaces inside the strings, perhaps your conversion to and from
> JSON could just URL encode (i.e. change all spaces to "%20", etc) or
> base64 encode the strings? It makes the FQL notation much less
> readable, of course. Part of the reason I chose JSON was this problem
> --- it's a relatively human readable format that is still very
> expressive.
>
> scott
>
> On Apr 11, 2013, at 3:56, Ryan Wisnesky <ry...@cs.harvard.edu> wrote:
>
>> I've been able to compare FQL and metaphor on the delta sigma and pi examples from the FQL GUI (see attached screenshots). The good news is that, according to metaphor, the results are isomorphic for all 3 examples.
>>
>> The bad news is that FQL can't read the metaphor output for the sigma and pi examples, because the output contains values with spaces. This is like the kiss of death for FQL, because the first thing the FQL parser does is tokenize the input into words separated by spaces. I have no particular desire to fiddle with the depths of the FQL parser, but it might be possible to hack the FQL parser so as not to tokenize quoted strings. It might also be impossible...
>>
>> Ryan
>> <delta.png><pi.png><sigma.png>

David Spivak

unread,
Apr 12, 2013, 1:17:59 PM4/12/13
to Ryan Wisnesky, Scott Morrison, categor...@googlegroups.com
Hi Ryan,

Yeah, I think it's worth finishing this job (writing the new tokenizer), even if it isn't so exciting.... Thanks for asking though.

David


On Thu, Apr 11, 2013 at 8:01 PM, Ryan Wisnesky <ry...@cs.harvard.edu> wrote:
Wow, that's a sophisticated technique… for the purposes of testing I'm all for just relying on metaphors bijection checker (and FQL's, if the instances are small enough).  So far, the need for checking bijection hasn't come up in any of the other fql stuff.

Scott Morrison

unread,
Apr 13, 2013, 8:16:18 AM4/13/13
to Ryan Wisnesky, categor...@googlegroups.com, David Spivak
It occurred to me sometime later that my parser might well choke on
double quotes inside strings (even if escaped as \"), so it's not as
if I'm living up to my own expectations on string handling...

scott

On Sat, Apr 13, 2013 at 10:05 AM, Scott Morrison <sc...@tqft.net> wrote:
> https://groups.google.com/forum/#!forum/categoricaldata
>
> On Sat, Apr 13, 2013 at 9:23 AM, Ryan Wisnesky <ry...@cs.harvard.edu> wrote:
>> Thanks to the new tokenizer, FQL can now handle the metaphor instances, and
>> FQL itself can now have arbitrarily long values with spaces, as long as they
>> are quoted:
>>
>> schema C = {
>> c : C -> C;
>> }
>>
>> instance I : C = {
>> C = {(foo,foo),("baz bar", "baz bar")};
>> c = {(foo,"baz bar"),("baz bar", foo)}
>> }
>>
>> Also, how do I sign up for categorical data the google group? My last
>> message got bounced.

Ryan Wisnesky

unread,
Apr 13, 2013, 12:27:09 PM4/13/13
to Scott Morrison, categor...@googlegroups.com, David Spivak
Yeah I'm not handling escape sequences either - or recognizing any non-trivial whitespace characters besides carriage return, line feed, or tab.
Reply all
Reply to author
Forward
0 new messages