interface to keras

Miller, Timothy

unread,

Jun 22, 2016, 12:15:52 PM6/22/16

to cleartk-d...@googlegroups.com

I've put together some DataWriter/ClassifierBuilder/Classifier classes
that interact with some python code using the keras neural network layer
that Steve suggested I add them to cleartk with the Beta annotation.

Two questions:

1) Should this be it's own module? I could foresee others writing
theano/tensorflow/kaffe versions that modify my code for those libraries
in which case it might make sense to call it cleartk-ml-python or
something general. Otherwise the most logical place might be cleartk-ml.

2) I wrote a bunch of utilities in python that read the data file into
numpy vectors and so forth. Is there any interest in making some kind of
formal python connector with cleartk branding so someone could do 'pip
install cleartk'?

Tim

Philip Ogren

unread,

Jun 22, 2016, 12:29:50 PM6/22/16

to cleartk-d...@googlegroups.com

Hi Tim. Thanks for offering up this code! That's great. :)

My first thought is to put it into it's own module. One of the main considerations for having so many modules in ClearTK is to isolate dependencies as much as possible. This has been really important to me for being able to use ClearTK in a commercial/industry setting. The first thing lawyers ask for when you want to use an open source library is a list of all the dependencies. The shorter the list, the better. So, when you isolate the dependencies, it makes it easier to selectively choose the modules you need and trim that list. A secondary consideration, is that a large stack of dependencies can cause problems when you integrate into other systems/frameworks that have their own dependencies which may conflict. This isn't such a big deal because you can always shade the dependencies.

That said, it occurs to me that perhaps you won't actually be adding any dependencies per se for this code you have written. In which case, a cleartk-ml-python module might make more sense. That way, to the extent that other wrappers can reuse the code you have put together for keras - the code is in one module and they can all be together. This assumes the other wrappers don't add dependencies either.

Question: Did you write an implementation of Classifier that actually performs the neural net classification in Java? That is, can you train a neural net in keras and then use the resulting model in a ClearTK annotator without a call out to python code? I think this might inform the discussion of whether or not we are introducing a "dependency" or not. I'm inclined to think that if at runtime (i.e. classification time) you call out to keras/python, then we should probably put this in it's own cleartk-ml-python-keras module (or something like that.)

Thanks,

Philip

--
You received this message because you are subscribed to the Google Groups "cleartk-developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cleartk-develop...@googlegroups.com.
To post to this group, send email to cleartk-d...@googlegroups.com.
Visit this group at https://groups.google.com/group/cleartk-developers.
For more options, visit https://groups.google.com/d/optout.

Miller, Timothy

unread,

Jun 22, 2016, 2:44:32 PM6/22/16

to cleartk-d...@googlegroups.com

> Question: Did you write an implementation of Classifier that actually
> performs the neural net classification in Java? That is, can you
> train a neural net in keras and then use the resulting model in a
> ClearTK annotator without a call out to python code?

No - the classifier creates and starts a process in the constructor and
then passes it instances through stdin in the classify method. We
discussed the future possibility that models could be stored such that
something like deeplearning4j could them but I don't know if that is
possible right now.

> it, send an email to cleartk-developers
> +unsub...@googlegroups.com.

Philip Ogren

unread,

Jun 22, 2016, 3:26:37 PM6/22/16

to cleartk-d...@googlegroups.com

In that case, I think a separate module probably makes the most sense. Even though there's no dependency (I assume) from maven's point-of-view, there's essentially a direct runtime dependency on Keras from the user's perspective.

To unsubscribe from this group and stop receiving emails from it, send an email to cleartk-develop...@googlegroups.com.

Miller, Timothy

unread,

Jul 19, 2016, 5:05:28 PM7/19/16

to cleartk-d...@googlegroups.com

After playing with this for a few weeks and sharing with a few people
here, I have an idea to throw out: What if the interface is even
simpler, instead of having ml-script and ml-script-{keras,theano}, etc.
just have ml-script, and the classifier builder only expects one file:
model.out or whatever. So whether you use keras or theano, it's your
responsibility to wrap whatever data structures, parameters, models,
etc. all into one file? Even using keras we've noted already that
sometimes we want more than just the parameter files, like a mapping
from words to ints inside python, and hacked around it by putting it in
the scripts directory. This would help us get around those types of
hacks while also simplifying the cleartk modules.

Tim

> > https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_group_cleartk-2Ddevelopers&d=CwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=7WHsyf1KWga21K7Xr2so3CG_JnCIauYe1Oibkajm0QU&s=LFAHi57txrX6nrJ_IngdJqBnCKPdPvPwibw5jKdbvz0&e= .
> > For more options, visit https://urldefense.proofpoint.com/v2/url?u=https-3A__groups.google.com_d_optout&d=CwIBaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=Heup-IbsIg9Q1TPOylpP9FE4GTK-OqdTDRRNQXipowRLRjx0ibQrHEo8uYx6674h&m=7WHsyf1KWga21K7Xr2so3CG_JnCIauYe1Oibkajm0QU&s=gNDfhTlxP1gKQS0NNVsvrh-Yc5Wem0rGhwg7rNQ1Wxk&e= .
> >
> >
>

signature.asc

Reply all

Reply to author

Forward