what can new contributors work on?

1 view
Skip to first unread message

David

unread,
Feb 22, 2009, 9:37:35 AM2/22/09
to INQLE Development Team
Lots of directions to go in, and contributors can obviously work on
whatever they would like to do. Will try to capture my latest
thoughts in this thread, on what potential contributors (like
Benjamin) could work on. As the INQLE code base has taken years to
develop, I would expect a significant

i need to add filtering capability, so only selected experiment
results get stored (e.g. those w/ correlation coefficient > 0.7)

unit testing - RAP has a unit testing framework that i have not
investigated.

employ SonarJ to review the code and restructure classes, packages,
and bundles where appropriate to avoid cycles and maximize coherence
of the code base.

work on the machine learning aspects of INQLE. Do we need to add new
learning capabilities other than 10 factor cross validation
classification or regression? T-test? Storing selected models as a
set of RDF statements? I would think decision trees and rule-based
learners would lend themselves to this.

We could also use more learning algorithms, that are tolerant of a
variety of types of attributes, because the nature of sampling raw RDF
data is that you get a lot of types of data. Ideally we would either
get RapidMiner to create new learners (e.g. M5Prime) or implement Weka
learners (especially M5Prime) or build our own learning algorithms.

We also need to do real world experiments using INQLE to identify what
we need to add to be able to generate publishable research.

We need to add some capabilities to fulfill the "Intelligent Network"
aspect of INQLE. INQLE is networked in the sense that INQLE servers
poll the Central INQLE Server (CIS) for subjects and properties to
use, sharing such RDF classes and properties from 1 INQLE instance to
the other. But we need to add peer-to-peer querying. Especially
pulling data from other INQLE servers. Need to add to the CIS the
ability to act as and index of data classes that can be found on INQLE
servers. So the workflow is INQLE server polls the CIS for other
INQLE servers containing such and such RDF class, and it answers with
the URL of other INQLE servers. My INQLE server can now send (SPARQL)
request to those INQLE servers and pull data.
Associated with these capabilities need to add SparqlEndpoint as a new
dataset type, and a new sampling algorithm which can use suc remote or
remote + local data.

The Intelligent aspect INQLE would be served by having algorithms that
start with past experiment results of significance, and modify perhaps
1 variable, and repeat them. Or repeat them on data from other
sources.

Would like to take INQLE in some other new directions. E.g. implement
my survey idea. Basically, an investigator can configure a research
campaign, identifying the questions to be asked and the frequency they
should be asked etc. Next add email addresses and start the survey
engine. The engine sends periodic questions (or just a link to please
log in and answer those questions on a web page). This could be
powerful as it would be a means of capturing research data directly
into INQLE.

Lots of other ideas too. Please let me know what you would like to
do!
Dave

David Donohue

unread,
Feb 22, 2009, 1:44:09 PM2/22/09
to INQLE Development Team
More things on the to-do list, which we could use help on

Help resubmit an SBIR-STTR grant proposal. We submitted a grant
proposal in 2007, and were rejected, because the reviewers "did not
think it was possible". Well we have proven I believe that it is
possible.
http://code.google.com/p/inqle/wiki/INQLE_Benchmarking
And the NIH budget is 50% bigger now! So our chances for success are
likely improved.

Other work we have planned is to develop some important semantic
ontologies for scientific research. We hope to be able to codify the
results of any research study in RDF, such that machines can leverage
such findings. We have made steps toward this through our model for
storing results on INQLE classification or regression experiments. As
INQLE gets to be a platform for more comprehensive, prospective
trials, INQLE will be able to represent publishable findings
similarly, as RDF. We envision other studies, not conducted by INQLE,
will some day be representable in the same format. Some time in the
future, organizations such as HIRU of McMaster University, or
Cochrane, would codify important research reports in this manner.
Some day scholarly journals (to the extent that they will still exist)
would require research to be thus tagged. So devising this ontology
and establishing it as a standard will become a big thing. Probably
best for us to "stake our claim" as quickly as possible, and release
this ontology soon, before INQLE is ready to generate such
comprehensive research data. So anybody who is interested & able to
help modeling in RDF all the aspects of a research study, should waste
no time!

David Donohue

unread,
Feb 22, 2009, 9:13:14 PM2/22/09
to INQLE Development Team
Another thing I need to do is to revamp the code to fall more in line
with the offerings of the Eclipse RAP framework. This activity would
include any of (1) make use of their infrastructure for running
asynchronous processes (for the learning agent, i created a plain old
thread); (2) more pluggable extension points; (3) retool it to be
deployable as either the current form (RAP) or a desktop version
(RCP).

Improved look & feel, using RAP's new CSS-based styling.

text mining, using technologies like LingPipe, OpenCalais, Web-
Harvest, or perhaps RapidMiner Text Mining plug-in, or other. This
would perhaps let you specify URLs from which to extract feature
vectors (or even semantic info) then permits mining these feature
vectors.

David Donohue

unread,
Feb 27, 2009, 9:46:07 PM2/27/09
to INQLE Development Team
Create an INQLE image for Amazon EC2 or other cloud computing
environment. Ideally this could be updated easily as we have new INQLE
releases.

Publish a example of INQLE on a server (perhaps EC2). This example app
would have to reset itself every so often, by simply overwriting the
root inqle directory. Perhaps should have some preloaded data.

I am not sure we handle multiple sessions properly in INQLE.
Particularly WRT shared resources like databases, the persister
singleton, etc. Address any such issues.

Revamp INQLE code to be deployable as a RAP(web) or RCP (desktop)
application. Inoopract has published a tutorial/webinar on doing this.

Develop a service offering to create & run new inqle instances
automatically.

Mavenize INQLE? Not sure if this would be worth it. INQLE is currently
dependent on Eclipse Feature exporting infrastructure

David Donohue

unread,
Feb 28, 2009, 10:45:24 PM2/28/09
to INQLE Development Team
Scheduler Agent: runs on a definable schedule, executing other agent
(s).


Add a plug-in, which contains Web-Harvest or similar web scraper.
Next add a scraper agent, which executes a Web-Harvest scraping
scheme, and stores resulting data in specified datamodel.

Reply all
Reply to author
Forward
0 new messages