ideas to implement dynamic forms and intuitive data import

0 views
Skip to first unread message

Venkatesh U

unread,
Jul 3, 2009, 1:56:44 AM7/3/09
to inqle...@googlegroups.com
Dave,

ideas to implement  dynamic forms and intuitive data import

 The user defines the data content . i.e metadata as mappings, say for example
     I am uploading a csv file which contains the columns     Name,Age,Sex,Salary

    In the mapping we define the
   first column in the csv is a  resource of type person,
   second is resource of  type age and it is minable ( can be used in data mining analysis)
   third is is resource of  type age sex
   and fourth is resource of  type salary

   For this we should allow the user to define resources before in hand, e.g. if there is no resource of type person
  is existing in the system ontology, the user can define that resource and assign an uri . Then this resource definition
  can be reused while loading data which are instances of this resource type.

  we will save this mapping file in XML or RDF format.

  For bulk loading of data through a csv file, the user can define mapping and just upload the csv. Our importer will import the data based on the mapping definition and if any bad records which do not adhere with the mapping definition exist, we write them to a log file and give a note to the user.

 If the user wants to add adhoc data, say some few numbers of records at a time through forms, we will read the mapping definiton XML and create a dynamic form, has all the fields defined in the mapping. This dynamic form will capture the data entered by user and save/add the data as statements in rdf.

I think here we can make use of XLWrap. In XLWrap we need to define mappings and use that mapping to import data.
we can create a valid XLWrap mapping file through a wizard based on user inputs. Then we make use of this mapping to import data as rdf using XLWrap.We can read the same mapping file and parse it  to create our dynamic forms.

This is getting very interesting.

Do you like this idea. Please feel free to suggest any modifications or correct any gaps.

Cheers,
Venki




--
Thanks and Best Regards,

Venkatesh Umaashankar
Mailto    : venka...@gmail.com

Arise, awake and stop not till the goal is reached - Swami Vivekanandha

David Donohue

unread,
Jul 3, 2009, 7:44:43 AM7/3/09
to inqle...@googlegroups.com
Venki,
Great ideas. I suggest everything is stored as RDF, in one of our TDB
databases.
I like your ideas. You seem to suggest more usability. So maybe the
user's data gets imported immediately with a system-generated URI for
each thing and the superclass of each thing and for each property.
You could later tell it what URI a column header maps to. You keep a
local database of the mappings you are using.

I will note that the existing model and code might be able to deliver
all of this already. We might only have to add the ability to bulk
update a set of subjects, property URIs, etc. And a more intuitive
importer.

I have the infrastruture to easily save e.g. mappings with ~2 lines of
code. We make use of annotations, which define that the class is
persistable and which system datamodel it should by default be saved
to. THen to save such an object, you just do this
Persister persister = Persister.getInstance();
persister.persist(myMappingObject);

I will note that we already have mapping objects. For each
spreadsheet we build a TableMapping object, which contains 1 or more
SubjectMappings, which in turn contain 1 or more DataMappings.

To import a TableMapping you do this
//get a CsvReader from the wizard, then
FileDataImporter importer = new FileDataImporter(csvReader,
myTableMapping, theOntModelToImportInto);
importer.doImport();

If we create a new importing mechanism, we should keep the existing
one (since it works and only needs minor modification) and add a new
one. We can always retire the older one later.

I like the approach as it would give us an opportunity to flesh out
the browsing, etc.
And add the forms and updating mechanism/
Dave

David Donohue

unread,
Jul 3, 2009, 7:58:13 AM7/3/09
to inqle...@googlegroups.com
We might consider first building the new capabilities:
bulk updating of URIs
dynamic model-driven forms
widgets for looking up local mappings
central inqle server storing & looking up such mappings so that they
can be shared among inqle servers.

Then later assemble this into a new, more intuitive importer.
whatever you have the most interest in

David Donohue

unread,
Jul 3, 2009, 11:44:57 PM7/3/09
to inqle...@googlegroups.com
Venki,
I reread your post and I think inqle almost supports this. It is
planned functionality, to make use of past mappings and to share
mappings across inqle servers. So upon uploading a spreadsheet, we
should query the local and remote mappings datamodel, and offer any
mappings that already exist. Or the user can create a new one.
The user could do an import with just like 2 clicks by accepting the
default mapping, selected because its column headers matches the
mapping's column headers.

Anyway we can add that capability and it will help the use case of
repetitively importing data. However this use case might be not the
top priority?

It seems we could perhaps do some cooler stuff than improve the
mundane importing capability? Since we already can import?

Now your ideas for model-driven forms and the like are definitely cooler.

Also consider other options still. Like doing interesting things with
the results of experiments. I recently created 10 new visualization
widgets for RAP, which could be used to plot data or to plot
experiment results. How about visualizing predictive models using a
scatter chart? RapidMiner includes packages for flattening
multi-dimensional data sets into 2 dimensions, so they can be plotted
on X,Y scatter chart.
Also improving the UI for viewing experimental results: add dynamic
columns, add filtering. Also add logic to prevent experiments from
being repeated. Or add a "brute force" sampling algorithm.
Also some rules for highlighting some experimental results. E.g. if
correlation coefficient > 0.8 or more sophisticated. Also could add
the ability for user to annotate particular experiments. E.g. "This
is highly interesting" or "Publish this result to Central Inqle
Server". We might introduce an additional datamodel for storing
longer term experiments of interest. Like the "long term memory".

Just trying to stimulate some more ideas. I hope your engagement will
be as interesting and rewarding for you as possible. I am happy to do
more mundane stuff that you identify needs to be done like improve the
data importer.

You just got on gmail now.
Dave

It seems that the latter might be more interesting for you, as they
pertain to the data mining side of things. This is an area where you
know more than I do I am sure.

Venkatesh U

unread,
Jul 4, 2009, 12:51:29 AM7/4/09
to inqle...@googlegroups.com
Dave,

It seems we could perhaps do some cooler stuff than improve the
mundane importing capability?  Since we already can import?

 >>  Looks like a valid point to me


I recently created 10 new visualization
widgets for RAP, which could be used to plot data or to plot
experiment results.  How about visualizing predictive models using a
scatter chart?
    This is a great Idea. We can do this.

 I think I can concentrate more on the data mining part. I am fine with this Idea. I think since import is already working we can make it more user friendly at a later point of time. I completely agree with your thoughts. I can focus on the the data mining part and you can take care of improving the usability and user friendliness.

 Shall we talk,the same time tomorrow as well? we ll discuss your ideas on the data mining part and decide which one to work upon and improve. Please feel free to direct me in any data mining area, which you feel as appropriate for me. Meanwhile today I ll try to roughly understand the inqle code but doing the modifications we discussed today for removing addsubjectpage.

Thanks,
Venki

Venkatesh U

unread,
Jul 4, 2009, 2:47:12 AM7/4/09
to inqle...@googlegroups.com
Hi Dave,
I got the inqle working now. I have attached a document which highlights the changes needs to be done in the importer wizard based on our discussion. If you could get these changes done, I will start focusing on other areas as you have mentioned.

Venki
importer.pdf

David Donohue

unread,
Jul 4, 2009, 7:31:37 AM7/4/09
to inqle...@googlegroups.com
Great list Venki! I will work on all your items. Feel free to ask me
questions about any particular task you would like to do.
Dave

David Donohue

unread,
Jul 7, 2009, 10:36:29 PM7/7/09
to inqle...@googlegroups.com
Venki,
I made most of the changes you recommended to the FileDataImporter
Wizard. Fixed some bugs introduced by Jena TDB version 0.8. Much
simplified. Seems to work adequately. I am aware of no bugs though I
am sure they are there. Committed to subversion. Probably time for a
new release? 0.3.2?
Dave

Venkatesh U

unread,
Jul 8, 2009, 2:01:14 AM7/8/09
to inqle...@googlegroups.com
Hi Dave,
 Went through the modified wizard. I find most of the changes in place, including saving a mapping, how do we re use this saved mapping if I am uploading the data of similar structure? should we be listing the saved mappings in the initial pages of the wizard?

Venki

David Donohue

unread,
Jul 8, 2009, 5:55:35 AM7/8/09
to inqle...@googlegroups.com
OK, will add this feature!
Dave
Reply all
Reply to author
Forward
0 new messages