Hi David,
> About a month ago I stumbled across the concepts of the semantic web
> and I have been gorging on as much documentation about it as I can. I
> work at a University and have begun trying to put together some of the
> semantic tools in ways that will make the lives of the
> Faculty/Staff/Students easier.
Sounds great. In which university do you work?
> I would like to setup an xOperator agent to be available to anyone
> with a University LoginID so that they can ask questions like "What time
> is my next class?"
Does this mean that anyone with an University LoginID has automatically an
XMPP account?
> It looks like xOperator will be able to manage most of what I need
> after I setup a SPARQL endpoint that knows about all of the students'
> classes. The "personal agent" aspect of xOperator will also provide
> amazing opportunities for our userbase, but my first focus is providing
> an IM bot on which I can adjust responses via AIML.
I agree. your usecase matches the group agent scenario, which means the
agent has no proxy account.
> ) able to generate SPARQL queries that include the XMPP username of the
> person sending messages to the bot - perhaps some variable like
> %%USERNAME%%
This is not such a problem. The script context has to be extended with
that value.
> ) able to check the XMPP username's access permissions to SPARQL
> endpoints requested (via LDAP or SPARQL/LDAP)
What exactly do you mean with that? In general: groovy scripts can use any
java API you want, which means you can use an LDAP API too.
another idea about this is: why not managing the access control over the
RDF triple store or the endpoint? In OntoWiki, we have at least model
based access control, so you can manage, which information is given back
to users in a raw scope.
> ) some sort of encryption for queries leaving the bot
You mean SPARQL Endpoints which use HTTPS?
> - I'm still trying to figure out how to encrypt access to SPARQL end-
> points.. we don't want to have a SPARQL endpoint that anyone can query
> without authenticating in some fashion first
you do not need to configure the endpoint in the config for this. you can
encode it in the groovy script directly, which means no "query" command
will fired against the endpoint and also no queries from other agents.
in detail: normally, query script fire queries against all endpoints, all
neighbouring agents and the local store. you can use the query command to
access only a specific, not configured, endpoint. have a look in the
dbpedia*groovy files and look for context.queryRemote(documentQuery,url)
...
> ) respond only to University LoginID's - eg. "only allow Jabber ID's
> that are in the @university.edu domain"
this can achieve the university XMPP server for you so that only
university users are in the roster of the agent (I dont know, how stable
are huge rosters ...)
> ) find some way to limit queries to only those SPARQL endpoints that
> have the information requested
> - perhaps an optional list of query-able attributes per namespace
> - if no attributes are listed for a namespace, then it always tries
> the query there
these request are very hard to tackle. which type of queries do you want
to use and in which way should the agent fire the queries and answer the
user requests? Maybe you can manage this in your scripts?
> ) allow one agent to keep track of multiple users
> - eg. the bot would have a database where it can store user-specific
> settings like what their iCal stores are, what SPARQL endpoints they
> have configured, and possibly store their custom templates as well
> - most of our users won't know what the semantic web is and if I can
> have one xOperator agent running for all of the users it would make it
> easier for users to personalize the bot for their own needs
In this case, agents cant communicate in their neighourhood cause they do
not act as their users and other agents do not allow them to query them.
There are ideas to run only one xOperator to serve many users but in a
more independent way (complete separate configs, scripts and files). But
for us, these dev-direction has a low priority for now. A first step is
scheduled for 0.3 as issue 29: Scripts to install xOperator as a service.
> If anyone has any ideas or feedback about these things I'm
> practically starving for information. Thanks for all the work ya'll
> have put into xOperator (as well as the other AKSW projects)! They've
> inspired me :^)
--
Sebastian Dietzold - Department of Computer Science; University of Leipzig
Tel/Fax: +49 341 97 323-66/-29 http://bis.uni-leipzig.de/SebastianDietzold
quote david.alston (29.10.2008):
Sounds great. In which university do you work?
Does this mean that anyone with an University LoginID has automatically an XMPP account?
I agree. your usecase matches the group agent scenario, which means the agent has no proxy account.
This is not such a problem. The script context has to be extended with that value.
) able to generate SPARQL queries that include the XMPP username of the person sending messages to the bot - perhaps some variable like %%USERNAME%%
What exactly do you mean with that? In general: groovy scripts can use any java API you want, which means you can use an LDAP API too.
) able to check the XMPP username's access permissions to SPARQL endpoints requested (via LDAP or SPARQL/LDAP)
another idea about this is: why not managing the access control over the RDF triple store or the endpoint? In OntoWiki, we have at least model based access control, so you can manage, which information is given back to users in a raw scope.
) some sort of encryption for queries leaving the bot
You mean SPARQL Endpoints which use HTTPS?
you do not need to configure the endpoint in the config for this. you can encode it in the groovy script directly, which means no "query" command will fired against the endpoint and also no queries from other agents.
- I'm still trying to figure out how to encrypt access to SPARQL end- points.. we don't want to have a SPARQL endpoint that anyone can query without authenticating in some fashion first
in detail: normally, query script fire queries against all endpoints, all neighbouring agents and the local store. you can use the query command to access only a specific, not configured, endpoint. have a look in the dbpedia*groovy files and look for context.queryRemote(documentQuery,url) ...
this can achieve the university XMPP server for you so that only university users are in the roster of the agent (I dont know, how stable are huge rosters ...)
) respond only to University LoginID's - eg. "only allow Jabber ID's that are in the @university.edu domain"
these request are very hard to tackle. which type of queries do you want to use and in which way should the agent fire the queries and answer the user requests? Maybe you can manage this in your scripts?
) find some way to limit queries to only those SPARQL endpoints that have the information requested
- perhaps an optional list of query-able attributes per namespace
- if no attributes are listed for a namespace, then it always tries
the query there
In this case, agents cant communicate in their neighourhood cause they do not act as their users and other agents do not allow them to query them.
) allow one agent to keep track of multiple users
- eg. the bot would have a database where it can store user-specific settings like what their iCal stores are, what SPARQL endpoints they have configured, and possibly store their custom templates as well
- most of our users won't know what the semantic web is and if I can have one xOperator agent running for all of the users it would make it easier for users to personalize the bot for their own needs
There are ideas to run only one xOperator to serve many users but in a more independent way (complete separate configs, scripts and files). But for us, these dev-direction has a low priority for now. A first step is scheduled for 0.3 as issue 29: Scripts to install xOperator as a service.
Sebastian Dietzold - Department of Computer Science; University of Leipzig
If anyone has any ideas or feedback about these things I'm practically starving for information. Thanks for all the work ya'll have put into xOperator (as well as the other AKSW projects)! They've inspired me :^)
--
Tel/Fax: +49 341 97 323-66/-29 http://bis.uni-leipzig.de/SebastianDietzold
i got some things to add and remark. Find them inline.
2008/11/1 David Alston <david....@gmail.com>:
> Greetings!
>
> Responses inline..
>
> On Thu, Oct 30, 2008 at 7:45 AM, Sebastian Dietzold
> <diet...@informatik.uni-leipzig.de> wrote:
>>
>> quote david.alston (29.10.2008):
>>
>> Sounds great. In which university do you work?
>
> University of Texas at Dallas in the US.
>
Great!
With groovy you can execute non sparql queries inside a script as
well. basically you could just instantiate your own application server
in a script, which would be nonsense, but possible.
>
> Being able to use Java API's in Groovy is good news! I guess I'll have to
> learn a bit of java too.. :^)
>
>
>> ) some sort of encryption for queries leaving the bot
>>
>> You mean SPARQL Endpoints which use HTTPS?
>
> Yes! That's what I would like to do.. I just don't know how the SPARQL
> query clients (eg. the IM bot) will handle the certificates. I have some
> experience dealing with certs, but I haven't seen any documentation saying
> how a SPARQL client might use them.
Querying an endpoint that uses https should not be a problem at all,
at least for the client and as long as the certificate is valid. The
library used (http-client from apache commons) should handle it, see:
http://hc.apache.org/httpclient-3.x/sslguide.html
Creating a filter based upon the jid should be no problem and will
most likely implemented in the next few days, all together with the
new access control system.
>
>>
>>> ) find some way to limit queries to only those SPARQL endpoints that have
>>> the information requested
>>> - perhaps an optional list of query-able attributes per namespace
>>> - if no attributes are listed for a namespace, then it always tries
>>> the query there
>>
>> these request are very hard to tackle. which type of queries do you want
>> to use and in which way should the agent fire the queries and answer the
>> user requests? Maybe you can manage this in your scripts?
>
> I suppose (after I learn Groovy) I'll be able to write my own version of
> this for our private SPARQL endpoints.. but this is the algorythm I was
> thinking of..
>
> 1) translate AIML to SPARQL query
> 2) run SPARQL query against end-point
> 3) if there is an error, and there are other endpoints, then switch to next
> endpoint and goto step 2.
> 4) if no end-point returns success then print "No End-Points can answer your
> query"
> 5) if the user has specified a "debug" state, then print all the error
> messages from the queries back to the user
Having a fall-back sounds like a good idea to me, but if the endpoints
are identical, then may be some kind of load balancing on the server
side would be better, as the client is currently not able to
efficiently realize that one server is gone/ is back.
>
> Of course, this doesn't handle the case where the user might want to include
> data from multiple end-points in the same query.. in which case I imagine
> the algorythm would look like this..
>
> 1) use AIML to break down sentence into multiple query strings
> 2) run through the known endpoints with each query
> 3) store the "select" variables from the SPARQL queries in a hash table
> 4) run the SPARQL queries against each end-point until all the "select"
> variables have been filled
> 5) return the response string with the variables replaced with their values
>
> Obviously this last algorythm would have to be modified for queries that
> need to be presented in table form..
Combining multiple queries into one result could be a bit of work. We
have for example the where is * now template, in which we execute
multiple queries, but more in a step by step fashion (find the
calendar of a person, then query that calendar, shows only the results
of the last query), but they are definitly doable and i would be happy
to help.
Hi!
i got some things to add and remark. Find them inline.
2008/11/1 David Alston <david....@gmail.com>:
>
> On Thu, Oct 30, 2008 at 7:45 AM, Sebastian Dietzold
> <diet...@informatik.uni-leipzig.de> wrote:
>>
>> quote david.alston (29.10.2008):
> Yes! That's what I would like to do.. I just don't know how the SPARQL
> query clients (eg. the IM bot) will handle the certificates. I have some
> experience dealing with certs, but I haven't seen any documentation saying
> how a SPARQL client might use them.
Querying an endpoint that uses https should not be a problem at all,
at least for the client and as long as the certificate is valid. The
library used (http-client from apache commons) should handle it, see:
http://hc.apache.org/httpclient-3.x/sslguide.html
Creating a filter based upon the jid should be no problem and will
most likely implemented in the next few days, all together with the
new access control system.
>Having a fall-back sounds like a good idea to me, but if the endpoints
> I suppose (after I learn Groovy) I'll be able to write my own version of
> this for our private SPARQL endpoints.. but this is the algorythm I was
> thinking of..
>
> 1) translate AIML to SPARQL query
> 2) run SPARQL query against end-point
> 3) if there is an error, and there are other endpoints, then switch to next
> endpoint and goto step 2.
> 4) if no end-point returns success then print "No End-Points can answer your
> query"
> 5) if the user has specified a "debug" state, then print all the error
> messages from the queries back to the user
are identical, then may be some kind of load balancing on the server
side would be better, as the client is currently not able to
efficiently realize that one server is gone/ is back.
Combining multiple queries into one result could be a bit of work. We
have for example the where is * now template, in which we execute
multiple queries, but more in a step by step fashion (find the
calendar of a person, then query that calendar, shows only the results
of the last query), but they are definitly doable and i would be happy
to help.