Hello Florian,
Thanks for your interest.
We have a very high level description of what Anzo is on our company web
site
http://www.cambridgesemantics.com/technology/architecture There is a fair
amount of detailed information (although currently not well structured) in
the Open Anzo wiki including the abstract description of the API
http://www.openanzo.org/projects/openanzo/wiki/AnzoClientDesign
You will likely find it important to distinguish between the Anzo quad store
(a data source service to us) and the rest of the Anzo middleware which is a
real-time asynchronous SOA architecture in which services are generally OSGi
Java components and addressed by URI. The purpose of this design is to
provide scalability though the distributed parallelism that one can achieve
with message oriented architectures as well as making integration with all
manner of programs executing on multiple platforms simple.
The Anzo client API's rely on these services (e.g. replication,
notification, model service, authentication, query, update etc) to operate.
Communication between components is generally through JSON/RDF messages over
JMS using the communications subsystem we call Combus. Other protocols, like
SOAP/REST, can be gateway-ed into Combus in order to expose all connected
services. New service components can be added using either Java or
Javascript and made available as "semantic services" or as alternates to the
existing service types. For example additional storage architectures (RDF,
RDBMS etc) can be added by implementing the data source service interfaces.
The authentication service is backed by LDAP and authentication is required
to even connect to the messaging cloud and begin making service calls.
One other major architectural feature of the Anzo middleware built to
support scalability and offline use, is replication of data accessed through
the middleware to local quad stores that are part of the client API
libraries. Named graphs are the unit of granularity for replication and
access control. This also seems to a convenient level of granularity for
programmers who often treat a graph a bit like a single conceptual object.
All changes to graphs, including those cached in the local replica's are
automatically synchronized in near real-time and propagated from their
master source by the replication and notification services. In the case of
the anzo.js the JavaScript version of the Anzo client API, this is achieved
using JMS over HTTP (described here
http://cometdaily.com/2008/07/09/implementing-a-bayeux-to-jms-bridge/). The
system supports transactions with optimistic concurrency through the use of
preconditions. The Anzo middleware also includes extensive support for query
caching securely supporting many users.
Other Anzo service components include the Glitter SPARQL engine and an HTTP
accessible SPARQL endpoint, a text indexing component currently based on
Apache Lucene (and integrated with Glitter through "magic" predicates) and a
web services proxy (that can expose regular http accessible web services as
semantic services through Combus).
The Anzo quad store supports security for multiple users, named graphs with
versioning & metadata. The design emphasis of the Anzo store is on these
enterprise features and what is necessary for building deployable
applications rather than focusing on massive triple storage or benchmarks.
If you need more detail, please get in touch.
Kindest regards, Sean
--
Sean Martin,
President & CTO, Cambridge Semantics Inc.
email/XMPP:
se...@cambridgesemantics.com
phone:
+1 617 606 3411
FOAF:
http://www.cambridgesemantics.com/people/sean