[eobjects] #851: BackendProxyFeature Listener

0 views
Skip to first unread message

eobjects

unread,
May 21, 2012, 4:58:42 PM5/21/12
to dataclean...@googlegroups.com
#851: BackendProxyFeature Listener
-------------------------+---------------------------
Reporter: w.cazander | Owner:
Type: enhancement | Status: new
Priority: low | Milestone: MetaModel 3.0
Component: MetaModel | Keywords: listener
-------------------------+---------------------------
Influenced classes:
org.eobjects.metamodel.DataContext

-------------------------+---------------------------
Predefines;
- Simple data contexts does all query data action in mem on cpu.
- Real RDBMS over jdbc does all query data action on server side.

With the different data contexts diffent features are done local even more
when using (multiple or/and chained) CompositeDataContexts.

Now on the (gui) frontend side we can deal with a generic DataContext
object and
a Query object from somewhere, so code path has no info about query or
backend.
In that case it make sense to have a listener on certain software proxy
features
which are triggered by the combined DataContext and Query code path.

The implementing DataContext knows which features its backend does support
and
which features it should proxy/execute after backend(s) query returns
data.

So when these software implementions of missing backends features are
triggered then
a listener could print or decided by user or logic if it is safe to
execute the query.

The features should be all different query option parts like;
SORT,GROUP,MAX,AVG,WHERE,JOINS,ETC.
And the aggregated or compound missing feature impact on local;
CPU,MEMORY,DISK(if supported)
So the implementing listener can create a wide range or very specific part
of the
backend feature interceptor selector to be notified over.

For example;
{{{
dataContext = new UnknownDataContext(...);
dataContext.addBackendProxyFeatureListener(new
BackendProxyFeatureListener() {
public BackendProxyFeature[] getBackendProxyFeatures() {
return new BackendProxyFeature[]
{BackendProxyFeature.COMPOUND_CPU};
}
public boolean allowProxyFeatureExecute(BackendProxyFeatureEvent
event) {
Query q = event.getQuery();
DataContext dc = event.getDataContext();

// returns one or more which are in query that triggered
this event.
List<BackendProxyFeature> triggers =
event.getQueryBackendProxyTriggers();

// return magic 0-100 ?? scale number how heavy the impact
is on local system if executed.
int w = event.getQueryBackendProxyExecuteWeight();

System.out.println("About to execute cpu bound query:
"+q+" on dc: "+dc+" weight: "+w);
return true; // let user choose in gui.
};
});

public enum BackendProxyFeature {
COMPOUND_CPU (),
COMPOUND_MEMORY (),
COMPOUND_DISK (),

WHERE_COLUMN,
WHERE_IN
WHERE_SUBSELECT
SORT_BY,
ORDER_BY,
GROUP_BY,
SUM,AVG,MIN,MAX,ETC
}
}}}

--
Ticket URL: <http://eobjects.org/trac/ticket/851>
eobjects <http://eobjects.org/trac>
The eobjects project management system, based on the trac system.

eobjects

unread,
May 22, 2012, 1:51:36 PM5/22/12
to dataclean...@googlegroups.com
#851: BackendProxyFeature Listener
-------------------------+----------------------------
Reporter: w.cazander | Owner:
Type: enhancement | Status: new
Priority: low | Milestone: MetaModel 3.0
Component: MetaModel | Resolution:
Keywords: listener |
-------------------------+----------------------------
Influenced classes:
org.eobjects.metamodel.DataContext

-------------------------+----------------------------

Comment (by kasper):

I wonder why this needs to be a listener implementation. Why not simply
have a way of retrieving this information from a DataContext? In other
words: Is it required to be something which is notified (like a listener)
instead of having it simply as eg. a "getFeatures()" method on the
DataContext interface?

--
Ticket URL: <http://eobjects.org/trac/ticket/851#comment:1>

eobjects

unread,
May 22, 2012, 4:46:13 PM5/22/12
to dataclean...@googlegroups.com
#851: BackendProxyFeature Listener
-------------------------+----------------------------
Reporter: w.cazander | Owner:
Type: enhancement | Status: new
Priority: low | Milestone: MetaModel 3.0
Component: MetaModel | Resolution:
Keywords: listener |
-------------------------+----------------------------
Influenced classes:
org.eobjects.metamodel.DataContext

-------------------------+----------------------------

Comment (by w.cazander):

Yes that is the basic need to know which features a backend supports AND
which missing features an Query uses to execute.

Without listener it same feature can be reached with an QueryInterceptor
of some sort.
{{{
dataContext = new UnknownDataContext(...);
dataContext.getNativeFeatures();
dataContext.getProxiedFeatures(); // result should be
SomeFeaturesEnum.values()-getNativeFeatures(), but not forced
dataContext.addQueryInterceptor(new QueryInterceptorAdaptor() {
@Override
public boolean canExecute(DataContext dc,Query q) {

List<SomeFeaturesEnum> queryProxyFeatures =
dc.getProxiedFeaturesForQuery(q);

if
(queryProxyFeatures.contains(SomeFeaturesEnum.COMPOUND_CPU)) {

System.out.println("Killed query for some reason:
"+q+" on dc: "+dc);
return false;
}
return true;
};
@Override
public boolean preExecute(...) {}
public boolean postExecute(...) {} // or more generic listener as
there are much more events.
});
}}}

--
Ticket URL: <http://eobjects.org/trac/ticket/851#comment:2>

eobjects

unread,
May 24, 2012, 3:18:31 AM5/24/12
to dataclean...@googlegroups.com
#851: BackendProxyFeature Listener
-------------------------+----------------------------
Reporter: w.cazander | Owner:
Type: enhancement | Status: new
Priority: low | Milestone: MetaModel 3.0
Component: MetaModel | Resolution:
Keywords: listener |
-------------------------+----------------------------
Influenced classes:
org.eobjects.metamodel.DataContext

-------------------------+----------------------------

Comment (by kasper):

We need to talk about (and maybe you need to explain) these feature enums.
I am not sure what you mean with COMPOUND_CPU, COMPOUND_MEMORY etc.

What I do see is that there are currently three different levels of query
handling:

* Those that handle 100% in backend (JDBC).
* Those that can handle WHERE clauses in backend (MongoDB + CouchDB, as
far as I remember).
* Those that handle everything in MetaModel's query postprocessor.

Additionally it is important to look at the features of the query
postprocessor. Specifically it is important if a query feature can be
applied in a streaming fashion or if it requires loading the complete
dataset into memory. This is determined by the query clauses:

* Streaming FROM clauses is a prerequisite for any streaming work. This
is determined by the implementation of the materializeMainSchemaTable(...)
method in each of the DataContext implementations. Most implementations
are streaming, but NOT the XML-DOM implementation and the Excel .xls
(pre-2003) implementation.
* Since this is the main point where implementations deviate, this is
also the thing that I think would be valuable for the API to inform about.
* Streaming SELECT is supported, including all aggregate functions.
* Streaming WHERE is fully supported.
* Streaming LIMIT (max rows and first row) is fully supported.
* But, streaming is not supported for:
* GROUP BY
* HAVING
* ORDER BY
* Note to myself: Cannot remember how JOIN works with relation to
streaming. I think one of the two joined datasets will be materialized
completely in memory.

--
Ticket URL: <http://eobjects.org/trac/ticket/851#comment:3>

eobjects

unread,
May 24, 2012, 8:45:42 PM5/24/12
to dataclean...@googlegroups.com
#851: BackendProxyFeature Listener
-------------------------+----------------------------
Reporter: w.cazander | Owner:
Type: enhancement | Status: new
Priority: low | Milestone: MetaModel 3.0
Component: MetaModel | Resolution:
Keywords: listener |
-------------------------+----------------------------
Influenced classes:
org.eobjects.metamodel.DataContext

-------------------------+----------------------------

Comment (by w.cazander):

For small list of features see SqlOperation and SqlOperationMatcher
example in;

http://code.google.com/p/tungsten-
replicator/source/browse/trunk/replicator/src/java/com/continuent/tungsten/replicator/database/SqlOperation.java

For full list use list on line starting with "%token <keyword>"

http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob_plain;f=src/backend/parser/gram.y;hb=HEAD

The compound features are for more higher level aggregated detection for
example;
- COMPOUND_CPU (HAVING,SELECT_INNER)
- COMPOUND_MEMORY (GROUP_BY,HAVING,ORDER_BY)
- COMPOUND_DISK (all local file system backed contexts),
- COMPOUND_NETWORK (mongo,jdbc-based-on-url),

Lets try to expain with some samples;
{{{
IterableDataContext csvDataContextIterable =
CsvIterableDataContext(testFile); // Lets open large many GB csv file.

if (csvDataContext.hasFeature(BackendProxyFeature.SELECT)) {
System.out.println("iterable backend should not have select
support"); // even no query in interface
}

// Create DataContext from iterable which implments all features
DataContext csvDataContextNormal = new
FullQueryIterableDataContext(csvDataContextIterable);

// Which could be sort hand for chain of data contexts which add support
for features.
//DataContext dc1 = QueryableDataContext(csvDataContextIterable);
//DataContext dc2 = SelectableDataContext(dc1);
//DataContext dc3 = GroupableDataContext(dc2);
//DataContext dc4 = OrderedDataContext(dc3);

if (csvDataContext.getMissingFeatures().isEmpty()) {
System.out.println("DataContext should support all features");
}

Query qStream =
csvDataContextNormal.query().select().from().where().execute();
Query qMemory =
csvDataContextNormal.query().select().from().groupBy().execute();

if
(csvDataContext.isQueryUsingFeature(qStream,BackendProxyFeature.GROUP_BY)==false)
{
System.out.println("qStream does not use group by");
}
if
(csvDataContext.isQueryUsingFeature(qMemory,BackendProxyFeature.GROUP_BY))
{
System.out.println("qMemeory uses group by");
}
if
(csvDataContext.isQueryUsingFeature(qMemory,BackendProxyFeature.COMPOUND_MEMORY))
{
System.out.println("qMemory use local memory feature");
}
if
(csvDataContext.isQueryUsingFeature(qMemory,BackendProxyFeature.COMPOUND_NETWORK)==false)
{
System.out.println("qMemory uses no network traffic.");
}

// Now some high level features
DataContext pgA = new JdbcDataContext(...);
DataContext pgB = new JdbcDataContext(...);
DataContext pgC = new JdbcDataContext(...);
DataContext pgAll = new CompositeDataContext(pgA,pgB,pgC);

Query qNative = pgAll.query().select().from("pgA.sch.table").execute();
Query qStream =
pgAll.query().select().from("pgA.sch.table").join("pgB.sch.table").execute();

if
(pgAll.isQueryUsingFeature(qNative,BackendProxyFeature.COMPOUND_MEMORY)==false)
{
System.out.println("qNative does not use any local memory
feature");
}
if
(pgAll.isQueryUsingFeature(qStream,BackendProxyFeature.COMPOUND_MEMORY)) {
System.out.println("qStream use local memory feature");
}
if
(pgAll.isQueryUsingFeature(qStream,BackendProxyFeature.COMPOUND_NETWORK))
{
System.out.println("qStream on jdbc will use network traffic.");
}


// Lets do strange stuff
public enum CacheType {
QUERY_ONLY, /* copy data to tmp table to query data. *./
CACHE_LAZY, /* build cache while query come */
CACHE_NOW /* start prefetching all data now */
}

JdbcCachedDataSource conf = new JdbcCachedDataSource();
conf.setCacheType(CacheType.CACHE_LAZY);
conf.setCacheUpdateType(CacheUpdateType.NONE); // no refresh of data
else;
//conf.setSyncBatchSize(10000);
//conf.setSyncRunTimer(60); // refesh every hour
//conf.setSyncRunSelector(RunSelector.ONE_TABLE);
//conf.setSyncQueryResults(true); // refesh data from select query result
in real dc.
DataContext cachedDataSource = new JdbcCachedDataSource(pgAll, "some-
embedded-sql-db-url");

Query query =
cachedDataSource.query().select().from("pgA.sch.table").join("pgB.sch.table").execute();

if
(cachedDataSource.isQueryUsingFeature(qStream,BackendProxyFeature.COMPOUND_MEMORY)==false)
{
System.out.println("query is using no local memory");
}
}}}

--
Ticket URL: <http://eobjects.org/trac/ticket/851#comment:4>

eobjects

unread,
May 24, 2012, 9:00:19 PM5/24/12
to dataclean...@googlegroups.com
#851: BackendProxyFeature Listener
-------------------------+----------------------------
Reporter: w.cazander | Owner:
Type: enhancement | Status: new
Priority: low | Milestone: MetaModel 3.0
Component: MetaModel | Resolution:
Keywords: listener |
-------------------------+----------------------------
Influenced classes:
org.eobjects.metamodel.DataContext

-------------------------+----------------------------

Comment (by w.cazander):

This missing feature chain building is inspired from how I implement
missing features on generic crud backends; (but without enums as there are
only 5)
{{{
public interface VascBackend {
...
public boolean isReadOnly();
public boolean isSortable();
public boolean isPageable();
public boolean isSearchable();
public boolean isRecordMoveable();
...
}
public class XpqlPersistanceVascBackend extends
AbstractPersistenceVascBackend {
...
public boolean isPageable() {
if (queryTotal==null) {
return false;
}
return true;
}
...
}
public class VascBackendProxyPaged extends AbstractVascBackendProxy {
...
public boolean isProxyNeeded() {
if (backend.isPageable()) {
return false;
}
return true;
}
...
}

// In (defaults factory) config add all backend proxies in ORDER
vascConfigLocal.addVascBackendProxy(new VascBackendProxyTimerLogger());
vascConfigLocal.addVascBackendProxy(new VascBackendProxyEventExecutor());
vascConfigLocal.addVascBackendProxy(new VascBackendProxyCache());
vascConfigLocal.addVascBackendProxy(new VascBackendProxyFilter());
vascConfigLocal.addVascBackendProxy(new VascBackendProxySearch());
vascConfigLocal.addVascBackendProxy(new VascBackendProxySort());
vascConfigLocal.addVascBackendProxy(new VascBackendProxyPaged());

// To use backend in frontend code auto config proxy chain.
public VascBackend configVascBackendProxied(VascController
vascController,VascEntry vascEntry, VascBackend realBackend) throws
VascException {
VascBackend backend = realBackend;
for (VascBackendProxy proxy:backendProxies) {
VascBackendProxy proxyClone;
try {
proxyClone = proxy.clone();
} catch (CloneNotSupportedException e) {
throw new VascException(e);
}
proxyClone.initProxy(backend, vascEntry);
if (proxyClone.isProxyNeeded()==false) {
continue;
}
backend = proxyClone;
}
return backend;
}
}}}

--
Ticket URL: <http://eobjects.org/trac/ticket/851#comment:5>

eobjects

unread,
Jun 27, 2012, 5:50:28 AM6/27/12
to dataclean...@googlegroups.com
#851: BackendProxyFeature Listener
-------------------------+----------------------------
Reporter: w.cazander | Owner:
Type: enhancement | Status: new
Priority: low | Milestone: MetaModel X.0
Component: MetaModel | Resolution:
Keywords: listener |
-------------------------+----------------------------
Influenced classes:
org.eobjects.metamodel.DataContext

-------------------------+----------------------------
Changes (by kasper):

* milestone: MetaModel 3.0 => MetaModel X.0


Comment:

Batch moving issues to version X.0 for further release planning.

--
Ticket URL: <http://eobjects.org/trac/ticket/851#comment:6>

Reply all
Reply to author
Forward
0 new messages