OK, so 2.4.0 isnt quite out yet, but I have started working on post 2.4 features.
And the first major change I’m looking at making is multiple database support (Issue 321).
As you probably all know, ZAP is a fork of Paros which was very much a desktop tool.
As such the HSQLDB it used was a good choice as its lightweight and doesnt require any maintenance.
ZAP is now being used in many different environments, and for some of those we need a much more robust way of storing data.
So this enhancement is to change ZAP so that it can support more than just HSQLDB.
The requirements I’ve identified are:
It should be relatively easy to support any SQL Db
At least HSQLDB and MySQL/MariaDB should be supported
All ‘old’ sessions (ie created by previous versions of ZAP) should still be usable
A zero config db (such as HSQLDB) should be the default ‘out of the box’
Performance should not be impacted
We should attempt to minimize the number of extra jars we depend on
Code changes should be minimized and isolated as much as possible
Add-ons should be able to add their own tables in a clean and supportable way (ie cleaner than now;)
My initial thoughts were to use something like Hibernate (which I’ve used before) but I’m now not so sure.
The db access is pretty isolated, and we already have POJOs that represent the data in the tables.
Our biggest need is to make our code independent of a specific flavour or SQL, and Hibernate feels like a very heavyweight solution to that.
I’ve also looked at options like MyBatis, but it looks like that actually requires you to write different SQL statements for each SQL variant anyway!
So I’m now thinking that the easiest and most maintainable option might be to just use property files for the SQL statements, so we would have things like:
history.delete = DELETE FROM HISTORY WHERE {0} = ?
history.lastindex = SELECT TOP 1 HISTORYID FROM HISTORY ORDER BY HISTORYID DESC LIMIT 1
history.containsuri = SELECT TOP 1 HISTORYID FROM HISTORY WHERE URI = ? AND METHOD = ? AND REQBODY = ? AND SESSIONID = ? AND HISTTYPE = ?
history.lastinsert = CALL IDENTITY()
history.lastindex = SELECT HISTORYID FROM HISTORY ORDER BY HISTORYID DESC LIMIT 1
history.containsuri = SELECT HISTORYID FROM HISTORY WHERE URI = ? AND METHOD = ? AND REQBODY = ? AND SESSIONID = ? AND HISTTYPE = ? LIMIT 1
history.lastinsert = LAST_INSERT_ID()
And the code would then look something like:
psRead = conn.prepareStatement(DbSQL.getSQL("history.read", HISTORYID));
psDelete = conn.prepareStatement(DbSQL.getSQL("history.delete", HISTORYID));
psContainsURI = conn.prepareStatement(DbSQL.getSQL("history.containsuri"));
Yes, I’ve started trying this out and so far it seems to work well (although its still early days;).
One problem this doesnt address is how to define the schema.
For that I’ve been playing with Apache DdlUtils: http://db.apache.org/ddlutils/
It allowed me to easily export the HSQLDB schema into an XML file.
That file did _not_ load straight into MySQL, but I was able to tweak it so that it did.
Its worth noting that there are a small number of places in the code outside of the org.parosproxy.paros.db package that access the Db tables directly.
These will have to change as this will not be allowed going forwards.
You may well also have noticed that my requirements included ‘SQL Db’s which appears to exclude NoSQL dbs.
I have no problem with ZAP supporting NoSQL dbs, but I havnt used them myself and dont really understand the full implications of supporting them.
One very flexible option would be to introduce an interface layer which would be the only way to access classes like TableHistory. The ‘SQL’ implementation would be provided by default but anyone could create alternative implementations (eg using NoSQL) and change the ZAP configs to use them (e.g. using DI).
Can anyone suggest another way that these could be easily supported?
Notwithstanding NoSQL support, I think this combination of SQL in property files and using DdlUtils to migrate the schema could work well, be relatively easy to adopt and straightforward to maintain.
I still need to do quite a bit more work in this area before I can say for certain that its a good way to go, but I wanted to let you all know the direction I’m thinking of going so you have time to object, make alternative suggestions, etc etc :)
So .. does this sound reasonable or do you think I should be looking at other ways of doing this?
Cheers,
Hibernate is definitely a 'full fat' option, but it does do an excellent job of insulating the developer from the specifics of the rdbms.... I would be a little reluctant to roll our own alternative, before we have even tried hibernate or an alternative. Having said that, the only thing that concerns me a little right now about putting the SQL in properties is the cost of loading the SQL from disk, particularly in a sql statement that is run very frequently. Do you know if properties are cached in memory at all?
I have mixed views on having the dB options exposed in the GUI. I would be happy without a GUI option, but I expect we would probably need to enable it in the GUI as soon as someone starts asking why the dB is 'not configurable'.
Colm
--
You received this message because you are subscribed to the Google Groups "OWASP ZAP Developer Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to zaproxy-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and stop receiving emails from it, send an email to zaproxy-develop+unsubscribe@googlegroups.com.
To unsubscribe from this group and stop receiving emails from it, send an email to zaproxy-develop+unsubscribe@googlegroups.com.