Multiple Db support

psiinon

unread,

Jan 6, 2015, 8:22:57 AM1/6/15

to zaproxy...@googlegroups.com

OK, so 2.4.0 isnt quite out yet, but I have started working on post 2.4 features.

And the first major change I’m looking at making is multiple database support (Issue 321).

As you probably all know, ZAP is a fork of Paros which was very much a desktop tool.

As such the HSQLDB it used was a good choice as its lightweight and doesnt require any maintenance.

ZAP is now being used in many different environments, and for some of those we need a much more robust way of storing data.

So this enhancement is to change ZAP so that it can support more than just HSQLDB.

The requirements I’ve identified are:

It should be relatively easy to support any SQL Db
At least HSQLDB and MySQL/MariaDB should be supported
All ‘old’ sessions (ie created by previous versions of ZAP) should still be usable
A zero config db (such as HSQLDB) should be the default ‘out of the box’
Performance should not be impacted
We should attempt to minimize the number of extra jars we depend on
Code changes should be minimized and isolated as much as possible
Add-ons should be able to add their own tables in a clean and supportable way (ie cleaner than now;)

My initial thoughts were to use something like Hibernate (which I’ve used before) but I’m now not so sure.

The db access is pretty isolated, and we already have POJOs that represent the data in the tables.

Our biggest need is to make our code independent of a specific flavour or SQL, and Hibernate feels like a very heavyweight solution to that.

I’ve also looked at options like MyBatis, but it looks like that actually requires you to write different SQL statements for each SQL variant anyway!

So I’m now thinking that the easiest and most maintainable option might be to just use property files for the SQL statements, so we would have things like:

hsqldb.properties:

history.read = SELECT TOP 1 * FROM HISTORY WHERE {0} = ?

history.delete = DELETE FROM HISTORY WHERE {0} = ?

history.lastindex = SELECT TOP 1 HISTORYID FROM HISTORY ORDER BY HISTORYID DESC LIMIT 1

history.containsuri = SELECT TOP 1 HISTORYID FROM HISTORY WHERE URI = ? AND METHOD = ? AND REQBODY = ? AND SESSIONID = ? AND HISTTYPE = ?

history.lastinsert = CALL IDENTITY()

mysql.properties:

history.read = SELECT * FROM HISTORY WHERE {0} = ? LIMIT 1
history.delete = DELETE FROM HISTORY WHERE {0} = ?

history.lastindex = SELECT HISTORYID FROM HISTORY ORDER BY HISTORYID DESC LIMIT 1

history.containsuri = SELECT HISTORYID FROM HISTORY WHERE URI = ? AND METHOD = ? AND REQBODY = ? AND SESSIONID = ? AND HISTTYPE = ? LIMIT 1

history.lastinsert = LAST_INSERT_ID()

And the code would then look something like:

psRead = conn.prepareStatement(DbSQL.getSQL("history.read", HISTORYID));

psDelete = conn.prepareStatement(DbSQL.getSQL("history.delete", HISTORYID));

psContainsURI = conn.prepareStatement(DbSQL.getSQL("history.containsuri"));

Yes, I’ve started trying this out and so far it seems to work well (although its still early days;).

One problem this doesnt address is how to define the schema.

For that I’ve been playing with Apache DdlUtils: http://db.apache.org/ddlutils/

It allowed me to easily export the HSQLDB schema into an XML file.

That file did _not_ load straight into MySQL, but I was able to tweak it so that it did.

Its worth noting that there are a small number of places in the code outside of the org.parosproxy.paros.db package that access the Db tables directly.

These will have to change as this will not be allowed going forwards.

You may well also have noticed that my requirements included ‘SQL Db’s which appears to exclude NoSQL dbs.

I have no problem with ZAP supporting NoSQL dbs, but I havnt used them myself and dont really understand the full implications of supporting them.

One very flexible option would be to introduce an interface layer which would be the only way to access classes like TableHistory. The ‘SQL’ implementation would be provided by default but anyone could create alternative implementations (eg using NoSQL) and change the ZAP configs to use them (e.g. using DI).

Can anyone suggest another way that these could be easily supported?

Notwithstanding NoSQL support, I think this combination of SQL in property files and using DdlUtils to migrate the schema could work well, be relatively easy to adopt and straightforward to maintain.

I still need to do quite a bit more work in this area before I can say for certain that its a good way to go, but I wanted to let you all know the direction I’m thinking of going so you have time to object, make alternative suggestions, etc etc :)

So .. does this sound reasonable or do you think I should be looking at other ways of doing this?

Cheers,

Simon

kingthorin+owaspzap

unread,

Jan 6, 2015, 9:16:45 AM1/6/15

to zaproxy...@googlegroups.com

I think this looks good so far. There's one major detail I'm missing though: HSQLDB will be supported out of the box but if a user chooses to leverage an existing MySQL (or whatever) installation how/where are they providing the connectivity info for ZAP to use?

psiinon

unread,

Jan 6, 2015, 9:29:08 AM1/6/15

to zaproxy...@googlegroups.com

Thats tbd, but I'd want it to be configurable via a config file and/or the command line.
And by that I mean both options should be available.
We could just use the existing config.xml file or we could introduce another one.
I think reusing config.xml is ok, but I'm happy to be persuaded otherwise.

Do we need to be able to change the db options via the UI and API?
I think that might be a nice option, particularly the API one, as that would allow you to roll over to a new db periodically.

Thoughts?

Colm O'Flaherty

unread,

Jan 6, 2015, 2:07:52 PM1/6/15

to zaproxy...@googlegroups.com

Hibernate is definitely a 'full fat' option, but it does do an excellent job of insulating the developer from the specifics of the rdbms.... I would be a little reluctant to roll our own alternative, before we have even tried hibernate or an alternative. Having said that, the only thing that concerns me a little right now about putting the SQL in properties is the cost of loading the SQL from disk, particularly in a sql statement that is run very frequently. Do you know if properties are cached in memory at all?

I have mixed views on having the dB options exposed in the GUI. I would be happy without a GUI option, but I expect we would probably need to enable it in the GUI as soon as someone starts asking why the dB is 'not configurable'.

Colm

--
You received this message because you are subscribed to the Google Groups "OWASP ZAP Developer Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to zaproxy-devel...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

yha...@gmail.com

unread,

Jan 7, 2015, 5:01:28 AM1/7/15

to zaproxy...@googlegroups.com, colm.p.o...@gmail.com

Dear all, I agree that putting SQL statement in property files is not a good idea. And usually it's difficult to optimize the queries id they're loaded separately and used inside specific classes.

I think we've two possible choices: the first one is to use the DAO pattern, I use it strongly and it's very effective if the DB structure is stable and the queries don't change so much along time. In this situation you define a specific DAO for each supported DBMS and load it together with all the needed connection parameters at the startup of a high level "Manager" class. Using the override capabilities you can then quickly define new DAO starting from the older one.

But if we have frequent changes of the queries and database structure, the ORM model is the better one. In this case I suggest to use a standard API like JPA to introduce this functionality, and then use Hibernate, Toplink or other ORM compatible frameworks besides that can be "linked" according to the needings.

To unsubscribe from this group and stop receiving emails from it, send an email to zaproxy-develop+unsubscribe@googlegroups.com.

psiinon

unread,

Jan 7, 2015, 5:12:21 AM1/7/15

to zaproxy...@googlegroups.com, colm.p.o...@gmail.com

We actually cache all of the prepared statements right now.
So the only difference between using a property file and the current code is a one off read of the relevant property file at start up and that the SQL is not hard coded and can be changed.

I have actually used Hibernate before. I think its a good option for a brand new project, but I'm still not convinced its the best option for ZAP ;)

Cheers,

Simon

To unsubscribe from this group and stop receiving emails from it, send an email to zaproxy-develop+unsubscribe@googlegroups.com.

psiinon

unread,

Jan 7, 2015, 5:20:35 AM1/7/15

to zaproxy...@googlegroups.com, colm.p.o...@gmail.com

If we implement an interface layer then people can provide different implementations.
I suspect I'm still going to go for the property option, but if anyone else wants to implement a Hibernate or NoSQL implementation then that would be a great addition.
I'll come up with a proposal for the interface layer.

Cheers,

Simon

Colm O'Flaherty

unread,

Jan 7, 2015, 5:47:26 AM1/7/15

to psiinon, zaproxy...@googlegroups.com

Ok. I get you now.

What's your driver for the prop file approach, compared to a DAO/ ORM, btw? Simplicity?

psiinon

unread,

Jan 7, 2015, 5:59:58 AM1/7/15

to zaproxy...@googlegroups.com, psi...@gmail.com, colm.p.o...@gmail.com

Good question :)

Simplicity, minimising code changes, ease of implementation and lack of extra dependencies.

Extracting the existing SQL statements to a property file is essentially just like i18n - the code is the same apart from the fact the SQL is read once at start up from a property file rather than hardcoded. No extra dependencies.
And I think that should allow us to support dbs like MySQL quite easily - if it doesnt then I'll admit its the wrong approach ;)

Using hibernate will involve much more work defining the mapping files, and will mean we need to include at least another 9 jars (according to http://stackoverflow.com/questions/13978020/hibernate-4-1-9-which-jar-files-do-i-need).

Cheers,

Simon

Reply all

Reply to author

Forward