[Mifos-developer] Data generator for Mifos performance testing

Aravind.D...@sungard.com

unread,

Nov 17, 2009, 9:30:01 AM11/17/09

to mifos-d...@lists.sourceforge.net

Hi,

Extending John’s mail on data for performance testing, I want to add my views to the questions raised by him.

First I will try to define the problem

Aim:

the aim of this exercise was to generate sufficient data to do load tests for Mifos in particular collection sheet entry

Problem definition:

1. Identify a method to generate sufficient data which can be used as a baseline for the test(750,000 to 1,000,000 records). The data should be heterogenenous

and realistic ie should represent how production data will be in MFIs.

2. Identify a method to generate data required for the test. The time taken by this should be less, to make sure

multiple tests can be planned in a day(The expected size of Mifos DB with 1M records will be 100 GB).

3. Identify a method to rollback the database to the baseline state after each test. Again this should be quick enough to enable executing

multiple tests

Approaches for data insert:

Few of the approaches we discussed are

1. Create Business Objects using Java and persist them

Pros:

Easy maintenance

Cons:

Needs time to develop

Might be slow

Might not be useful to some who can do Performance testing and don’t know Java

2. Extend the currently used stored procs to support these requirements

Pros:

Will be the fastest

easy to modify in case any changes are required

Cons:

Maintenance is difficult

Knowledge of MySQL procs needed

Approaches for rollback:

1. Create delete scripts/procs for the data inserted

pros:

the fastest way to drop the data

cons:

what data to be deleted should be known to the tester

2. Rollback using the backup of the baseline data

pros:

will ensure the bug free rollback

cons:

might be the slowest of all methods. Restoring a 100 GB database will take days to complete

3. Shadow copy and restore the folder where MySQL DB files are stored

This is yet to be explored

Let me know your ideas/suggestions on this.

Aravind Deivendran • Project Lead • SunGard • Technology Services • 6th Floor, Embassy Icon, #3, Infantry Road, Bangalore, India

Tel +91 80 30913200 3144 • Mobile +919980962300 • www.sungard.com/sts

Jeff Brewster

unread,

Nov 18, 2009, 8:06:39 PM11/18/09

to Mifos software development, Aravind.D...@sungard.com

>Hi,
>Extending John's mail on data for performance testing, I want to add my
views to the questions raised by him.

Thanks for kicking off this conversation, Aravind. I hope others who
might have ideas will be willing to share their suggestions or previous
experiences which might help direct our efforts.

>Approaches for data insert:
>Few of the approaches we discussed are
>1. Create Business Objects using Java and persist them

Seems like there are dependencies that must be followed for this to work
-e.g. must create an office before creating a center, etc. Maybe this
aligns with the same sequencing that the current stored procedures
follow? I like the potential of being able to create data on the fly,
which would possibly assist other testing efforts including our
automated acceptance tests.

I also like this option because it helps test our application from the
business layer and below.

> 2. Extend the currently used stored procs to support these
requirements

Any thoughts on how to incorporate the associated transactions,
scheduled meetings, etc. into stored procs that would come along with 2x
users?

A couple of other approaches just to throw into the list. I wanted to
mention in case others have good experiences with one of these ideas:

3. We discussed taking existing customer data, and trying to externally
manipulating it to grow it in size. This ensures very realistic data,
but I have concerns about whether we could do this accurately, and
manipulation of such large files would be difficult.
4. One used previously by IBM for performance testing - generating test
data by driving the UI using a testing tool like Load Runner or
Selenium. This seems slow when trying to build such a large dataset.
5. Use an external tool that is build for generating data? I don't have
experience with these tools. One example would be "Advanced Data
Generator" - http://www.upscene.com/products.adg.index.php.

>Approaches for rollback:
another approach to add -
4. If we're using a virtual environment, we don't need to rollback the
data right? When the test is finished, we just shutdown the virtual
image and save the time required to do the rollback.

Jeff

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july

John Woodlock

unread,

Nov 18, 2009, 8:54:26 PM11/18/09

to Mifos software development

At 12:06 PM 19/11/2009, you wrote:

> >Approaches for data insert:
> >Few of the approaches we discussed are
> >1. Create Business Objects using Java and persist them

Way back during the September discussion I put up a patch that
generated test data using a business object creation method used in
some tests (TestObjectFactory). At the time Van said there were
problems with the way that worked. He was right. It is inaccurate
and shouldn't be used in this case.

Keith Woodlock was working on another, easier way to create centers
and customers etc for testing. I've used his method for some tests
but I don't think it has been widely enough implemented to build the
test data you are talking about.

So, as far as I can see, at the moment the business object creation
method wouldn't be available unless coding work was put into it.

On the 'restore'. Because rollback can take so long for large
insertions/deletions/updates, I still can't look past the operating
system 'file copy' type operation Adam M originally mentioned. I've
haven't ever done shadow copying but certainly it is common place
(for speed) in testing on top of many database systems to just copy
over the folder(s) / file(s) with the base data.

bit more soon.

John

Keith Woodlock

unread,

Nov 18, 2009, 10:47:18 PM11/18/09

to Mifos software development

Hi All,

Just wanted to jump in on this discussion and respond to some of the
points made.

This area of load/performance testing is vitally important and when
done on top of proper unit, integration and acceptance/functional
testing, gives us the confidence we need to release to production.
Aravind provided us with a description of the aim of what is trying to
be achieved:

> Aim:
>
> the aim of this exercise was to generate sufficient data to do load tests
> for Mifos in particular collection sheet entry
>

but I would change the aim to be something like:

The aim is to be able to do load/performance testing on the mifos
application. By making this easier/cheaper to do it will enable
load/performance testing to be executed more often (and ideally as
often as we run our other test suites).

As Aravind points out, this testing has to be done of top of a large
dataset as this more reflective of production environments and will
catch problems such as:
1. Database problems like no indexes etc
2. Hibernate usage or Query/Algorithm problems

Ideally this large dataset minics closely data that is seen in production.

So this brings us to Aravind's first problem:

> Problem definition:
>
> 1. Identify a method to generate sufficient data which can be used as a
> baseline for the test(750,000 to 1,000,000 records). The data should be
> heterogenenous
>
> and realistic ie should represent how production data will be in MFIs.
>

I can thing of two ways of getting here:
1. We just take a copy of a current large production dataset that
has sufficient size (1M, 2M records or whatevers needed). Of course
might be data issues here etc.

2. We create all the data needed!
- I am not sure that time is an issue here. Even if it takes two
weeks to generate all the data needed (using whatever technique,
leveraging current domain model and services or by using stored
procedures). Because at the end of it we have our asset that we can
call our baseline and just keep reusing it. We won't need to keep
recreating the large dataset constantly.

Aravind's second problem was:

> 2. Identify a method to generate data required for the test. The time taken
> by this should be less, to make sure
> multiple tests can be planned in a day(The expected size of Mifos DB with 1M
> records will be 100 GB).
>

This is where we run into the question of what approach to take to get
the data we needed to be inserted on top of our 'baseline' data. The
inserted data for the test must also be reflective of how the
application works and reflective of the type of data typically seen in
production. The example being used was the 'Collection sheet entry'
work.

In the end, I believe stored procedures were created to get the data
into a database to perform the load/performance testing. As john
pointed out in the previous mail, we showed an example of using java
to insert the data. The problem with that approach was the leveraging
of 'TestObjectFactory' which does not duplicate how the application
works correctly. But there was no problem with speed really, it was
pretty quick to get a 'good enough' dataset into the database so tests
could be carried out on it. In reality we should of being doing
exactly what the application code does and leveraging its domain model
and services for creating the 'collection sheet entries'. The code for
doing this did exist in the application code, it just wasn't nicely
put together in a 'servicey' way that was obvious and easy to use. I
believe John has being putting the finishing touches to that.

So in short, the application already inserts data into the database so
the code does exist that can be called. The problem is that much of
the business logic is typically up at the Action/Struts layer and not
presented nicely in any 'service' but you could just call the struts
method if needed.

The alternative approach of using stored procedures while it too will
work is very problematic in my view because:
1. It needs to duplicate the domain/business logic that already
exists with the java code.
2. It will take time to create this procedure and more time to
verify it works as the application does!

When we change/refactor any code in java land we will have to be
mindful of the performance tests and may break them without knowing
thus introducing another reason to be afraid of refactoring the domain
model.

So basically I am completely against the stored procedure approach but
maybe we should have a more thorough discussion on it before making a
decision on which way to go.

John also said the following:

>
> Keith Woodlock was working on another, easier way to create centers
> and customers etc for testing. I've used his method for some tests
> but I don't think it has been widely enough implemented to build the
> test data you are talking about.
>
> So, as far as I can see, at the moment the business object creation
> method wouldn't be available unless coding work was put into it.
>

The approach john is talking about is a 'Builder' approach to creating
complex domain objects. The purpose of these is purely to clean up the
unit/integration tests and make them more commutative and less
fragile. They are not intended for any other use. As I have stated
earlier, any approach using Java should leverage the production codes
domain/services and not any test code.

For more on these just read Nat Pryce's blog entries:
http://nat.truemesh.com/archives/000714.html

As for rollback/restore I also agree with what john is saying below
about what Adam M was saying.

>
> On the 'restore'. Because rollback can take so long for large
> insertions/deletions/updates, I still can't look past the operating
> system 'file copy' type operation Adam M originally mentioned. I've
> haven't ever done shadow copying but certainly it is common place
> (for speed) in testing on top of many database systems to just copy
> over the folder(s) / file(s) with the base data.
>

Hope this is some bit helpful.

Regards,
Keith W

Adam Feuer

unread,

Nov 18, 2009, 11:34:56 PM11/18/09

to Mifos software development

On Tue, Nov 17, 2009 at 6:30 AM, <Aravind.D...@sungard.com> wrote:
> 1. Create Business Objects using Java and persist them

Folks,

Elaborating on this idea a bit, if we placed the business objects in a
separate jar, we could write programs that used them.

To do this, one simple way would be to place all the classes in the
application/ module into a jar. This seems like a straightforward
change in maven. Another way to do this would be to make a separate
module that held these objects.

An external program could use multiple threads to do concurrent
inserts at high speed.

-adam
--
Adam Feuer <adamf at pobox dot com>

John Woodlock

unread,

Nov 19, 2009, 12:29:42 AM11/19/09

to Mifos software development

Personally, I think you could simultaneously implement a number of
approaches to performance testing depending on the 'question' under
test. I'm still not solid on the 'question'. I've imagined that the
question is to measure how scaleable current and future mifos is...
but I'm not sure.

However, I realize I'm overly biased towards "collection sheet save"
in GK. So, I find it hard, at the moment, to see the bigger
scaleability picture. But I'm going to try to.

Having declared my bias, I also see myself as a customer of the great
load testing asset that Sungard has created.

Currently, I want to be able to measure the value (or not) of changes
I've been making and intend to make to the save collection sheet
process. For me, the existing GK data (with 5 to 10 concurrent
users) is optimal (that's my belief... I'm often wrong). Even the
adhoc data from one GK center I got from Raghav pinpointed a few
obvious and significant areas of improvement that I hadn't seen by
looking through the code or from the generated data.

So my initial thought is that my needs (or the needs of anyone who is
interested in GK collection sheet save performance) can be satisfied
by allocating maybe a 100 active GK centers to each of the 5 or 10
concurrent users. Whether this meets other needs or future needs I don't know.

I understand that the stored procs can be changed to reflect 'aged
data' but that's merely responding to the lesson learnt from looking
at the GK data so you might as well be using GK data (the more recent
the better).

Maybe its because of my bias but I can't get my head around why it's
valuable to generate new data on top of the GK baseline dataset?

If there is a requirement that the data generation be more generic
than GK, fair enough, except that so far the data generation has been
tailored to the GK case. e.g. its basically repayment processing
with no fees or charges or savings accounts that other users may make use of.

John

Jeff Brewster

unread,

Nov 19, 2009, 1:31:55 AM11/19/09

to Mifos software development

>-----Original Message-----
>From: John Woodlock [mailto:jo...@nassles.com]

>Personally, I think you could simultaneously implement a number of
>approaches to performance testing depending on the 'question' under
>test. I'm still not solid on the 'question'. I've imagined that the
>question is to measure how scaleable current and future mifos is...
>but I'm not sure.
>

Hi John,
Good observation! What is the question we're trying to answer? In
short, we are trying to jump ahead of our largest customers to ensure
that when they do arrive at 500,000 or 1 million clients we'll have good
understanding how Mifos will scale. Hopefully this draft scalability
test plan will give better context to this discussion -
http://www.mifos.org/developers/testing/scalability-test-strategy.
Sorry for not mentioning this document on the list sooner.

To optimize a specific area, the existing test data sets may be
perfectly fine. But we'd also like to get a grasp for what the Mifos
performance will be for defined scenarios as we grow the database to 2x
or 4x its current size.

Appreciate any feedback on the document mentioned above.

Regards,
Jeff

John Woodlock

unread,

Nov 22, 2009, 3:10:36 AM11/22/09

to mifos software development

Jeff,

http://www.mifos.org/developers/testing/scalability-test-strategy.

So, it's just a small initiative then :)

Had a look through the document. Very good it is too. March 2010, with GK expecting over 500k clients seems very close.

There are so many interesting questions that could be tackled (in so many ways) and yet there's the niggling feeling there could be a simple enough rough answer and getting it quickly might be more valuable than getting a more accurate answer later on. Probably both approaches are needed. I might be a more than a little out of my capacity planning depth here but, since you asked, here's what I'm thinking in case any of it helps. Apologies for length and waffle.

X 1 baseline (current situation)
Are there any 'current' figures yet or is that still to be done? I think an "X 1" baseline would have to run against a recent GK backup with no generated data so it could be said that X customers results in certain figures for key functionality areas. Using "mifos 1.4" I guess? If GK users using it operationally (rather than in the lab) could express whether the 1.4 situation was exceeding, just at, or below expectations that would be helpful too.

I think a lot of decision making information could be gathered by seeing how sensitive mifos is to 1) data volume, 2) transaction throughput & 3) time.

1) Data volume question: "how does mifos scale if the same load as in "X 1" was pushed through at database sizes corresponding to " X 2" and "X 4?"
There would have to be some deterioration but, intuitively, I think mifos would do quite well here. Largely because of the indexes Sungard applied in 1.3.1. The Sungard tests run so far suggest this too.

2) Transaction throughput question: "what if I "X 2" and "X 4" the load.... on the baseline database... and optionally on "X 2" and "X 4" databases as well?"

3) Time (activity based data growth where no. of customers is kept constant): "How much do previous transactions affect current transactions?"
I'm pretty sure that Mifos performance is quite sensitive to this (might even be the biggest factor). The extra growth associated with transactions like loan payment gets pulled in the next time you make a payment or look at loan details. So making the first payment is a lot cheaper than the 20th. Good news is that once identified much of it can be avoided or mitigated. To get a rough idea of how much this costs... you could run the same load through a GK Nov 2008 database and then a GK Nov 2009 database.

I think none of the tests above demand 'generated data' (which are basically new customers) but "X 2" and "X 4" does suggest "cloned data". I think this is what you are referring to here:

>3. We discussed taking existing customer data, and trying to externally
>manipulating it to grow it in size. This ensures very realistic data,
>but I have concerns about whether we could do this accurately, and
>manipulation of such large files would be difficult.

It's basically being able to clone a branch? So you could run through all the branches and clone them giving you "X 2". Because this is mostly about following database relationships, little of the domain knowledge that is held in java is required... so there wouldn't be any debate over duplicating business rules.
Stored procedures could do this (but it is at least as much work as it took for the current stored procs). Even better might be a tool. I had a quick look and I didn't see any open source tools that jumped out... but there must be something. I'd imagine the tool would expect you do define which relationships to follow.

However, if the cloning isn't a goer maybe the current stored proc generated data could be used but opposite to the way its used now. It could be used to fill up the database to approximately "X 2" and "X 4" size but the tests would be driven off the actual GK data.

Or if it was thought that the new customers coming on wouldn't have the data build up that existing customers have and it was important to model that then you could apply some ratio of GK centers to generated centers in the tests.

Regards

John

Adam Feuer

unread,

Nov 23, 2009, 12:40:44 AM11/23/09

to Mifos software development

On Wed, Nov 18, 2009 at 9:29 PM, John Woodlock <jo...@nassles.com> wrote:
> However, I realize I'm overly biased towards "collection sheet save"
> in GK.

John,

The two primary "stress points" that I see are:

* the Collection Sheet Entry flow - used by the bank branches to enter
their daily transactions
* the batch jobs - that do various calculations

When we increase the number of clients, we increase the number of
branches that are doing data entry, and we increase the number of
accounts that the batch jobs need to iterate over.

Do you see other primary (business critical and already on the verge
of breaking) stress points?

Some secondary places increased clients puts stress on the system are:

* Slow running time for reports
* Slow running time for various operations (adding a client or
account; adding a custom field; etc.)

We're mainly concerned about the primary stress points, but we do want
to collect data about the secondary ones.

What are your thoughts?

cheers

adam
--
Adam Feuer <adamf at pobox dot com>

------------------------------------------------------------------------------

John Woodlock

unread,

Nov 23, 2009, 3:25:22 AM11/23/09

to Mifos software development

Hi Adam,

I've really only worked in detail on the collection sheet area. I have some familiarity with the screen that shows center details and also some of the batch jobs so what I see tends to be aligned to what you've written. I haven't looked at reporting.

I've kind of assumed (like most it seems) that these are the big areas to start with, are reasonable indicators for capacity planning and are amenable to substantial improvement but I don't have direct GK feedback on what their user experience is.

You'd think that something somewhere would be able to track user transaction name, time and frequency (rather than code it up simple though that might be).

John

Ryan Whitney

unread,

Nov 23, 2009, 7:18:03 PM11/23/09

to Mifos software development

Adam F and John,

I think we should add an additional stress point of entering individual
transactions through the standard loan detail page. Not all MFI's are using
the collection sheet report and are doing more of a teller model (ie, ENDA).

Ryan

On 11/23/09 13:40, "Adam Feuer" <ad...@pobox.com> wrote:

> On Wed, Nov 18, 2009 at 9:29 PM, John Woodlock <jo...@nassles.com> wrote:
>> However, I realize I'm overly biased towards "collection sheet save"
>> in GK.
>
> John,
>
> The two primary "stress points" that I see are:
>
> * the Collection Sheet Entry flow - used by the bank branches to enter
> their daily transactions
> * the batch jobs - that do various calculations
>
> When we increase the number of clients, we increase the number of
> branches that are doing data entry, and we increase the number of
> accounts that the batch jobs need to iterate over.
>
> Do you see other primary (business critical and already on the verge
> of breaking) stress points?
>
> Some secondary places increased clients puts stress on the system are:
>
> * Slow running time for reports
> * Slow running time for various operations (adding a client or
> account; adding a custom field; etc.)
>
> We're mainly concerned about the primary stress points, but we do want
> to collect data about the secondary ones.
>
> What are your thoughts?
>
> cheers
> adam

--
Ryan Whitney
Mifos Technical Program Manager
rwhi...@grameenfoundation.org
Mifos - Technology that Empowers Microfinance (www.mifos.org)
Our mission is to enable the poor, especially the poorest, to create a world
without poverty.
<http://grameenfoundation.org/take-action/ingenuity-fund-challenge/>
P please consider the environment before printing this e-mail.

Jeff Brewster

unread,

Nov 23, 2009, 8:36:25 PM11/23/09

to Mifos software development

John,
Thanks for the great comments. I'm planning to chat with Aravind
Tuesday (IST) so we'll respond back with more thoughts after we chat.
In the meantime, I do have a couple of questions about your email -

1. You mention data volume, transaction throughput, and time as 3
factors. I'd like to confirm I see what you mean by those three. For
the data volume question, it's essentially what is the response for
single user with current data size vs. 2X or 4X? Scenarios that might
fall in this category include batch job performance (more records to
process) or actions against large data like the issue Jakub fixed for
Gazelle B - https://mifos.dev.java.net/issues/show_bug.cgi?id=2410.
Thanks Jakub!
2. I'm not sure I quite follow the time factor. When you say compare a
2008 database to a 2009 database, I think you mean the transactional
data (for example) has grown for a given account? How is that different
that the data volume factor?
3. I like the idea of making the data bigger by adding generic test
data and then acting on the actual GK data.

Much to discuss here for sure!

Thanks,
Jeff

John Woodlock

unread,

Nov 23, 2009, 8:42:58 PM11/23/09

to Mifos software development

That would make sense.

Although some of the improvements intended in the collection sheet area are to do with the bulk nature of it... others are connected to the underlying loan (and other account) structure.

John Woodlock

unread,

Nov 23, 2009, 10:19:49 PM11/23/09

to Mifos software development

Jeff,

comments below. I think you got the right idea even though I worded it a bit confusingly.

On Tue, Nov 24, 2009 at 12:36 PM, Jeff Brewster <jbre...@grameenfoundation.org> wrote:

1. You mention data volume, transaction throughput, and time as 3
factors. I'd like to confirm I see what you mean by those three. For
the data volume question, it's essentially what is the response for
single user with current data size vs. 2X or 4X? Scenarios that might
fall in this category include batch job performance (more records to
process) or actions against large data like the issue Jakub fixed for
Gazelle B - https://mifos.dev.java.net/issues/show_bug.cgi?id=2410.
Thanks Jakub!

Exactly. Though I'm sure it would be useful to simulate a few concurrent users.
Example of this is: I do something to a single client today with 1M clients and it takes 10 secs.
I double the number of clients and the same thing now takes nearly 20 secs. It's obviously affected by adding in the extra unrelated data. Often this is bad indexes or some thresholds being hit.

2. I'm not sure I quite follow the time factor. When you say compare a
2008 database to a 2009 database, I think you mean the transactional
data (for example) has grown for a given account? How is that different
that the data volume factor?

I do mean transactional data growth for a given account.

The previous example's data growth is new branches and centers. New centers shouldn't affect other centers very much i.e. if I process a collection sheet for a center I wouldn't expect the application to touch data for any other centers. However, until the Sungard indexes were put on ... because of table scans... this was exactly the case.

However, if my collection sheet process continually reads all its own account_payment table data... then it is will take a bit longer each time a new account_payment is added. After a year this can be quite a difference. Indexes won't fix this. This is one of the 'time' sensitivities and afaics is down to our hibernate use and hopefully will be addressed soon.

But I suppose what I was getting at is you don't have to take on new clients to build up bottlenecks.

John

Chand...@sungard.com

unread,

Jan 8, 2010, 4:54:38 AM1/8/10

to mifos-d...@lists.sourceforge.net, Maheswari....@sungard.com

Hi!

I was just looking at some opensource tools available for data generation. .Came across this http://databene.org/databene-benerator/ .. also there are some other generators that are mentioned in this site. .. Wanted to know if anyone has used any of these, and how complicated it is configure it to actually generate data for a relational database such as mifos? The documentation seems pretty extensive.

regards

Chandan

Jeff Brewster

unread,

Jan 8, 2010, 9:03:39 PM1/8/10

to Mifos software development, Maheswari....@sungard.com

Hi Chandan,

This does look like a nice tool. I too would be interested in hearing if anyone has used this tool or a similar tool on testing projects.

The drawback I see with using a data generator for Mifos is the business rules we have around building loan schedules, meeting schedules, etc. In talking with other test engineers that have done a lot of test data generation, they have recommended to me that we use the business layer (i.e. an API) to generate test data as it’s a more robust method and also stays in sync with changes to your application over time. For an application that has more basic data model – e.g. a set of customers and their purchases – a data generator seems like the right answer.

Regards,
Jeff

Chand...@sungard.com

unread,

Jan 12, 2010, 7:32:59 AM1/12/10

to mifos-d...@lists.sourceforge.net

Hi

In this thread john had briefly mentioned of how there was a testobjectfactory class but with its share of problems. So I kinda got curious and asked him about it. I have quoted his reply in this mail.

He had pointed out one problem regarding the fees, I was able to fix that. I added an extra function that could retrieve the fee objects if they already existed instead of creating a new object every time

public static List<FeeView> getFees(List<Short> feeIds) //I pass fee ids that need to be used {

List<FeeView> fees = new ArrayList<FeeView>();

FeeView fee;

for (Short Id : feeIds) {

fee=new FeeView(getContext(), testObjectPersistence.getFee(Id));

fees.add(fee);

}

return fees;

}

This fixes the problem with the fees ( Maybe getFees() can be made a bit more intelligent and not create new objects all the time.. that would be simpler I guess)

But as john suggests, there are apparently more problems with using this class? Couldn’t really make any headway there ( apart from the fact that there are a few functions that would be useful, that can be added, if a similar class is used to generate data)

Thank you and Regards

Chandan

---------------------------------------------------------------------------------------------------------------------------------------------------

From: John Woodlock [mailto:john.w...@gmail.com]
Sent: Monday, January 11, 2010 5:17 PM
To: Rao, Chandan
Subject: Re: regarding testobjectfactory

Chandan,

Yes. A while ago I put up a patch for generating data using the TestObjectFactory approach.

http://groups.google.com/group/mifosdeveloper/msg/4bfb45e5ef067ada

I think Van mentioned he knew of a few problems with using this TestObjectFactory approach and I think Keith W felt it was better to use the production objects (or apis but there's few of those).

Afterwards, I came across a problem or two with the TestObjectFactory whilst using it in making integration tests. Unfortunately, I can't recall exactly what they were (date related I think) except I think it was in the loan area (it only creates 6 schedules but that wasn't my problem). They might or might not matter in data generation.

The specific problem I had with my own data generator patch was running out of numbers for fee ids! (the primary key is a Short) From what I remember my use of TestObjectFactory was creating (under the hood) a new fee for each loan and for each customer account I think. So I came to a halt far quicker than I would have liked. I got about 300 nice big centres which was ok for me but not for GK size.

Fees are not really present much in GK so maybe there's a use of TestObjectFactory that doesn't need to create fees. I haven't really looked into it since.

John

On Mon, Jan 11, 2010 at 5:55 PM, <Chand...@sungard.com> wrote:

Hi John

In the data generator thread you had mentioned that there was a problem with using the testobjectfactory class. I didn’t really get you there.. does the problem still exist and could you clarify what exactly the problem is?

Thank you and Regards

Chandan

Chandan Rao H • Associate Software Engineer • SunGard Technology Services •Embassy Icon, 6th Floor, Infantry Road, Bangalore 560001 India • Tel +91-80-2222-0501 Extn:3240 , Mobile +91-9686601284• http://www.sungard.com/sts

Email: Chand...@sungard.com

Error! Filename not specified.

CONFIDENTIALITY: This email (including any attachments) may contain confidential, proprietary and privileged information, and unauthorized disclosure or use is prohibited. If you received this email in error, please notify the sender and delete this email from your system. Thank you.

Van Mittal-Henkle

unread,

Jan 13, 2010, 12:25:51 PM1/13/10

to Mifos software development

Hi Chandan,

TestObjectFactory is used in the Mifos integration tests. Although it constructs objects that can be used to test particular parts of the application, there are some methods which construct objects that are not in a valid state and/or do not follow the business rules. This happens when data has been forced into an object to test some particular scenario.

In some cases, forcing data into an object allows a particular feature to be tested in a valid way, in other cases tests based on data manipulation like this are suspect. When considering using TestObjectFactory for generic data generation, great care would need to be taken in order to avoid using methods that do this kind of forcing data into an object for a particular testing purpose.

In general our plan is to move away from the use of TestObjectFactory and build a cleaner, easier to use and maintain alternative.

Keith Woodlock proposed and starting work on using the Builder pattern as a basis for constructing test objects. The original intent of this approach was for generating data for unit tests and integration tests, but it also seems promising as a way of generating generic test data for use in performance testing.

Take a look at the set of classes such as ClientBuilder, FeeBuilder, LoanAccountBuilder and the like (if you do an “open type” in eclipse and search for “org.mifos.*Builder” you’ll get of list of them. There is still additional work to be done on these and there will probably be some distinctions that need to be made between data constructed for in memory unit tests vs. database based data.

Some code in TestObjectFactory could well be used to help understand what Builder classes will need to do. The nice thing about the Builder pattern is that it provides a clean way to construct both objects using many default settings and objects with very particular settings without having the explosion of similar methods each carrying various defaults that can be seen in TestObjectFactory.

--Van

----

Van Mittal-Henkle

Mifos Software Developer

Grameen Foundation

va...@grameenfoundation.org

From: Chand...@sungard.com [mailto:Chand...@sungard.com]
Sent: Tuesday, January 12, 2010 4:33 AM
To: mifos-d...@lists.sourceforge.net
Subject: Re: [Mifos-developer] Data generator for Mifos performance testing

Hi

Reply all

Reply to author

Forward