[Mifos-developer] Proposal for performance improvement of acceptance tests and performant approach for managing test data, so that we run our build in < 10 minutes on CI.

3 views
Skip to first unread message

Vivek Singh

unread,
Dec 30, 2010, 5:19:32 AM12/30/10
to Mifos software development
I have spiked an approach for considerably speeding up acceptance tests run and managing test data. I feel confident that we can now attack these problems and bring down the build time to < 10 minutes on CI. I have explained the spikes and approach below. I have also explained the context for everyone's benefit.

This is a proposal and I want you rate it between 1-10. Also let me know what it would take to make it 10.

Current State
Acceptance tests
I think most of us aware of it but still. This is the case with most (if not all) acceptance tests.
  1. They delete the data in all the tables
  2. Recreate their own data using dbUnit dataset. (There are a lot these datasets. Some of them are reused. Even when reused it need to perform step 1-4).
  3. Reinitialize the application by reloading the cache.
  4. Perform the test.
Test data
We have two kinds of test data. Integration test data and acceptance test data. Integration test data is in sql form in two files custom_data.sql and testdbinsertionscript.sql. Acceptance test data as mentioned earlier is maintained in a set of dbUnit XML files.
For changing database we add our changes to latest-schema.sql. Whenever we do this we need to do the following for test data i.e. If we want to add specific data for it (default values can also be used in which case we don't need to do the following).
  1. Integration test data Go through two sql files and edit them.
  2. Acceptance test data Go through the dbUnit files and modify the XML files. We have another approach which automatically exports the data, and also modifies all the dbUnit files.
Consequence
Acceptance test suite run takes a long time to run.
Cannot run tests in parallel.
Cannot have clusters of server (this is important for this would tell us if there are scaling out issues).
Changing acceptance test data in quite painful. The dbUnit output file also need to be changed.
Test data setup takes long time.

Proposed approach to solve acceptance tests run performance
Clearly we cannot do 1-4 for all tests. Ideally it should be done only once. This would make for most performant test suite. At the same time how do we ensure one test doesn't step on another by changing the data.
These form our basic requirement. Our current approach ensures correctness but is rather unusable because of performance we get. So we probably should relax fool proof correctness for performance, on which I am basing this proposal.

Get the union of all data sets
This step is spiked now. The idea was to find out one dataset which is union of all the datasets. Where an item in this set is a database table row (row) and the set contains rows from different tables. In order to define union we need to define equality. So, two rows are equal when:
    • primary key matches
    • unique key columns are equal
    • if no primary key, then all column values are equal
I used Guava to find union (key programs used are attached as files, its throwaway code so the quality is not great but it has tests to understand). I ignored some datasets which were very large and targeted for handful of reporting tests. We would need to handle these separately. I also recorded the rejected data using intersection (hopefully be useful for troubleshooting).

Setup data only once
The sql version of this union (we have this after the spike, to large to attach) found above is run against the database at the beginning before all tests start. In other words instead of running custom_data.sql, testdbinsertionscript.sql and latest-data.sql we run this union, functional-test-data.sql file.

Individual tests isolate their own data if required
Obviously this would mean affect of one test on another test when common data is changed. We wait for this to happen and test to fail because of this. Only when this happens (or when we are writing new test) the test should define its data in a way that it is not known to other test. In other words the tests are responsible for maintaining isolation of their data.

Tests run without setup its own data and reinitialization of application

Some indicative result of running acceptance tests after this spike on my machine. 100 tests, 35 success, 65 failure. 407 seconds. Obviously we need to fix all the tests. Like when fixing integration tests we need to put some effort towards this.

Possible questions on above
  1. Why not use transaction to rollback the data after every test as in integration test? Acceptance test differ with integration test. The test don't run in the same process as application under test. The tests are exponentially long running. Multiple transactions are performed. Given these if we use transaction then.
    • It would not be a black box test.
    • We would have to make modifications in application under test for running tests in this mode.
    • We would have issues when running tests in parallel, as long running multiple transactions would create database locks, deadlocks and waits.
  2. What if two tests need incompatible data or configuration? There are two scenarios here, if we run test in sequence or in parallel. When running test in sequence we can create REST interface in application to be able to change these from the test. This would not work when we want to run tests in parallel. If we do not have too many combinations of this data then can group them. e.g. elsim, glim and then run them one after other.
  3. Why are we need this improvement given that we are going to follow test pyramid strategy? I think even when we go with test pyramid approach we would still have some selenium functional tests. As the kind of coverage provided by these are not provided by service level tests. As mifos matures the number of such tests would only increase.

Proposed approach for faster creation of test data
Setting us test data for integration test and acceptance test (after we implement above) would still take quite a lot of time. 10 minutes on my machine. Although this doesn't really provide value worth of 10 minutes. We can reduce this to < 20 seconds with the following.

Maintain database dumps Essentially we can treat the test data as production data and apply database upgrade scripts on it. The only difference being we would be applying them every build once at the beginning of test. We can use mysqlhotcopy to export the data/index files from mysql, zip them up and commit it to source control. (We can also copy these directly from mysql datadir like /var/lib/mysql/mifos folder).
Restore database from the checked in dump Unzip the dump to the mysql datadir. (e.g. /var/lib/mysql/mifos)
Apply database upgrade scripts to these dumps
Run the test (integration or acceptance)
AcceptanceDataSets.java
Row.java
RowTest.java
SetsTest.java

Michael Vorburger

unread,
Jan 2, 2011, 8:48:42 AM1/2/11
to Mifos software development
This would certainly be very cool.

I'm wondering if the "how to isolate tests" issue could be addressed, instead of "real" classical transactions to roll-back to the state of the data set before the test, by using something along the lines of restoring from dumps as you suggest, before every test yes - if somehow it can be made very fast? May be http://dev.mysql.com/doc/refman/5.1/en/backup-and-recovery.html, and the "Bulk Data Loading" in http://www.mysql.com/news-and-events/on-demand-webinars/display-od-460.html (which I haven't had time to watch) give some further useful ideas?

It's just a thought, and I'm happy to let others dig more into this to see if it's feasible and worthwhile.


------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and,
should the need arise, upgrade to a full multi-node Oracle RAC database
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl

Vivek Singh

unread,
Jan 4, 2011, 1:05:26 AM1/4/11
to Mifos software development
We can definitely reduce the time taken to setup the data for a an acceptance test by following the mysql binary image restore approach. We would still need to reinitialize the application every test, which might not be that time consuming though.
Changing acceptance test data would remain quite painful and the pain would be as we have most tests (more images). Life is much easier with one image.

Inspite of above we might use this as an intermediate approach.

Vivek Singh

unread,
Jan 4, 2011, 2:40:59 AM1/4/11
to Mifos software development
Jeff asked me some questions. I have added response to them and they are here now.

But please do use the mailing list for rating/responding to the proposal.

Thanks

Adam Monsen

unread,
Jan 7, 2011, 3:48:46 PM1/7/11
to mifos-d...@lists.sf.net
I rate this a 3 out of 10.

I like:
* that it allows running tests in parallel
* that we avoid re-initializing test data over and over

To make it a 10:
* enforce data isolation. "Individual tests isolate their own data if
required" says otherwise, and I understand the reasons for pushing the
burden down to the actual acceptance tests. But I don't like it! This
got us into a heap of trouble in the integration tests (Udai spent a
Summer of Code decoupling them). Bugs caused by dependent/coupled data
can lay dormant for a long time and are complex to debug and fix.


Here are some other random ideas:
* Create a tool to help developers isolate test data. Let's call it
"checkClean". checkClean would make sure that each individual test
leaves the database exactly as it was before it started mucking about.
You might only run it on demand once a coupling/dependency is found
between tests.
* Split up the acceptance tests (ideally, dynamically) and have several
build slaves run them in parallel. For instance, say we have 100
acceptance tests. 1-20 are sent to slaveA, 21-40 are sent to slaveB,
41-60 are sent to slaveC, etc. A central job/server/script whatever
parcel out groups of tests and compiles test results. If this
parallelization is built into the TestNG/Selenium tests, we could simply
set parallel=true for the surefire plugin (and just turn this on for ci
builds).

signature.asc

Udai Gupta

unread,
Jan 8, 2011, 1:32:27 AM1/8/11
to Mifos software development, mifos-d...@lists.sf.net
I rate it 9 out of 10.

I think by following the approach proposed by Vivek, we will make
acceptance tests simpler and faster.

The "Automated Acceptance Tests" should not be working like
"Functional Tests" (Service Facade level), i.e. running every
functional check in isolation.

The purpose of Acceptance tests should be just making sure that
application is working from the front end side (UI) for different work
flow.

These acceptance tests automate some of the work a QA would do
manually from the UI.

In Manual Acceptance testing, we usually set up one database (or two
for depends on different configuration) and then follow through the UI
to make sure everything is working as per expectation. Similarly for
Automated acceptance testing, it should not do too much at setUp() and
tearDown().

I like the isolation of test data by target objects (different
customers and accounts).

I remember during work on Integration Tests when we were not able to
rollback because of the commits being called from main code, we
isolated tests by changing the target object names like
createClient("Client_TestName) in setup methods, while previously it
was createClient("Cleint") in all the tests causing collision (or
duplicate customer error) in case of failure by one tests.

Acceptance tests won't rollback hence isolation by target object name
should work.

to make it 10, I will still keep the dump in DbUnit format, because I
don't want us to hardcode MySQL in Mifos wherever it can be avoided,
because later it will become a blocker if we decide to try out other
databases.

Cheers,
Udai

------------------------------------------------------------------------------
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web. Learn how to
best implement a security strategy that keeps consumers' information secure
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl

Vivek Singh

unread,
Jan 10, 2011, 3:01:14 AM1/10/11
to Mifos software development, mifos-d...@lists.sf.net
Thanks for your ideas. Right now I am worrying about just designing such that we can parallelize it later. So when we do parallelize our tests your ideas would be useful, someone has to remember to revisit this proposal at that time.

>> This got us into a heap of trouble in the integration tests (Udai spent a Summer of Code decoupling them). Bugs caused by dependent/coupled data can lay dormant for a long time and are complex to debug and fix.
Bugs in tests always have the problem of producing false positives, although in my experience false positive caused by lack of data isolation is rare. As this is the approach we tend to follow in most if not all projects, as providing complete data isolation is impractical because of performance issues (as we are facing now). It is a trade-off. So given the choice I would tend towards faster but slightly unreliable tests than really-really slow but completely reliable tests. To me it works more like CAP theorem. I don't know how to solve it. For the same reason we shouldn't run the tests randomly (dynamically) in parallel unless we are willing to bear the cost of keeping every test isolated from other.

Having said that we have keep few things in mind with regards to how bad this problem will be.
a) We would be running these tests all the time so we cannot build a pile of debt in acceptance tests.
b) We have an approach to fix a test when failing randomly (isolate your data)
c) We have far fewer acceptance tests than integration tests.

------------------------------------------------------------------------------
Gaining the trust of online customers is vital for the success of any company
that requires sensitive data to be transmitted over the Web.   Learn how to
best implement a security strategy that keeps consumers' information secure
and instills the confidence they need to proceed with transactions.
http://p.sf.net/sfu/oracle-sfdevnl

Vivek Singh

unread,
Jan 10, 2011, 3:45:30 AM1/10/11
to Mifos software development, mifos-d...@lists.sf.net
>> I will still keep the dump in DbUnit format, because I don't want us to hardcode MySQL in Mifos wherever it can be avoided,
because later it will become a blocker if we decide to try out other databases.

Good point, I had overlooked how we would change the data when we need to. So we would need to keep dbUnit as source of our data along with mysql binary images. I have updated the proposal with this info.

Jakub Sławiński

unread,
Jan 13, 2011, 1:50:36 PM1/13/11
to mifos-d...@lists.sourceforge.net

Hi Vivek,

since my team is working on improving the current set of acceptance tests, I am specifically interested in your approach to prepare union of all of data sets.

We have currently 75 data set xml files. However, 53 of them are used only for validating test results. Since we prefer validating tests by checking values in the UI, we will remove most of these data sets in the near future. Moreover, we are in the process of reviewing the existing acceptance test cases/tests (about 200) and the ones we will have to add (about 240) and preparing the final list of test cases that will be leaved in the Mifos code base. Our current estimation is that at the end there will be no more than 200-250 acceptance test cases/tests in the Mifos. We will also try to limit the number of data sets used in our tests.

If you do not want to wait until we will finish our work (we will finish it by the end of February), please do not include results data sets in your union.


Regards,
  Jakub.
AcceptanceDataSets.java
package data; import com.google.common.collect.Sets; import org.dbunit.dataset.*; import org.mifos.framework.util.DbUnitUtilities; import java.io.File; import java.io.IOException; import java.util.*; public class AcceptanceDataSets { private static UniqueKeysMap uniqueKeysMap = new UniqueKeysMap(); public static void main(String[] args) throws Exception { File directory = new File("acceptanceTests/src/test/resources/dataSets"); File[] files = directory.listFiles(); DbUnitUtilities dbUnitUtilities = new DbUnitUtilities(); ITestDatabase testDatabase = TestDatabase.connect("root", "", "mifos"); Set<Row> union = new HashSet<Row>(); RowsForExactMatch rejectedRows = new RowsForExactMatch(); IgnoredDataSets ignoredDataSets = new IgnoredDataSets(); for (File file : files) { if (file.getName().endsWith(".xml") && !ignoredDataSets.isIgnored(file.getName())) { System.out.println("Constructing dataSet for " + file.getName()); IDataSet dbUnitDataSet = dbUnitUtilities.getDataSetFromFile(file.getAbsolutePath()); String[] tables = dbUnitDataSet.getTableNames(); Set<Row> rows = getAllRows(testDatabase, dbUnitDataSet, tables); System.out.println("Number of items in the current data set: " + rows.size()); rejectedRows.addAll(Sets.intersection(rows, union), file.getName()); union = Sets.union(rows, union); System.out.println("Number of items in the union data set: " + union.size()); } } writeAllTablesToFile(union); writeAllRejectedToFile(rejectedRows, union); } private static void writeAllRejectedToFile(RowsForExactMatch rejectedRows, Set<Row> selectedRows) throws IOException { RowsForExactMatch selectedRowsForExactMatch = new RowsForExactMatch(); selectedRowsForExactMatch.addAll(selectedRows, ""); ArrayList<RowForExactMatch> nonDuplicateRejectedRows = new ArrayList<RowForExactMatch>(Sets.difference(rejectedRows, selectedRowsForExactMatch)); Collections.sort(nonDuplicateRejectedRows, new Comparator<RowForExactMatch>() { @Override public int compare(RowForExactMatch row1, RowForExactMatch row2) { return row1.getTag().compareTo(row2.getTag()); } }); SimpleFile simpleFile = new SimpleFile("/home/vivek/projects/mifos/rejected.sql"); String lastTag = null; for (RowForExactMatch rowForExactMatch : nonDuplicateRejectedRows) { if (!rowForExactMatch.getTag().equals(lastTag)) { lastTag = rowForExactMatch.getTag(); simpleFile.writeLine(String.format("FILE %s;", lastTag)); simpleFile.writeLine(); } simpleFile.writeLine(rowForExactMatch.toInsertSql()); } simpleFile.close(); } private static Set<Row> getAllRows(ITestDatabase testDatabase, IDataSet dbUnitDataSet, String[] tables) throws DataSetException { Set<Row> rows = new HashSet<Row>(); for (String tableName : tables) { ITable dbUnitTable = dbUnitDataSet.getTable(tableName); Table table = new Table(tableName, testDatabase, uniqueKeysMap.get(tableName)); ITableMetaData tableMetaData = dbUnitTable.getTableMetaData(); Column[] columns = tableMetaData.getColumns(); int rowCount = dbUnitTable.getRowCount(); for (int i = 0; i < rowCount; i++) { Row row = new Row(table); for (Column column : columns) { Object value = dbUnitTable.getValue(i, column.getColumnName()); row.setColumnValue(column.getColumnName(), value); } rows.add(row); } } return rows; } private static ArrayList<String> writeAllTablesToFile(Set<Row> union) throws IOException { ArrayList<Row> result = new ArrayList<Row>(union); Collections.sort(result, new Comparator<Row>() { @Override public int compare(Row row1, Row row2) { return row1.getTableName().compareTo(row2.getTableName()); } }); IgnoredTables ignoredTables = new IgnoredTables(); int numberOfRowsInATable = 0; ArrayList<String> allTables = new ArrayList<String>(); String outputFile = "/home/vivek/projects/mifos/functional-test-data.sql"; System.out.println("Writing to file " + outputFile); SimpleFile simpleFile = new SimpleFile(outputFile); try { simpleFile.writeLine("SET FOREIGN_KEY_CHECKS=0;"); String lastTableName = ""; for (Row row : result) { String tableName = row.getTable().getName(); if (ignoredTables.contains(tableName)) continue; if (!tableName.equals(lastTableName)) { System.out.printf("%s: %d%n", lastTableName, numberOfRowsInATable); numberOfRowsInATable = 0; lastTableName = row.getTable().getName(); allTables.add(lastTableName); simpleFile.writeLine(String.format("select \"TABLE: %s\";", lastTableName)); simpleFile.writeLine(String.format("delete from %s;", lastTableName)); simpleFile.writeLine("commit;"); } simpleFile.writeLine(row.toInsertSql()); numberOfRowsInATable++; } simpleFile.writeLine("SET FOREIGN_KEY_CHECKS=1;"); } finally { simpleFile.close(); } return allTables; } }
Row.java
package data; import org.apache.commons.lang.StringUtils; import java.util.ArrayList; import java.util.HashMap; import java.util.Iterator; public class Row { private Table table; private HashMap<String, Object> data = new HashMap<String, Object>(); public Row(Table table) { this.table = table; } public void setColumnValue(String columnName, Object value) { data.put(columnName, value); } @Override public boolean equals(Object o) { if (this == o) return true; if (o == null || getClass() != o.getClass()) return false; Row row = (Row) o; if (!table.equals(row.table)) return false; boolean primaryKeyMatched = true; PrimaryKey primaryKey = table.getPrimaryKey(); ArrayList<String> primaryKeyColumns = primaryKey.getColumns(); if (primaryKeyColumns.size() != 0) { for (String column : primaryKeyColumns) primaryKeyMatched &= (columnValueEquals(row, column)); if (primaryKeyMatched) return true; } if (table.getUniqueKeyColumn() != null && columnValueEquals(row, table.getUniqueKeyColumn())) return true; if (primaryKeyColumns.size() == 0) return data.equals(row.data); return primaryKeyMatched; } private boolean columnValueEquals(Row row, String column) { return data.get(column).equals(row.data.get(column)); } @Override public int hashCode() { return table.hashCode(); } @Override public String toString() { return String.format("table=%s, %s", table, data); } public Table getTable() { return table; } public String toInsertSql() { return toInsertSql(table.getName(), data); } public static String toInsertSql(String tableName, HashMap<String, Object> data) { StringBuffer buffer = new StringBuffer(); buffer.append("insert into ").append(tableName).append(" ("); buffer.append(StringUtils.join(data.keySet().toArray(), ',')); buffer.append(") values ("); Iterator<String> iterator = data.keySet().iterator(); while (iterator.hasNext()) { Object columnValue = data.get(iterator.next()); if (columnValue == null) buffer.append("null"); else buffer.append("'").append(columnValue.toString().replaceAll("'", "''")).append("'"); if (iterator.hasNext()) buffer.append(","); } buffer.append(");"); return buffer.toString(); } public RowForExactMatch cloneForExactMatch(String tag) { return new RowForExactMatch(table.getName(), data, tag); } public String getTableName() { return table.getName(); } public String primaryKeySqlCondition() { StringBuffer stringBuffer = new StringBuffer(); PrimaryKey key = table.getPrimaryKey(); for (String column : key.getColumns()) { stringBuffer.append(column).append("='").append(data.get(column)).append("'"); } return stringBuffer.toString(); } }
RowTest.java
package data; import org.testng.Assert; import org.testng.annotations.BeforeMethod; import org.testng.annotations.Test; import java.util.HashMap; @Test public class RowTest { private HashMap map; private Table table; private Row row; private StubTestDatabase testDatabase; @BeforeMethod public void setup() { map = new HashMap(); map.put("foo", new PrimaryKey("abc")); map.put("bar", new PrimaryKey()); testDatabase = new StubTestDatabase(map); table = new Table("foo", testDatabase, null); row = new Row(table); } public void noPrimaryKeyAndUnequalUniqueKeyShouldBeUnequal() { Table tableWithNoPrimaryKey = new Table("bar", testDatabase, "abc"); Row row1 = new Row(tableWithNoPrimaryKey); row1.setColumnValue("abc", 123); row1.setColumnValue("efg", 456); Row row2 = new Row(tableWithNoPrimaryKey); row2.setColumnValue("abc", 321); row2.setColumnValue("efg", 456); Assert.assertFalse(row1.equals(row2)); } public void sameIsEqual() { row.setColumnValue("abc", 123); row.setColumnValue("efg", 456); Assert.assertEquals(row, row); Assert.assertEquals(row.hashCode(), row.hashCode()); } public void equalWhenColumnOrderIsDifferent() { Row row1 = new Row(table); row1.setColumnValue("abc", 123); row1.setColumnValue("efg", 456); Row row2 = new Row(table); row2.setColumnValue("efg", 456); row2.setColumnValue("abc", 123); Assert.assertTrue(row1.equals(row2)); Assert.assertTrue(row1.hashCode() == row2.hashCode()); } public void rowFromDifferentTableAreNotEqual() { Row row1 = new Row(table); row1.setColumnValue("abc", 123); row1.setColumnValue("efg", 456); Table otherTable = new Table("bar", testDatabase, null); Row row2 = new Row(otherTable); row2.setColumnValue("abc", 123); row2.setColumnValue("efg", 456); Assert.assertFalse(row1.equals(row2)); Assert.assertFalse(row1.hashCode() == row2.hashCode()); } public void rowWithDifferentPrimaryKeyAreNotEqual() { Row row1 = new Row(table); row1.setColumnValue("abc", 124); row1.setColumnValue("efg", 456); Row row2 = new Row(table); row2.setColumnValue("abc", 123); row2.setColumnValue("efg", 456); Assert.assertFalse(row1.equals(row2)); } public void rowWithSamePrimaryKeyButDifferentDataAreEqual() { Row row1 = new Row(table); row1.setColumnValue("abc", 123); row1.setColumnValue("efg", 456); Row row2 = new Row(table); row2.setColumnValue("abc", 123); row2.setColumnValue("efg", 457); Assert.assertTrue(row1.equals(row2)); Assert.assertTrue(row1.hashCode() == row2.hashCode()); } public void toInsertSql() { row.setColumnValue("abc", 123); row.setColumnValue("efg", 456); String expectedSql = "insert into foo (abc,efg) values ('123','456');"; Assert.assertEquals(expectedSql, row.toInsertSql()); } public void toInsertSqlWhenTheColumnValueIsNull() { row.setColumnValue("abc", 123); row.setColumnValue("efg", null); String expectedSql = "insert into foo (abc,efg) values ('123',null);"; Assert.assertEquals(expectedSql, row.toInsertSql()); } public void primaryKeySqlCondition() { row.setColumnValue("abc", 123); row.setColumnValue("efg", 456); Assert.assertEquals("abc='123'", row.primaryKeySqlCondition()); } public void rowsEqualWhenNoPrimaryKeyPresent() { Table otherTable = new Table("bar", testDatabase, null); Row row1 = new Row(otherTable); row1.setColumnValue("abc", 123); Row row2 = new Row(otherTable); row2.setColumnValue("abc", 123); Assert.assertTrue(row1.equals(row2)); Assert.assertTrue(row1.hashCode() == row2.hashCode()); } public void rowsUnEqualWhenNoPrimaryKeyPresent() { Table otherTable = new Table("bar", testDatabase, null); Row row1 = new Row(otherTable); row1.setColumnValue("abc", 123); Row row2 = new Row(otherTable); row2.setColumnValue("abc", 124); Assert.assertFalse(row1.equals(row2)); } }
SetsTest.java
package data; import com.google.common.collect.Sets; import org.testng.Assert; import org.testng.annotations.Test; import java.util.HashMap; import java.util.HashSet; import java.util.Set; @Test public class SetsTest { private HashMap map; private Table table; private Row row; private StubTestDatabase testDatabase; public void uniqueKeyRepeated() { map = new HashMap(); map.put("foo", new PrimaryKey("abc")); testDatabase = new StubTestDatabase(map); table = new Table("foo", testDatabase, "efg"); Row row = new Row(table); row.setColumnValue("abc", "123"); row.setColumnValue("efg", "456"); Set<Row> firstSet = new HashSet<Row>(); firstSet.add(row); row = new Row(table); row.setColumnValue("abc", "321"); row.setColumnValue("efg", "456"); Set<Row> secondSet = new HashSet<Row>(); secondSet.add(row); Assert.assertEquals(1, Sets.intersection(firstSet, secondSet).size()); Assert.assertEquals(1, Sets.union(firstSet, secondSet).size()); } public void primaryKeyRepeated() { map = new HashMap(); map.put("foo", new PrimaryKey("abc")); testDatabase = new StubTestDatabase(map); table = new Table("foo", testDatabase, "efg"); Row row = new Row(table); row.setColumnValue("abc", "123"); row.setColumnValue("efg", "654"); Set<Row> firstSet = new HashSet<Row>(); firstSet.add(row); row = new Row(table); row.setColumnValue("abc", "123"); row.setColumnValue("efg", "456"); Set<Row> secondSet = new HashSet<Row>(); secondSet.add(row); Assert.assertEquals(1, Sets.intersection(firstSet, secondSet).size()); Assert.assertEquals(1, Sets.union(firstSet, secondSet).size()); } public void noPrimaryKeyWithUniqueKey() { map = new HashMap(); testDatabase = new StubTestDatabase(map); table = new Table("foo", testDatabase, "efg"); Row row = new Row(table); row.setColumnValue("abc", "123"); row.setColumnValue("efg", "654"); Set<Row> firstSet = new HashSet<Row>(); firstSet.add(row); row = new Row(table); row.setColumnValue("abc", "123"); row.setColumnValue("efg", "456"); Set<Row> secondSet = new HashSet<Row>(); secondSet.add(row); Assert.assertEquals(0, Sets.intersection(firstSet, secondSet).size()); Assert.assertEquals(2, Sets.union(firstSet, secondSet).size()); } public void noPrimaryKeyWithUniqueKeyAndSomeColumnsNull() { map = new HashMap(); testDatabase = new StubTestDatabase(map); table = new Table("foo", testDatabase, "efg"); Row row = new Row(table); row.setColumnValue("abc", null); row.setColumnValue("efg", "654"); Set<Row> firstSet = new HashSet<Row>(); firstSet.add(row); row = new Row(table); row.setColumnValue("abc", "123"); row.setColumnValue("efg", "456"); Set<Row> secondSet = new HashSet<Row>(); secondSet.add(row); Assert.assertEquals(0, Sets.intersection(firstSet, secondSet).size()); Assert.assertEquals(2, Sets.union(firstSet, secondSet).size()); } }

Vivek Singh

unread,
Jan 17, 2011, 2:50:13 AM1/17/11
to Mifos software development
Thanks for explaining what you and your team have been working on. We would be stepping on each other's code if this work is started before February end, so this is something we should keep in mind. Right now I have just put this forward as proposal and it takes a while before we write stories and plan the work.

Regarding using only the input files for doing the union, I realized this during the spike and from the code you can see that I have a class called IgnoredFiles which filters such files out (I have not attached the file but you can see a call to that).

2011/1/14 Jakub Sławiński <jslaw...@soldevelo.com>
------------------------------------------------------------------------------
Protect Your Site and Customers from Malware Attacks
Learn about various malware tactics and how to avoid them. Understand
malware threats, the impact they can have on your business, and how you
can protect your company and customers by using code signing.
http://p.sf.net/sfu/oracle-sfdevnl
Reply all
Reply to author
Forward
0 new messages