Exporting convex hull

166 views
Skip to first unread message

digulla

unread,
Aug 20, 2012, 11:42:31 AM8/20/12
to jooq...@googlegroups.com
I'm looking for a tool that (should) be relatively simple to build with jOOQ (and maybe someone already did it): I'd like to pass it a list of records (or tables + PKs) and it should dump the convex hull of all FK relations to an XML file.

Usage: Extract test data from a testing/production system.

Anyone?

Lukas Eder

unread,
Aug 20, 2012, 2:10:37 PM8/20/12
to jooq...@googlegroups.com
That's an interesting problem. How about this canonical, yet a bit
inefficient algorithm:

-------------------------------------------------
public class Hull {
public static Set<Table<?>> hull(Table<?>... tables) {
Set<Table<?>> result = new HashSet<Table<?>>(Arrays.asList(tables));

// Loop as long as there are no new result tables
int size = 0;
while (result.size() > size) {
size = result.size();

for (Table<?> table : new ArrayList<Table<?>>(result)) {
// Follow all outbound foreign keys
for (ForeignKey<?, ?> fk : table.getReferences()) {
result.add(fk.getKey().getTable());
}

// Follow all inbound foreign keys from tables within
the same schema
for (Table<?> other : table.getSchema().getTables()) {
if (other.getReferencesTo(table).size() > 0) {
result.add(other);
}
}
}
}


return result;
}

public static void main(String[] args) {
System.out.println(hull(T_AUTHOR));
}
}
-------------------------------------------------

Note that this algorithm relies on a set of generated tables from a
single schema and doesn't follow foreign keys, if they lead to enum
types generated from "master data tables". A more reliable solution
would be to query the Oracle dictionary tables as such:

-------------------------------------------------
-- "graph" denotes an undirected foreign key reference graph
-- for schema "TEST"
with graph as (
select c1.table_name t1, c2.table_name t2
from all_constraints c1
join all_constraints c2
on c1.owner = c2.r_owner
and c1.constraint_name = c2.r_constraint_name
where c1.owner = 'TEST'
union all
select c2.table_name t1, c1.table_name t2
from all_constraints c1
join all_constraints c2
on c1.owner = c2.r_owner
and c1.constraint_name = c2.r_constraint_name
where c1.owner = 'TEST'
),
-- "paths" are all directed paths within that schema as a #-delimited string
paths as (
select sys_connect_by_path(t1, '#') || '#' path
from graph
connect by nocycle prior t1 = t2
),
-- "subgraph" are all those directed paths that go trough a given table T_AUTHOR
subgraph as (
select distinct regexp_replace(path, '^#(.*)#$', '\1') path
from paths
where path like '%#T_AUTHOR#%'
)
-- This XML-trick splits paths and generates rows for every distinct
select distinct cast(t.column_value.extract('//text()') as
varchar2(4000)) table_name
from subgraph,
table(xmlsequence(xmltype(
'<x><x>' || replace(path, '#', '</x><x>') ||
'</x></x>').extract('//x/*'))) t
order by table_name
-------------------------------------------------

Happy deciphering this beast! I'll pay you a beer if you manage to
phrase this query with jOOQ (contributing the required extensions, of
course) ;-)

Cheers
Lukas

Lukas Eder

unread,
Aug 20, 2012, 2:23:27 PM8/20/12
to jooq...@googlegroups.com
In fact, with only a little addition, you can calculate all convex
hulls within your database, for every "origin" table:
select distinct t.table_name, regexp_replace(p.path, '^#(.*)#$', '\1') path
from paths p
cross join all_tables t
where t.owner = 'TEST'
and p.path like '%#' || t.table_name || '#%'
),
-- This XML-trick splits paths and generates rows for every distinct
split_paths as (
select distinct table_name origin,
cast(t.column_value.extract('//text()') as varchar2(4000)) table_names
from subgraph,
table(xmlsequence(xmltype(
'<x><x>' || replace(path, '#', '</x><x>') ||
'</x></x>').extract('//x/*'))) t
)
select origin, listagg(table_names, ', ') within group (order by 1)
from split_paths
group by origin
-------------------------------------------------

Within the jOOQ integration test database, the above yields (result CSV-export):

-------------------------------------------------
"ORIGIN","LISTAGG(TABLE_NAMES,',')WITHINGROUP(ORDERBY1)"
"T_658_11","T_658_11, T_658_12, T_658_21, T_658_22, T_658_31,
T_658_32, T_658_REF"
"T_658_12","T_658_11, T_658_12, T_658_21, T_658_22, T_658_31,
T_658_32, T_658_REF"
"T_658_21","T_658_11, T_658_12, T_658_21, T_658_22, T_658_31,
T_658_32, T_658_REF"
"T_658_22","T_658_11, T_658_12, T_658_21, T_658_22, T_658_31,
T_658_32, T_658_REF"
"T_658_31","T_658_11, T_658_12, T_658_21, T_658_22, T_658_31,
T_658_32, T_658_REF"
"T_658_32","T_658_11, T_658_12, T_658_21, T_658_22, T_658_31,
T_658_32, T_658_REF"
"T_658_REF","T_658_11, T_658_12, T_658_21, T_658_22, T_658_31,
T_658_32, T_658_REF"
"T_AUTHOR","T_AUTHOR, T_BOOK, T_BOOK_DETAILS, T_BOOK_SALE,
T_BOOK_STORE, T_BOOK_TO_BOOK_STORE, T_LANGUAGE"
"T_BOOK","T_AUTHOR, T_BOOK, T_BOOK_DETAILS, T_BOOK_SALE, T_BOOK_STORE,
T_BOOK_TO_BOOK_STORE, T_LANGUAGE"
"T_BOOK_DETAILS","T_AUTHOR, T_BOOK, T_BOOK_DETAILS, T_BOOK_SALE,
T_BOOK_STORE, T_BOOK_TO_BOOK_STORE, T_LANGUAGE"
"T_BOOK_STORE","T_AUTHOR, T_BOOK, T_BOOK_DETAILS, T_BOOK_SALE,
T_BOOK_STORE, T_BOOK_TO_BOOK_STORE, T_LANGUAGE"
"T_BOOK_TO_BOOK_STORE","T_AUTHOR, T_BOOK, T_BOOK_DETAILS, T_BOOK_SALE,
T_BOOK_STORE, T_BOOK_TO_BOOK_STORE, T_LANGUAGE"
"T_DIRECTORY","T_DIRECTORY"
"T_LANGUAGE","T_AUTHOR, T_BOOK, T_BOOK_DETAILS, T_BOOK_SALE,
T_BOOK_STORE, T_BOOK_TO_BOOK_STORE, T_LANGUAGE"
"X_TEST_CASE_64_69","X_TEST_CASE_64_69, X_TEST_CASE_71,
X_TEST_CASE_85, X_UNUSED"
"X_TEST_CASE_71","X_TEST_CASE_64_69, X_TEST_CASE_71, X_TEST_CASE_85, X_UNUSED"
"X_TEST_CASE_85","X_TEST_CASE_64_69, X_TEST_CASE_71, X_TEST_CASE_85, X_UNUSED"
"X_UNUSED","X_TEST_CASE_64_69, X_TEST_CASE_71, X_TEST_CASE_85, X_UNUSED"
-------------------------------------------------

Lukas Eder

unread,
Aug 20, 2012, 3:05:39 PM8/20/12
to jooq...@googlegroups.com
... this was such a geeky thing to write in SQL, it deserves a blog post:
http://blog.jooq.org/2012/08/20/serious-sql-a-convex-hull-of-correlated-tables/

I challenge anyone to write a more concise SQL statement to achieve the same!
Or to get Tom Kyte involved in the discussion :-)

Cheers
Lukas

Lukas Eder

unread,
Aug 31, 2012, 2:18:18 PM8/31/12
to jooq...@googlegroups.com
Hi Aaron,

So was any of this of use for you?

Cheers
Lukas

2012/8/20 Lukas Eder <lukas...@gmail.com>:
Reply all
Reply to author
Forward
0 new messages