I am in search of a downloadable oracle data profiling/quality tool.
Any recommendation is highly appreciated. Thanks!
Oracle provide a free tool called SQL Developer which can be
downloaded from here: http://www.oracle.com/technology/software/products/sql/index.html.
You need to create a Metalink account before you can download it but
this is free and quick to do.
HTH
-g
Thanks. No, this is not what I am looking for. I am looking for open
source tools that can do
column profiling, dependency profiling and redundancy profiling, etc.
N.
Let me first say I don't know of any free (I assume that is what you
mean by downloadable) tools that will do this, but that kind of tool
always scares me.
I never know the kind of queries it will be use to validate the
business data. I know that purists out there will say that a well
designed relational schema will not need to be validated, but I also
know people (designers) make mistakes. However, just turning a third
party tool loose on a production database to find table *data*
dependencies and redundancies (not just structurual dependencies or
redundancies) seems to be a risky proposition at best.
Regards,
Steve
Can you give us some examples of existing commercial products that
currently exist?
What are you thinking of exactly?
You can always develop sql scripts and kick them off via sqlplus.
Once you have a script that fits your purpose that's in sql or plsql
it can be used ( with version dependencies of course ) by everyone.
Steve,
Maybe I didn't make myself clear.
Here is what I am looking for: an open source tool to do the following
http://en.wikipedia.org/wiki/Data_profiling.
It won't be used in production. We want to have a quick
look at the data in the database as we are having problems
with the data.
N.
Like Oracle Discoverer. The idea is to analyze the data and verify
that the data conforms the
column definition (data type, length, etc) and other constraints, etc.
The data will always conform to the column definition and other
constraints. If the data does not conform to the column definition or
constraint, then an error will be raised for the DML statement
attempting to populate the column.
HTH,
Brian
--
===================================================================
Brian Peasland
d...@nospam.peasland.net
http://www.peasland.net
Remove the "nospam." from the email address to email me.
"I can give it to you cheap, quick, and good.
Now pick two out of the three" - Unknown
--
Posted via a free Usenet account from http://www.teranews.com
Well, independent of the discussion about whether data profiling is
sensible etc., are you aware that Oracle Warehouse Builder 10gR2
offers quite advanced data profiling? It's not open source, but the DB
EE and I believe Standard Edition do include a license of the basic
product. The following article by an Oracle ACE can be a starting
point: http://www.oracle.com/technology/pub/articles/rittman-owb.html.
Be sure to research the licensing of this 'option' - licensing is very
messy (and expensive!) with this OWB release, but I believe you need
the Data Quality option (that markets here as 222 Euro per named user
or Euro 11.105 per processor, should match the licensing of the
database.) Playing for a while should still be free. OWB is not a
fully intuitive product, the installation manual is definitively a
required read.
Regards,
Erik Ykema
> Here is what I am looking for: an open source tool to do the followinghttp://en.wikipedia.org/wiki/Data_profiling.
>
The page you linked has items such as the following:
* between tables: the customer number on an order should always
appear in the customer table
> It won't be used in production. We want to have a quick
> look at the data in the database as we are having problems
> with the data.
In order to get this information, I'm assuming you would have to query
a production database to get it (unless you have a copy of your
production database elsewhere where performance isn't as much of a
concern). So for example, for the item above, my fear is a "data
profiling" tool issues something like the following...
select * from order where customer_id not in (select customer_id from
customer);
In some versions of Oracle, that could be a performance nightmare.
It *may* not end up to be an issue, but it's the first thing I always
think of. I've worked with hibernate, and seen some nightmarish
statements generated. Admittedly, that's not a profiling tool, but it
does generate SQL, which I would think a tool like you are mentioning
would also do.
Regards,
Steve
Thanks so much for the infomation!
I have found a couple (even though they may not be free) -
1. Data Quality / Explorer by Informatica
2. Oracle Warehouse Builder
3. IBM WebSphere Information Analyzer
4. Pervasive Data Profiler (possible product for Unisys?)
Wonder if anyone has any comments, pro and cons. Thanks in advance!
-N
What exactly are you trying to do? Do you have a project you are
working on?
If there is a project give us some idea of what environment and what
kind of application design was used. Is there 3rd party software
( canned applications )? Is this theoretical knowledge or do you have
some goals and deadlines you are trying to achieve?
Answers to questions like that might give some people more room for
specific recommendations.
Yes, this is for real :)
We are in the process of migrating data from mainframe to Oracle. It
is much
desired to have a tool that can tell us the quality of the data in
Oracle.
N.
Well the quality of data in oracle is probably going to depend on the
quality of the data that you are migrating out from the mainframe.
Are you re-designing as you go or just dumping into oracle the
mainframe design? OLTP to OLTP or what?
Yes, we are redesigning as we go. How have we accomplished this is
really not my concern at this point and I don't pretend to know much
of it :)
Here we have the data in Oracle and in either case, bad data in
mainframe
to start with or data was currupted during migration, we want to have
a good
look at the quality of the data in Oracle.
N.
The point is, forget about mainframe. I mentioned it only to clarify
that this is
a real world situation. We may or may not have data problems at all
during the
migration but down the road, we definitely want to have a tool for
data profiling.
N.
This blog has had some interesting comments: http://www.rittmanmead.com/blog/
(The author of the article Erik mentioned, of course).
The subject of ETL (extract transform and load) is quite dependent on
data quality. Googling on:
open source etl data quality
seems to find things, I wouldn't know about the, er, quality of the
information.
jg
--
@home.com is bogus.
http://eyeonoracle.blogs.techtarget.com/2007/06/13/database-administration-is-for-suckers/