Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Very Large Database, dont know what my options are

1 view
Skip to first unread message

victor....@gmail.com

unread,
Oct 20, 2008, 8:57:54 PM10/20/08
to tammy...@gmail.com
Hi,

Im a newbie DBA and a little bit stuck with a problem with an
Oracle Database and I would appreciate your help. I have a database
of billions of records. I need to dump a set of data for
investigation. I was using SELECT * OUTFILE... Because of the size of
the DB and lack of indexes (data model was poorly designed from the
get-go), the query ran for 4 hours, and I was cut off as Oracle has
setup the profile of every user to only allow a query to run for 4
hours. So I was wondering

a) Without creating indexes, is there a way I can do a plaintext
file dump of a set of data from a table more efficiently? If not,
other than changing the user profile to allow unlimited query cpu/
session resource, are there any other means?

b) Im anticipating that the data the analyst needs to investigate
in will have to stored within a table of some sort for querying.
Aside from creating indexes on that table, what other efficiency
improver would you advice to make sure that this new table won't take
forever to query? Im thinking about denormalizing the table.

Thanks,
Victor

Steve Howard

unread,
Oct 20, 2008, 9:03:10 PM10/20/08
to
On Oct 20, 8:57 pm, "victor.s.em...@gmail.com"

Hi Victor,

It sounds like you don't have DBA access to the database if you are
getting knocked off after a period of time. I would suggest asking
the DBA to provide some suggestions based on exactly what data you
need.

I will say that if you are extracting the most of the data anyway, an
index may not help.

Also, always provide version information (OS and database) and as much
data as possible, as the insight you get will always be better.

HTH,

Steve

joel garry

unread,
Oct 21, 2008, 1:30:19 PM10/21/08
to
On Oct 20, 5:57 pm, "victor.s.em...@gmail.com"

It helps to state all versions, options and platform info. If you are
on a unix platform, there are tools that can help this kind of
investigation. The general idea is to select out a small
representative set of data, and use unix tools to manipulate it as you
need, then once you get it all working, you can pipe them all together
and do the massive work necessary without blowing out oracle sort or
buffer areas. Then you can either use external tables or sqlloader to
put it back into a db, depending on what exactly you need the analyst
to do.

In some cases, it can be more efficient to do everything within the
database, given the modern DW and DSS enhancements of Oracle. It just
depends. Just as, it depends on what you are doing, exactly, as to
whether you want or need indices. Sometimes it actually is better to
full table scan, doing this kind of analysis may require quite
different tuning than the db is currently set up for. And don't let
anyone tell you things like "if you are getting x% of the table, y
access is better." The issues are far more sophisticated than that.
See the performance tuning manual, google for writings by Jonathan
Lewis and Richard Foote, and generally learn about data warehousing.
You may even want to get an experienced DBA to help.

What you may want to avoid is doing all this DSS work on a production
OLTP system, if that is what you have.

jg
--
@home.com is bogus.
http://www.imdb.com/title/tt0074412/#comment

gazzag

unread,
Oct 22, 2008, 7:11:03 AM10/22/08
to
On 21 Oct, 01:57, "victor.s.em...@gmail.com"

I agree with Joel Garry here. I obviously don't know the business
case, but I am always suspicious of such requests. Surely the best
place for the data is the RDBMS? Surely the best tool for
manipulating the data is the RDBMS?

I was once asked by a manager if I could "dump" a database into MS
Excel (*shudder*) for him. The database, while not huge, is 9Gb in
size and consists of 400-odd tables. The politest way that I could
tell him that he obviously doesn't know what he's talking about is to
inform him that Excel is limited to 65535 rows :)

-g

DA Morgan

unread,
Oct 22, 2008, 9:30:47 AM10/22/08
to
gazzag wrote:

> I was once asked by a manager if I could "dump" a database into MS
> Excel (*shudder*) for him. The database, while not huge, is 9Gb in
> size and consists of 400-odd tables. The politest way that I could
> tell him that he obviously doesn't know what he's talking about is to
> inform him that Excel is limited to 65535 rows :)
>
> -g

Next time solve the problem this way:
http://www.google.com/search?hl=en&q=%22Excel%22+and+%22math+errors%22&btnG=Google+Search&aq=f&oq=
You'd think they could have figured this out after more than a decade.
--
Daniel A. Morgan
Oracle Ace Director & Instructor
University of Washington
damo...@x.washington.edu (replace x with u to respond)
Puget Sound Oracle Users Group
www.psoug.org

joel garry

unread,
Oct 22, 2008, 2:05:47 PM10/22/08
to

At the risk of disagreeing with myself :-) it depends. I was talking
to a fellow on the train, who turned out to be a statistician for a
big credit bureau. Turned out we had quite similar opinions about
unix tools, AIX idiosyncracies, silly user requests and so forth. So
I asked him about analytics, and boy, did he have opinions about
that! Definitely a vertical market that Oracle built-ins don't hit
there. No, I don't know enough to give any examples, I only see
people buying packages to do analytics by sucking from the Oracle db
into dark unmentionables.

>
> I was once asked by a manager if I could "dump" a database into MS
> Excel (*shudder*) for him.  The database, while not huge,  is 9Gb in
> size and consists of 400-odd tables.  The politest way that I could
> tell him that he obviously doesn't know what he's talking about is to
> inform him that Excel is limited to 65535 rows :)

:) Still easier than the dirty data I get _from_ Excel, not to mention
wasting time figuring out and explaining to people why their forumulae
don't match the properly rounded numbers from the db.

jg
--
@home.com is bogus.

Anyone else notice Yahoo stock down by 2/3 in 6 months? And yet
Kirkorian bails on Ford...

David Portas

unread,
Oct 22, 2008, 3:52:13 PM10/22/08
to
Actually it's over 1 million rows in Excel 2007 but no-one is silly enough
to create spreadsheets that large :)

http://office.microsoft.com/en-us/excel/HP100738491033.aspx

--
David Portas


Palooka

unread,
Oct 22, 2008, 5:11:52 PM10/22/08
to
David Portas wrote:
> Actually it's over 1 million rows in Excel 2007 but no-one is silly enough
> to create spreadsheets that large :)
>

I dimly remember a Windoze tool from about a decade or so ago. Its main
purpose was to take print files of reports from mainframe or relational
databases, "normalize" them, and feed them into some toy desktop
database, like M$ Access. Wish I could remember the name....

Palooka

R. Schierbeek

unread,
Oct 23, 2008, 2:25:56 PM10/23/08
to
<victor....@gmail.com> wrote

hello Victor
If you're spooling the SELECT * OUTFILE. from sqlplus, try:

SET trimspool on termout off

before your SELECT.
You might me surprised at the speed improvement.

cheers
Roelof Schierbeek

0 new messages