One of the quotes in the article reminded me of the CIO who has saved 9.5 million dollars on monitoring by switching to OEM. He was talking about thousands of databases. The interesting passage from the article (page 5) is here:
"Again, only very large customers with many interconnected Oracle databases would be likely to run a significant risk of being affected by this problem. But the larger the Oracle environment, the longer this restoration would take. Typically, large organizations have the least tolerance for downtime."
That's precisely the description of the company run by the guy who has saved millions. This could be funny. Of course, my confidence into Oracle is also a bit shaken, bugs on the level this fundamental are not supposed to happen. I should be able to trust my DB vendor with the same degree of trust as my stock broker. I know that my stock broker is not going to securitize worthless "liar loans", get the deceiving AAA rating for so created security, by the auditing agency owned by the same bank as the brokerage, sell that security to me and bet against the security they sold me with an insurance company. I must have the same level of confidence with my DB vendor, too.
> One of the quotes in the article reminded me of the CIO who has saved 9.5
> million dollars on monitoring by switching to OEM. He was talking about
> thousands of databases. The interesting passage from the article (page 5)
> is here:
> "Again, only very large customers with many interconnected Oracle
> databases would be likely to run a significant risk of being affected by
> this problem. But the larger the Oracle environment, the longer this
> restoration would take. Typically, large organizations have the least
> tolerance for downtime."
> That's precisely the description of the company run by the guy who has
> saved millions. This could be funny. Of course, my confidence into Oracle
> is also a bit shaken, bugs on the level this fundamental are not supposed
> to happen. I should be able to trust my DB vendor with the same degree of
> trust as my stock broker. I know that my stock broker is not going to
> securitize worthless "liar loans", get the deceiving AAA rating for so
> created security, by the auditing agency owned by the same bank as the
> brokerage, sell that security to me and bet against the security they sold
> me with an insurance company. I must have the same level of confidence
> with my DB vendor, too.
So, can script kiddies poison dns to point at their own VM with a
compromised scn in it and a user already linked to? How about if they
can steal a backup VM, or a plain old backup of an XE used in
production? Does OCM world-publish enough info to know what to
attack? Are employees ever disaffected?
I wonder if it's only 11g that's affected by the bug or also any older versions? I don't remember reading anything about this in the last PSU patch notes for 10g... And yes, there are still people using prehistoric technology like 10g! ;-)
Strange, these are the patch notes for the PSU 11.2.0.3.1 (JAN2012), released a couple of days ago:
PSU 11.2.0.3.1 contains the following new fixes:
Automatic Storage Management
9703627 - 11.2.0.2: ROOT USE OF ASMCMD PLACES ALERT.LOG IN USER DIRECTORY
12620823 - SOL-SP64-11203:ASM INSTANCE HANG DURING CRS STACK STARTING ON THE SECOND NODE
12797765 - SOL_SP64: AFTER ALL DISKS FAILURE, DG CAN'T BE DISMOUNTED ON T2000-3
12905058 - REBOOT 2 CELL NODES, CHECKFILE FOUND CORRUPTION BLOCK IN 3 UNDO DATAFILES
12938841 - 11203_ASM_SOL_SP64:RACE BETWEEN ADD DISK AND DISMOUNT MAY CAUSE KFGUSENUM01
12950644 - RBAL HIT ORA-07445:[KFDGLOBALOPEN()+738], ASM INST ABORT
Generic
9873405 - ORA-600 DURING FAST REFRESH AFTER 11.2.0.1.0 TO 11.2.0.2.0 UPDATE.
High Availability
12718090 - LNX64-11203-RAC:DB FG RROC HIT ORA-00600[KCLCHKBLK_3]
12834027 - ORA-00600 [KJBMPRLST:SHADOW] & [KJBRASR:PKEY] IN A READ MOSTLY & SKIP LOCK ENV
12847466 - AROLTP-C: HANG SIGNATURE: 'GC CURRENT REQUEST'<='GC BUFFER BUSY ACQUIRE'
12861463 - RAC PERF: DEFAULT VALUE FOR _LM_SINGLE_INST_AFFINITY_LOCK SHOULD BE FALSE
12917230 - QUERY WITH TEMP TABLE TRANSFORMATION RUNS 5X SLOWER WAITING FOR REMASTERING
12998795 - AROLTP-C: HANG SIGNATURE: 'GC CURRENT REQUEST'<='GC BUFFER BUSY ACQUIRE'
13035804 - LACK OF DLM PSEUDO RECONFIGURATION TEXTUAL REASON
Oracle Space Management
13041324 - HCC ON ZFS AND PILLAR STORAGE
13492735 - DISALLOW ADDING NON-HCC DATAFILE TO HCC TABLESPACE
Oracle Virtual Operating System Services
13362079 - HCC SHOULD NOT BE ENABLED FOR NON ZFS/ PILLAR STORAGE ARRAY
> I wonder if it's only 11g that's affected by the bug or also any older
> versions? I don't remember reading anything about this in the last PSU patch
> notes for 10g... And yes, there are still people using prehistoric
> technology like 10g! ;-)
Mladen, see Bug 12371955 - Hot Backup can cause increased SCN growth
rate leading to ORA-600 [2252] errors [ID 12371955.8]
I think there is confusion because that was in 11.2.0.3, but is also
available as a patch 12371955 for earlier versions. They don't seem
to put the old patches in the new listing you posted.
> > I wonder if it's only 11g that's affected by the bug or also any older
> > versions? I don't remember reading anything about this in the last PSU patch
> > notes for 10g... And yes, there are still people using prehistoric
> > technology like 10g! ;-)
> Mladen, see Bug 12371955 - Hot Backup can cause increased SCN growth
> rate leading to ORA-600 [2252] errors [ID 12371955.8]
> I think there is confusion because that was in 11.2.0.3, but is also
> available as a patch 12371955 for earlier versions. They don't seem
> to put the old patches in the new listing you posted.
Thanks, that MOS article helped to clear up the confusion a bit :-)
Looks like the bug was already fixed in the 11.2.0.3 server patch set.
And this is what they say about pre-11g versions:
"This fix is *NOT* required in any release prior to 11g.
For 11g onwards this fix is already included in various Patch Set
Updates and bundles as listed above."
> > > I wonder if it's only 11g that's affected by the bug or also any older
> > > versions? I don't remember reading anything about this in the last PSU patch
> > > notes for 10g... And yes, there are still people using prehistoric
> > > technology like 10g! ;-)
> > Mladen, see Bug 12371955 - Hot Backup can cause increased SCN growth
> > rate leading to ORA-600 [2252] errors [ID 12371955.8]
> > I think there is confusion because that was in 11.2.0.3, but is also
> > available as a patch 12371955 for earlier versions. They don't seem
> > to put the old patches in the new listing you posted.
> Thanks, that MOS article helped to clear up the confusion a bit :-)
> Looks like the bug was already fixed in the 11.2.0.3 server patch set.
> And this is what they say about pre-11g versions:
> "This fix is *NOT* required in any release prior to 11g.
> For 11g onwards this fix is already included in various Patch Set
> Updates and bundles as listed above."
> *getting even more confused*
> Matthias Hoys
As I understand it, there are several issues, working together. The
SCN being propagated among distributed databases appears to have been
around a long time, but never really had a problem because of the
large scale of the variable. The bug that congealed the problem seems
to be the begin database backup which would elevate the SCN too fast.
That would only really be a problem for a large system with many links
and much usage of bcp style backups, where people would backup whole
dbs with a snapshot, rather than tablespaces, and the SCN jumps
propagating would multiply the problem. Since it could happen, but
usually doesn't, they distribute a script to say red, amber or green
light, so most people get warm and fuzzy green lights.
But now that we know that, it is a simple matter to poison a system by
hacking the controlfiles of an obscure database, then propagate with a
mere access over a link. You don't need the unpatched backup to have
the problem happen, someone can make it happen. It may just be a
matter of time until it gets to the script-kiddie point (I haven't
looked yet this morning).
> > > > I wonder if it's only 11g that's affected by the bug or also any older
> > > > versions? I don't remember reading anything about this in the last PSU patch
> > > > notes for 10g... And yes, there are still people using prehistoric
> > > > technology like 10g! ;-)
> > > Mladen, see Bug 12371955 - Hot Backup can cause increased SCN growth
> > > rate leading to ORA-600 [2252] errors [ID 12371955.8]
> > > I think there is confusion because that was in 11.2.0.3, but is also
> > > available as a patch 12371955 for earlier versions. They don't seem
> > > to put the old patches in the new listing you posted.
> > Thanks, that MOS article helped to clear up the confusion a bit :-)
> > Looks like the bug was already fixed in the 11.2.0.3 server patch set.
> > And this is what they say about pre-11g versions:
> > "This fix is *NOT* required in any release prior to 11g.
> > For 11g onwards this fix is already included in various Patch Set
> > Updates and bundles as listed above."
> > *getting even more confused*
> > Matthias Hoys
> As I understand it, there are several issues, working together. The
> SCN being propagated among distributed databases appears to have been
> around a long time, but never really had a problem because of the
> large scale of the variable. The bug that congealed the problem seems
> to be the begin database backup which would elevate the SCN too fast.
> That would only really be a problem for a large system with many links
> and much usage of bcp style backups, where people would backup whole
> dbs with a snapshot, rather than tablespaces, and the SCN jumps
> propagating would multiply the problem. Since it could happen, but
> usually doesn't, they distribute a script to say red, amber or green
> light, so most people get warm and fuzzy green lights.
> But now that we know that, it is a simple matter to poison a system by
> hacking the controlfiles of an obscure database, then propagate with a
> mere access over a link. You don't need the unpatched backup to have
> the problem happen, someone can make it happen. It may just be a
> matter of time until it gets to the script-kiddie point (I haven't
> looked yet this morning).
This bug is not nearly as risky as the InfoWorld article made out. On
its own it is not likely to occur. As far as a DOS attach goes if you
have proper control of your network and do not allow remote non-
controlled databases to link into yours then you can wait the time it
takes to upgrade/patch to a protected version in the normal course of
business.
You can implement monitoring of your SCN number and spit out an alert
or other form of warning message to identify an attach taking place.
> > > > > I wonder if it's only 11g that's affected by the bug or also any older
> > > > > versions? I don't remember reading anything about this in the last PSU patch
> > > > > notes for 10g... And yes, there are still people using prehistoric
> > > > > technology like 10g! ;-)
> > > > Mladen, see Bug 12371955 - Hot Backup can cause increased SCN growth
> > > > rate leading to ORA-600 [2252] errors [ID 12371955.8]
> > > > I think there is confusion because that was in 11.2.0.3, but is also
> > > > available as a patch 12371955 for earlier versions. They don't seem
> > > > to put the old patches in the new listing you posted.
> > > Thanks, that MOS article helped to clear up the confusion a bit :-)
> > > Looks like the bug was already fixed in the 11.2.0.3 server patch set.
> > > And this is what they say about pre-11g versions:
> > > "This fix is *NOT* required in any release prior to 11g.
> > > For 11g onwards this fix is already included in various Patch Set
> > > Updates and bundles as listed above."
> > > *getting even more confused*
> > > Matthias Hoys
> > As I understand it, there are several issues, working together. The
> > SCN being propagated among distributed databases appears to have been
> > around a long time, but never really had a problem because of the
> > large scale of the variable. The bug that congealed the problem seems
> > to be the begin database backup which would elevate the SCN too fast.
> > That would only really be a problem for a large system with many links
> > and much usage of bcp style backups, where people would backup whole
> > dbs with a snapshot, rather than tablespaces, and the SCN jumps
> > propagating would multiply the problem. Since it could happen, but
> > usually doesn't, they distribute a script to say red, amber or green
> > light, so most people get warm and fuzzy green lights.
> > But now that we know that, it is a simple matter to poison a system by
> > hacking the controlfiles of an obscure database, then propagate with a
> > mere access over a link. You don't need the unpatched backup to have
> > the problem happen, someone can make it happen. It may just be a
> > matter of time until it gets to the script-kiddie point (I haven't
> > looked yet this morning).
> This bug is not nearly as risky as the InfoWorld article made out. On
> its own it is not likely to occur. As far as a DOS attach goes if you
> have proper control of your network and do not allow remote non-
> controlled databases to link into yours then you can wait the time it
> takes to upgrade/patch to a protected version in the normal course of
> business.
> You can implement monitoring of your SCN number and spit out an alert
> or other form of warning message to identify an attach taking place.
> IMHO -- Mark D Powell --
Ah, I missed the bit about ora-600 on the victim db if you went over a
reasonable SCN. Going through Bug 11767824: HIGH SCN VALUES / ORA-600
[2252] ERRORS while trying to understand what Jonathan said in some
places helped me understand much more. That shows the issue was there
in 10, even if that was kind of solved (or at least known and
trackable) and then made worse with the backup bug in 11.
According to InfoWorld a partial patch is in the January 2010 CPU. The magazine has issued an update which includes mention of another potential means of the issue being raised. The actual bug related to manual hot backups is apparently limited to two releases per the article, which also contains a link to an article on what and how Oracle uses the SCN number.
On Wed, 25 Jan 2012 07:56:29 -0800, Mark D Powell wrote:
> According to InfoWorld a partial patch is in the January 2010 CPU. The
> magazine has issued an update which includes mention of another
> potential means of the issue being raised. The actual bug related to
> manual hot backups is apparently limited to two releases per the
> article, which also contains a link to an article on what and how Oracle
> uses the SCN number.