Bloom filters

Showing 1-5 of 5 messages
Bloom filters Mladen Gogala 12/9/12 11:53 AM
I have first heard of Bloom filters when I started reading about Exadata.
Bloom filtering is a probabilistic algorithm which for the given set can
very quickly decide whether an item at hand is a member or not. The
algorithm can produce false postives, but not false negatives.

http://en.wikipedia.org/wiki/Bloom_filter

The main use in the database, or so I thought, was for Exadata to search
the blocks requested by the DB query. However, a recent discussion on the
Metalink community forum has produced the following:

**************************************************************************
The Oracle database makes use of Bloom filters in the following 4
situations:

- To reduce data communication between slave processes in parallel joins:
mostly in RAC
- To implement join-filter pruning: in partition pruning, the optimizer
analyzes FROM and WHERE clauses in SQL statements to eliminate unneeded
partitions when building the partition access list
- To support result caches: when you run a query, Oracle will first see
if the results of that query have already been computed and cached by
some session or user, and if so, it will retrieve the answer from the
server result cache instead of gathering all of the database blocks
- To filter members in different cells in Exadata: Exadata performs joins
between large tables and small lookup tables, a very common scenario for
data warehouses with star schemas. This is implemented using Bloom
filters as to determine whether a row is a member of the desired result
set.

You can identify a bloom pruning in a plan when you see :BF0000 in the
Pstart and Pstop columns of the execution plan and PART JOIN FILTER
CREATE in the operations column:

Two hints can be used: px_join_filter and no_px_join_filter; and there
are a couple of views for monitoring Bloom filters:

- v$sql_join_filter: information about Bloom filters; number of rows
filtered out and probed by Bloom filters
- v$pq_tqstat: check reduced communication due to usage of Bloom filters
The Oracle database makes use of Bloom filters in the following 4
situations:

- To reduce data communication between slave processes in parallel joins:
mostly in RAC
- To implement join-filter pruning: in partition pruning, the optimizer
analyzes FROM and WHERE clauses in SQL statements to eliminate unneeded
partitions when building the partition access list
- To support result caches: when you run a query, Oracle will first see
if the results of that query have already been computed and cached by
some session or user, and if so, it will retrieve the answer from the
server result cache instead of gathering all of the database blocks
- To filter members in different cells in Exadata: Exadata performs joins
between large tables and small lookup tables, a very common scenario for
data warehouses with star schemas. This is implemented using Bloom
filters as to determine whether a row is a member of the desired result
set.

You can identify a bloom pruning in a plan when you see :BF0000 in the
Pstart and Pstop columns of the execution plan and PART JOIN FILTER
CREATE in the operations column:

Two hints can be used: px_join_filter and no_px_join_filter; and there
are a couple of views for monitoring Bloom filters:

- v$sql_join_filter: information about Bloom filters; number of rows
filtered out and probed by Bloom filters
- v$pq_tqstat: check reduced communication due to usage of Bloom filters
**************************************************************************

The author is an Oracle employee, so I searched the Metalink. The search
has produced the document 912330.1 which, while less detailed than the
post above still confirms it. Of course, as with all new features, there
are "features":

Bug 12637294: BLOOM PARTITION PRUNING MAY CAUSE DEADLOCK ORA-60

Funny thing about this particular "feature" is that it is marked as fixed
in release 12.1.

There are also some good blog articles, like this one:

http://perfexpert2.wordpress.com/2011/05/21/undocumented-parameter/


--
Mladen Gogala
The Oracle Whisperer
http://mgogala.byethost5.com
Re: Bloom filters joel garry 12/10/12 8:34 AM
On Dec 9, 11:53 am, Mladen Gogala <gogala.mla...@gmail.com> wrote:

...
> Two hints can be used: px_join_filter and no_px_join_filter; and there
> are a couple of views for monitoring Bloom filters:
>
> - v$sql_join_filter: information about Bloom filters; number of rows
> filtered out and probed by Bloom filters
> - v$pq_tqstat: check reduced communication due to usage of Bloom filters
> **************************************************************************
>
> The author is an Oracle employee, so I searched the Metalink. The search
> has produced the document 912330.1 which, while less detailed than the
> post above still confirms it. Of course, as with all new features, there
> are "features":
>
> Bug 12637294: BLOOM PARTITION PRUNING MAY CAUSE DEADLOCK ORA-60
>
> Funny thing about this particular "feature" is that it is marked as fixed
> in release 12.1.

Those "fixed in" statements can be misleading, you have to look for
backports.  You can get the 11.2.0.3 patch for exadata, or, can I
believe my eyes, hp-ux Itanium.

jg
--
@home.com is bogus.
http://video-games.pricegrabber.com/cdos+readme+first/results.html/
Re: Bloom filters Mladen Gogala 12/10/12 9:59 AM
On Mon, 10 Dec 2012 08:34:03 -0800, joel garry wrote:

> Those "fixed in" statements can be misleading, you have to look for
> backports.  You can get the 11.2.0.3 patch for exadata, or, can I
> believe my eyes, hp-ux Itanium.

It looks like Oracle is starting to use Bloom filters extensively. The
first thing that comes to mind are "NOT IN" queries, since Bloom filter
doesn't generate false negatives. We may see many, many more
implementations of this algorithm in the future.
Re: Bloom filters Vsevolod Afanassiev 1/10/13 9:26 PM
Years ago we had several 10.2.0.3 crashes caused by Bloom filter. Oracle support recommended setting _bloom_filter_enabled = false. The crashes were caused by parallel query:

ORA-00600: internal error code, arguments: [17114], [0x70000011C7B2C30], [], [], [], [], [], []
ORA-10388: parallel query server interrupt (failure)

=== From Metalink Note 389621.1 ===
===================================
Applies to:
Oracle Server - Enterprise Edition - Version: 10.2.0.1 to 10.2.0.3
This problem can occur on any platform.

Symptoms

Database hangs, is not available for users, and possible to crash after a while. This condition has also been observed by pressing Control-C in a SQL*Plus session.

This is an intermittent problem

Alert log shows errors:
ORA-00600: internal error code, arguments: [KGHLKREM1], [0x7DA72C420], [], [], [], [], [], []

Trace files show HEAP corruption errors, and show the errors
occurring when pinning an object in memory. Additionally, there is much Parallel Query processing

We can see different Call stack:

kgesic1 kghfrunp kghfnd kghalo kghsupmm kghssgai qesblAlo qesblGetFilter qerblFetch . . .
or
kghalo kspcrec ksucre kxfpProcessJoin
or
kghnerror kghadd_reserved_extent kghget_reserved_ext ent kghgex ent kghfnd kghalo kghgex kghalf kghalp kghssgai kggsmInitCompact 1360 qkssmMigrateExec

Cause
This is the same issue as in
Bug 5914711 DATABASE HANG WITH MANY ORA-600 [KGHLKREM1] ERROR

The cause of the error is due the following unpublished bug.

Unpublished Bug 5736850 ABSTRACT: MULTI KILL SESSION LEADS TO SGA CORRUPTION

Fixed In Ver: 11.1

Solution

Workaround is to set _bloom_filter_enabled=FALSE in init.ora.

It is often possible to tell that bloom filters are being used because of the presence of qerblRop in the stack or because there are memory chunks in the trace with the label ’qesblFilter_seg’.

A bloom filter is a generic(non-Oracle) data structure/algorithm used for looking up elements in a set. You can find detailed information by searching for "bloom filter" in google or any other search engine. In Oracle 10.2 onwards, parallel query can use this operation internally if the parameter is true.

This might have a slight negative impact on PQ performance but since this is a new 10gR2 feature, it may not be noticeable.

The fix for the bug will be in 10.2.0.4 and Rel. 11. This fix changes a fixed sga structure and introduces a new wait event so cannot be backported.

References
Bug 5914711 - DATABASE HANG WITH MANY ORA-600 [KGHLKREM1] ERROR

Errors
ORA-600[KGHLKREM1]
===============================


Re: Bloom filters dombrooks 1/11/13 2:17 AM
Amongst other things, you should see that, for example, subquery pruning has mainly been replaced by bloom filtering in recent versions.

Christian Antognini has an excellent paper on bloom filters:
http://antognini.ch/papers/BloomFilters20080620.pdf