Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

DB2 - Issue with Hash Partitioned Table

411 views
Skip to first unread message

james...@gmail.com

unread,
Nov 8, 2013, 8:22:48 PM11/8/13
to
I am pretty new in DB2-partitioned world:

Using DB2 AESE 10.1, with 6 nodes (3 partitions each) = Total of 18 partitions,
I have created an *autoconfigure database*, along with *tablespace with automatic storage*.
Defining a simple table (just 2 columns) which is Hash-Partitioned on its PK, I've loaded 500M tuples (12GB size in total).
Now once I go into partitions, I see that TS folders have more-or-less the same size across all partitions, but when I try to get exact number of tuples per partition, using DATAPARTITIONNUM function, I see that all tuples are shown to be in first partition p0.

I am confused as I wanna make sure that my hash partitioned table is distributed properly, and I can get correct stats from it.

Any advice/suggestion as where I could be wrong ?

Below you can see my DDLs:

CREATE DATABASE SOCIALDB ON
/sdb/space/db2Device,/sdc/space/db2Device,/sdd/space/db2Device pagesize 32768 autoconfigure apply none

connect to SOCIALDB

create tablespace social_datax managed by automatic storage
create tablespace social_idx managed by automatic storage

CREATE TABLE myTable (id INTEGER NOT NULL, Name VARCHAR(40), PRIMARY KEY
(id) ) in social_datax index in social_idx DISTRIBUTE BY HASH(id)

load from /home/db2-my-gen/data.csv of del insert into myTable


When I run following command (to see how many rows, each partition
has), I see all rows are on first partition (p0):

SELECT DATAPARTITIONNUM(id) as Partition , COUNT(*) as Total_Rows FROM
myTable group by DATAPARTITIONNUM(id) order by DATAPARTITIONNUM(id)

PARTITION TOTAL_ROWS
----------- -----------
0 500000000

1 record(s) selected.


Thnx in advance,
James

Mark A

unread,
Nov 9, 2013, 9:46:59 AM11/9/13
to
I am not sure what the defaults are with automatic storage tablespaces, but ordinarily, you must first
define a partition group with CREATE DATABASE PARTITION GROUP that you indicate which nodes will
be involved in that partition group (probably all of them in your case). Then when you create the
tablespace, you indentify the partition group.

Typically, it is good to store small tables in a partition group that is only assigned to partition 0. This
typcially is the default partition group.

To check your complete configuration, I would run db2look -d <dbname> -e -a -l
last parm is letter l (lower case)

This will show you the complete tablespace definition and which partition groups are being used, and
whether that partition group is using all nodes or only node 0.

james...@gmail.com

unread,
Nov 10, 2013, 6:19:26 PM11/10/13
to
Thanks for reply.

Following your suggestion, I created a database partition group as:

CREATE DATABASE PARTITION GROUP SOCIAL_DAT ON ALL DBPARTITIONNUMS

and I re-created my tablespaces within that:

create regular tablespace social_datax in SOCIAL_DAT MANAGED BY AUTOMATIC STORAGE AUTORESIZE YES
create regular tablespace social_idx in SOCIAL_DAT MANAGED BY AUTOMATIC STORAGE AUTORESIZE YES

CREATE TABLE myTable (id INTEGER NOT NULL, Name VARCHAR(40), PRIMARY KEY (id) ) in social_datax index in social_idx compress yes adaptive

But again after I load, It seems all the tuples are going into partition 0 .

I used db2look, and it seems my tablespaces and table are created as expected (see below).

Even when I use OS command (du -sh) I see that NOD folders on all partitions are more or less of same size, but DATAPARTITIONNUM still tells me all the tuples are in first partition.

-bash-3.2$ db2look -d socialdb -e -a -l
-- Generate statistics for all creators
-- Creating DDL for table(s)
-- This CLP file was created using DB2LOOK Version "10.1"
-- Timestamp: Sun 10 Nov 2013 03:08:32 PM PST
-- Database Name: SOCIALDB
-- Database Manager Version: DB2/LINUXX8664 Version 10.1.3
-- Database Codepage: 1208
-- Database Collating Sequence is: IDENTITY

-- Binding package automatically ...
-- Bind is successful
-- Binding package automatically ...
-- Bind is successful

CONNECT TO SOCIALDB;
-- Binding package automatically ...
-- Bind is successful

-----------------------------------
-- DDL Statements for DATABASE PARTITION GROUPS --
-----------------------------------

CREATE DATABASE PARTITION GROUP "SOCIAL_DAT" ON DBPARTITIONNUMS
(0,
1,
2,
3,
4,
5,
6,
7,
8,
9,
10,
11,
12,
13,
14,
15,
16,
17);


--------------------------
-- Mimic Storage groups --
--------------------------

ALTER STOGROUP "IBMSTOGROUP"
OVERHEAD 6.725000
DEVICE READ RATE 100.000000
DATA TAG NONE
SET AS DEFAULT;

-- Warning:Remove the storage paths defined during database creation time
-- And un comment the remaining lines in the below DDL statement

-- ALTER STOGROUP "IBMSTOGROUP" ADD '/mnt/data/sdb/space/db2Device',
-- '/mnt/data/sdc/space/db2Device',
-- '/mnt/data/sdd/space/db2Device';

------------------------------------
-- DDL Statements for TABLESPACES --
------------------------------------


CREATE REGULAR TABLESPACE "SOCIAL_DATAX" IN DATABASE PARTITION GROUP SOCIAL_DAT
PAGESIZE 32768 MANAGED BY AUTOMATIC STORAGE
USING STOGROUP "IBMSTOGROUP"
AUTORESIZE YES
INITIALSIZE 32 M
MAXSIZE NONE
EXTENTSIZE 32
PREFETCHSIZE AUTOMATIC
BUFFERPOOL "IBMDEFAULTBP"
DATA TAG INHERIT
OVERHEAD INHERIT
TRANSFERRATE INHERIT
NO FILE SYSTEM CACHING
DROPPED TABLE RECOVERY ON;


CREATE REGULAR TABLESPACE "SOCIAL_IDX" IN DATABASE PARTITION GROUP SOCIAL_DAT
PAGESIZE 32768 MANAGED BY AUTOMATIC STORAGE
USING STOGROUP "IBMSTOGROUP"
AUTORESIZE YES
INITIALSIZE 32 M
MAXSIZE NONE
EXTENTSIZE 32
PREFETCHSIZE AUTOMATIC
BUFFERPOOL "IBMDEFAULTBP"
DATA TAG INHERIT
OVERHEAD INHERIT
TRANSFERRATE INHERIT
NO FILE SYSTEM CACHING
DROPPED TABLE RECOVERY ON;


CREATE LARGE TABLESPACE "SYSTOOLSPACE" IN DATABASE PARTITION GROUP IBMCATGROUP
PAGESIZE 32768 MANAGED BY AUTOMATIC STORAGE
USING STOGROUP "IBMSTOGROUP"
AUTORESIZE YES
INITIALSIZE 32 M
MAXSIZE NONE
EXTENTSIZE 4
PREFETCHSIZE AUTOMATIC
BUFFERPOOL "IBMDEFAULTBP"
DATA TAG INHERIT
OVERHEAD INHERIT
TRANSFERRATE INHERIT
NO FILE SYSTEM CACHING
DROPPED TABLE RECOVERY ON;


----------------------
-- Mimic tablespace --
----------------------

ALTER TABLESPACE "SYSCATSPACE"
PREFETCHSIZE AUTOMATIC
OVERHEAD INHERIT
NO FILE SYSTEM CACHING
AUTORESIZE YES
TRANSFERRATE INHERIT;


ALTER TABLESPACE "SYSCATSPACE"
USING STOGROUP "IBMSTOGROUP";


ALTER TABLESPACE "TEMPSPACE1"
PREFETCHSIZE AUTOMATIC
OVERHEAD INHERIT
FILE SYSTEM CACHING
TRANSFERRATE INHERIT;


ALTER TABLESPACE "USERSPACE1"
PREFETCHSIZE AUTOMATIC
OVERHEAD INHERIT
NO FILE SYSTEM CACHING
AUTORESIZE YES
TRANSFERRATE INHERIT
DATA TAG INHERIT;


ALTER TABLESPACE "USERSPACE1"
USING STOGROUP "IBMSTOGROUP";


------------------------------------------------
-- DDL Statements for Schemas
------------------------------------------------

-- Running the DDL below will explicitly create a schema in the
-- new database that corresponds to an implicitly created schema
-- in the original database.

CREATE SCHEMA "DB2INST ";



------------------------------------------------
-- DDL Statements for Table "DB2INST "."MYTABLE"
------------------------------------------------


CREATE TABLE "DB2INST "."MYTABLE" (
"ID" INTEGER NOT NULL ,
"NAME" VARCHAR(40) )
COMPRESS YES ADAPTIVE
DISTRIBUTE BY HASH("ID")
IN "SOCIAL_DATAX" INDEX IN "SOCIAL_IDX" ;


-- DDL Statements for Primary Key on Table "DB2INST "."MYTABLE"

ALTER TABLE "DB2INST "."MYTABLE"
ADD PRIMARY KEY
("ID");









COMMIT WORK;

CONNECT RESET;

TERMINATE;

Mark A

unread,
Nov 11, 2013, 1:36:01 AM11/11/13
to
On Sun, 10 Nov 2013 15:19:26 -0800, james.pirz wrote:

> Thanks for reply.
>
> Following your suggestion, I created a database partition group as:
>
> CREATE DATABASE PARTITION GROUP SOCIAL_DAT ON ALL DBPARTITIONNUMS
>
> and I re-created my tablespaces within that:
>
> create regular tablespace social_datax in SOCIAL_DAT MANAGED BY
> AUTOMATIC STORAGE AUTORESIZE YES create regular tablespace social_idx in
> SOCIAL_DAT MANAGED BY AUTOMATIC STORAGE AUTORESIZE YES
>
> CREATE TABLE myTable (id INTEGER NOT NULL, Name VARCHAR(40), PRIMARY KEY
> (id) ) in social_datax index in social_idx compress yes adaptive
>
> But again after I load, It seems all the tuples are going into partition
> 0 .

Although not related to your problem, you should not create a "regular" tablespace. Just say "create
tablespace" or "create large tablespace." This is especially true when you have 32K page size. Large
tablespaces are the default now.

I don't know what is wrong with the partitioning setup, but please show us the contents of the
db2nodes.cfg file ($HOME/sqllib/db2nodes.cfg).

Larry

unread,
Nov 11, 2013, 1:42:46 PM11/11/13
to
Have a look at this:
http://www.ibm.com/developerworks/data/library/techarticle/dm-1005partitioningkeys/index.html

especially "From the best practices perspective, the table data skew on
every individual database partition should be no more than 10%. To
achieve this goal, the partitioning key should be selected on the
columns that have high cardinality, or in other words, that contain a
large number of distinct values".

--
Larry E.

--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---

Mark A

unread,
Nov 11, 2013, 1:50:59 PM11/11/13
to
On Mon, 11 Nov 2013 13:42:46 -0500, Larry wrote:

> Have a look at this:
> http://www.ibm.com/developerworks/data/library/techarticle/dm-1005partitioningkeys/index.html
>
> especially "From the best practices perspective, the table data skew on
> every individual database partition should be no more than 10%. To
> achieve this goal, the partitioning key should be selected on the
> columns that have high cardinality, or in other words, that contain a
> large number of distinct values".

It looks to me like the partitioning key is the PK, which is obviously unique.

james...@gmail.com

unread,
Nov 11, 2013, 2:12:31 PM11/11/13
to
My partitioning key is PK, and its range is [1,500M], so that should not be a problem.

I also ran
REORG TABLE myTable ON ALL DBPARTITIONNUMS
but it did not change anything.

Moreover, I made sure that my data folders, on all partitions have proper R/W permissions.

My db2nodes.cfg file is as below (I have 6 machines, named vader-0XX , and I have 3 partitions per machine (one per disk) ). My machines are vader-002, vader-003, vader-006, vader-007, vader-008 and vader-010.

0 vader-002 0
1 vader-002 1
2 vader-002 2
3 vader-003 0
4 vader-003 1
5 vader-003 2
6 vader-006 0
7 vader-006 1
8 vader-006 2
9 vader-007 0
10 vader-007 1
11 vader-007 2
12 vader-008 0
13 vader-008 1
14 vader-008 2
15 vader-010 0
16 vader-010 1
17 vader-010 2


Pouria

Mark A

unread,
Nov 11, 2013, 3:37:21 PM11/11/13
to
On Fri, 08 Nov 2013 17:22:48 -0800, james.pirz wrote:

> SELECT DATAPARTITIONNUM(id) as Partition , COUNT(*) as Total_Rows FROM
> myTable group by DATAPARTITIONNUM(id) order by DATAPARTITIONNUM(id)
>
> PARTITION TOTAL_ROWS
> ----------- -----------
> 0 500000000
>
> 1 record(s) selected.
>
>
> Thnx in advance,
> James

You are using the wrong scalar function. The one you are using is for Table (range) parititions. For DPF
partition numbers, use DBPARTITIONNUM.

james...@gmail.com

unread,
Nov 11, 2013, 3:54:35 PM11/11/13
to
Mark, you are absolutely right !

Switching to DBPARTITIONNUM(), now I see data is uniformly distributed across my 18 partitions.

Thanks for looking into it, and your suggestions.
:)
0 new messages