Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Separate word in a field

0 views
Skip to first unread message

RJ

unread,
Dec 8, 2006, 3:06:06 AM12/8/06
to
Hi All,

I have a table (table1) with 2 columns (ID, keywords)
The ID column is the PK (values 1..n), and the keywords column is
varchar with words separated with a space. EG:

ID KEYWORD
---- -----------------

1 CAR VEHICLE
2 CHICKEN FOOD CUISINE
3 HOUSE PROPERTY CONSTRUCTION

I need to separate the words to show like this:

ID KEYWORD
----- -----------------

1 CAR
1 VEHICLE
2 CHICKEN
2 FOOD
2 CUISINE

etc..

how can this be done in SQL? I know i can use the UNION command to show
each record, but how do I separate each word? I've tried using a
combination of Instr and substr but its not showing correctly.

Thanks in advance :)

Robert Klemme

unread,
Dec 8, 2006, 3:20:49 AM12/8/06
to

IMHO this is a typical case of bad data modeling. I would want to
correct that. I'd probably write a script that extracts the data and
writes out the proper separated values to a CSV file. Then I'd change
the table layout or constraints (depending on what's there and how it's
used) and import the data.

Kind regards

robert

RJ

unread,
Dec 8, 2006, 3:50:37 AM12/8/06
to
Yes it is bad data modeling, I'm trying to fix it.
If theres no simple way of doing it in SQL i'll have to do as you say
with the CSV file.

Any ideas?


On Dec 8, 10:20 am, Robert Klemme <shortcut...@googlemail.com> wrote:
> On 08.12.2006 09:06, RJ wrote:
>
>
>
> > Hi All,
>
> > I have a table (table1) with 2 columns (ID, keywords)
> > The ID column is the PK (values 1..n), and the keywords column is
> > varchar with words separated with a space. EG:
>
> > ID KEYWORD
> > ---- -----------------
>
> > 1 CAR VEHICLE
> > 2 CHICKEN FOOD CUISINE
> > 3 HOUSE PROPERTY CONSTRUCTION
>
> > I need to separate the words to show like this:
>
> > ID KEYWORD
> > ----- -----------------
>
> > 1 CAR
> > 1 VEHICLE
> > 2 CHICKEN
> > 2 FOOD
> > 2 CUISINE
>
> > etc..
>
> > how can this be done in SQL? I know i can use the UNION command to show
> > each record, but how do I separate each word? I've tried using a

> > combination of Instr and substr but its not showing correctly.IMHO this is a typical case of bad data modeling. I would want to

Charles Hooper

unread,
Dec 8, 2006, 6:53:56 AM12/8/06
to

On the surface, this appears to be very difficult. There are a couple
different ways that this can be done. Since you are working with INSTR
and SUBSTR, we will use that method.

First, the set up:
CREATE TABLE TABLE1 (ID NUMBER(10), KEYWORD VARCHAR2(255));
INSERT INTO TABLE1 VALUES (1,'CAR VEHICLE');
INSERT INTO TABLE1 VALUES (2,'CHICKEN FOOD CUISINE');
INSERT INTO TABLE1 VALUES (3,'HOUSE PROPERTY CONSTRUCTION');
And just for fun:
INSERT INTO TABLE1 VALUES (4,'MOOSE DUCK BEAR COW PIG DOG CAT MOUSE RAT
HAMSTER LION TIGER');

If we can trim the KEYWORD column to eliminate any leading and trailing
spaces, it will make the task much easier. To keep the task simple, we
will then add a space to the end of the KEYWORD column when making
comparisons.

SELECT
TRIM(KEYWORD)||' ' NEW_KEYWORD
FROM
TABLE1;

With this change, we can using INSTR to find the first, second, third,
etc. space in the KEYWORD column:
SELECT
ID,
INSTR(TRIM(KEYWORD)||' ',' ',1,1) SPACE_1,
INSTR(TRIM(KEYWORD)||' ',' ',1,2) SPACE_2,
INSTR(TRIM(KEYWORD)||' ',' ',1,3) SPACE_3,
INSTR(TRIM(KEYWORD)||' ',' ',1,4) SPACE_4,
INSTR(TRIM(KEYWORD)||' ',' ',1,5) SPACE_5,
INSTR(TRIM(KEYWORD)||' ',' ',1,6) SPACE_6,
INSTR(TRIM(KEYWORD)||' ',' ',1,7) SPACE_7,
INSTR(TRIM(KEYWORD)||' ',' ',1,8) SPACE_8,
INSTR(TRIM(KEYWORD)||' ',' ',1,9) SPACE_9,
INSTR(TRIM(KEYWORD)||' ',' ',1,10) SPACE_10,
INSTR(TRIM(KEYWORD)||' ',' ',1,11) SPACE_11,
INSTR(TRIM(KEYWORD)||' ',' ',1,12) SPACE_12,
INSTR(TRIM(KEYWORD)||' ',' ',1,13) SPACE_13,
TRIM(KEYWORD)||' ' NEW_KEYWORD
FROM
TABLE1;

ID S_1 S_2 S_3 S_4 S_5 S_6 S_7 S_8 S_9 S_10 S_11 S_12 S_13 NEW_KEYWORD
1 4 12 0 0 0 0 0 0 0 0 0 0 0 CAR VEHICLE
3 6 15 28 0 0 0 0 0 0 0 0 0 0 HOUSE
PROPERTY CONSTRUCTION
2 8 13 21 0 0 0 0 0 0 0 0 0 0 CHICKEN FOOD
CUISINE
4 6 11 16 20 24 28 32 38 42 50 55 61 0 MOOSE DUCK
BEAR COW PIG DOG CAT MOUSE RAT HAMSTER LION TIGER

Now that we know where the spaces are located in the KEYWORD column, if
we slide the above into an inline view, we can use SUBSTR to pull out
each of the keywords. If one of the SPACE_n values is 0, that
indicates that the keyword in the previous position does not exist, so
we need to use DECODE to check the SPACE_n value:
SELECT
ID,
DECODE(SPACE_1,0,NULL,SUBSTR(NEW_KEYWORD,1,SPACE_1-1)) SUB_KEYWORD1,
DECODE(SPACE_3,0,NULL,SUBSTR(NEW_KEYWORD,SPACE_2+1,SPACE_3-SPACE_2))
SUB_KEYWORD3,
NEW_KEYWORD
FROM
(SELECT
ID,
TRIM(KEYWORD)||' ' NEW_KEYWORD,
INSTR(TRIM(KEYWORD)||' ',' ',1,1) SPACE_1,
INSTR(TRIM(KEYWORD)||' ',' ',1,2) SPACE_2,
INSTR(TRIM(KEYWORD)||' ',' ',1,3) SPACE_3,
INSTR(TRIM(KEYWORD)||' ',' ',1,4) SPACE_4,
INSTR(TRIM(KEYWORD)||' ',' ',1,5) SPACE_5,
INSTR(TRIM(KEYWORD)||' ',' ',1,6) SPACE_6,
INSTR(TRIM(KEYWORD)||' ',' ',1,7) SPACE_7,
INSTR(TRIM(KEYWORD)||' ',' ',1,8) SPACE_8,
INSTR(TRIM(KEYWORD)||' ',' ',1,9) SPACE_9,
INSTR(TRIM(KEYWORD)||' ',' ',1,10) SPACE_10,
INSTR(TRIM(KEYWORD)||' ',' ',1,11) SPACE_11,
INSTR(TRIM(KEYWORD)||' ',' ',1,12) SPACE_12,
INSTR(TRIM(KEYWORD)||' ',' ',1,13) SPACE_13
FROM
TABLE1);

ID SUB_KEYWORD1 SUB_KEYWORD3 NEW_KEYWORD
1 CAR CAR VEHICLE
3 HOUSE CONSTRUCTION HOUSE PROPERTY CONSTRUCTION
2 CHICKEN CUISINE CHICKEN FOOD CUISINE
4 MOOSE BEAR MOOSE DUCK BEAR COW PIG DOG CAT MOUSE RAT
HAMSTER LION TIGER

If you make the above a named query in a SELECT statement using WITH,
you could then use a UNION ALL to pull out all of the non-null
SUB_KEYWORD1 values with their associated IDs, non-null SUB_KEYWORD2
values with their associated IDs, SUB_KEYWORD3 values with their
associated IDs, etc.

There are probably more efficient methods for accomplishing the above.

Charles Hooper
PC Support Specialist
K&M Machine-Fabricating, Inc.

RJ

unread,
Dec 8, 2006, 8:11:42 AM12/8/06
to
hey thanx man!


On Dec 8, 1:53 pm, "Charles Hooper" <hooperc2...@yahoo.com> wrote:
> RJ wrote:
> > Hi All,
>
> > I have a table (table1) with 2 columns (ID, keywords)
> > The ID column is the PK (values 1..n), and the keywords column is
> > varchar with words separated with a space. EG:
>
> > ID KEYWORD
> > ---- -----------------
>
> > 1 CAR VEHICLE
> > 2 CHICKEN FOOD CUISINE
> > 3 HOUSE PROPERTY CONSTRUCTION
>
> > I need to separate the words to show like this:
>
> > ID KEYWORD
> > ----- -----------------
>
> > 1 CAR
> > 1 VEHICLE
> > 2 CHICKEN
> > 2 FOOD
> > 2 CUISINE
>
> > etc..
>
> > how can this be done in SQL? I know i can use the UNION command to show
> > each record, but how do I separate each word? I've tried using a
> > combination of Instr and substr but its not showing correctly.
>

> > Thanks in advance :)On the surface, this appears to be very difficult. There are a couple

Maxim Demenko

unread,
Dec 8, 2006, 10:51:13 AM12/8/06
to RJ
RJ schrieb:

Yet another approach ( assuming, between all words is exactly one space,
otherwise, it can be adjusted to reflect multiple spaces):

scott@ORA102> create table table1
2 (
3 id number,
4 keyword varchar2(4000)
5 )
6 ;
scott@ORA102> insert into table1 (id, keyword)
2 values (1, 'CAR VEHICLE');
scott@ORA102> insert into table1 (id, keyword)
2 values (2, 'CHICKEN FOOD CUISINE');
scott@ORA102> insert into table1 (id, keyword)
2 values (3, 'HOUSE PROPERTY CONSTRUCTION');
scott@ORA102> with t as (
2 select id, keyword,length(keyword)-length(replace(keyword,' '))+1
words
3 from table1),
4 g as (select rownum rn
5 from dual
6 connect by level<=(select
max(length(keyword)-length(replace(keyword,' ')))+1 from table1))
7 select id,regexp_substr(keyword,'\w+',1,rn) word
8 from t,g
9 where g.rn<=t.words
10 order by id,rn
11 /

ID WORD
---------- ---------------


1 CAR
1 VEHICLE
2 CHICKEN
2 FOOD
2 CUISINE

3 HOUSE
3 PROPERTY
3 CONSTRUCTION


Best regards

Maxim

Charles Hooper

unread,
Dec 8, 2006, 11:45:14 AM12/8/06
to

Maxim, I prefer your SQL statement to the one that I posted, although
it would have been even more impressive if you would have explained how
it worked (it only takes 2 minutes of experimentation to see how it
works).

SELECT
ID,
REGEXP_SUBSTR(KEYWORD,'\w+',1,number_of_word_here) WORD
FROM
TABLE1;

I have been looking for a good reason to use REGEXP_SUBSTR, and it
looks like the above is a very good use for this function.

Maxim Demenko

unread,
Dec 8, 2006, 12:08:24 PM12/8/06
to Charles Hooper
Charles Hooper schrieb:

Yes, for the purpose string splitting into tokens it might be very handy
to use it. Here is another nice example (courtesy www.sql.ru ) which
splits a string into words

scott@ORA102> SELECT LEVEL word_number,regexp_substr(s,'\w+',1,LEVEL) word
2 FROM ( SELECT 'Use new regexp functions in oracle 10g' s FROM dual)
3 CONNECT BY regexp_substr(s,'\w+',1,LEVEL) IS NOT NULL
4 /

WORD_NUMBER WORD
----------- ---------------
1 Use
2 new
3 regexp
4 functions
5 in
6 oracle
7 10g

Best regards

Maxim

0 new messages