Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How do I parse one column into different columns?

9 views
Skip to first unread message

Kay E

unread,
Dec 15, 2006, 11:04:13 PM12/15/06
to
I've inherited a big spreadsheet with a column that is full of entries
like

"Symptom : PAIN Cause : INFECTION Rx : AMOXICILLIN Symptom :
SORENESS Cause : STRAIN Rx : TYLENOL"

Does anyone know of a good way to change this column into three
separate columns like

Symptom1 Cause1 Rx1 Symptom2 Cause2 Rx2
PAIN INFECTION AMOXICILLIN SORENESS STRAIN TYLENOL

Thanks for any suggestions. (I've looked at a few books and couldn't
find an example, and I thought somebody out there may have encountered
this problem...)

Kay

Jef Gorbach

unread,
Dec 16, 2006, 3:18:11 AM12/16/06
to
\tools\data\text to columns, delimiting using : should do it

"Kay E" <k...@nosp.com> wrote in message
news:80s6o2p21is6vnqbl...@4ax.com...

macropod

unread,
Dec 16, 2006, 4:46:56 AM12/16/06
to
Hi Kay,

If your data are actually structured like:


Symptom : PAIN Cause : INFECTION Rx : AMOXICILLIN
Symptom : SORENESS Cause : STRAIN Rx : TYLENOL

and not all on one line, like:


Symptom : PAIN Cause : INFECTION Rx : AMOXICILLIN Symptom : SORENESS Cause : STRAIN Rx : TYLENOL

then, provided each category consists of one word only, it's relatively easy.

With:


Symptom : PAIN Cause : INFECTION Rx : AMOXICILLIN
Symptom : SORENESS Cause : STRAIN Rx : TYLENOL

the first step would be to use Date|Text to Columns|Delimited> check Space & Other. For other, insert a colon (:)
Click Finish.

This will give six columns


Symptom|PAIN|Cause|INFECTION|Rx|AMOXICILLIN
Symptom|SORENESS|Cause|STRAIN|Rx |TYLENOL

Now, insert a blank row and copy the new second row and paste it into the worksheet at B1
Finally, select columns A, C, E & G (eg select A then hold down the Ctrl key as you select C, E & G) then press delete.

Voila! Data in three columns with headers.

If your data structure isn't so simple, it'll take more work to parse

Cheers


--
macropod
[MVP - Microsoft Word]


"Kay E" <k...@nosp.com> wrote in message news:80s6o2p21is6vnqbl...@4ax.com...

Dave Peterson

unread,
Dec 16, 2006, 5:56:27 AM12/16/06
to
If the descriptions (like PAIN, INFECTION, ...) can be more than one word, then
data|Text to columns delimited by both colon and space will cause trouble.

I'd do a little housecleaning first.
Copy that column (just in case) to another column.
Select that new column and do a series of edit|Replaces

Edit|Replace
what: Symptom_:_ (where _ represents a space character)
with: | (some character that isn't used in that column)
replace all

And repeat with "Cause : ", "Rx :", etc.

Then use data|Text to columns to separate the fields (delimited by that |
character).

And add the headers manually.

--

Dave Peterson

Ron Rosenfeld

unread,
Dec 16, 2006, 7:05:14 AM12/16/06
to

Your description is unclear.

1. Since your column contents is within quotes, am I correct in assuming that
your example is all in one cell?

2. In your desired result, you write "three" separate columns; but you show
"six" column headings. Can you resolve this discrepancy?

3. Are the contents (e.g. PAIN INFECTION etc) always a single word, or might
they be several words?
--ron

Kay E

unread,
Dec 16, 2006, 10:05:04 AM12/16/06
to
On Sat, 16 Dec 2006 07:05:14 -0500, Ron Rosenfeld
<ronros...@nospam.org> wrote:

>On Fri, 15 Dec 2006 22:04:13 -0600, Kay E <k...@nosp.com> wrote:
>
>>I've inherited a big spreadsheet with a column that is full of entries
>>like
>>
>>"Symptom : PAIN Cause : INFECTION Rx : AMOXICILLIN Symptom :
>>SORENESS Cause : STRAIN Rx : TYLENOL"
>>
>>Does anyone know of a good way to change this column into three
>>separate columns like
>>
>>Symptom1 Cause1 Rx1 Symptom2 Cause2 Rx2
>>PAIN INFECTION AMOXICILLIN SORENESS STRAIN TYLENOL
>>
>>Thanks for any suggestions. (I've looked at a few books and couldn't
>>find an example, and I thought somebody out there may have encountered
>>this problem...)
>>
>>Kay
>
>Your description is unclear.
>
>1. Since your column contents is within quotes, am I correct in assuming that
>your example is all in one cell?

Yes, that particular column has over 8,000 one-line cells like

Symptom : PAIN Cause : INFECTION Rx : AMOXICILLIN Symptom : SORENESS
Cause : STRAIN Rx : TYLENOL

i.e. all on one line. It was exported out of some database, which has
been lost. The only thing that's consistent is the prompts "Symptom
:", "Cause :", "Rx :"

So

"Symptom : PAIN Cause : INFECTION Rx : AMOXICILLIN" refers to the
first group of data
"Symptom : SORENESS Cause : STRAIN Rx : TYLENOL" refers to the
second group of data

etc. Actually, some cells have as many as 5 groups of data, but if I
know how to do 2 groups, the other ones should be similar (right?).

>2. In your desired result, you write "three" separate columns; but you show
>"six" column headings. Can you resolve this discrepancy?

Yes, you're right. It should be 6 distinct columns - not 3.

>3. Are the contents (e.g. PAIN INFECTION etc) always a single word, or might
>they be several words?
>--ron

Many data entries have more than one word - so instead of just "PAIN",
some people entered "PATIENT COMPLAINED OF HEADACHE", etc., or
instead of "AMOXICILLIN" it's "QUINACRINE 7 DAYS" etc.

Kay

Ron Rosenfeld

unread,
Dec 16, 2006, 2:57:43 PM12/16/06
to

Well, you can do it with formulas or with VBA. I think using a VBA solution is
easier to debug and support, so that's what I'll give you.

This routine uses Regular Expressions. It also assumes that your data types
(e.g. Symptom, Cause, etc) consist of a

space or beginning of line
Single Word
one or more spaces
Colon
one or more spaces

It also assumes that the groups of data always have the three (Symptom, Cause,
Rx) and always in that order.

The routine will take the line in the selected cell, and parse out the data
into the adjacent columns.

If it works with your real data, you should be able to modify it to step
through your entire data set.

But some debugging may be required.

To enter the routine, <alt-F11> opens the VB Editor. Ensure your project is
highlighted in the Project Explore window, then Insert/Module and paste the
code below into the window that opens.

Select Tools/References from the top menu and Select Microsoft VBScript Regular
Expressions 5.5 from the drop down list.

Then select a cell with the data in it. <alt-F8> opens the macro dialog box.
Select the macro and <RUN>.

Let's see what happens. It should handle multiple word descriptions, and as
many data sets as are in the cell.

If the quote marks are also in the cell, it will include the terminal quote,
but that's a simple fix, if that is the only quote mark in the cell.

=================================================
Option Explicit

Sub ParseData()
'Be sure to set Reference to Microsoft VBScript Regular Expressions 5.5
'See Tools/References on Main Menu Bar

Dim objRegExp As RegExp
Dim objMatch As Match
Dim colMatches As MatchCollection
Dim Match As Match
Dim Pattern As String

Dim Str As String
Dim c As Range
Dim i As Long 'counter

' Create a regular expression object.
Set objRegExp = New RegExp

objRegExp.Pattern = Pattern

' Set Case Insensitivity.
objRegExp.IgnoreCase = True

'Set global applicability.
objRegExp.Global = True

'set multiline
objRegExp.MultiLine = True

'Set Pattern to pick up data types
objRegExp.Pattern = "(\b\w+\s+:\s+)(.*?)(?=(\b\w+\s+:\s+)|$)"

For Each c In Selection 'set the range to parse here

Str = c.Text

'Test whether the String can be compared.
If (objRegExp.Test(Str) = True) Then

'Get the matches.
Set colMatches = objRegExp.Execute(Str) ' Execute search.

i = 1
For Each Match In colMatches
c.Offset(0, i).Value = Match.SubMatches(1)
i = i + 1
Next Match

End If

Next c


End Sub
=======================================
--ron

Kay E

unread,
Dec 16, 2006, 4:20:01 PM12/16/06
to
On Sat, 16 Dec 2006 14:57:43 -0500, Ron Rosenfeld
<ronros...@nospam.org> wrote:

Wow... It works!!! That's just awesome.

Thank you so very much!

Kay

Ron Rosenfeld

unread,
Dec 17, 2006, 12:03:06 AM12/17/06
to
On Sat, 16 Dec 2006 15:20:01 -0600, Kay E <k...@nosp.com> wrote:

>Wow... It works!!! That's just awesome.
>
>Thank you so very much!
>
>Kay

You're very welcome. Thanks for the feedback. But the fact that it works is
due in no small part to YOU being able to accurately convey your requirements.
That's usually the most dificult part of these kinds of things.

Best wishes,
--ron

0 new messages