Deleting Characters from a String Variable

Ryan

unread,

Mar 24, 2010, 4:34:18 PM3/24/10

to

Hello:

I have a string variable that begins with the name of the month
followed by the date.

For example,

January 12/1/2010
March 4/3/1998
February 8/17/1990
.
.
.

What would be the simplest approach in SPSS to remove *only* the names
of the months from each cell? So, sticking with the example, I want it
to look like this:

12/1/2010
4/3/1998
8/17/1990
.
.
.

Any help would be appreciated!

Ryan

Bruce Weaver

unread,

Mar 24, 2010, 4:54:09 PM3/24/10

to

Hi Ryan. You can use CHAR.INDEX to find the position of the first
blank space in your string variable, and go from there. E.g.,

data list list / datestr(a25).
begin data

"January 12/1/2010"
"March 4/3/1998"
"February 8/17/1990"

end data.

string newstring(a10).
* Get position of first blank space.
compute #blankpos = char.index(datestr," ").

* Now use it to extract a substring from the original variable .

compute newstring = substr(datestr,#blankpos+1).

* Date and Time Wizard: newdate.
COMPUTE newdate=number(newstring, ADATE10).
VARIABLE LABELS newdate "".
VARIABLE LEVEL newdate (SCALE).
FORMATS newdate (ADATE10).
VARIABLE WIDTH newdate(10).

list.

HTH.

--
Bruce Weaver
bwe...@lakeheadu.ca
http://sites.google.com/a/lakeheadu.ca/bweaver/Home
"When all else fails, RTFM."

Ryan

unread,

Mar 24, 2010, 5:04:32 PM3/24/10

to

> bwea...@lakeheadu.cahttp://sites.google.com/a/lakeheadu.ca/bweaver/Home
> "When all else fails, RTFM."- Hide quoted text -
>
> - Show quoted text -

Perfect! Thanks!

ross.d...@gmail.com

unread,

Dec 12, 2013, 7:16:05 PM12/12/13

to

This is interesting, but what if there is no space separating the months and dates? How can you separate those into two columns?

Thanks

David Marso

unread,

Dec 13, 2013, 3:48:22 AM12/13/13

to

maybe
COMPUTE #1=CHAR.INDEX(var,"0123456789",1)-1.
COMPUTE #2=CHAR.INDEX(var," ").
verify that #1 and #2 match? Parse on #1 if not or maybe just use #1 and don't bother with #2 if you don't know that you have odd data.

Art Kendall

unread,

Dec 13, 2013, 8:02:32 AM12/13/13

to

data list list / datestr(a25).
begin data
"January 12/1/2010"
"March 4/3/1998"
"February 8/17/1990"

"January12/1/2010"
"March4/3/1998"
"February8/17/1990"
end data.
string workstring (a25).
compute workstring ="".
do repeat index= 1 to 25.
do if not range(substr(datestr,index,1), "A","Z","a", "z").
compute workstring =
concat(rtrim(workstring),substr(datestr,index,1)).
end if.
end repeat.
numeric mydate(adate10).
compute mydate = number(workstring,adate10).
list.

Art Kendall
Social Research Consultants

Bruce Weaver

unread,

Dec 13, 2013, 9:18:18 AM12/13/13

to

On 13/12/2013 8:02 AM, Art Kendall wrote:
> data list list / datestr(a25).
> begin data
> "January 12/1/2010"
> "March 4/3/1998"
> "February 8/17/1990"
> "January12/1/2010"
> "March4/3/1998"
> "February8/17/1990"
> end data.
> string workstring (a25).
> compute workstring ="".
> do repeat index= 1 to 25.
> do if not range(substr(datestr,index,1), "A","Z","a", "z").
> compute workstring =
> concat(rtrim(workstring),substr(datestr,index,1)).
> end if.
> end repeat.
> numeric mydate(adate10).
> compute mydate = number(workstring,adate10).
> list.
>
>
> Art Kendall
> Social Research Consultants

Here's a variation on Art's code without the workstring variable (which
you would probably end up deleting afterwards).

NUMERIC MyDate(adate10).
COMPUTE #i = 0. /* #i = index variable for looping.
LOOP.
- COMPUTE #i = #i+1.
END LOOP if range(substr(datestr,#i,1), "0","9").
COMPUTE MyDate = NUMBER(CHAR.SUBSTR(datestr,#i),adate10).
LIST.

Output:

datestr MyDate

January 12/1/2010 12/01/2010
March 4/3/1998 04/03/1998
February 8/17/1990 08/17/1990
January12/1/2010 12/01/2010
March4/3/1998 04/03/1998
February8/17/1990 08/17/1990

Number of cases read: 6 Number of cases listed: 6

HTH.

--
Bruce Weaver
bwe...@lakeheadu.ca

David Marso

unread,

Dec 13, 2013, 5:00:16 PM12/13/13

to

REALLY GUYS? ;-)) Must be late on Friday?
--
COMPUTE mydate=NUMBER(CHAR.SUBSTR(datestr,CHAR.INDEX(datestr,"0123456789",1)),ADATE10).
FORMATS mydate (ADATE10).
LIST.

David Marso

unread,

Dec 13, 2013, 5:02:09 PM12/13/13

to

People always seem to forget that magickal 3rd argument for INDEX!

Art Kendall

unread,

Dec 13, 2013, 6:49:00 PM12/13/13

to

nice one!

Art Kendall
Social Research Consultants

Bruce Weaver

unread,

Dec 13, 2013, 9:35:41 PM12/13/13

to

Yes, very nice. From the FM:

CHAR.INDEX(haystack, needle[, divisor]). Numeric. Returns a number
indicating the character position of the first occurrence of needle in
haystack. The optional third argument, divisor, is a number of
characters used to divide needle into separate strings. Each substring
is used for searching and the function returns the first occurrence of
any of the substrings. For example, CHAR.INDEX(var1, 'abcd') will return
the value of the starting position of the complete string "abcd" in the
string variable var1; CHAR.INDEX(var1, 'abcd', 1) will return the value
of the position of the first occurrence of any of the values in the
string; and CHAR.INDEX(var1,'abcd', 2) will return the value of the
first occurrence of either "ab" or "cd". Divisor must be a positive
integer and must divide evenly into the length of needle. Returns 0 if
needle does not occur within haystack.

The key bit of that that David's solution takes advantage of is this:

CHAR.INDEX(var1, 'abcd', 1) will return the value of the position of the
first occurrence of any of the values in the string; and
CHAR.INDEX(var1,'abcd', 2) will return the value of the first occurrence
of either "ab" or "cd".

Interesting that the last example returns the position of the first
occurrence of "ab" or "cd", but not of "bc".

David Marso

unread,

Dec 14, 2013, 12:10:30 PM12/14/13

to

"Interesting that the last example returns the position of the first
occurrence of "ab" or "cd", but not of "bc". "

It chops the needle into substrings the size of divisor then scans the haystack for each parsed element. If one wanted to do what you propose you could build an array of moving chunks and then INDEX each of those. Not going to write it today. Would be easy to do.

Bruce Weaver

unread,

Dec 14, 2013, 3:28:51 PM12/14/13

to

So CHAR.INDEX(var1,'abbccd',2) would find the position of the first
occurrence of 'ab', 'bc' or 'cd' in var1.

David Marso

unread,

Dec 14, 2013, 7:15:42 PM12/14/13

to

Inded!
DATA LIST FREE /teststring (A10).
BEGIN DATA
abdghfhjkj
dfggsbcwer
ghfhhdjjjs
END DATA.
COMPUTE abbccd=CHAR.INDEX(teststring,"abbccd",2).
LIST.

teststring abbccd

abdghfhjkj 1.00
dfggsbcwer 6.00
ghfhhdjjjs .00

Number of cases read: 3 Number of cases listed: 3

SS

unread,

May 16, 2014, 3:55:42 AM5/16/14

to

Hi guys,
I have a similar situation.
two string variables have same or different information in same or reverse order separated by "," e.g:
ab12,cd56 cd56,ab12
ab12,cd56 ab12,cd56
ab12 fx54
ab12,ac34,k4 lx1,km55

I want to create one variable with only unique entries from the two variables.
any suggestions?
Regards,
e.g

David Marso

unread,

May 16, 2014, 4:21:13 AM5/16/14

to

Please start a NEW thread rather than jumping an old one from 4 years ago.
Next time also include what you would want as the result!
What do you expect from the 3rd and 4th example?
For case 1 and 2 look at CHAR.INDEX, CHAR.SUBSTR, CONCAT and MIN and MAX functions

SS

unread,

May 16, 2014, 6:49:04 AM5/16/14

to

I have started new post.
My problem is very clearly stated. Can you help?