Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Find substring in cell array of numbers and strings

262 views
Skip to first unread message

paul.d...@gmail.com

unread,
Jan 24, 2017, 7:25:47 PM1/24/17
to
I have a cell array consisting of numbers, strings, and empty arrays. I want to find the position (linear or indexed) of all cells containing a string in which a certain substring of interest appears.

mixedCellArray = {
'adpo' 2134 []
0 [] 'daesad'
'xxxxx' 'dp' 'dpdpd'
}

If the substring of interest is 'dp', then I should get the indices for three cells.

The only solutions I can find work when the cell array contains only strings:

* http://www.mathworks.com/matlabcentral/answers/2015-find-index-of-cells-containing-my-string

* http://www.mathworks.com/matlabcentral/newsreader/view_thread/255090

One work-around is to find all cells not containing strings, and fill them with '', as hinted by http://stackoverflow.com/questions/21931954/matlab-find-substring-in-cell-array. Unfortunately, my approach requires a variation of that solution, probably something like `cellfun('ischar',mixedCellArray)`. This causes the error:

Error using cellfun
Unknown option.

Thanks for any suggestions on how to figure out the error.

I've posted this to http://stackoverflow.com/questions/41841376/find-substring-in-cell-array-of-numbers-and-strings.

dpb

unread,
Jan 24, 2017, 7:58:34 PM1/24/17
to
On 01/24/2017 6:25 PM, paul.d...@gmail.com wrote:
> I have a cell array consisting of numbers, strings, and empty arrays. I want to find the position (linear or indexed) of all cells containing a string in which a certain substring of interest appears.
>
> mixedCellArray = {
> 'adpo' 2134 []
> 0 [] 'daesad'
> 'xxxxx' 'dp' 'dpdpd'
> }
>
> If the substring of interest is 'dp', then I should get the indices for three cells.
...

I posted a complaint about there not being generic searching routines a
week or so ago in the Answers forum under a "What's Matlab Missing?"
thread...there should be some simpler ways to do this, I agree
wholeheartedly.

Best for such problems I've come up with is something like--

>>
~cellfun(@isempty,cellfun(@(x)strfind(x,'dp'),mixedCellArray,'uniform',0))
ans =
1 0 0
0 0 0
0 1 1
>>

albeit it is ugly because there's not a way to write an anonymous
function for the guts of it and pass the searched-for string inside the
CELLFUN call.

You either have to keep repeating the search inline or dynamically
create an anonymous function that embeds the search string or write a
standalone m-file that can accept the two arguments.

As say, "there otta' be a way" built in...

--



paul.d...@gmail.com

unread,
Jan 24, 2017, 10:58:54 PM1/24/17
to
I found something under "What is missing from MATLAB?".

Your solution is intriguing. I didn't bring my work laptop home, so I tried it on an old Octave installation. It doesn't work:

error: strfind: first argument must be a string or cell array of strings

error: evaluating argument list element number 2

So strfind is tripping on numerical input. I will try it using Matlab tomorrow.

However, I was wondering what was wrong with the cellfun('ischar',mixedCellArray) in my original post? It seems to work on ye olde Octave installation.

>> mixedCellArray = {
'adpo' 2134 []
0 [] 'daesad'
'xxxxx' 'dp' 'dpdpd'
}

mixedCellArray =
{
[1,1] = adpo
[2,1] = 0
[3,1] = xxxxx
[1,2] = 2134
[2,2] = [](0x0)
[3,2] = dp
[1,3] = [](0x0)
[2,3] = daesad
[3,3] = dpdpd
}

>> % One of the following
>> %
>> mixedCellArray( ~cellfun('ischar',mixedCellArray) ) = {''}
>> %[ mixedCellArray{ ~cellfun('ischar',mixedCellArray) } ] = deal('')

mixedCellArray =
{
[1,1] = adpo
[2,1] =
[3,1] = xxxxx
[1,2] =
[2,2] =
[3,2] = dp
[1,3] =
[2,3] = daesad
[3,3] = dpdpd
}

>> cellfun('ischar',mixedCellArray)

ans =
1 1 1
1 1 1
1 1 1

paul.d...@gmail.com

unread,
Jan 24, 2017, 11:15:05 PM1/24/17
to
> I was wondering what was wrong with the
> cellfun('ischar',mixedCellArray) in my original post? It seems to
> work on ye olde Octave installation.

I think I might have the explanation. The documentation says that cellfun's 1st argument is a handle. It can only be a string if the string corresponds to certain function names, with 'isempty' being one of the names that can be specified as a string. That's why it doesn't work for me when I specify 'ischar', but it works for 'isempty' in the example cited in my original post. Or so I suspect. I will verify tomorrow, when I have access to Matlab.

Bruno Luong

unread,
Jan 25, 2017, 2:55:08 AM1/25/17
to
paul.d...@gmail.com wrote in message <57a7311a-02b9-4cdc...@googlegroups.com>...

> I think I might have the explanation. The documentation says that cellfun's 1st argument is a handle. It can only be a string if the string corresponds to certain function names,

That's for sure

To use 'ischar' simpy call

isempty(@ischar, ....)

paul.d...@gmail.com

unread,
Jan 25, 2017, 12:21:09 PM1/25/17
to
Thanks to Bruno for the tip on the proper format for cellfun's 1st argument. I know it should have been obvious from dpb's example code, but I just lost track of the punctuations (not using anymous function much, and all). I confirmed that my approach to converting nonstring cells to strings works:

>> mixedCellArray = {
'adpo' 2134 []
0 [] 'daesad'
'xxxxx' 'dp' 'dpdpd'
}

mixedCellArray =
'adpo' [2134] []
[ 0] [] 'daesad'
'xxxxx' 'dp' 'dpdpd'

>> ~cellfun(@ischar,mixedCellArray)

ans =
0 1 1
1 1 0
0 0 0
>> mixedCellArray( ~cellfun(@ischar,mixedCellArray) ) = {''}

mixedCellArray =
'adpo' '' ''
'' '' 'daesad'
'xxxxx' 'dp' 'dpdpd'

>> cellfun(@ischar,mixedCellArray)

ans =
1 1 1
1 1 1
1 1 1

But I didn't bother following it through to find a substring. dpd's example works so much better. And being able to run it in Matlab allowed me to actually figure out what it does. For reference, here is his example, best viewed in fixed-width font:

>> mixedCellArray = {
'adpo' 2134 []
0 [] 'daesad'
'xxxxx' 'dp' 'dpdpd'
}

mixedCellArray =
'adpo' [2134] []
[ 0] [] 'daesad'
'xxxxx' 'dp' 'dpdpd'

>> ~cellfun( @isempty , ...
cellfun( @(x)strfind(x,'dp') , ...
mixedCellArray , ...
'uniform',0) ...
)

ans =
1 0 0
0 0 0
0 1 1

The inner cellfun is able to apply strfind to even numerical cells because, I presume, Matlab treats numerical arrays and strings the same way. A string is just an array of numbers representing the character codes. The outer cellfun identifies all cells for which the inner cellfun found a match, and the prefix tilde turns that into all cells for which there was NO match.

Thanks, dpb.

dpb

unread,
Jan 25, 2017, 1:22:47 PM1/25/17
to
On 01/25/2017 11:21 AM, paul.d...@gmail.com wrote:
...

> The inner cellfun is able to apply strfind to even numerical cells
> because, I presume, Matlab treats numerical arrays and strings the same
> way. A string is just an array of numbers representing the character
> codes. The outer cellfun identifies all cells for which the inner
> cellfun found a match, and the prefix tilde turns that into all cells
> for which there was NO match.
>
> Thanks, dpb.

Above is correct interpretation for STRFIND and friends; however, REGEXP
will choke in place thereof.

Where my real beef is with Matlab/TMW on cell array searching syntax is
that you can't write a single expression that will look for either
numeric OR string in mixedCellArray without special-casing the search
syntax for the target. My contention is that since cell arrays are
designed for disparate data types the methods to access them should also
be designed for that purpose and they're not.

Bruno Luong

unread,
Jan 25, 2017, 2:04:07 PM1/25/17
to
"dpb" wrote in message <o6aqae$g3h$1...@dont-email.me>...

>
> Where my real beef is with Matlab/TMW on cell array searching syntax is
> that you can't write a single expression that will look for either
> numeric OR string in mixedCellArray without special-casing the search
> syntax for the target. My contention is that since cell arrays are
> designed for disparate data types the methods to access them should also
> be designed for that purpose and they're not.

So you want to ask for expansion of strfind strcmp to numerical, structure, etc....

You would get people mad, as showed recently with "+" in R2006B.

dpb

unread,
Jan 25, 2017, 2:18:29 PM1/25/17
to
On 01/25/2017 1:04 PM, Bruno Luong wrote:
> "dpb" wrote in message <o6aqae$g3h$1...@dont-email.me>...
>
>>
>> Where my real beef is with Matlab/TMW on cell array searching syntax
>> is that you can't write a single expression that will look for either
>> numeric OR string in mixedCellArray without special-casing the search
>> syntax for the target. My contention is that since cell arrays are
>> designed for disparate data types the methods to access them should
>> also be designed for that purpose and they're not.
>
> So you want to ask for expansion of strfind strcmp to numerical,
> structure, etc....

Not at all, no. There should be a more extensive class of functions to
handle using cell arrays (and should have been from the time of their
introduction into the language). Searching is a very common task that
receives a lot of queries similar to this one and the answers to date
are always ugly at best...

--

dpb

unread,
Jan 25, 2017, 5:51:55 PM1/25/17
to
On 01/25/2017 11:21 AM, paul.d...@gmail.com wrote:
...

> >> ~cellfun( @isempty , ...
> cellfun( @(x)strfind(x,'dp') , ...
> mixedCellArray , ...
> 'uniform',0) ...
...

> ... The outer cellfun identifies all cells for which the inner
> cellfun found a match, and the prefix tilde turns that into all cells
> for which there was NO match.
>
> Thanks, dpb.

Actually the logic is the reverse--isempty finds the locations that did
_NOT_ have a match (a failed match returns empty result, a match returns
an array of one or more matching location indices) and then the negation
operator changes the sense to return the locations that were found.

The need there is to reduce the result of the strfind() operation to a
single element per cell in case there is more than one match (as there
is in the last cell in your example data).

--

paul.d...@gmail.com

unread,
Jan 25, 2017, 7:09:06 PM1/25/17
to
On Wednesday, January 25, 2017 at 5:51:55 PM UTC-5, dpb wrote:
Yes. My mistake. I knew what you meant, and it was what I was trying
to say. Thanks.

dpb

unread,
Jan 26, 2017, 9:30:28 AM1/26/17
to
There is one simplification if one can remember to not use ISEMPTY but
ANY instead--then you can eliminate the negation. The first time I saw
the idiom used (maybe S Lord?) was with ISEMPTY and it's stuck...

cellfun(@any, cellfun(@(x) strfind(x,'dp'),mixedCellArray, 'uniform',0))

--
0 new messages