Dear regex experts:
I am writing a JSON parser in matlab. I am looking for a faster way to match array marks ([...]) while not getting confused for anything inside pairs of double quotes.
here is a testing string:
["a", "b\\", "c\"", [ "d\\\"","e\"[" ], "f\\\"[", [ "g[\\","h]\\\"" ] ]
any tricks to match the real array brackets (there are 6 of them)
and skip the ones in the double quotes (there are 4 of them) ?
I tried the following 3 steps in matlab
s='["a", "b\\", "c\"", [ "d\\\"","e\"[" ], "f\\\"[", [ "g[\\","h]\\\"" ] ]';
s=regexprep(s,'\\\\',' ');
s=regexprep(s,'\\\"',' ');
s=regexprep(s,'(\"[^"]*\")','${char($1*0+char(32))}');
that gives
s =
[ , , , [ , ], , [ , ] ]
it first strips the double quotes before finding the brackets. But I feel it is
not so robust (maybe fails under some strange conditions?). I am wondering
if anyone knows a more robust approach, ideally does not involve dynamic
regexp.
a perl or matlab sample command would be great!
thanks in advance
Qianqian