"Hello\, World,Hi".split_escapable(',' '\')
# => ["Hello, World", "Hi"]
Through a number of permutations with regexps, scan and the rest of the
family, I was unable to find a solution. I could parse the given string
myself, going though it character by character, but I'd prefer a less
pedestrian approach.
Michael
--
Michael Schuerig All good people read good books
mailto:mic...@schuerig.de Now your conscience is clear
http://www.schuerig.de/michael/ --Tanita Tikaram, Twist In My Sobriety
Michael
#####################################################################################
This email has been scanned by MailMarshal, an email content filter.
#####################################################################################
class String
def split_escapable(split_char, escape_char)
arr = []
split(split_char).each do |x|
if(arr[-1] && (arr[-1][-1].chr == escape_char))
arr[-1].chop!
arr[-1] << x
else
arr << x
end
end
arr
end
end
It's not very good, because you can't escape the backslash, but some
magic should sort that out, and depending on what you're doing, you
might not care. Also, it's not all that elegant, so maybe you already
got this kind of solution.
Also, you should note that here:
"Hello\, World,Hi".split_escapable(',' '\')
# => ["Hello, World", "Hi"]
Since you're using double-quotes, the backslash is already being
consumed as an escape character, and it won't compile because the
backslash in the single quotes needs to be escaped because it preceeds a
single quote. And also because you missed the comma between arguments.
'Hello\, World,Hi'.split_escapable(',','\\')
# => ["Hello, World", "Hi"]
works.
Your above example is missing a couple of \, but I assume I know what
you meant.
Is the following elegant or not?
class String
def split_escapable( separator, escape_char=nil )
results = []
re = /(.+?)(?:#{escape_char ? "([^\\#{escape_char}])" : ''}#
{separator}|$)/
self.scan( re ){ |str,last_char|
results << str + last_char.to_s
}
results
end
end
p "Hello\\, World,Hi".split_escapable( ',', '\\' )
#=> ["Hello\\, World", "Hi"]
Note that the above does not account for the case of:
Hello \\,World
(where an escaped backslash is intended to end the first entry)
but if that was important, that's just a matter of a bit of odd/even
backslash counting.
Something like (untested):
re = /(.+?)(?:#{escape_char ? "([^\\#{escape_char}](\\#{escape_char}\
\#{escape_char})*)" : ''}#{separator}|$)/
class String
def split_escapable( splitter, escaper )
escaper = escaper*2 if escaper=='\\'
re = %r{ \G
# Make sure at least 1 character remains.
(?= . )
(
(?:
[^#{ splitter }#{ escaper }]
|
(?: #{ escaper } . )
) *
)
(?:
#{ splitter }
|
\Z
)
}xm
scan( re ).map{|x| x.first.gsub( /#{escaper}(.)/, '\1' ) }
end
end
s = <<HERE
Hello@, World!,Hi.
Alarm rings@, lights flash.,One escaper @@
HERE
s.split("\n").each {|x|a=x.split_escapable(',','@');p a; puts a}
puts "----"
s = <<'HERE'
Hello\, World!,Hi.
Alarm rings\, lights flash.,One escaper \\
HERE
s.split("\n").each {|x|a=x.split_escapable(',','\\');p a; puts a}
With the new Regex engine in cvs ruby you can use a negative lookback
assertion in your split:
>> s = "Hello\\, World, Hi"
=> "Hello\\, World, Hi"
>> s.split /(?<!\\),/
=> ["Hello\\, World", " Hi"]
$ ruby --v
ruby 1.9.0 (2005-09-08) [i686-linux]
Regards,
Jason
http://blog.casey-sweat.us/
I printed that and put it on my wall. I do that all the time.
-Ben
> On 9/27/05, Michael Schuerig <mic...@schuerig.de> wrote:
>>
>> I'm trying to come up with an *elegant* way to split a string into an
>> array at a separator with the additional feature that the separators
>> can be escaped. It should work like this
>>
>> "Hello\, World,Hi".split_escapable(',' '\')
>> # => ["Hello, World", "Hi"]
>>
>> Through a number of permutations with regexps, scan and the rest of
>> the family, I was unable to find a solution. I could parse the given
>> string myself, going though it character by character, but I'd prefer
>> a less pedestrian approach.
>>
>> Michael
>
> With the new Regex engine in cvs ruby you can use a negative lookback
> assertion in your split:
>>> s = "Hello\\, World, Hi"
> => "Hello\\, World, Hi"
>>> s.split /(?<!\\),/
> => ["Hello\\, World", " Hi"]
That must be the most elegant solution. Unfortunately I can't use cvs
ruby and can't wait for it either.
Michael
--
Michael Schuerig Airtight arguments have
mailto:mic...@schuerig.de vacuous conclusions.
http://www.schuerig.de/michael/ --A.O. Rorty, Explaining Emotions
>> I'm trying to come up with an *elegant* way to split
>> a string into an array at a separator with the
>> additional feature that the separators can be
>> escaped.
> With the new Regex engine in cvs ruby you can use a
> negative lookback assertion in your split:
> >> s = "Hello\\, World, Hi"
> => "Hello\\, World, Hi"
> >> s.split /(?<!\\),/
> => ["Hello\\, World", " Hi"]
With the current Ruby RE engine, you can use zero-width positive
lookahead if you don't mind reversing the string before and after the
split:
irb(main):001:0> s = "Hello\\, World, Hi"
=> "Hello\\, World, Hi"
irb(main):002:0> s.reverse.split(/,(?!\\)/).map {|ss| ss.reverse}
=> [" Hi", "Hello\\, World"]
You might also consider handling escaped escape characters by
ignoring pairs of escape characters:
irb(main):003:0> s = "Test, Test\\, Test\\\\, Test\\\\\\, Test"
=> "Test, Test\\, Test\\\\, Test\\\\\\, Test"
irb(main):004:0> s.reverse.split(/,(?!(\\\\)*\\([^\\]|$))/).map {|ss|
ss.reverse}
=> [" Test\\\\\\, Test", " Test\\, Test\\\\", "Test"]
I hope this helps.
- Warren Brown
> Michael Schuerig wrote:
>> I'm trying to come up with an *elegant* way to split a string into an
>> array at a separator with the additional feature that the separators
>> can be escaped.
[snip]
> class String
> def split_escapable( splitter, escaper )
[snip]
> end
> end
Thanks, that appears to work indeed, although I can't claim to
understand how or why.
Michae.
--
Michael Schuerig The more it stays the same,
mailto:mic...@schuerig.de The less it changes!
http://www.schuerig.de/michael/ --Spinal Tap, The Majesty of Rock
> Note that the above does not account for the case of:
> Hello \\,World
> (where an escaped backslash is intended to end the first entry)
> but if that was important, that's just a matter of a bit of odd/even
> backslash counting.
That's a thing I'd need. Opportunistically, I'll go with William's
suggestion from a sibling post.
> Something like (untested):
> re = /(.+?)(?:#{escape_char ? "([^\\#{escape_char}](\\#{escape_char}\
> \#{escape_char})*)" : ''}#{separator}|$)/
Thanks.
Michael
LOL, what a creative solution! :) I'll have to remember that trick.
"To use lookbehinds in a regexp engine that doesn't have them,
reverse the string (and your thinking), then use a lookahead."
class String
def split_escapable(separator, escape_char, *args)
istr = dup
impossible = "\x01"
replace = "#{escape_char}#{separator}"
changed = istr.gsub!(replace, impossible)
fields = istr.split(separator, *args)
if changed
fields.each do |f|
f.gsub!(impossible, separator)
end
end
fields
end
end
a = "Hello\\, World,Hi"
puts a.split_escapable( ',', '\\' )
Cheers,
Han Holl