Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
Message from discussion The .bytes/.codepoints/.graphemes methods
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Austin Hastings  
View profile  
 More options Jun 28 2004, 2:27 pm
Newsgroups: perl.perl6.language
From: austin_hasti...@yahoo.com (Austin Hastings)
Date: Mon, 28 Jun 2004 11:27:34 -0700 (PDT)
Local: Mon, Jun 28 2004 2:27 pm
Subject: Re: The .bytes/.codepoints/.graphemes methods
--- Dan Sugalski <d...@sidhe.org> wrote:

> On Mon, 28 Jun 2004, Juerd wrote:

> > Dave Whipp skribis 2004-06-28  9:55 (-0700):
> > > > substr($string, 2 bytes, 4 bytes) = $substitute;
> > > substr($string, 2, 4 :bytes)

> > substr($string, 2 but graphemes, 4 but bytes);

> > I think "but" even makes sense, if substr defaults to something.

> I think mixing strings, bytes, graphemes, and code points together
> is a phenomenally bad idea, likely to lead to many tears, much
> gnashing of teeth, and quite a few rampages with sharp objects,
> not to mention a lot of code guaranteed to fail at the edge cases.

Hmm. Suppose that I have a system that is friendly to 80 byte records.
I want to output "meaningful" strings, so I want to partition a buffer
into 80-ish byte substrings, but preserve any graphemes (i.e., store
the data in a legible format).

How would I do that?

The obvious answer is a gnarly little loop, but I think I'd like to
have perl do that for me. Can I say something like:

  while ($buffer)
  {
    $output = substr($buffer, 0, 80 but bytes, units => graphemes);
    $buffer = substr($buffer, 0, length $output :graphemes);

    $cout << $output << nl; # :-)
  }

and get some dwimmery?

=Austin

> If, as a programmer, you *really* want to run with scissors then
> convert
> your string to a binary byte buffer and go from there. At least then
> when
> you poke out an eye you won't be nearly so surprised.

>                                    Dan

> --------------------------------------"it's like
> this"-------------------
> Dan Sugalski                          even samurai
> d...@sidhe.org                         have teddy bears and even
>                                       teddy bears get drunk


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.