String concatenation and repetition

890 views
Skip to first unread message

Jacek Generowicz

unread,
Nov 23, 2012, 4:50:20 AM11/23/12
to julia-dev
I kinda like the choice of * and ^ rather than + and * as the string
concatenation and repetition operators. But I'm wondering whether
there's some deeper reason behind the choice. Are there some patterns of
usage that are made easier by this choice?

Tim Holy

unread,
Nov 23, 2012, 6:30:04 AM11/23/12
to juli...@googlegroups.com
I believe Stephan can tell you more, but I think the main reason is that
algebras over fields are commutative with respect to + but not necessarily for
*. String concatenation is definitely not commutative. I can't remember whether
this was part of the original motivation, but I also tend to think of this in
the terms that string concatenation is a lot like taking the outer product,
you're getting an object with the combined dimensionality of both factors.

So in a language designed to appeal to people who know math, * and ^ seem more
correct.

--Tim

Stefan Karpinski

unread,
Nov 23, 2012, 10:07:30 AM11/23/12
to Julia Dev
Yes. I also did some literature review of math papers about strings and string concatenation and they all use multiplication – typically expressed as juxtaposition – to express string concatenation. None of them use + for this. And if you think about repetition, it has always struck me as unclear whether to write n*str or str*n, whereas it's perfectly clear that you need to write str^n rather than n^str.

Another thing that could be added as an extension is having str1+str2 produce a pattern matching r"(str1|str2)", which dovetails with using fast boolean matrix multiplication for CFG parsing, as in http://www.cs.cornell.edu/home/llee/talks/bmmtalk.pdf



--Tim

--




Mike Nolta

unread,
Nov 23, 2012, 11:16:26 AM11/23/12
to juli...@googlegroups.com


On 2012-11-23, at 10:07, Stefan Karpinski <ste...@karpinski.org> wrote:

Yes. I also did some literature review of math papers about strings and string concatenation and they all use multiplication – typically expressed as juxtaposition – to express string concatenation. None of them use + for this. And if you think about repetition, it has always struck me as unclear whether to write n*str or str*n, whereas it's perfectly clear that you need to write str^n rather than n^str.

Another thing that could be added as an extension is having str1+str2 produce a pattern matching r"(str1|str2)", which dovetails with using fast boolean matrix multiplication for CFG parsing, as in http://www.cs.cornell.edu/home/llee/talks/bmmtalk.pdf


Ha, Kleene algebra FTW!

-Mike



On Fri, Nov 23, 2012 at 6:30 AM, Tim Holy <tim....@gmail.com> wrote:
On Friday, November 23, 2012 10:50:20 AM Jacek Generowicz wrote:
> I kinda like the choice of * and ^ rather than + and * as the string
> concatenation and repetition operators. But I'm wondering whether
> there's some deeper reason behind the choice. Are there some patterns of
> usage that are made easier by this choice?

I believe Stephan can tell you more, but I think the main reason is that
algebras over fields are commutative with respect to + but not necessarily for
*. String concatenation is definitely not commutative. I can't remember whether
this was part of the original motivation, but I also tend to think of this in
the terms that string concatenation is a lot like taking the outer product,
you're getting an object with the combined dimensionality of both factors.

So in a language designed to appeal to people who know math, * and ^ seem more
correct.

--Tim

--




--
 
 
 
Reply all
Reply to author
Forward
0 new messages