> The problem with overloading is that the language stops forcing a readable syntax.
This argument has been made before, e.g. by Bertrand Meyer. I believe this has been proven wrong. A matrix addition is more readable if written A+B than A.add(B) for example. Now, of course, no language can prevent you from writing code such that A+B could zero matrix A and multiply matrix B by two. But forcing the programmer to write A.add(B) will not prevent any kind of obfuscation, only start with a higher level of obfuscation...
The real test is what a thing that is well written looks like, not what something that is obfuscated looks like. In other words, I do not think that arguments based on misuses of a feature really matter that much. You can misuse most features of Go as well.
> In Go the line:
>
> for i := 0; i < 26; i++ { println(i) }
For what it's worth, you are not convincing me with this example, because IMHO, the semi-colons, the i++, the <, the braces and the parentheses are all clutter, so I'd like to get rid of them.
In XL, the same loop is:
for I in 0..25 loop
writeln i
> Reads as such: "While variable i, initialized as integer type 0, is less than 26, and increases by one per iteration, do the following: print on a line the current value of i."
Do you really have trouble reading the XL loop above as plain English?
Now, the real issue to me is: how do you iterate over all elements of a rectangular matrix in Go. In XL, you can write it as:
for I,J in M loop ...
I guess in Go, that would be (pardon me if there is a better way to compute the size of M...)
for i := 0; i < unsafe.Sizeof(M[0])/unsafe.Sizeof(M[0][0]); i++
for j := 0; j < unsafe.Sizeof(M[0][0])/unsafe.Sizeof(M[0][0][0]); j++
...
How is Go more readable here? More importantly, why is it better for the programmer to have to type this, rather than the compiler do it for you?
> Each portion of the line of code has a definite and succinct meaning. Indeed, one could code entirely with full sentences and punctuation from natural language (and have it sound grammatically correct, if somewhat lack-luster) if they took the symbols of Go and transfered it almost directly to English or the like.
That remains true with an extensible syntax. The only difference is who implements the definite and succinct meaning.
In one case, it's hard-coded in the Go compiler, meaning that when it no longer becomes succinct (as in my matrix example above), you are stuck with verbose code.
In the XL case, it's in the standard library. I argue that the second implementation leaves more room for growth and refinement, and more to the point, highly readable and mistake-proof notations even in the more complicated cases.
> This definition and compression allows us to build a program that can be read by another programmer as well as provide a set of tools to allow the concepts to be transferred to code. Even in XL, there are basic structures such as functions, loops, computations, generics, etc. So, the only selling point that I am seeing in XL is the ability to overload anything and everything with ease.
The key selling point in my opinion is that XL doesn't lock you in whatever concepts I came up with. That's what I'm arguing Go should also have. The title was not "Why Go should have X and Y feature", but "Why Go should be extensible."
Here is an example. Go has imaginary literals. Why should the core language have to define that? In XL, I can write 2+3i, but it's not anything special built into the compiler, it's just a declaration in the complex package.
Here is another example. Go doesn't have 24-bit fixed-point literals. Why not? How do I program a DSP56000 with Go? In XL, all you need is to add a type to the standard library or some CPU-specific library, and it will behave like any other type. You want to write 3dsp for DSP 24-bit fixed-point literals? That works. You want complex numbers where the imaginary and real part are 24-bit fixed point? That works. You need a 3x3 matrix of complex numbers with fixed point 24-bit elements? Again, that just works.
Now just think for a minute the amount of work it would take to get that in Go if you needed it... And replace "fixed point literal" with something that matters to you...
> In addition to code obfuscation, there would likely be a large performance hit for the compiler if the symbol table could change in any arbitrary piece of code, and would likely need to either throw syntax errors if the overloading function was below the executed code, or read the code at least twice to make sure it got all of the symbology correct. Indeed, if one has an overload that relies on another overload they can suddenly be stuck with overload dependencies, requiring time to sort out the dependency tree, then compiling through the tree.
Well, true, compilation performance is a big selling point in Go (a little too much so, IMHO). By contrast, my generalized rewrite algorithm is relatively expensive.
However, experience compiling for example the native XL compiler shows that it's not the XL front-end that is expensive, it's code generation:
Front-end:
real 0m4.959s user 0m4.511s sys 0m0.394s
Back-end (debug build)
14.24 real 12.76 user 0.82 sys
Back-end (opt build):
real 0m45.713s user 0m43.837s sys 0m1.275s
> As for the basic structures being extensible, I doubt there are too many things that cannot be modeled with a fair degree of ease in Go. Of course, if a revolutionary concept emerges that replaces the current paradigm of "Object + Action" (which appears to be the most basic abstraction one can do in life) then I'm sure that C, C++, Go, Java, etc. will all become obsolete, where XL may overload into the new model.
The proof of the pudding is that you eat your own dog food :-) In XL, all the basic concepts (if statement, for loops, procedures, basic types) are built on the same extensibility mechanisms. It's with these simple things that you can test if the mechanism is sound, solid, flexible. In other words, Go gives you a fish, XL gives you a fishing rod, and one that I tested can catch the most common kind of fish.
That being said, I disagree with your statement that there aren't too many things that cannot be modeled in Go. What about Prolog-style back-tracking? What about symbolic differentiation (one of the examples I gave)? What about SIMD instructions? What about Cg-style code for graphics accelerators? What about Ada-like rendez-vous based tasking? Or all the things that have to be "special cases" in go, sizeof or pointers or arrays or slices or ... What if I don't like the way Go does strings (I don't, actually)? Can I do my own strings? Can I get the Go compiler to compile regexps at compile-time for me from a string like "[A-Z]+[0-9]"?
Now let's take something simpler. What about text I/O? This last one is interesting. Go has variadics, you write ...string and you get a slice of strings. But with text I/O, you don't want a slice of strings, you want a slice of different things. So Go's variadics cannot be used to implement Go's println. Unless I'm mistaken, print and println are "special" in Go, i.e. you can't write them in Go. IMO, that's a pretty big "thing that cannot be modeled with a fair degree of ease in Go".
> Meta-programming in general can be achieved via a pre-processor taking macros and other special syntax and transforming it into a more normal Go code. This may be added at some point to allow for the sort of overloading that XL does, but it seems unnecessary, as all of the examples shown can be implemented with a more static form in Go, allowing for code to be uniform as well as readable to anyone who read and understood the Go specification (which is pretty short).
Are you sure that you really analyzed all my examples carefully enough?
- WriteLn example: Could you explain, for example, how you would implement println in Go? My example shows how you do that in XL.
- Complex data type example: Go has a complex type, but no matrix and no quaternions. How would you implement these in Go? Assuming you implement a type for quad precision floating point, or a floating-point range type (i.e. where all operations return an upper and lower bound for the result), or a symbolic type, can you leverage Go's "complex" definition to use these types as representation? With XL, you can.
- Basic integer data types: How would you add a DSP fixed-point type to go, or the kind of SIMD instructions now found in most CPUs (MMX, Altivec, ...)? And, again, how would it integrate with the rest of what you built, e.g. text I/O, complex numbers, matrix operations and the like?
- Compiler plug-ins: You claim it would be possible with a pre-processor. Could you give an example of what the code would like like to implement the d/dx(exp(x)) notation I demonstrated? BTW, for XL, I forgot to give the pointer to the corresponding plug-in, here it is: http://xlr.git.sourceforge.net/git/gitweb.cgi?p=xlr/xlr;a=blob;f=xl2/native/xl.plugin.differentiation.xl;h=deae41e5fdb44be5ea77f792b8c0cf7b98854505.
What really matters in that link: the whole plug-in is 191 lines of XL code, including 36 lines of comments. And with that simple plug-in, you can write d/dt(sin(omega * t) * exp(-t/t0)) in your code and let the compiler do the work for you. Now, I'd love to see you do that with a simple pre-processor in Go, but quite frankly, I believe that it would be really hard.
And in case you think it doesn't matter, Google "Symbolic differentiation in C++", you should get something like 38000 hits. So there are folks who would like that.
> In the end, those are just my two cents too, but I believe that the producers of Go are searching for simplicity. They've proven this on many occasions, avoiding practically everything that they haven't found a simple and effective method of doing (generics being one of them, because there has not been a simple and effective enough method come up yet).
Actually, it may be hard to believe for people on this list, but I find Go complicated relative to XL :-) That was not my original point, but since you bring it up, I think it's worth sharing some of the reasons:
- Not counting comments, Go has 10 lexical element types (tokens, semicolons, identifiers, keywords, operators/delimiters, integer literals, floating-point literals, imaginary literals, character literals, string literals). XL has 4: integer, real, symbol/name and text.
- Go has three special cases for integer literals, 10, 08 and 0xA. These rules don't apply to floating-point. XL has one special case that works for both integer and real literals: 16#FF, 2#1.1001.
- Go has 20 types that are special to the compiler: 5 uint, 5 int, 3 float, 3 complex, bool, byte, uintptr and string. XL has none.
- Go has special rules for building the following 8 structured types: arrays, slices, struct, pointers, functions, interface, map, channels. XL has "special rules" for 4 kinds of types, functions, iterators, struct and generic types, but it can build all the others in such a way that you wouldn't be able to tell. Furthermore, the same tools would allow me to build bit vectors, disk-based pointers, 1-based arrays, sparse matrices, ... and use them as easily as arrays in Go. Finally, the "special rules" in XL are not that special, it's plug-ins just like the symbolic differentiation plug-in.
- Go has one big set of special rules in the unsafe package. What is this ArbitraryType thing? Is it a generic type in disguise or what?
- Finally, to use a somewhat objective measure of complexity, the Go compiler is 37Kloc (only the sources under gc), the XL compiler 21Kloc of XL (sources under native, see http://xlr.git.sourceforge.net/git/gitweb.cgi?p=xlr/xlr;a=tree;f=xl2/native). Build times are around 38s for go, 30s for XL, and the XL build actually builds 5 different compilers during that time (a bootstrap compiler written in C++, a bootstrap compiler written in simplified XL, 2 test compilers to validate the bootstrap, and an XL compiler written in XL).
Thanks for letting me know about XL, from your description it appears
to have most of the features I hope Go, and any other language I ever
use, will never have.
Now when somebody asks me what I don't want in a programming language,
I know where to point them.
It also seems to be a perfect example what are all the wrong
priorities when designing a language.
uriel
Go is already pretty far down the design path. We have a relatively
stable language spec, two working compilers, and a large amount of
code already written. Adopting these ideas wouldn't require mere
changes to the language, but fundamental differences in design
philosophy from day one.
It seems that languages like XL and the Lisp family already do these
things well. Go is a different kind of language entirely.
> One interesting consequence of taking the problem that way is that
> language designers cannot know which concepts will matter ahead of
> time.
I am curious: could you implement goroutines and channels in XL?
Andrew
That was an inappropriate and rude response to someone who has taken
the time to share their hard work with us. Please be civil when
sharing your opinions here.
Andrew
> Go is already pretty far down the design path. We have a relatively
> stable language spec, two working compilers, and a large amount of
> code already written. Adopting these ideas wouldn't require mere
> changes to the language, but fundamental differences in design
> philosophy from day one.
Maybe so. If that's the case, I'm sorry I shared this too late. But then, I think it's possible the parse tree structure is still simple enough it can be made public or something like that. I.e. the Go parser and compiler infrastructure could be made accessible to Go program. That would be a good first step and go a long way...
> I am curious: could you implement goroutines and channels in XL?
I see no reason why not. Implementing as a compiler plug-in, certainly. Implementing without any meta-programming, I don't think so...
The 'go' semantics implies that you start the thread, then evaluate the argument in that thread. So it can't be implemented with a mere function call (where arguments would be evaluated before the thread starts). There may be a way with lazy evaluation of arguments (similar to the way you implement a lazy "and"), but I don't remember if that works today.
Channel semantics, on the other hand, looks like it can be done entirely with function calls, so that one doesn't need anything special.
Thanks
Christophe
Actually, when you launch a goroutine it's just like a normal function
a call. The function's arguments are evaluated in the calling
goroutine, and the function is executed in the new goroutine.
> Channel semantics, on the other hand, looks like it can be done entirely with function calls, so that one doesn't need anything special.
But how will it perform? Performance is very important to the Go project.
Andrew
>> Channel semantics, on the other hand, looks like it can be done entirely with function calls, so that one doesn't need anything special.
>
> But how will it perform? Performance is very important to the Go project.
By "function calls", I didn't mean to imply an actual call. For example, in XL, A+B where A and B are integers is a function call, to a function declared as:
function Add(X, Y : integer) return integer written X+Y is XL.BYTECODE.add_int
The XL.BYTECODE.add_int means the code generator will lookup a file called xl.bytecode for the corresponding code sequence. That could be a function call, a single machine instruction, or something more complicated. So as long as I can steal from go the code sequence for x <- y, I guess I could not just have channels, but go-compatible channels even.
BTW, performance is very important for XL too, although I tend to focus only on runtime performance, not compile-time.
> Thanks for letting me know about XL, from your description it appears
> to have most of the features I hope Go, and any other language I ever
> use, will never have.
Would you mind tell me which features you'd like Go to stay away from? It doesn't matter much to me, obviously, but I'm sure it's useful for the Go designers to know what they should avoid at all cost if they want to keep you...
> Now when somebody asks me what I don't want in a programming language, I know where to point them.
Glad to be of service. Send them my way, maybe some of them will have a different opinion.
> It also seems to be a perfect example what are all the wrong priorities when designing a language.
Could you elaborate a little? What put you off that way? Syntax? Extensibility? Solutions to problems you didn't know existed or don't care about? Talking on golang-nuts about another developing language?
Conversely, what are the priorities in Go you think I should take inspiration from?
> I think you need to use Go a bit more before you start talking about what Go can and cannot do.
That is certainly true. I am absolutely not an expert in Go. I plead guilty (by ignorance).
> For example, if you want to see how you can implement println in Go, just look at fmt.Println: http://golang.org/pkg/fmt/#Println.
Ah, I see. Thanks for answering my question "how would you implement println in Go".
So, is the Go specification is out of date? It states "Current implementation provides several built-in functions useful during bootstrapping. These functions are documented for completeness but are not guaranteed to stay in the language." It looks to me like it should mention the right way to do it, which is (if I understood correctly) the fmt package.
Apparently, you can use interface{} to denote an arbitrary type. I didn't know that. If that's the case, I don't clearly understand the difference between ArbitraryType (as declared below) and interface{}, both used in package unsafe:
type ArbitraryType int // shorthand for an arbitrary Go type; it is not a real type
> The use of println is actually discouraged, since it was only added to get Go going while its libraries matured.
So I assume the replacement is the stuff in package fmt. That definitely looks better than what I had understood. It's extensible through the Formatter interface too, which is what matters in this thread. OTOH, I can't say that I'm really fond of the "BigSwitch" on reflective RTTI, or the general complexity of the thing. Regarding simplicity the XL text I/O modules is 115 lines, the Go code is 996, but then it does formatted I/O which the XL module doesn't...
>
> Further, your loop is way too complex for traversing a matrix. A better example is:
>
> for i := range M {
> for j := range M[i] {
>
> }
> }
Ah, yes, this is slightly better. I missed the "range" syntax in the for loop, what I had seen was "length and capacity", and there wasn't a function for arrays. To my defense, I did write: "pardon me if there is a better way to compute the size of M", though :-)
> This isn't nearly as complicated as what you proposed. In fact, it seems like you intentionally looked for ways of making the loop more complex than it needed to be in order to achieve a false contrast favouring your preference. People normally do this, but your example was ridiculously extreme.
Never attribute to malice that which can adequately be explained by stupidity. I was looking for a range or size operator on arrays, didn't think to look for it in the description of the for loop...
On the other hand, you seem to go out of the way to totally miss my point about extensibility. Replace my example with a loop that iterates over a file directory, over all lines/chars in a file, over all the non-zero elements of a sparse matrix, and so on... Is there any way to factor out such loops in Go?
In C++, there is at least a common idiom, which is
for (iterator i = x.begin(); i != x.end(); x++).
Some other languages have foreach. It's unclear to me how you achieve the same thing in Go. I will no longer claim you can't :-)
> There is a place in the world for all different languages; different tools for different jobs.
If I didn't believe that, I wouldn't visit the Go community to say "here are some ideas you might want to borrow", now would I?
Please remember that these are not just vague and unsubstantiated ideas either. I did design a language and implement all the ideas I shared with you, so I know for certain it works. I also shared metrics showing how it actually simplifies compiler design. Remember: the XL compiler is less than 2/3 the size of gc, I believe this means something...
> Go is simple (if you aren't approaching it with a laundry list of expectations that you've been brainwashed into considering essential ;) ) and regular. A single person can learn the whole language in a few days, making it easy to maintain.
I wrote elsewhere why I personally see Go as way too complicated. But that's probably hard to hear when you have been brainwashed with the idea that Go is the final word in simplicity :-)
> Most of the features of XL simply wouldn't be appropriate in a language like Go. Its not that they're not good features, they're just targeted to a different purpose.
Well, I wouldn't have submitted the ideas here if I thought the features were not appropriate for Go. But ultimately, it's really your call (by "you", I mean this community).
Go seems to strive for simplicity, this is where we meet. Here is my philosophy: you don't simplify by removing features, you simplify by removing rules. In my opinion, Go has too many rules, each one of them being too specific. For example, it has several rules explaining arithmetic operators, comparison operators, logical operators, whereas XL has a single rule for expression reduction. Go has four variants of 'for' statement, XL has only one that covers more cases. Go has keywords, XL doesn't. And so on.
But then, my definition of simplicity is not necessarily yours. I'm sharing with you, you don't have to accept it.
> In the end, some people will like having complete control to customize their toolkit, while others want a regular framework to work within. Remember, Go is being made at Google with their own use cases in mind. Businesses, including Google, often put restrictions on how programming languages are used since you really need a lot of structure in order to have a large number of people working well together.
Lisp hackers sometimes refer to this philosophy as "Bondage and discipline programming languages" ;-)
> Are there some things Go needs to work on? Certainly. Does it need to have every feature anyone ever thought of added to it? No.
This is not what I'm advocating. I'm advocating that it should have room for growth, that it should be easy to add features when you need them, that it's better to have a small number of powerful rules than a large number of simple rules.
Right now, there are many interesting things I just can't do in Go, and that bothers me.
> The key selling point in my opinion is that XL doesn't lock you in
> whatever concepts I came up with. That's what I'm arguing Go should
> also have. The title was not "Why Go should have X and Y feature", but
> "Why Go should be extensible."
XL sounds like an interesting language. It also sounds like a language
that is fundamentally different from Go. The goals of XL are entirely
different from the goals of Go; the latter can be found in Go's Language
Design FAQ. Go isn't going to adopt these ideas from XL, because they
do not fit the purpose of the language.
> So, is the Go specification is out of date? It states "Current
> implementation provides several built-in functions useful during
> bootstrapping. These functions are documented for completeness but are
> not guaranteed to stay in the language." It looks to me like it should
> mention the right way to do it, which is (if I understood correctly)
> the fmt package.
The language specification describes the language proper and rarely
refers to the libraries. I don't consider the fact that the spec does
not mention the fmt library to indicate that the spec is out of date.
> Apparently, you can use interface{} to denote an arbitrary type. I
> didn't know that. If that's the case, I don't clearly understand the
> difference between ArbitraryType (as declared below) and interface{},
> both used in package unsafe:
>
> type ArbitraryType int // shorthand for an arbitrary Go type;
> // it is not a real type
The type interface{} has a specific runtime representation. The type
*interface{} is a pointer to that representation. The description of
the unsafe package uses ArbitraryType to make clear that what you get
with unsafe.Pointer is a pointer to an actual value of some type, not a
pointer to a value of the type interface{}.
In other words, the type interface{} does not denote an arbitrary type.
It denotes a specific interface type. A value of any other type may be
converted to that specific interface type, a conversion which involves a
change in the way that the value is represented (specifically, the value
is boxed).
Ian
I'm not sure the goals are so different. Here is what the FAQ says about creating a language:
>> Go was born out of frustration with existing languages and environments for systems programming. Programming had become too difficult and the choice of languages was partly to blame. One had to choose either efficient compilation, efficient execution, or ease of programming; all three were not available in the same mainstream language. Programmers who could were choosing ease over safety and efficiency by moving to dynamically typed languages
That pretty much sums up why I created XL.
Also, from http://golang.org/doc/talks/go_talk-20091030.pdf, Go is a New, Experimental, Concurrent, Garbage-collected, Systems language. I'm pointing out where I think (personal opinion, nothing more) Go could be improved in the New, Experimental and Systems categories.
Again, my objective is not to convert you to XL, only to shared ideas that might make sense for Go. You don't have to take them. You don't have to improve Go, you can consider it the ultimate language design. Or you can wait until it's to late to fix anything. Maybe it's already too late, actually...
> Go isn't going to adopt these ideas from XL, because they do not fit the purpose of the language.
Why is having a standardized parse-tree a bad idea for Go? Why is having an extensible for syntax a bad idea for Go? Since generics are being considered (just "not urgent"), why not learn about a better way to do generics than C++?
Again, don't use these ideas if you don't want to, but don't rule them out too quickly either. Just make sure you understand what you are leaving out.
> The type interface{} has a specific runtime representation [...]
Thanks for the explanation. So interface{} is really a way to get boxing. It's much like C# then.
One stated goal of Go is performance. Naively, boxing and unboxing seem to go against that goal. How does Go solve that? In XL, Write is a generic function, so there's no performance penalty at all (other than code bloat).
> I'm not sure the goals are so different. Here is what the FAQ says about creating a language:
>
>>> Go was born out of frustration with existing languages and
>>> environments for systems programming. Programming had become too
>>> difficult and the choice of languages was partly to blame. One had
>>> to choose either efficient compilation, efficient execution, or ease
>>> of programming; all three were not available in the same mainstream
>>> language. Programmers who could were choosing ease over safety and
>>> efficiency by moving to dynamically typed languages
>
> That pretty much sums up why I created XL.
You should go on to read the second paragraph.
> Again, my objective is not to convert you to XL, only to shared ideas
> that might make sense for Go. You don't have to take them. You don't
> have to improve Go, you can consider it the ultimate language
> design. Or you can wait until it's to late to fix anything. Maybe it's
> already too late, actually...
I'm not saying that Go does not have to be improved, nor that it is the
ultimate language design. I'm saying that the ideas you expressed are
not the direction that Go is going to take.
>> Go isn't going to adopt these ideas from XL, because they do not fit
>> the purpose of the language.
>
> Why is having a standardized parse-tree a bad idea for Go?
There is nothing wrong with a standardized parse tree, it's just not
important. As you know there are at least three different parsers for
Go, all implemented differently.
> Why is having an extensible for syntax a bad idea for Go?
* A language with an extensible syntax is not a single language, it is a
family of related languages.
* A language with an extensible syntax is slower to compile, because the
compiler has to integrate a set of plugins which operate on the syntax
tree. I understand that your XL compiler is fast. However, it seems
clear to me that a compiler with plugins will be slower than a
compiler without plugins.
* I don't know how the backend for such a compiler works, but perhaps it
also operates via plugins. That would imply that a language with an
extensible syntax tree is limited in its optimization capabilities,
because it is difficult to optimize interactions between different
plugins.
* When I use a plugin to define a new type, I need to define conversions
to other types. When I use two independent plugins to define two new
types, how do I convert from one type to the other?
* If I want Common Lisp macros, I know where to find them.
> Since
> generics are being considered (just "not urgent"), why not learn about
> a better way to do generics than C++?
I don't see extensible syntax as giving me the same thing as generics.
The first thing everybody wants from generics is efficient type-safe
containers. How do I get that from an extensible syntax?
>> The type interface{} has a specific runtime representation [...]
>
> Thanks for the explanation. So interface{} is really a way to get
> boxing. It's much like C# then.
>
> One stated goal of Go is performance. Naively, boxing and unboxing
> seem to go against that goal. How does Go solve that? In XL, Write is
> a generic function, so there's no performance penalty at all (other
> than code bloat).
Code bloat is a runtime cost. Compiling different code to handle
different types is a compile-time cost. Boxing and unboxing are a
compromise between compile time and run time aimed at getting good
results that are acceptable.
Ian
Whilst it may not be important, it is very handy for having a standardised meta-programming model. My fear with Go is that any and all significant meta-programming will have to be done in a third-party language and hence will be less maintainable and less portable between projects.
I know that experience with the C preprocessor has taught many systems programmers to be suspicious of any kind of meta-programming, but languages which support it well have a considerable advantage in development speed. Measuring my current projects against my C baseline dev speed Go is a good degree faster, but compared to my Ruby baseline it's still hideously slow. Now that's partly due to a difference in expertise (9 years with Ruby, 8 months with Go - although 22 years of C apparently counts for nothing here) but without meta-programming there's a lot of mechanical repetition in Go that just wouldn't exist in equivalent Ruby code.
This isn't an argument for the wholesale importation of dynamic code generation but the idea of a clean interface to Go's parse tree during compilation is surely worth considering?
>> Why is having an extensible for syntax a bad idea for Go?
>
> * A language with an extensible syntax is not a single language, it is a
> family of related languages.
Well Forth has extensible syntax thanks to direct access to parser input, but it's clearly a language in its own right.
> * A language with an extensible syntax is slower to compile, because the
> compiler has to integrate a set of plugins which operate on the syntax
> tree. I understand that your XL compiler is fast. However, it seems
> clear to me that a compiler with plugins will be slower than a
> compiler without plugins.
Not true. A compiler built on a plugin architecture requires a processing pipeline, and that processing pipeline can be more easily decoupled to scale across cores/processors/boxes.
> * I don't know how the backend for such a compiler works, but perhaps it
> also operates via plugins. That would imply that a language with an
> extensible syntax tree is limited in its optimization capabilities,
> because it is difficult to optimize interactions between different
> plugins.
That could be true, and is indeed one of the questions I've been pondering in the context of GoLightly (because of its extensible instruction-set model). If I come up with any kind of workable solution I'll be more than happy to share.
> * When I use a plugin to define a new type, I need to define conversions
> to other types. When I use two independent plugins to define two new
> types, how do I convert from one type to the other?
I'm assuming there's work that already has to be done to define type conversions in the existing compiler so surely you're just talking about passing the related metadata along the pipeline and using a data format that is itself extensible?
> * If I want Common Lisp macros, I know where to find them.
Hence why we have Greenspun's 10th law. Many problems need macro-like capabilities but are constrained to a language such as C which lacks them, leading to half-arsed reimplementation of said macros to get the job done. The end result is code that's fragile and difficult to maintain but which passes the commercial yardstick of "deployed by an arbitrary deadline". That's a major loss for everyone concerned.
I realise that adding a decent meta-programming model within Go's design constraints is no easy task - it would require statements which were explicitly about compilation and finding an elegant vocabulary which works within Go's other design constraints could take a long time - but if Go's going to stem the brain drain to dynamic languages it really does have to offer one. Otherwise it's going to remain considerably less productive than Python, Ruby or (God help us) Perl - even as it leaves C++ and Java in the dust.
Ellie
Eleanor McHugh
Games With Brains
http://feyeleanor.tel
----
raise ArgumentError unless @reality.responds_to? :reason
On 26 juil. 2010, at 11:13, Ian Lance Taylor wrote:
> You should go on to read the second paragraph.
Yes, I did. I agree until the point "to meet these goals required". To me, what follows "required" is a very weak proof.
> I'm saying that the ideas you expressed are not the direction that Go is going to take.
Fine. I have little to bring to the table, then.
> There is nothing wrong with a standardized parse tree, it's just not important.
I have a different opinion.
> As you know there are at least three different parsers for Go, all implemented differently.
I don't see why that's a good thing.
> * A language with an extensible syntax is not a single language, it is a family of related languages.
... not much more so than a language with modules, classes or functions.
> * A language with an extensible syntax is slower to compile, because the compiler has to integrate a set of plugins which operate on the syntax tree.
It doesn't have to be: Apache or Firefox are not slow, yet they allow plug-ins.
> I understand that your XL compiler is fast. However, it seems clear to me that a compiler with plugins will be slower than a compiler without plugins.
My experience implementing a compiler with plug-ins tells me otherwise.
> * I don't know how the backend for such a compiler works, but perhaps it also operates via plugins. That would imply that a language with an extensible syntax tree is limited in its optimization capabilities, because it is difficult to optimize interactions between different plugins.
LLVM is a nice counter-example. Plug-ins enable more optimizations, not less.
> * When I use a plugin to define a new type, I need to define conversions to other types. When I use two independent plugins to define two new types, how do I convert from one type to the other?
Explaining XL types is off-topic, but a good plug-in systems fosters collaboration between plug-ins. For example, XL generic types and functions share much of the same overload resolution code.
> * If I want Common Lisp macros, I know where to find them.
... which means you don't have the excuse of ignorance when you decide to ignore meta-programming in a new language design today.
> I don't see extensible syntax as giving me the same thing as generics.
It doesn't. XL also implements generics (using plug-ins, but that's besides the point). It's a different approach than C++, worth looking at when/if you design Go's generics.
> The first thing everybody wants from generics is efficient type-safe containers. How do I get that from an extensible syntax?
The primary thing you get from C++ generics is, indeed, efficient type-safe containers. With XL generics, you get better text I/O, a better complex type, a better "maximum" function, library-defined pointer and array types, and more.
If Go had started with generics, you wouldn't need a string or map type to be hard-coded in the language.
> Code bloat is a runtime cost. Compiling different code to handle different types is a compile-time cost. Boxing and unboxing are a compromise between compile time and run time aimed at getting good results that are acceptable.
You may be ready to accept it, but I see no reason I would pay the price if I don't have to.
> Disclaimer: Creating animosity is not my objective. Since there seems
> to be a consensus that XL ideas are not too relevant for Go, I'll
> probably refrain from posting further on this list.
I also do not intend to create animosity. I am trying to respond
seriously to your suggestions.
>> As you know there are at least three different parsers for Go, all
>> implemented differently.
>
> I don't see why that's a good thing.
It's not a good thing in itself. But there is an advantage to being
able to parse a language using whatever approach is appropriate for the
task at hand.
>> * A language with an extensible syntax is slower to compile, because
>> the compiler has to integrate a set of plugins which operate on the
>> syntax tree.
>
> It doesn't have to be: Apache or Firefox are not slow, yet they allow plug-ins.
I don't understand how my statement could be false. On the one hand you
have the time required to find and load a set of required plugins. On
the other hand you do not. You can say that the time required is
irrelevant--that could be true--but I don't see how you can say that it
does not exist.
The examples of Apache and Firefox are not immediately relevant because
they start once and run for a long time. A compiler starts many times
and runs for a short time. Startup time matters.
>> * I don't know how the backend for such a compiler works, but perhaps
>> it also operates via plugins. That would imply that a language with
>> an extensible syntax tree is limited in its optimization
>> capabilities, because it is difficult to optimize interactions
>> between different plugins.
>
> LLVM is a nice counter-example. Plug-ins enable more optimizations, not less.
LLVM keeps the IR the same, and permits plugins which optimize the fixed
IR. It seems to me that a language which has extensible syntax must
have an extensible IR, or it must have a low-level IR, or it must have
an extremely general IR. As far as I can see, any of those choices will
inhibit optimizations. What am I missing?
>> * When I use a plugin to define a new type, I need to define
>> conversions to other types. When I use two independent plugins to
>> define two new types, how do I convert from one type to the other?
>
> Explaining XL types is off-topic, but a good plug-in systems fosters
> collaboration between plug-ins. For example, XL generic types and
> functions share much of the same overload resolution code.
I'm still curious about the issue. I went to look at the XL docs, but
the entry for "Type System" appears to be as yet unwritten.
>> * If I want Common Lisp macros, I know where to find them.
>
> ... which means you don't have the excuse of ignorance when you decide
> to ignore meta-programming in a new language design today.
But Common Lisp was around when I was in school 20 years ago, and it
hasn't caught on in any serious way. And back when I wrote programs in
Common Lisp, the main effect of macros was to make code nearly
incomprehensible. So it's not, for me, a particularly strong example.
>> Code bloat is a runtime cost. Compiling different code to handle
>> different types is a compile-time cost. Boxing and unboxing are a
>> compromise between compile time and run time aimed at getting good
>> results that are acceptable.
>
> You may be ready to accept it, but I see no reason I would pay the
> price if I don't have to.
What I'm saying is that you always pay a price. You have to choose
which price you are going to pay.
Ian
> This isn't an argument for the wholesale importation of dynamic code
> generation but the idea of a clean interface to Go's parse tree during
> compilation is surely worth considering?
I guess I don't know how to think about that in isolation. What would
it look like in practice? What would it permit you to do that you can't
do today? Since all language choices are tradeoffs, what would be the
downside?
>> * A language with an extensible syntax is slower to compile, because the
>> compiler has to integrate a set of plugins which operate on the syntax
>> tree. I understand that your XL compiler is fast. However, it seems
>> clear to me that a compiler with plugins will be slower than a
>> compiler without plugins.
>
> Not true. A compiler built on a plugin architecture requires a
> processing pipeline, and that processing pipeline can be more easily
> decoupled to scale across cores/processors/boxes.
It's conceptually straightforward to distribute the optimization passes
of a compiler. You can cut up the program into functions and regions
within the functions, optimize them independently, and combine the
results. For example, gcc's WHOPR project does this at the function
level. However, an extensible syntax implies plugins which affect the
parser. It is far less straightforward to distribute the parsing phase
of a compiler since, conceptually, what you see on line 10 can affect
how you parse line 20.
>> * When I use a plugin to define a new type, I need to define conversions
>> to other types. When I use two independent plugins to define two new
>> types, how do I convert from one type to the other?
>
> I'm assuming there's work that already has to be done to define type
> conversions in the existing compiler so surely you're just talking
> about passing the related metadata along the pipeline and using a data
> format that is itself extensible?
But consider type conversions like Go's string to []int conversion.
It's hard for me to imagine any plausible type metadata which can define
that conversion in any way other than procedurally. And a procedural
type conversion can only be defined for known types, not for unknown
ones.
> I realise that adding a decent meta-programming model within Go's
> design constraints is no easy task - it would require statements which
> were explicitly about compilation and finding an elegant vocabulary
> which works within Go's other design constraints could take a long
> time - but if Go's going to stem the brain drain to dynamic languages
> it really does have to offer one. Otherwise it's going to remain
> considerably less productive than Python, Ruby or (God help us) Perl -
> even as it leaves C++ and Java in the dust.
Well, I'm not personally convinced. But I would certainly be interested
in seeing a proposal that fits in with the existing Go language.
Ian
However, I do not think such facilities jive with Go. Go excels at
allowing clean, clear, crisp expression of your code, moreso than I've
seen in any metaprogramming-free language. There are a lot of areas
where metaprogramming pays huge dividends, but system programming is
not, imo, one of them. There's a lot to be said for what you see is
what you get when you're working at the lower levels.
One of the reasons I hate C++ is that I have to read three pages of
code to comprehend the effect of what should be a simple expression.
There's an argument for that being because all of C++'s fancy
mechanisms are being jammed into a relatively strict syntactic
framework, but that's another reason I'd like to avoid trying to pour
too much coffee into one mug, so to speak (that metaphor's not very
apt but I could use some coffee so it's all I can think about at the
moment).
In sum, I'm one happy metaprogrammer that's quite happy with Go's lack
of metaprogramming facilities. Different tools for different jobs.
> I also do not intend to create animosity. I am trying to respond seriously to your suggestions.
I know, but you are probably right that this is not the direction Go is going. So if I'm not contributing, I will just shut up.
>> It doesn't have to be: Apache or Firefox are not slow, yet they allow plug-ins.
>
> I don't understand how my statement could be false. On the one hand you have the time required to find and load a set of required plugins. On the other hand you do not. You can say that the time required is irrelevant--that could be true--but I don't see how you can say that it does not exist.
On modern OSes, it doesn't make much of a difference if you load 1MB of code from a shared library or from a.out. In both cases, the OS doesn't actually load anything until you start executing it, it's all demand-paged. So load time is similar, assuming you have the same amount of code.
That being said, XL plug-ins are currently configured statically. The only cost is a small initialization cost to add the various plug-ins to the main symbol table.
The real cost in the case of XL is not the plug-ins, but the generalized rewrite rule. Everything is in the same symbol table, functions, generics, and standard statements like "if then else". The positive thing is that the rewrite engine tends to always be in the cache. The negative thing is that you can't special-case integer operations to make them faster, you go through overload resolution for 1+1.
> The examples of Apache and Firefox are not immediately relevant because they start once and run for a long time. A compiler starts many times and runs for a short time. Startup time matters.
Agreed. But again, nothing prevents you from making plug-ins a compile-time option of the compiler. In other words, plug-ins mean only that you can add or remove them, not that you have to load them dynamically.
>>> * I don't know how the backend for such a compiler works, but perhaps
>>> it also operates via plugins. That would imply that a language with
>>> an extensible syntax tree is limited in its optimization
>>> capabilities, because it is difficult to optimize interactions
>>> between different plugins.
>>
>> LLVM is a nice counter-example. Plug-ins enable more optimizations, not less.
>
> LLVM keeps the IR the same, and permits plugins which optimize the fixed IR. It seems to me that a language which has extensible syntax must have an extensible IR, or it must have a low-level IR, or it must have an extremely general IR. As far as I can see, any of those choices will inhibit optimizations. What am I missing?
XL has a simple IR, 8 node types in all. Rewrite rules are based on the "shape" of the tree, not on its "kind". In other words, I don't have an "if statement" node, I have an entry in the rewrite table for rewriting a tree with the shape "if X then Y else Z".
This enables optimizations. I can for example decide that "if false then Y else Z" is Z and ignore Y. The XL compiler actually does that, to the point where Y may contain completely bogus code. See line 20 in the "case" statement example I shared earlier, http://xlr.git.sourceforge.net/git/gitweb.cgi?p=xlr/xlr;a=blob;f=xl2/native/TESTS/08.Aggregates/case_statement.xl;h=e01a88586baf8afd3d2fcc76b4d3159002a9bae7. This means I can use 'if' statements in XL where we would use #ifdef in C++, and then I can also use "switch" statements which is better than #elif #elif...
Drifting a little too much towards XL, sorry...
>> Explaining XL types is off-topic, but a good plug-in systems fosters
>> collaboration between plug-ins. For example, XL generic types and
>> functions share much of the same overload resolution code.
>
> I'm still curious about the issue. I went to look at the XL docs, but the entry for "Type System" appears to be as yet unwritten.
Well, if you ask me about documentation, you are going to corner me very quickly :-( XL documentation: lousy.
To answer the question, types in XL are attributes attached to some particular nodes of the parse tree. There is a general type system (http://xlr.git.sourceforge.net/git/gitweb.cgi?p=xlr/xlr;a=blob;f=xl2/native/xl.semantics.types.xs;h=c002f80183ec572911885d722866b23babd45490;hb=HEAD) and then other components in the compiler will derive from that, e.g. for generics (http://xlr.git.sourceforge.net/git/gitweb.cgi?p=xlr/xlr;a=blob;f=xl2/native/xl.semantics.types.generics.xs;h=ce21b32c1a5eaf9942d27a9ba976caf4f37e96a6)
So in the compiler, there is a single "type" representation, with derived classes for function types, generic types, and so on. Nothing really fancy there. The only aspect where the "plug-in" architecture comes up is that you can attach any other data to tree nodes, not just types, see http://xlr.git.sourceforge.net/git/gitweb.cgi?p=xlr/xlr;a=blob;f=xl2/native/xl.semantics.types.xl;h=c1827a22932ddc3339e30cf2a89039b38d68839c;hb=HEAD#l56.
Not sure if I'm answering the question, but for more details, I think it would me more appropriate to discuss that off-list.
> But Common Lisp was around when I was in school 20 years ago, and it hasn't caught on in any serious way. And back when I wrote programs in Common Lisp, the main effect of macros was to make code nearly incomprehensible. So it's not, for me, a particularly strong example.
Most programmers would agree with you (and I do). But I don't think the criticism applies to any meta-programming, just to the approach in Lisp. You can see XL as having the objective of getting meta-programming right. Maybe I didn't, but I can tell with some confidence that at the very least, XL's meta-programming abilities don't make code less readable.
Consider the following code:
function IsGenericType (tp : any_type) return boolean
function IsGenericDeclaration (decl : declaration) return boolean
Does that seem unreadable to you? Yet there is not a single keyword in there, and the internal representation of that code is extremely Lisp-ish:
ddd[git-xl] native> ./nxl -parse /tmp/glop.xl -style debug -show
(infix <CR>
(prefix
function
(infix return
(prefix
IsGenericType
(block( )
(infix :
tp
any_type)))
boolean))
(prefix
function
(infix return
(prefix
IsGenericDeclaration
(block( )
(infix :
decl
declaration)))
boolean)))
And here is the code that recognizes the structure in the XL compiler: http://xlr.git.sourceforge.net/git/gitweb.cgi?p=xlr/xlr;a=blob;f=xl2/native/xl.semantics.functions.xl;h=1a5d13bfb1086b27fe08a1deebb3d92436b9f510;hb=HEAD#l371
371 translation XLDeclarations
372 // ------------------------------------------------------------------------
373 // Translation of function declarations
374 // ------------------------------------------------------------------------
375 when
376 function 'FunctionStuff'
377 then
378 return EnterFunction (FunctionStuff)
> What I'm saying is that you always pay a price. You have to choose which price you are going to pay.
I believe I differ from the Go design philosophy in that I'm willing to pay a little more in terms of compile time in return for much improved control over what kind of code I can generate in the end. If my machine has Altivec or MMX, I'd prefer a compiler that lets me take advantage of that.
That's my choice, I understand it's not the design choice in Go and I respect that. I'm trying to see if there's stuff in what I did that benefits Go. If not, sorry.
Startup time is certainly relevant, but is always a balance between how much work is involved in loading additional code and how much time that additional code saves on the overall operation of the tool concerned. Except for trivial cases this balance is not something that can easily be determined in advance, so the question is whether or not a compiler is a trivial case...
There is also an assumption in your argument that plugin loading will be a serial process with all plugins loaded at startup, however these architectural choices are themselves open to discussion. It might be just as reasonable to run each plugin as a separate process connected over pipes/sockets and leave frequently used plugins as long-running servers.
>>> * I don't know how the backend for such a compiler works, but perhaps
>>> it also operates via plugins. That would imply that a language with
>>> an extensible syntax tree is limited in its optimization
>>> capabilities, because it is difficult to optimize interactions
>>> between different plugins.
>>
>> LLVM is a nice counter-example. Plug-ins enable more optimizations, not less.
>
> LLVM keeps the IR the same, and permits plugins which optimize the fixed
> IR. It seems to me that a language which has extensible syntax must
> have an extensible IR, or it must have a low-level IR, or it must have
> an extremely general IR. As far as I can see, any of those choices will
> inhibit optimizations. What am I missing?
Language syntax has very little to do with IR, which should be a representation of the program subsequent to parsing. So whilst syntax extensibility would complicate a language's parser pipeline, probably requiring multiple parses to generate a complete parse tree, there's no specific reason why that should have any impact at all on the IR used to then manipulate and optimise the parse tree. Forth for example exposes the parse buffer for manipulation in its outer interpreter which in principle allows arbitrarily complex syntax rules to be added to the language without having any impact at all on how the resulting low-level representation of the language is executed.
Or look at Ruby, which has syntax rules which are sufficiently generic to allow new constructs to be added through meta-programming in a manner not entirely dissimilar to Lisp macros. Indeed one of the underlying principles of Ruby is that good syntax produces readable - and hence maintainable - code.
>>> * If I want Common Lisp macros, I know where to find them.
>>
>> ... which means you don't have the excuse of ignorance when you decide
>> to ignore meta-programming in a new language design today.
>
> But Common Lisp was around when I was in school 20 years ago, and it
> hasn't caught on in any serious way. And back when I wrote programs in
> Common Lisp, the main effect of macros was to make code nearly
> incomprehensible. So it's not, for me, a particularly strong example.
Lisp is a language for the few not the many, by which I mean most people are actually quite skilled at using languages with rich syntax (human languages being inclined that way) and find the idea of working with raw parse trees unappealing. I personally loathe the form of Lisp but greatly admire its flexibility and as a result use many of the same tricks in my Ruby code that macros in Common Lisp and Scheme allow. But the situation with all these languages (as with Forth) is rather different in that they all include a compilation mode as part of their natural runtime environment.
Go is determinedly single-pass and that means meta-programming features would probably take a subtly different form. Indeed arguably Go would make a good compilation target for programs written in those languages...
The main advantage would be in adding syntactic optimisations so that fewer lines of code could achieve the same ends, and do so in a manner which would reduce the cost of validating and verifying that code. For an idea of how this might work in practice take a look at http://github.com/coatl/rubymacros which is a Lisp-like macro system for manipulating Ruby parse trees.
>>> * A language with an extensible syntax is slower to compile, because the
>>> compiler has to integrate a set of plugins which operate on the syntax
>>> tree. I understand that your XL compiler is fast. However, it seems
>>> clear to me that a compiler with plugins will be slower than a
>>> compiler without plugins.
>>
>> Not true. A compiler built on a plugin architecture requires a
>> processing pipeline, and that processing pipeline can be more easily
>> decoupled to scale across cores/processors/boxes.
>
> It's conceptually straightforward to distribute the optimization passes
> of a compiler. You can cut up the program into functions and regions
> within the functions, optimize them independently, and combine the
> results. For example, gcc's WHOPR project does this at the function
> level. However, an extensible syntax implies plugins which affect the
> parser. It is far less straightforward to distribute the parsing phase
> of a compiler since, conceptually, what you see on line 10 can affect
> how you parse line 20.
I'm not suggesting it's easy, but it certainly can be done. One obvious constraint is that syntax changes have to be localised rather than global in impact, but Go already has a package mechanism which prevents circular dependencies so having pluggable syntax within a package would certainly be plausible.
>>> * When I use a plugin to define a new type, I need to define conversions
>>> to other types. When I use two independent plugins to define two new
>>> types, how do I convert from one type to the other?
>>
>> I'm assuming there's work that already has to be done to define type
>> conversions in the existing compiler so surely you're just talking
>> about passing the related metadata along the pipeline and using a data
>> format that is itself extensible?
>
> But consider type conversions like Go's string to []int conversion.
> It's hard for me to imagine any plausible type metadata which can define
> that conversion in any way other than procedurally. And a procedural
> type conversion can only be defined for known types, not for unknown
> ones.
One of the reasons I keep arguing for a fundamental bitstring data type is because all higher-level in-memory types can be phrased in terms of such a data type. A string can then easily be expressed as a bitstring plus associated meta-data and the fact that an int is x bytes long makes it very easy to identify the equivalence of []int with []bytes - including the rewriting of bounds to suit the bitstring type being expressed as opposed to the one originally used for storage.
This after all is the whole point of Turing machines - they're all about operations on bitstrings ;)
>> I realise that adding a decent meta-programming model within Go's
>> design constraints is no easy task - it would require statements which
>> were explicitly about compilation and finding an elegant vocabulary
>> which works within Go's other design constraints could take a long
>> time - but if Go's going to stem the brain drain to dynamic languages
>> it really does have to offer one. Otherwise it's going to remain
>> considerably less productive than Python, Ruby or (God help us) Perl -
>> even as it leaves C++ and Java in the dust.
>
> Well, I'm not personally convinced. But I would certainly be interested
> in seeing a proposal that fits in with the existing Go language.
As would I :)
what about:
type X struct {i int; left, right *X}
?
i get the feeling that your bitstring data string would apply
only to data types that are contiguous in memory,
which leaves out several of Go's fundamental data
types (map, interface, slice, channel)
Are you arguing that something about this data type leaves it incompatible with a Turing-complete representation?
> i get the feeling that your bitstring data string would apply
> only to data types that are contiguous in memory,
> which leaves out several of Go's fundamental data
> types (map, interface, slice, channel)
These may be fundamental types from the perspective of a Go programmer, but from the perspective of memory layout they're a composite of bitstring types and can in that sense be treated as an abstract type separate from any in-memory representation. Hence the caveat in my original statement.
i'm not sure i understand. what conversions would you allow on
bitstring types? in what way would they be different to the way
the "unsafe" package allows conversion of any type to
any other type?
> Language syntax has very little to do with IR, which should be a
> representation of the program subsequent to parsing.
But when the language is generic to the point of permitting you to
define new numeric types, those types must be represented in the IR.
Ian
The point isn't what conversions I'd allow on bitstrings, it 's self-evident that being a dimensioned array of bits they can be manipulated in a similar manner to vectors without too many conceptual headaches. Of much more interest is the fact that all data types which are represented by allocated memory are in fact examples of bitstrings and by embracing that the pressure to keep adding basic types to the language core becomes much less pronounced.
Any Turing complete language is capable of expressing anything that's computable, so assuming a Turing complete IR I find it hard to understand why this should be a significant problem. More practically, I can quite happily define new numeric types in both Ruby and Forth without placing any strains on their respective IRs at all - and as MacRuby runs on LLVM by definition the IR of LLVM is itself capable of supporting new numeric types.
Now it's quite possible these numeric types will be represented less efficiently than numeric types which have hard-coded IR forms, but that's more about the inflexibility of the particular IR than it is about the fundamental inability of an IR to support new numeric types.
it's not self evident to me. how is a bitstring different from []byte?
> On 26 Jul 2010, at 19:54, Ian Lance Taylor wrote:
>> Eleanor McHugh <ele...@games-with-brains.com> writes:
>>
>>> Language syntax has very little to do with IR, which should be a
>>> representation of the program subsequent to parsing.
>>
>> But when the language is generic to the point of permitting you to
>> define new numeric types, those types must be represented in the IR.
>
>
> Any Turing complete language is capable of expressing anything that's
> computable, so assuming a Turing complete IR I find it hard to
> understand why this should be a significant problem. More practically,
> I can quite happily define new numeric types in both Ruby and Forth
> without placing any strains on their respective IRs at all - and as
> MacRuby runs on LLVM by definition the IR of LLVM is itself capable of
> supporting new numeric types.
>
> Now it's quite possible these numeric types will be represented less
> efficiently than numeric types which have hard-coded IR forms, but
> that's more about the inflexibility of the particular IR than it is
> about the fundamental inability of an IR to support new numeric types.
My original point in this mini-subthread was that an extensible syntax
is difficult to optimize. So I think we agree.
Of course you can represent anything in an IR via a set of function
calls or complex operations, but the optimizers can't do anything useful
with that. E.g., they can't implement constant subexpression
elimination.
Ian
Well []byte is [][8]bit so in on sense it isn't. However in terms of which is the fundamental concept, []byte can be built atomically from []bit but the converse isn't true.
I would caveat that with a "might be", but certainly to support extensible syntax there would need to be some thought about how that would be reflected in the IR and what the implications for optimisation would be.
> Of course you can represent anything in an IR via a set of function
> calls or complex operations, but the optimizers can't do anything useful
> with that. E.g., they can't implement constant subexpression
> elimination.
That's not a good example as the concept of "constant" is different from the concept "numeric" and there's no reason constancy couldn't be asserted based on criteria known to a plugin - such as lack of memory side effects. However I agree that as with everything computational there are tradeoffs at work here. Given the size of the Go dev team it makes little sense to be innovating with the IR at this time, but I hope that won't always be the case as extensible syntax implemented in a Goish style could be a very useful feature.
that's true. however all other types in the language are represented
in units of bytes,
so []byte should be sufficient, surely.
i just fail to see how your bitstring idea would help in solving the
problems you'd like it to solve (e.g. reducing the pressure to
add basic types to the language)
is there another language that uses the same concept? (i found
a "bitstring" library for python but it just seems to be a way of manipulating
bit vectors)
Should booleans be represented by bytes? And what about the many communications protocols which pack multiple bitfields into a byte or a word? Currently to handle the (de)compression involved in handling those requires ugly and potentially error-prone bitwise operations which could more properly be subsumed into the compiler and exposed via a more natural form.
> i just fail to see how your bitstring idea would help in solving the
> problems you'd like it to solve (e.g. reducing the pressure to
> add basic types to the language)
Because the vast majority of basic types are nothing more than a way of tagging bitstrings as either integers or floating-point values of a given resolution so they can be loaded into registers or stored in words. Having bitstrings provides a way of generalising these tags without requiring a plethora of basic types.
> is there another language that uses the same concept? (i found
> a "bitstring" library for python but it just seems to be a way of manipulating
> bit vectors)
Wikipedia has a nice overview of some of the typical uses of bitstrings [1] which also covers language support, although it misses out ASN.1 where bitstrings are a basic type. Bitstrings are also one of the approaches used to implement arbitrary precision maths where their compressibility can make for a compact representation.
Ellie
[1] http://en.wikipedia.org/wiki/Bit_array
the hardware can operate at byte-level granularity. sure, using a byte
to represent a boolean is inefficient in space, but it's a lot more
efficient in time.
> And what about the many communications protocols which pack
> multiple bitfields into a byte or a word? Currently to handle the
> (de)compression involved in handling those requires ugly and
> potentially error-prone bitwise operations which could more properly
> be subsumed into the compiler and exposed via a more natural form.
communications protocols are about communicating
with the outside world, not about how types are represented
internally to the language. it would be straightforward to
make a Go package automatically pack fields together when
packing a structure for external consumption - in fact, i presume
that the protobuf implementation already does.
>> i just fail to see how your bitstring idea would help in solving the
>> problems you'd like it to solve (e.g. reducing the pressure to
>> add basic types to the language)
>
> Because the vast majority of basic types are nothing more than a
> way of tagging bitstrings as either integers or floating-point values
> of a given resolution so they can be loaded into registers or stored
> in words. Having bitstrings provides a way of generalising these tags
> without requiring a plethora of basic types.
how do pointers fit in here?
(sorry if i'm coming across as a bit short here - i'm genuinely interested
to understand what you're driving at...)
That's an argument about the code a compiler should generate, not about the semantics of the language from which it generates that code. As a programmer I care about the latter and trust compiler writers to care about the former.
Ellie
No it isn't. Testing whether a bit is set or absent can usually be performed with just a couple of instructions (depending on whether it has to be done with a loaded bitmask or can be performed directly by a machine op) and being able to keep associated flags packed into the same machine word allows for a much higher associativity with both machine registers and the cache. That can sufficiently reduce memory access times that for critical sections which rely heavily on boolean sets their performance outstrips the byte-per-flag alternative by a substantial margin.
The only real advantage I can think of to using a whole byte for a boolean is that if you ever have to write a cosmic ray degradation conformance document you can describe it as an eight-way redundant data correction strategy. However by the time you get involved in that kind of pseudo-engineering your far past concerns over either space or time efficiency.
>> And what about the many communications protocols which pack
>> multiple bitfields into a byte or a word? Currently to handle the
>> (de)compression involved in handling those requires ugly and
>> potentially error-prone bitwise operations which could more properly
>> be subsumed into the compiler and exposed via a more natural form.
>
> communications protocols are about communicating
> with the outside world, not about how types are represented
> internally to the language. it would be straightforward to
> make a Go package automatically pack fields together when
> packing a structure for external consumption - in fact, i presume
> that the protobuf implementation already does.
Actually communications protocols are about ensuring that multiple disjoint systems share the same view of both a problem domain and a specific data space. Type is very much at the core of delivering this. If I have a field that can only ever contain a 3-bit integer and should only ever contain a 3-bit integer then it seems throughly reasonable that a 3-bit integer representation be used for that.
I'm well aware of the technical difficulties in handling the manipulation of that 3-bit integer correctly because I've spent a large part of my career having to do this sort of thing manually. I consider that a language design failure for many of the languages I've had to use because the code that results could just as easily be generated automatically by a compiler, leaving me to focus on the problems that are really of interest.
>>> i just fail to see how your bitstring idea would help in solving the
>>> problems you'd like it to solve (e.g. reducing the pressure to
>>> add basic types to the language)
>>
>> Because the vast majority of basic types are nothing more than a
>> way of tagging bitstrings as either integers or floating-point values
>> of a given resolution so they can be loaded into registers or stored
>> in words. Having bitstrings provides a way of generalising these tags
>> without requiring a plethora of basic types.
>
> how do pointers fit in here?
Well pointers can be many things, depending on context. We generally model them as word-sized unsigned integers because of the desire to present memory as a single cohesive space. Viewed at that level a pointer is just an n-bit value that happens to correspond to the location of a machine word. One of the things I like in Go is that pointers take that basic premise seriously: you store them in a uintptr because it happens to be an appropriately-sized allocation unit, but you can't do math on them without deliberately converting them to a numeric type - and when you do there is no guarantee that the resulting number can itself be converted back to a valid pointer.
In the general case thought there's no reason why from a language perspective a pointer should be defined as having a fixed-sized representation, that way then can be used to reference fully disjoint memory spaces - such as disk storage, network resources, or even multiple memory spaces which require addressing outside of the standard word-size boundary.
> (sorry if i'm coming across as a bit short here - i'm genuinely interested
> to understand what you're driving at...)
Well that's probably not an insight I can share in the absence of alcohol... it's one of "those" engineering things and a lot of my thinking has been shaped by particular problems I've dealt with using assembler and C on embedded systems.
Ellie
Languages tend to gravitate to where they make people's lives easier, and Go does have a bare-metal ARM port which would be very appealing for a large segment of the embedded world. C/C++ sucks for both concurrency and type safety as well as being relatively slow languages to develop in. Type safety in particular is a big selling point as there's often no possibility of fixing bugs in embedded kit short of replacing entire board assemblies, and that's an expensive process.
> In the worst case scenario, make your bit strings out of int8 and 1, 2, 4, 8, 16, 32 as your flags. It's pretty easy to work with, especially when you make a constant mask for those, and you don't get the performance hit of packing and unpacking an int3 into something the machine likes.
This is pretty much what a large number of programmers do on a daily basis in the embedded world, including the packing and unpacking. It's error-prone precisely because the resulting code is mechanical in nature and could just as easily be written by a compiler, which would also make a better job of register assignment and could optimise the packing/unpacking cycle much more consistently than tends to be the case with manual code.
> Also, how would the compiler take care of that performance hit for you, without automatically converting int3 into int8, and thus leaving you with no idea of the storage space?
The storage space matters in memory or on disk, but it's no longer so important once data is in the cache or loaded into registers - in much the same way that most modern hardware is only word addressable but you can still load those words into registers and perform byte operations on them without having to consciously think of them as truncated word operations.
Actually, I think it's better to simply illustrate this with code. http://xlr.git.sourceforge.net/git/gitweb.cgi?p=xlr/xlr;a=blob;f=xl2/native/xl.semantics.instructions.xl;h=6c1ccad5576f905f8e41f2456e31d1e95f04f3a2;hb=HEAD#l617 shows how the XL compiler recognizes various instructions.
In terms of role, the code shown above is roughly similar to the combination of the parse code generated in http://golang.org/src/cmd/gc/go.y (which parses the tokens and builds the typed if statement), and the code that later processes the typed nodes (grep for OIF in the source code).
With type-based trees, we have something like:
switch(n->op) {
...
case OIF:
... code dealing with if
With shape-based processing, we have instead:
617 translation XLSemantics
637 when
638 if 'Cond' then 'TrueCase'
639 then
640 return DoIf(Cond, TrueCase, nil)
It's really a matter of taste.
Christophe