x = "some string"
x is a reference to "some string"
foo(x)
Reference is passed to function.
In foo:
x += " change"
Strings are immutable, so x in foo() now points to a different string
than x outside foo().
Right?
Back outside foo.
x = ["some string"]
x is a reference to a list whose first element is a reference to a
string.
foo(x)
Within foo:
x[0] += " other"
Another string is created, the first element of x is modified to point
to the new string and back outside foo(), x[0] will point to the new
string.
Right?
> Is the following correct?
>
> [lots of references to "references"]
All good so far.
> x[0] += " other"
>
> Another string is created, the first element of x is modified to point
> to the new string and back outside foo(), x[0] will point to the new
> string.
Change these to talk about "references" again and it'll be true also:
"Another string is created, the first element of x now refers to
the new string and back outside foo(), x is still a reference to
the same list (so its first element is a reference to the same
string)."
> Right?
Right. In Python, all names, and all elements of container objects,
are references to the corresponding objects. Python has no concept of
"pointers" in the style of C-like languages.
--
\ "I fly Air Bizarre. You buy a combination one-way round-trip |
`\ ticket. Leave any Monday, and they bring you back the previous |
_o__) Friday. That way you still have the weekend." -- Steven Wright |
Ben Finney
Somewhat colloquial/abbreviated. x is a reference. It does not have
elements. You mean "... the first element of the list to which x
refers is modified ...".
> to the new string and back outside foo(), x[0] will point to the new
> string.
>
> Right?
Close enough.
Right?
I'd rephrase that as:
* Both the global context and the inside of foo see the same list
* They can therefore both update the list
* If a new string is put in the first element of the list, the can
both see the same new string.
> Right?
You know you can get python to answer your question - yes? Might be slightly
more illuminating than twisting round english... :-)
OK, you're passing in a string in a list. You have 2 obvious ways of doing
that - either as an argument:
def foo(y):
y[0] += " other"
print id(y[0])
... or as a global: (which of course you wouldn't do :-)
def bar():
global x
x[0] += " another"
print id(x[0])
So let's see what happens.
>>> x = ["some string"] # create container with string
>>> x[0] # Check that looks good & it does
'some string'
>>> id(x[0]) # What's the id of that string??
3082578144L
>>> foo(x) # OK, foo thinks the new string has the following id
3082534160
>>> x[0] # Yep, our x[0] has updated, as expected.
'some string other'
>>> id(x[0]) # Not only that the string has the same id.
3082534160L
>>> bar() # Update the global var, next line is new id
3082543416
>>> x[0] # Check the value's updated as expected
'some string other another'
>>> id(x[0]) # Note that the id is the same as the output from bar
3082543416L
Does that perhaps answer your question more precisely ?
Michael.
x is a name bound to a string object with value 'some string'.
Some people find is useful to call that a 'reference', as you seem to have.
Others get confused by that viewpoint. It depend on exactly what one means
by 'reference'.
| foo(x)
|
| Reference is passed to function.
The first parameter name of foo gets bound to the object referred to by
'x'.
Calling that 'passing by reference' sometimes misleads people as to how
Python behaves.
| In foo:
| x += " change"
|
| Strings are immutable, so x in foo() now points to a different string
| than x outside foo().
| Right?
A function local name x has no particular relationship to a global name
spelled the same, except to confuse things. Best to avoid when possible.
The effect of that statement would be the same outside of the function as
well, pretty much for the reason given. In general, 'y op= x' is the same
as 'y = y op x' except for any side-effects of expression y. Lists are an
exception.
tjr
Sort-of, but I would say that it's misleadingly correct. Try this:
http://starship.python.net/crew/mwh/hacks/objectthink.html
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/
"Typing is cheap. Thinking is expensive." --Roy Smith
... and for bonus marks, explain why the "global x" in this function
is not required.
--
\S -- si...@chiark.greenend.org.uk -- http://www.chaos.org.uk/~sion/
"Frankly I have no feelings towards penguins one way or the other"
-- Arthur C. Clarke
her nu becomeþ se bera eadward ofdun hlæddre heafdes bæce bump bump bump
Sion Arrowsmith wrote:
> Michael Sparks <m...@cerenity.org> wrote:
> >def bar():
> > global x
> > x[0] += " another"
> > print id(x[0])
>
> ... and for bonus marks, explain why the "global x" in this function
> is not required.
Because x does not appear as an LHS in bar(), just about the first
thing I learned here.
More seriously, I can and do use lots of globals. In the tokenizer I'm
writing, for example, all the token types(COMMENT_EOL = 0,
CONSTANT_INTEGER = 1, ...) are global constants. The text to be
tokenized is a global variable. (Actually, the text is unchanging once
the Tok object is created, so this "variable" may be another
constant.) Passing global constants to functions is a case of CPU
abuse.
Structured purists gave globals a bad rap, years ago. Time to stick up
for them. They're good citizens. Don't blame them if some dumb coder
abuses them. It's not their fault.
We all do, FWIW - since everything is name/object binding, all the
classes, functions, modules etc defined or imported in a module are,
technically, globals (for the Python definition of 'global').
> In the tokenizer I'm
> writing, for example, all the token types(COMMENT_EOL = 0,
> CONSTANT_INTEGER = 1, ...) are global constants.
Technically, they are not even constants !-)
> The text to be
> tokenized is a global variable.
Now *this* is bad. Really bad.
> (Actually, the text is unchanging once
> the Tok object is created, so this "variable" may be another
> constant.)
It isn't.
> Passing global constants to functions is a case of CPU
> abuse.
Remember that Python doesn't copy objects when passing them as function
params, and that function-local names are faster to lookup than global
ones.
There are indeed reasons not to pass module constants to the module's
functions, but that have nothing to do with CPU. And in your case, the
text to be tokenised is definitively not a constant.
> Structured purists gave globals a bad rap, years ago. Time to stick up
> for them. They're good citizens. Don't blame them if some dumb coder
> abuses them.
Once you learned why you should not do something - and how to avoid
doing it -, chances are you also know when it's ok to break the rule.
By "constant" I meant that it did not change during the lifetime of
the Toker.
<aol />
> By "constant" I meant that it did not change during the lifetime of
> the Toker.
That's still a variable to me. It's even the essence of the variable,
since it's the main input of your program. And that's definitively not
something I'd store in global.
> More seriously, I can and do use lots of globals. In the tokenizer I'm
> writing, for example, all the token types(COMMENT_EOL = 0,
> CONSTANT_INTEGER = 1, ...) are global constants. The text to be
> tokenized is a global variable. (Actually, the text is unchanging once
> the Tok object is created, so this "variable" may be another
> constant.) Passing global constants to functions is a case of CPU
> abuse.
>
> Structured purists gave globals a bad rap, years ago. Time to stick up
> for them. They're good citizens. Don't blame them if some dumb coder
> abuses them. It's not their fault.
*grin*
It is good to see that I am not the only person in the squad who hears
the beat of this drum.
I wonder if you have some COBOL data divisions under your belt?
- Hendrik
Hendrik van Rooyen wrote:
> I wonder if you have some COBOL data divisions under your belt?
Hendrik, I go way back but somehow I missed COBOL.
Martin
Bruno Desthuilliers wrote:
> ... that's definitively not
> something I'd store in global.
So where would you put it?
Context is all gone, so I'm not sure that I remember what "it" is. I
think it is the text that you're parsing.
I believe you are currently doing something like this:
TEXT = "placeholder"
def parse():
while True:
token = get_next_token() # looks at global TEXT
yield token
# And finally actually run your parser:
TEXT = open("filename", "r").read()
for token in parse():
print token
If I were doing this, I would do something like this:
def parse(text):
while True:
token = get_next_token() # looks at local text
yield token
# Run as many independent parsers as I need:
parser1 = parse(open("filename", "r").read())
parser2 = parse(open("filename2", "r").read())
parser3 = parse("some text")
for token in parser1:
print token
# etc.
Unless the text you are parsing is truly enormous (multiple hundreds of
megabytes) you are unlikely to run into memory problems. And you gain the
ability to run multiple parsers at once.
--
Steven
Steven D'Aprano wrote:
> Context is all gone, so I'm not sure that I remember what "it" is. I
> think it is the text that you're parsing.
Yes. I'm tokenizing today. Parsing comes after Christmas.
> TEXT = "placeholder"
>
> def parse():
> while True:
> token = get_next_token() # looks at global TEXT
> yield token
Classic, but I'm not going to go there (at least until I fail
otherwise).
My tokenizer returns an array of Token objects. Each Token includes
the text from which is created, locations in the original text and,
for something like CONSTANT_INTEGER, it has an intValue data member.
> # Run as many independent parsers as I need:
> parser1 = parse(open("filename", "r").read())
> parser2 = parse(open("filename2", "r").read())
> parser3 = parse("some text")
Interesting approach, that. Could have a separate parser for each
statement. Hmmm. Maybe my tokenizer should return a list of arrays of
Tokens, one array per statement. Hmmm.
I'm thinking about an OO language construction that would be very easy
to extend. Tentatively, I'll have Production objects, Statement
objects, etc. I've already got Tokens.
Goal is a really simple language for beginners. Decaf will be to Java
as BASIC was to Fortran, I hope.
You don't have to "put" functions arguments anywhere - they're already
local vars.
def tokenize(text):
do some work
returns or (yields) a list of tokens or whatever
If you want the tokenizer module to work as a self-contained appliction
*too*, then :
if __name__ == '__main__':
text = reads the text from a file or stdin
for token in tokenize(text):
do something with token
HTH
Dennis Lee Bieber wrote:
> Great if one is using a teletype as editor
The original Dartmouth computer room was a basement that featured 8
teletypes.
The original BASIC, Dennis, was implemented on a time-shared
"mainframe" with a gigantic 8k words (20-bit words, if I remember) of
core memory. Designing a language for such a machine, I'd bet you,
too, would choose single-letter names. ('A' was a numeric. 'A$' a
string.)
If you compare the teletype to a tube it was lame. But that's not the
right comparison. The Fortran technology was cards, punched on a card
punch, carried to the operator. Wait your turn (hours more commonly
than minutes). Get a report off the line printer. Repunch the
offending cards.
Indeed, the teletype with line numbers was a giant step forward. No
operator. No waiting. Compiler complains. Retype the offending line. A
miracle in its day. You didn't even have to start your statements in
column 7!
Bruno Desthuilliers wrote:
> MartinR...@gmail.com a �crit :
> >
> > Bruno Desthuilliers wrote:
> >
> >>... that's definitively not
> >>something I'd store in global.
> >
> >
> > So where would you put it?
>
> You don't have to "put" functions arguments anywhere - they're already
> local vars.
Bruno, right now I've got this:
def __init__ ( self, t ):
""" Constructor, called with array of strings. """
self.text = t
...
Some other program will say:
tok = Toker( text_array )
tokens = tok.tokenize()
So how does the constructor make the array of strings available to the
tokenize() method?
> Bruno, right now I've got this:
>
> def __init__ ( self, t ):
> """ Constructor, called with array of strings. """
>
> self.text = t
> ...
>
> Some other program will say:
> tok = Toker( text_array )
> tokens = tok.tokenize()
>
> So how does the constructor make the array of strings available to the
> tokenize() method?
Assuming the `__init__()` above belongs to the `Toker` class then the
`tokenize()` method can access it via `self.text` of course.
Ciao,
Marc 'BlackJack' Rintsch