class SimpleGraph:
def __init__(self):
self._spo = {}
self._pos = {}
self._osp = {}
This is to create indexes that are permutations of the pattern subject,
predicate object. So, using this:
self._pos = {predicate:{object:set([subject])}}
We have the first dictionary keyed off the first term, the
second dictionary keyed off the second term, and the set containing the
third terms(note terms plural). I guess I somewhat understand that sets
are used to test for membership. Cannot that be done with dictionaries,
though?
Also, if you ran this for loop, below, what is being yielded,
the key or the value in the dictionary? I refer to this for loop here:
for subject in self._pos[predicate][ojbect]: yield (subject, predicate,
object)
Thanks for any help and insights into this,
Bruce
--
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
B. M. Whealton - Web Developer, Programmer, Owner:
Future Wave Designs: http://FutureWaveDesigns.com
"On Being a Poet and Other Existential Ideas":>
http://WordSaladPoetryMagazine.com/words/
Publisher of Word Salad Poetry Magazine:
http://WordSaladPoetryMagazine.com/ - Submit your poetry
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Indeed, CPython's sets are implemented as something like dictionaries
with dummy values (the keys being the members of the set), with some
optimization(s) that exploit this lack of values.
Cheers,
Chris
--
http://blog.rebertia.com
> I did get a bit confused in reading about the concept of
> sets in python and why you would use them instead of a dictionary for
> example.
Why would you use a spoon instead of a paper clip? Why would you use a
hat-stand instead of a pencil sharpener?
Sets aren't an alternative to dictionaries. They have a completely
different purpose.
It so happens that sets and dicts *in Python* happen to share some
implementation details, and they happen to share similar syntax, but
don't be fooled by these incidental details. They could be implemented
differently.
Dicts are used when you need a 1:1 mapping between a key and a value:
{name: address} # each person has exactly one home address
Sets are used when you have a list of items, but you don't care about the
order of the items, only whether or not something is in the list.
Since lists are ordered, searching a list to see if something is in it
can be slow. Sets give up order so that they can make membership testing
blindingly fast:
>>> from timeit import Timer
>>> Timer("100042 in L", "L=range(100000)").timeit(number=10000)
53.221637010574341
>>> Timer("100042 in S", "S=set(range(100000))").timeit(number=10000)
0.0022180080413818359
> For example, here we have:
> self._pos = {predicate:{object:set([subject])}} This is said to be a
> dictionary, containing dictionaries, which in turn contain sets. I don't
> know if it would be possible to explain why one would use a set,
> especially in this context without showing more. I was a bit stumped by
> this for some odd reason. I don't know if other languages lack certain
> structures and features.
Of course they do. That's why we have more than one programming language.
Earlier versions of Python didn't have sets. Standard Pascal doesn't have
dicts. Neither Fortran nor COBOL supported recursion, at least early on.
> It's been a long time since I did C
> programming too.
> Anyway, this would be used an a Class as follows:
>
> class SimpleGraph:
> def __init__(self):
> self._spo = {}
> self._pos = {}
> self._osp = {}
>
> This is to create indexes that are permutations of the pattern subject,
> predicate object. So, using this:
>
> self._pos = {predicate:{object:set([subject])}}
Sounds complicated. What are the indexes for? Why not just have the
permutations directly?
> We have the first dictionary keyed off the first term, the
> second dictionary keyed off the second term, and the set containing the
> third terms(note terms plural). I guess I somewhat understand that sets
> are used to test for membership. Cannot that be done with dictionaries,
> though?
> Also, if you ran this for loop, below, what is being yielded,
> the key or the value in the dictionary? I refer to this for loop here:
> for subject in self._pos[predicate][ojbect]: yield (subject, predicate,
> object)
What is yielded is exactly what you tell Python to yield: a tuple
containing three items (subject, predicate, object), whatever they are.
--
Steven
> I did get a bit confused in reading about the concept of sets in
> python and why you would use them instead of a dictionary for example.
Use a set when something is naturally modelled as a set... it's a
collection of unordered objects that you can test for membership, form
unions, intersections etc. reasonably efficiently.
Use a dictionary when you want something that's naturally modelled as a
dictionary - i.e. you have a unordered collection of keys each of which
has a value associated with it and you want reasonably efficient lookup
of those values from the keys.
Certainly you can model a set as a dictionary, but that's likely to be
less efficient than using a set, and quite possibly you'll need to roll
your own operations on your sets, whereas they're provided for the built
in implementation.
(Of course if you know more information about your problem domain you
might have a better representation than the built in set... for example
you might imagine an application where you want to form sets of objects
from a fixed, relatively small, universe. Then you could perhaps model a
set as a bit array, and lots of set operations can then be done by very
fast bit fiddling operations.)
Sets are faster and more convenient to do intersections, unions,
differences. They also use less space than dictionaries. Finally they
also help conveying the intent of the code.
Many things can be emulated using dictionaries. In languages like Lua
there are no sets, and no tuples, and no lists*. This makes sense if
your goal is to create a very small embedded language. Using specific
structures on the other hand has the benefit that operations common to
that structure are more convenient to express and that those
operations are optimized for space and time.
*"Tables are the sole data structuring mechanism in Lua; they can be
used to represent ordinary arrays, symbol tables, sets, records,
graphs, trees, etc." http://www.lua.org/manual/5.1/manual.html#2.2
> self._pos = {predicate: {object: {subject: None}}}
>
> That's a bit ugly because the None serves no purpose here; the value
> associated with the subject has no meaning in this context.
This is what we did in Python before sets were added.
> It also
> uselessly takes up space in memory that could add up to significant
> wastage if the data structure grows to be large.
>
> The other major difference between dicts and sets is that sets provide
> mathematical operations such as intersection and difference that dicts
> do not, but that doesn't appear to be relevant in this context.
These two reasons are why we added sets.
--
Terry Jan Reedy
> Sets aren't an alternative to dictionaries. They have a completely
> different purpose.
A dict/mapping is a specialized set -- a set of ordered pairs in which
each first member (the 'key') only appears once as a first member. The
set union of two mappings may not be a mapping, which is why dicts have
a specialized .update method instead of .union.
> It so happens that sets and dicts *in Python* happen to share some
> implementation details, and they happen to share similar syntax, but
> don't be fooled by these incidental details. They could be implemented
> differently.
>
> Dicts are used when you need a 1:1 mapping between a key and a value:
The mapping does not have to be 1:1, which means that each value also
appears only once, so that the mapping can be inverted. Values can
appear more than once, which is to say, many keys can map to one value.
so the mapping is many:1. Collections (sets or lists) are used as values
to implement 1:many mappings.
--
Terry Jan Reedy
> Certainly you can model a set as a dictionary, but that's likely to be
> less efficient than using a set, and quite possibly you'll need to roll
> your own operations on your sets, whereas they're provided for the built
> in implementation.
Did you know that applying the “set” or “frozenset” functions to a dict
return a set of its keys?
Seems a bit dodgy, somehow.
> Did you know that applying the “set” or “frozenset” functions to a dict
> return a set of its keys?
> Seems a bit dodgy, somehow.
That's just a consequence of the fact that dicts produce their
keys when iterated over, and the set constructor iterates over
whatever you give it.
--
Greg
Hmm. It seems that “iter(<dict>)” iterating over the keys has been around a
long time. But a dict has both keys and values: why are language constructs
treating them so specially as to grab the keys and throw away the values?
Is this an artifact of older directions in the language evolution, before
sets were introduced?
What's so special about it? If you want the value, ask for it;
iterating over the dict without asking specifically for the values does
not give them. Furthermore, if you did get the key:value pairs how
would you store them in a set such that you could query the set to see
if a key were there?
~Ethan~
Language constructs are not treating anything specially much less "throwing
away" anything. The language construct in question does exactly the same thing
with every object nowadays: call the .__iter__() method to get the iterator and
call .next() on that iterator until it raises StopIteration. It is the
responsibility of the dict object itself to decide how it wants to be iterated over.
The reasoning for this decision is spelled out in the PEP introducing the
iterator feature:
http://www.python.org/dev/peps/pep-0234/
"""
- There has been a long discussion about whether
for x in dict: ...
should assign x the successive keys, values, or items of the
dictionary. The symmetry between "if x in y" and "for x in y"
suggests that it should iterate over keys. This symmetry has been
observed by many independently and has even been used to "explain"
one using the other. This is because for sequences, "if x in y"
iterates over y comparing the iterated values to x. If we adopt
both of the above proposals, this will also hold for
dictionaries.
The argument against making "for x in dict" iterate over the keys
comes mostly from a practicality point of view: scans of the
standard library show that there are about as many uses of "for x
in dict.items()" as there are of "for x in dict.keys()", with the
items() version having a small majority. Presumably many of the
loops using keys() use the corresponding value anyway, by writing
dict[x], so (the argument goes) by making both the key and value
available, we could support the largest number of cases. While
this is true, I (Guido) find the correspondence between "for x in
dict" and "if x in dict" too compelling to break, and there's not
much overhead in having to write dict[x] to explicitly get the
value.
For fast iteration over items, use "for key, value in
dict.iteritems()". I've timed the difference between
for key in dict: dict[key]
and
for key, value in dict.iteritems(): pass
and found that the latter is only about 7% faster.
Resolution: By BDFL pronouncement, "for x in dict" iterates over
the keys, and dictionaries have iteritems(), iterkeys(), and
itervalues() to return the different flavors of dictionary
iterators.
"""
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
> The reasoning for this decision is spelled out in the PEP introducing
> the iterator feature:
>
> http://www.python.org/dev/peps/pep-0234/
There's also the question of whether 'if x in dict' should
compare keys only or both keys and values. This was also
hotly debated back when dicts were given an 'in' operator
(before that you had to use dict.haskey(x)). Can't remember
all the arguments, but it was definitely thought about.
--
Greg
The PEP seems to refer to such discussion, although no there is no
specific footnote to the mailing list archives.
- Regarding "if key in dict": there is no doubt that the
dict.has_key(x) interpretation of "x in dict" is by far the
most useful interpretation, probably the only useful one. There
has been resistance against this because "x in list" checks
whether x is present among the values, while the proposal makes
"x in dict" check whether x is present among the keys. Given
that the symmetry between lists and dictionaries is very weak,
this argument does not have much weight.
Symmetry is always a tricky balance in programming languages. Python
discounts symmetry considerations with respect to lists and
dictionaries when it comes to "in" (see above), while it goes for a
certain symmetry between "if" and "for" with respect to the "in"
keyword and dictionaries (see below):
While this is true, I (Guido) find the correspondence between
"for x in
dict" and "if x in dict" too compelling to break, and there's
not
much overhead in having to write dict[x] to explicitly get the
value.
There is no major inconsistency here; "symmetry" is subjective and
needs to be applied with a heavy dose of pragmatism.
> Symmetry is always a tricky balance in programming languages.
Is that what we used to call “orthogonality”?
No, orthogonality is something else. "Orthogonal" means "perpendicular
to." See the Wikipedia article for a discussion of the word's
application to computer science.
--
D'Arcy J.M. Cain <da...@druid.net> | Democracy is three wolves
http://www.druid.net/darcy/ | and a sheep voting on
+1 416 425 1212 (DoD#0082) (eNTP) | what's for dinner.
> On Tue, 12 Oct 2010 23:34 +1300
> Lawrence D'Oliveiro <l...@geek-central.gen.new_zealand> wrote:
>
>>> Symmetry is always a tricky balance in programming languages.
>>
>> Is that what we used to call “orthogonality”?
>
> No, orthogonality is something else. "Orthogonal" means "perpendicular
> to."
The appropriate meaning is ‘being able to combine independently” (as in the
orthogonal decomposition of a Fourier transform). An example of contemporary
usage, from the “Revised Report on the Algorithmic Language ALGOL 68” (1974,
I think):
0.1.2 Orthogonal design
The number of independent primitive concepts has been minimized in
order that the language be easy to describe, to learn, and to implement.
On the other hand, these concepts have been applied “orthogonally” in
order to maximize the expressive power of the language while trying to
avoid deleterious superfluities.
So “orthogonality” has to do with use of minimum number of primitive
components (operators, object types) in a maximum number of meaningful
combinations.
Lawrence, I was actually talking about symmetry, not orthogonality. I
was making the observation that Python doesn't always strive for
symmetry as its #1 driving concern (as well it shouldn't). I think
your definition of orthogonality is more what Python is about. I'd
say that symmetry and orthogonality go hand in hand for most
situations. Symmetry often helps minimize the number of primitive
components, for example. But sometimes symmetry arguments can be
forced or just be made on a weak foundation.
Someone (Niklaus Wirth?) once said something about how a simpler
language isn't necessarily easier to learn. Pascal, for example, makes
a distinction between procedures and function, which can help to
prevent some bugs, and Python forbids assignment in expressions, which
also helps to prevent some bugs.
> Lawrence, I was actually talking about symmetry, not orthogonality.
So what’s the difference?
I don't think there's a big difference between the two--you snipped
away my comment about them often going hand in hand, so I will repeat
it here that they do often go hand in hand.
I guess a lot depends on how you define "symmetry." Is your
definition of "symmetry" equivalent to your definition of
"orthogonality"?
I don't have a precise definition of "symmetry" myself, I will admit.
Perhaps for lack of a better word, I throw around the term "symmetry"
for situations where analogous constructs get used in slightly
different contexts. The example in this discussion is the "in"
keyword. You can use "in" for strings, sets, lists, and dictionaries,
which in and of itself is an example of "symmetry" in my mind.
Another example is the relationship of the "+" operator to the "*"
operator. With numbers "*" connotes repeated use of "+", as it does
with strings, even though the underlying operations for "+" are not
exactly equivalent. String star-ification is repeated concatenation,
whereas integer star-ification is repeated addition, but since
concatenation and addition are both represented by "+", it creates
some sort of symmetry to have "*" as the string-concatenation-
repetition operator, even if it is not something folks commonly
encounter outside of programming languages.
> I guess a lot depends on how you define "symmetry." Is your
> definition of "symmetry" equivalent to your definition of
> "orthogonality"?
No idea. It’s just that the example being discussed in this thread seemed to
come under the old term “orthogonality”, so I wondered why a different term
was being used.
So far no-one has come up with a good excuse for using a different term.
Ask the authors of PEP 234 why they use the term symmetry:
http://www.python.org/dev/peps/pep-0234/
That was the original context of my comment. The term "symmetry" gets
used a couple times in that PEP, and I think we're in violent
agreement that the concept of "symmetry" is wishy-washy at best.
Here is just one example from the PEP:
The symmetry between "if x in y" and "for x in y"
suggests that it should iterate over keys. This symmetry has
been
observed by many independently and has even been used to
"explain"
one using the other.
I think I'm just making the same observation as you coming from a
different angle. Why talk about "symmetry" when it's such a tricky
balance?
> That was the original context of my comment. The term "symmetry" gets
> used a couple times in that PEP, and I think we're in violent
> agreement that the concept of "symmetry" is wishy-washy at best.
>
> Here is just one example from the PEP:
>
> The symmetry between "if x in y" and "for x in y"
> suggests that it should iterate over keys.
Maybe "analogy" or "similarity" would be a better word here.
--
Greg
Agreed. "Analogy" seems particularly appropriate.
Except that, while analogies can be handy for illustrating things, they are
useless for actually supporting arguments.
And only a fool would try to support his argument by actually
illustrating a point, right? ;)