Too many warnings from S and C: best way forward?

28 views
Skip to first unread message

Joachim Durchholz

unread,
Sep 28, 2012, 3:13:00 AM9/28/12
to sy...@googlegroups.com
Hi all,

I'm having some trouble with S and C and would like to find out the root
cause, and which ways to deal with that are available.

The basic situation is that S and C do not show their members in the
source code, the members are created dynamically during import.
This is a nifty usage of Python's dynamic features, but it's giving code
analysis tools a really hard time.
For example, for every occurrence of C.<whatever> in SymPy, Eclipse
Pydev spits out a warning that C.<whatever> isn't present in class C.
Last time I ran that check, that was a four-digit number of warnings.

This in itself wouldn't be a problem. Pydev allows suppressing specific
kinds of warnings.
However, suppressing warnings can hide real problems. E.g. if a member
of C is renamed/removed during refactoring, but there's still a
reference to it somewhere inside SymPy, and unit tests happen to miss
the situation that would trigger the error.

For this reason, I'd like to eliminate the irrelevant warnings.

There are two issues that are blocking me from moving forward with this:

1) I do not fully understand what C and S actually do.
C says that it is needed to avoid problems with cyclic imports, but it
does not describe what the problem exactly is, or how having C avoids
those problems.
For S, it's similar to C but not saying anything about cyclic imports,
so I'm wondering if that's just not documented or whether S does indeed
serve a different purpose than C. S' purpose also remains unclear - it's
a singleton mechanism, I can see that easily, but if it's just an
implementation vehicle, I don't understand why all singletons are made
into members of S - the construction should rather be that S provides
the singleton machinery and classes that need a singleton use a member
that's a subclass of S.

2) I'm not sure how C and S are being used "in the wild", so I don't
know which kinds of publicly visible changes will break code of SymPy
users. This makes it hard to decide how to fix the problem, or whether
fixing it is even worth the potential breakage.

Here's a short list of approaches that could be explored. I'd like to
know what you people think about each:

a) Turn S into a pure implementation vehicle, making S instances into
hidden members of all classes that need a singleton.
I think some initialization code populates S with singletons that are
available to user code, so that would break a lot of user code. It would
be the conceptually cleanest approach for S (if S was indeed intended to
be just an implementation vehicle).

b) Eliminate C, solve the import cycle problems in some other way (one
could shortcut the second invocation of the import, or the cyclic
dependency could be factored out into a separate module that's imported
by all classes that are part of the cycle).
This could break huge amounts of user code because people tend to use
C.<whatever>. It would require changing all the call sites to C inside
SymPy itself (something I'm willing to do so, sacrifices must be made).

C) Make sure each member exists as source code.
E.g. extend C and S with a stub for each member that gets introduced by
some other module. The stub just throws an exception that says "Please
import module Foo to use C.bar" so that people get meaningful error
messages if they forget to import something that should have populated C
(or S).
This adds another clerical task on anybody who wishes to extend C or S
with a new member. On the other hand, it will give them a heads-up if
they create a name conflict, or make them look whether what they're
adding isn't already available under a slightly different name, so it
can be argued that that's not just bad.

D) Leave everything as it is, declaring it's not worth it.

Obviously, I would like to do something else than D, but I need input to
decide whether that's actually a good idea and whether it should be A+B
or C, or something else altogether.

Thoughts? Comments?

Regards,
Jo

Ronan Lamy

unread,
Sep 28, 2012, 2:06:19 PM9/28/12
to sy...@googlegroups.com
Le vendredi 28 septembre 2012 à 09:13 +0200, Joachim Durchholz a écrit :
> Hi all,
>
> I'm having some trouble with S and C and would like to find out the root
> cause, and which ways to deal with that are available.
>
> The basic situation is that S and C do not show their members in the
> source code, the members are created dynamically during import.
> This is a nifty usage of Python's dynamic features, but it's giving code
> analysis tools a really hard time.
> For example, for every occurrence of C.<whatever> in SymPy, Eclipse
> Pydev spits out a warning that C.<whatever> isn't present in class C.
> Last time I ran that check, that was a four-digit number of warnings.
>
> This in itself wouldn't be a problem. Pydev allows suppressing specific
> kinds of warnings.
> However, suppressing warnings can hide real problems. E.g. if a member
> of C is renamed/removed during refactoring, but there's still a
> reference to it somewhere inside SymPy, and unit tests happen to miss
> the situation that would trigger the error.

There is another problem: C is a flat, global namespace, which makes
name collisions likely. Worse, these collisions can happen behind the
scenes, causing spooky action at a distance - for instance:

>>> trigsimp(cos(x)**2 + sin(x)**2)
1
>>> c = Function('cos')
>>> trigsimp(cos(x)**2 + sin(x)**2)
sin(x)**2 + cos(x)**2

(what happened is that Function('cos') registered C.cos = c, but
trigsimp relies on C.cos == cos ...)

>
> For this reason, I'd like to eliminate the irrelevant warnings.
>
> There are two issues that are blocking me from moving forward with this:
>
> 1) I do not fully understand what C and S actually do.
> C says that it is needed to avoid problems with cyclic imports, but it
> does not describe what the problem exactly is, or how having C avoids
> those problems.

C is used to gain access to a class without importing the module where
it's defined - which prevents a cyclic import if that module imports the
current one. However, doing the import at function/module-level would
solve the issue just as well.

> For S, it's similar to C but not saying anything about cyclic imports,
> so I'm wondering if that's just not documented or whether S does indeed
> serve a different purpose than C. S' purpose also remains unclear - it's
> a singleton mechanism, I can see that easily, but if it's just an
> implementation vehicle, I don't understand why all singletons are made
> into members of S - the construction should rather be that S provides
> the singleton machinery and classes that need a singleton use a member
> that's a subclass of S.

S registers all singleton instances, but it's not an essential part of
the implementation of singletons. It should really be thought of as a
simple dictionary that happens to use the nicer attribute access syntax
"S.foo" instead of the usual item access syntax "S['foo']".
>
> 2) I'm not sure how C and S are being used "in the wild", so I don't
> know which kinds of publicly visible changes will break code of SymPy
> users. This makes it hard to decide how to fix the problem, or whether
> fixing it is even worth the potential breakage.

I suspect that C isn't used much, but S is probably used, due to the
existence of S() as a shortcut for sympify() and to the fact that S.One
looks like the nicest way of accessing the One() singleton.

In any case, changing or removing them would require a significant
deprecation period.

> Here's a short list of approaches that could be explored. I'd like to
> know what you people think about each:
>
> a) Turn S into a pure implementation vehicle, making S instances into
> hidden members of all classes that need a singleton.
> I think some initialization code populates S with singletons that are
> available to user code, so that would break a lot of user code. It would
> be the conceptually cleanest approach for S (if S was indeed intended to
> be just an implementation vehicle).
>
> b) Eliminate C, solve the import cycle problems in some other way (one
> could shortcut the second invocation of the import, or the cyclic
> dependency could be factored out into a separate module that's imported
> by all classes that are part of the cycle).

More simply, it should be the case that every use of C.* can be replaced
by a function-level import (because by the time 'C.Foo' is executed, Foo
must have been defined, which means that its module has been imported by
the interpreter already). Of course, it's better to move the import to
the top of the module whenever possible.

> This could break huge amounts of user code because people tend to use
> C.<whatever>. It would require changing all the call sites to C inside
> SymPy itself (something I'm willing to do so, sacrifices must be made).

Thanks for offering to do it. That would be very useful.
>
> C) Make sure each member exists as source code.
> E.g. extend C and S with a stub for each member that gets introduced by
> some other module. The stub just throws an exception that says "Please
> import module Foo to use C.bar" so that people get meaningful error
> messages if they forget to import something that should have populated C
> (or S).
> This adds another clerical task on anybody who wishes to extend C or S
> with a new member. On the other hand, it will give them a heads-up if
> they create a name conflict, or make them look whether what they're
> adding isn't already available under a slightly different name, so it
> can be argued that that's not just bad.

Let's not do that. Just not. Please!

> D) Leave everything as it is, declaring it's not worth it.

Given the problems with C, I think we should really do something about
it. Dealing with S is probably worth it as well, but less of a priority,
I think.

Aaron Meurer

unread,
Sep 28, 2012, 2:42:23 PM9/28/12
to sy...@googlegroups.com
I agree that C is probably not necessary, and really is not necessary
to be included in "from sympy import *". So I'm +1 to remove it,
unless we determine that it really is needed. Should we deprecate it
first, or just remove it?

S on the other hand is quite useful. Because the members are
singletonized, you can use "is" comparison (much faster than ==
comparison). But this only works if you have the element itself, so
for example, if you want to compare against 0, you have to first
sympify 0. "a is S.Zero" is the cleanest way to do this. I'm also
rather fond of the tab-completeable shortcut for 1/2 by S.Half in
interactive use.

Also, unlike C, S only eats up those objects that specifically set it
as their metaclass. So the name clash issue that Ronan mentioned does
not apply. I also seriously doubt any external code uses S to create
its own singletons, though something like that would be perfectly
acceptable and supported, so long as the names don't clash with
existing names.

Aaron Meurer
> --
> You received this message because you are subscribed to the Google Groups "sympy" group.
> To post to this group, send email to sy...@googlegroups.com.
> To unsubscribe from this group, send email to sympy+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/sympy?hl=en.
>

Joachim Durchholz

unread,
Sep 29, 2012, 4:49:20 AM9/29/12
to sy...@googlegroups.com
Am 28.09.2012 20:42, schrieb Aaron Meurer:
> I agree that C is probably not necessary, and really is not necessary
> to be included in "from sympy import *". So I'm +1 to remove it,
> unless we determine that it really is needed. Should we deprecate it
> first, or just remove it?

+1 to deprecate, -1 to remove.

Removal runs the risk of enraging people who use C and can't quickly
change their code.

Deprecation would be enough for inside SymPy: the warnings happen in the
call site, which is what's getting changed anyway.

> S on the other hand is quite useful. Because the members are
> singletonized, you can use "is" comparison (much faster than ==
> comparison).

That's generally true for each singleton.

I'd suggest moving the singletons into the classes where they're created.

> But this only works if you have the element itself, so
> for example, if you want to compare against 0, you have to first
> sympify 0. "a is S.Zero" is the cleanest way to do this. I'm also
> rather fond of the tab-completeable shortcut for 1/2 by S.Half in
> interactive use.

Hm... Zero could be a class.
This is Python, things can be class, function, and dictionary all at once.

> Also, unlike C, S only eats up those objects that specifically set it
> as their metaclass. So the name clash issue that Ronan mentioned does
> not apply. I also seriously doubt any external code uses S to create
> its own singletons, though something like that would be perfectly
> acceptable and supported, so long as the names don't clash with
> existing names.

They may clash with names created by other people who're using SymPy, so
freely combining code written by others isn't possible anymore.
Also, they may clash with names that future versions of SymPy place in S.

For this reason, adding stuff to S by anybody other than the SymPy team
would actually be a bad idea IMNSHO.

Just my thoughts.
Regards,
Jo

Aaron Meurer

unread,
Sep 29, 2012, 4:58:13 PM9/29/12
to sy...@googlegroups.com
On Sep 29, 2012, at 2:49 AM, Joachim Durchholz <j...@durchholz.org> wrote:

> Am 28.09.2012 20:42, schrieb Aaron Meurer:
>> I agree that C is probably not necessary, and really is not necessary
>> to be included in "from sympy import *". So I'm +1 to remove it,
>> unless we determine that it really is needed. Should we deprecate it
>> first, or just remove it?
>
> +1 to deprecate, -1 to remove.
>
> Removal runs the risk of enraging people who use C and can't quickly change their code.
>
> Deprecation would be enough for inside SymPy: the warnings happen in the call site, which is what's getting changed anyway.
>
>> S on the other hand is quite useful. Because the members are
>> singletonized, you can use "is" comparison (much faster than ==
>> comparison).
>
> That's generally true for each singleton.
>
> I'd suggest moving the singletons into the classes where they're created.

I'm not sure if I understand you. The classes are the singletons.
"sympify(0) is S.Zero is Zero" is True.

>
> > But this only works if you have the element itself, so
>> for example, if you want to compare against 0, you have to first
>> sympify 0. "a is S.Zero" is the cleanest way to do this. I'm also
>> rather fond of the tab-completeable shortcut for 1/2 by S.Half in
>> interactive use.
>
> Hm... Zero could be a class.
> This is Python, things can be class, function, and dictionary all at once.

I definitely don't understand what you're saying here.

>
>> Also, unlike C, S only eats up those objects that specifically set it
>> as their metaclass. So the name clash issue that Ronan mentioned does
>> not apply. I also seriously doubt any external code uses S to create
>> its own singletons, though something like that would be perfectly
>> acceptable and supported, so long as the names don't clash with
>> existing names.
>
> They may clash with names created by other people who're using SymPy, so freely combining code written by others isn't possible anymore.
> Also, they may clash with names that future versions of SymPy place in S.

Well, it will be supported modulo any name clashes. If you're worried
about it, you can choose a name that you know will never clash.

Aaron Meurer

>
> For this reason, adding stuff to S by anybody other than the SymPy team would actually be a bad idea IMNSHO.
>
> Just my thoughts.
> Regards,
> Jo
>

Joachim Durchholz

unread,
Sep 30, 2012, 4:37:41 AM9/30/12
to sy...@googlegroups.com
Am 29.09.2012 22:58, schrieb Aaron Meurer:
>> I'd suggest moving the singletons into the classes where they're created.
>
> I'm not sure if I understand you. The classes are the singletons.
> "sympify(0) is S.Zero is Zero" is True.

Heh. Pythonic ignorance on my part I guess.
Backtracking and asking questions to get a picture of the situation
now... here goes:

What's the advantage of writing down "S.Zero" if we can write "Zero"?
Ah... *can* we even write "Zero"? (Not Zero(), that much I do understand
:-) )

>>> Also, unlike C, S only eats up those objects that specifically set it
>>> as their metaclass. So the name clash issue that Ronan mentioned does
>>> not apply. I also seriously doubt any external code uses S to create
>>> its own singletons, though something like that would be perfectly
>>> acceptable and supported, so long as the names don't clash with
>>> existing names.
>>
>> They may clash with names created by other people who're using
>> SymPy, so freely combining code written by others isn't possible anymore.
>> Also, they may clash with names that future versions of SymPy place in S.
>
> Well, it will be supported modulo any name clashes.

I agree with that.

> If you're worried
> about it, you can choose a name that you know will never clash.

I don't understand - how can I know that a name will never clash?

As a SymPy user, I would be unable to control how others (including the
SymPy team) choose names, so couldn't prevent them from accidentally
come up with the same name as I do.

Regards,
Jo

Aaron Meurer

unread,
Sep 30, 2012, 3:37:35 PM9/30/12
to sy...@googlegroups.com
On Sep 30, 2012, at 2:37 AM, Joachim Durchholz <j...@durchholz.org> wrote:

> Am 29.09.2012 22:58, schrieb Aaron Meurer:
>>> I'd suggest moving the singletons into the classes where they're created.
>>
>> I'm not sure if I understand you. The classes are the singletons.
>> "sympify(0) is S.Zero is Zero" is True.
>
> Heh. Pythonic ignorance on my part I guess.
> Backtracking and asking questions to get a picture of the situation now... here goes:
>
> What's the advantage of writing down "S.Zero" if we can write "Zero"?
> Ah... *can* we even write "Zero"? (Not Zero(), that much I do understand :-) )
>
>>>> Also, unlike C, S only eats up those objects that specifically set it
>>>> as their metaclass. So the name clash issue that Ronan mentioned does
>>>> not apply. I also seriously doubt any external code uses S to create
>>>> its own singletons, though something like that would be perfectly
>>>> acceptable and supported, so long as the names don't clash with
>>>> existing names.
>>>
>>> They may clash with names created by other people who're using
>>> SymPy, so freely combining code written by others isn't possible anymore.
>>> Also, they may clash with names that future versions of SymPy place in S.
>>
>> Well, it will be supported modulo any name clashes.
>
> I agree with that.
>
> > If you're worried
>> about it, you can choose a name that you know will never clash.
>
> I don't understand - how can I know that a name will never clash?

If your module is called My Fun Module, you could name your class
MyFunModuleClass. Of course, your could shortcut it as just Class
everywhere in your code. The long name would just be to prevent
clashes in the SingletonRegistry.

Aaron Meurer

>
> As a SymPy user, I would be unable to control how others (including the SymPy team) choose names, so couldn't prevent them from accidentally come up with the same name as I do.
>
> Regards,
> Jo
>

Joachim Durchholz

unread,
Sep 30, 2012, 4:17:49 PM9/30/12
to sy...@googlegroups.com
>> I don't understand - how can I know that a name will never clash?
>
> If your module is called My Fun Module, you could name your class
> MyFunModuleClass. Of course, your could shortcut it as just Class
> everywhere in your code. The long name would just be to prevent
> clashes in the SingletonRegistry.

Well... assume somebody else is writing a My Fun Module, too.
By following the same logic, he'll end with MyFunModuleClass.

Clash.

Aaron Meurer

unread,
Sep 30, 2012, 4:19:46 PM9/30/12
to sy...@googlegroups.com
Yes, but then even "import myfunmodule" will be ambiguous. Of course
you *can* create a name clash, but realistically, it can be avoided.

Aaron Meurer

Joachim Durchholz

unread,
Sep 30, 2012, 7:09:23 PM9/30/12
to sy...@googlegroups.com
Am 30.09.2012 22:19, schrieb Aaron Meurer:
> On Sun, Sep 30, 2012 at 2:17 PM, Joachim Durchholz <j...@durchholz.org> wrote:
>>>> I don't understand - how can I know that a name will never clash?
>>>
>>>
>>> If your module is called My Fun Module, you could name your class
>>> MyFunModuleClass. Of course, your could shortcut it as just Class
>>> everywhere in your code. The long name would just be to prevent
>>> clashes in the SingletonRegistry.
>>
>>
>> Well... assume somebody else is writing a My Fun Module, too.
>> By following the same logic, he'll end with MyFunModuleClass.
>>
>> Clash.
>
> Yes, but then even "import myfunmodule" will be ambiguous. Of course
> you *can* create a name clash, but realistically, it can be avoided.

How would one avoid such a clash then?

Aaron Meurer

unread,
Oct 1, 2012, 12:03:46 AM10/1/12
to sy...@googlegroups.com
If you're so worried about it, use your own separate singleton registry.

Joachim Durchholz

unread,
Oct 1, 2012, 1:48:23 AM10/1/12
to sy...@googlegroups.com
Am 01.10.2012 06:03, schrieb Aaron Meurer:
> On Sun, Sep 30, 2012 at 5:09 PM, Joachim Durchholz <j...@durchholz.org> wrote:
>> Am 30.09.2012 22:19, schrieb Aaron Meurer:
>>
>>> On Sun, Sep 30, 2012 at 2:17 PM, Joachim Durchholz <j...@durchholz.org>
>>> wrote:
>>>>>>
>>>>>> I don't understand - how can I know that a name will never clash?
>>>>>
>>>>>
>>>>>
>>>>> If your module is called My Fun Module, you could name your class
>>>>> MyFunModuleClass. Of course, your could shortcut it as just Class
>>>>> everywhere in your code. The long name would just be to prevent
>>>>> clashes in the SingletonRegistry.
>>>>
>>>>
>>>>
>>>> Well... assume somebody else is writing a My Fun Module, too.
>>>> By following the same logic, he'll end with MyFunModuleClass.
>>>>
>>>> Clash.
>>>
>>>
>>> Yes, but then even "import myfunmodule" will be ambiguous. Of course
>>> you *can* create a name clash, but realistically, it can be avoided.
>>
>>
>> How would one avoid such a clash then?
>
> If you're so worried about it, use your own separate singleton registry.

Heh. I wouldn't use a singleton registry that's prone to name clashes
like this. I've been bitten by these over several times over the
decades, that's why I'm so insistent.

I conclude that S shouldn't be used by anybody but the SymPy team,
unless that somebody is 150% sure that his code won't ever be used by
anybody else.

Joachim Durchholz

unread,
Oct 3, 2012, 7:22:02 AM10/3/12
to sy...@googlegroups.com
Okay, here's my current plan:

Leave S alone for the moment - I don't feel we have reached a consensus
there.

Remove all uses of C inside SymPy, replacing them by imports from the
original module.
Fix any import cycles that that causes. Techniques I have seen:
- Move the import statement into the function that uses it
(dependencies less clear)
- Move the import to the bottom of the file
(dependencies less clear)
(nonidiomatic)
(can cause other problems)
- Replace "import foo from mod" with "import mod"
(requires changing foo() to mod.foo())
(seems to be the least intrusive change for most situations)
- Eliminate the cycle by
- importing just submodules
- splitting a module
(possible if import graph is cyclic, call graph is acyclic)
(sometimes this actually improves the overall program structure)
(sometimes it rips apart stuff that belongs together)
Probably it's mostly the third option, and the last option in obvious cases.

Make a separate pull request per module, except when refactoring.
This should avoid huge pull requests that accumulate conflicts until
they're approved.

Once all pull requests are accepted, deprecate C.


Should I go forward with that?
Advice? Comments?

Regards,
Jo

Aaron Meurer

unread,
Oct 3, 2012, 10:39:48 AM10/3/12
to sy...@googlegroups.com
Sounds good, though I think its mostly the first option, not the
third. And anyway, if importing a function from a module causes
circular import problems, so will just importing the module.

Aaron Meurer

Ondřej Čertík

unread,
Oct 3, 2012, 12:11:00 PM10/3/12
to sy...@googlegroups.com
On Wed, Oct 3, 2012 at 4:22 AM, Joachim Durchholz <j...@durchholz.org> wrote:
> Okay, here's my current plan:
>
> Leave S alone for the moment - I don't feel we have reached a consensus
> there.
>
> Remove all uses of C inside SymPy, replacing them by imports from the
> original module.
> Fix any import cycles that that causes. Techniques I have seen:
> - Move the import statement into the function that uses it
> (dependencies less clear)

I would point out, that the in my experience, it's quite slow to import
things inside a function, which might or might not be a problem, depending
on the situation. I don't know at the moment, how C.something is doing
performance wise.

However, I agree with this cleanup, thanks for doing this.

Ondrej

Joachim Durchholz

unread,
Oct 3, 2012, 3:15:58 PM10/3/12
to sy...@googlegroups.com
Am 03.10.2012 16:39, schrieb Aaron Meurer:
> And anyway, if importing a function from a module causes
> circular import problems, so will just importing the module.

I can only agree after reading and re-reading the piece on the import
statement in the Python docs.

On the other hand, the folks on Stack Overflow are usually
knowledgeable, and they do maintain there's a difference.

I suspect there's a difference in what module initialization code can
expect. So different styles of initialization code could very well
affected by having import foo from bar vs. import bar, not because the
import itself works differently but because the latter style forces
module code to use imported functions in a circular-import-safe manner.

Exploring these subtleties is going to be an, erm, "interesting"
endeavour :-)

Joachim Durchholz

unread,
Oct 3, 2012, 3:38:55 PM10/3/12
to sy...@googlegroups.com
Am 03.10.2012 18:11, schrieb Ondřej Čertík:
> I would point out, that the in my experience, it's quite slow to import
> things inside a function,

I can imagine. The import will run on every call to the function, and
that's always slower than simply looking up the imported name in the
dictionary of the class object.

> which might or might not be a problem, depending
> on the situation.

It's probably proportional to the number of calls to the function.

> I don't know at the moment, how C.something is doing
> performance wise.

It's a call to a Python function that does a dictionary lookup.
I suspect it's slower than a direct call but faster than an import
inside the function.

That's a very vague estimate and probably not worth the bits needed to
store it.
For performance, there's another aspect: all the usually-used
workarounds for circular dependency problems will get some attention by
whoever is tuning Python for performance. Atypical stuff like C won't
get this love, so even if it's fast now, it will fall back behind the
more typical techniques as Python improves.
So while I wouldn't disregard benchmarking results, looking at how
Python is being used in the wild is as important.

BTW I counted around 890 call sites to C.
That's not too bad. Changing the calls to raises() wasn't much different.

Aaron Meurer

unread,
Oct 3, 2012, 7:01:20 PM10/3/12
to sy...@googlegroups.com
If the performance becomes an issue, maybe something like

global sin, cos, imported

def _import_stuff():
global sin, cos, imported
if not imported:
from sympy import sin, cos
imported = True

at the module level would be an alternative. Or if global scares you,
use some kind of namespace.

Aaron Meurer

Sent from my iPad
Reply all
Reply to author
Forward
0 new messages