splitting up python script into smaller files...

2,370 views
Skip to first unread message

e955...@gmail.com

unread,
Mar 18, 2015, 6:11:41 PM3/18/15
to python_in...@googlegroups.com
Hi yo,

I have a long python script, over 1000 lines now and growing. I was thinking would it be wise to start splitting this big program up into smaller python files. I have two questions:

1) would this increase the time it takes to run the script. Say it had to load 10 separate python scripts.

2) how do you do it?, could someone explain this with a simple example?

thanks,
Sam

Justin Israel

unread,
Mar 18, 2015, 9:30:31 PM3/18/15
to python_in...@googlegroups.com
On Thu, Mar 19, 2015 at 11:11 AM <e955...@gmail.com> wrote:
Hi yo,

I have a long python script, over 1000 lines now and growing. I was thinking would it be wise to start splitting this big program up into smaller python files. I have two questions:

1) would this increase the time it takes to run the script. Say it had to load 10 separate python scripts.

Technically it would be false to say it didn't add at least some form of extra work since at first import Python would have to do more filesystem operations. But it isn't something you should really be concerned with in terms of structuring your application. Especially since they will all be in the same PYTHONPATH and you would be benefiting from filesystem caching. 

Basically...no don't worry about this.
 

2) how do you do it?, could someone explain this with a simple example?


It isn't so much of a matter of starting to write a script and the going "well I have hit 1000 lines, so it's time to split!". What you should probably focus on is grouping common functionality into modules. This will allow you to continue to do related work within a given module as opposed to jumping between logic that is simply just split for the sake of splitting it. Also it promotes re-usability, because you may have some interesting operations that can be imported by some other app, without needing dependencies that some of your other modules might require. Let's say you require a certain GUI library in part of your application, but you also have a bunch of great file parsing logic. If you kept them both in the same module, then other applications would depend on having the GUI library available to even import the code to get at the file parsing. And the two may be completely unrelated. 

An example would be, say if you were developing a GUI application, to group together logic for various views, and to keep your business logic separate as well. Such as the logic that may talk to a database, or interact with a network API.

 
thanks,
Sam

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_maya+unsub...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/python_inside_maya/55b56fbf-0525-427b-b8b8-a21028df5bc5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Joe Weidenbach

unread,
Mar 18, 2015, 11:06:15 PM3/18/15
to python_in...@googlegroups.com
Hi Sam,

It doesn't honestly sound like your file is too big yet, from a Python perspective.  I've seen MUCH longer single scripts, and it's tough to really say without an idea of the purpose or structure of your script what the best way to split it up is.  In my experience (I come from a C++/C# background), Python likes to have much more monolithic scripts than I'm used to.  I've always preferred to make one class per file, and maybe several files with utility functions.  In Python, however, this will quickly run you into problems with Circular Dependencies, especially if you have classes that have bi-directional communications.  There's definitely things you can do design-wise to mitigate this, but at a certain point you're trying to design around something that Python never intended for. I wish I had a concrete example for you, but unfortunately that's about 200 iterations ago on my current project.  I've since redesigned twice as I wrapped my head around more of the Python "way", and learned deeper techniques like Dependency Injection and Inversion of Control that let me decouple classes more effectively.

The first thing to realize is that an import is not the same as an include, and we don't have a #pragma once or #ifndef __HEADER_H guard we can set up.  Technically, of course, we do have the capability, but again with that approach I'd argue you're trying to force Python into working like C++, which is not how it's intended.

Python's import system is built around the idea of modules, where a module maintains some specific standalone functionality, be it audio, a rig system, etc.  The idea is that each module can stand on its own without dependencies internal to your overall project.

The second part of the import system is packages, which are basically a folder of modules, containing a special file called __init__.py.  This file can have no content, and the whole thing will work as a single module.  If you do want special content in __init__.py, you can put it in, and that code will execute when you import the package.  I personally use this functionality to explicitly expose the classes I intend for use outside the package.  The key thing to remember with a package is that from Python's perspective, the whole thing is basically one module once it's loaded.  So, with that in mind, Packages should also be self-sufficient.

Basically, here's what this means in a nutshell.  When you start splitting things up, you need to have a solid design already figured out, or you're going to be in for headaches down the road.  And when I say solid design, I mean far more solid than most programs you write in other languages.  

Take C++, for example.  In C++ I can make two classes that rely on each other.  Good design? Most likely not.  I surround the headers with compilation guards to make sure each one only gets loaded into memory once, and I'm good to go.  That's not to say I have good design, and I'll probably have headaches down the road as I try to maintain this as a large project, but it will probably compile and run, which in the prototype stage is all I'd need.

In Python, there'd be no such luck.  If I put those classes in two separate files, and then have each file try to import the other, the entire program won't work, and will give you cryptic messages about Symbols not being defined.  Why? Well, because the first file will start to load, and will then try to import the other, which, of course, will lead back to trying to import the first file.  Python is smart enough to see a loop happening, and will just cancel both of those imports and try to continue executing, leading to those errors.

In this case, the easy solution would be to ensure that cross-dependent classes are in the same file, and then everything will work.  That doesn't solve the underlying design problem, but it will again make it work.  The better long-term solution is to redesign your classes so that they don't rely on each other.

I bring all of this up because it bit me hard early on in my Python journey, when my instincts were more along the lines of putting everything in separate files.  Splitting things up will definitely bring any design flaws you might have to the surface, which, if you're still developing on a prototype, could be more of a headache than it's worth.  If, on the other hand, you've already got a working system that you want to break up, what Justin recommended is exactly right.  Look at your common functionalities, and put them in their own modules, based on area of operation.  I use a structure similar to the following:

--Project Root
    --base # This contains global functionality that is specific to my overall project
    --modules # This is where I put plugins that I'm going to be using, and is where most of my active development happens. I develop each piece of functionality in my tools as a standalone plugin that I can include or not at will.  That's backed by a dynamic loading system (part of settings) that will check this directory and find available packages. Each of these modules is separate from the others, and with the exception of one or two of the base required modules, they don't talk to each other.  They can only import from base, settings, or utils.  This is actually one I plan on refactoring further at some point in the future when I'm further along, so that my base modules will go in a completely different place.  That way, no Plugin module will rely on another plugin.  I'll probably accomplish this by moving required modules into a modules package in base.
    --settings # This contains my base settings system to manage options etc
    --ui # Global Project UI functionality. Mostly, this is my main (non-module-specific) windows.  These windows will call into my modules for their specific UI elements.
    --utils # Useful functions that I include in any project I use them in.  I have a library of files I can pull in, separated by functionality.  For example, I have Sequencing Utilities, Functions for Containers, Namespaces, and so on.  The key with this directory is that the modules within have no dependencies on anything else internal to my project.  So, the only things they'd import are Python libraries, maya.whatever, or PySide stuff.  I can import these from any other scripts except from utils safely.

The reason I show this is that this is the level of thought you have to put into it when you start to break things up on a complex project.  I'm obsessed with clean, easy to read, maintainable code, so as I mentioned earlier, I like to break things up.  But, when you do that, it comes with complexity that doesn't tend to rear its ugly head in single file programs.  With that said, this project has complexity yours might not--it's a flexible auto-rigger development framework that supports dynamic modules and abstracts away a lot of the more technical aspects of the rigging process.  It's currently sitting at close to 15,000 lines of code across 40-50 files, and I'm anticipating that when I'm "done" with it that'll be closer to 30-40,000 lines of code.  In short, it's a huge project, for one person.

So yeah, the methodology for splitting up larger files really depends on what your code is trying to do, splitting of concerns, etc.

For a simpler setup, here's what one of my modules looks like:

--Module Root
    --blocks # A subdirectory for a module that supports plugins.  This uses functionality similar to my global module loading.
    --_common.py # This module contains most of my data-handling classes and functions.
    --_settings.py  # This module hooks into my global settings system and registers the settings this module uses
    --_ui.py           # This module contains my UI specific code, so the PySide stuff for this module.
    --__init__.py    # The init file primarily establishes overall module information for the larger system, and imports the primary externally accessible Class for this module into the package namespace.

Hope that helps, if you can give some more info on the specifics of your tool I can give more specifics on the process of re-combining things once you separate.

--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

Jesse Kretschmer

unread,
Mar 19, 2015, 3:55:02 AM3/19/15
to python_in...@googlegroups.com
Sam,
As Justin pointed out; don't worry about the load time increase. Focus on organization. 

If you have never made a module or package read this answer: http://stackoverflow.com/a/15747198

Now, he is my opinion: There is no maximum line count. I can't be sure that your project needs to be split into multiple files. ...But since you asked, you should explore this aspect of python. Jump in and do it. Even if you only put a couple functions in a module, you will learn something. Have fun!

Lastly, try to write code that is easy for people to read, not computers. 
>>> import this

-jesse

Joe Weidenbach

unread,
Mar 19, 2015, 4:12:48 AM3/19/15
to python_in...@googlegroups.com
Yeah, definitely the right answer.  There's a lot of ups and downs in Python, but you can't learn it without trying.  That whole brain dump I had came from a long time fighting with it, and I wouldn't trade it for anything.  You just have to be aware of what's going on.

That's a really good link, as well.  I'll try to write up a basic import example tomorrow :)

Sent from my iPhone
--
You received this message because you are subscribed to the Google Groups "Python Programming for Autodesk Maya" group.
To unsubscribe from this group and stop receiving emails from it, send an email to python_inside_m...@googlegroups.com.

Fredrik Averpil

unread,
Mar 19, 2015, 4:21:48 AM3/19/15
to python_in...@googlegroups.com
This is a very interesting topic!

I'm trying to picture myself splitting up a generic and quite large python script. It almost requires you to rewrite it from scratch, I would say, although with a lot of copy-pasting from the old script of course.

I wouldn't say I'm an expert in how to layout your code, but ;) I've designed a some 10,000-rows of code application in a way basically a series of classes are inheriting each other. I've also grouped the classes into separate files. Example:

 
So, myApp.py is my main application. It imports the configuration appConfig.py and makes the class "Launcher" in myApp inherit the configuration.

Then appBrowser.py and appToolbox.py are modules which in turn import and inherit a shared library called appCommon.py. For example, using this setup all database queries can be performed through the DBQuery class in appCommon only, so this is the only file which needs to import the psycopg2 package to access the db. Since appCommon.DBQuery is talking to the database, it also needs stuff from the appConfig.py which it therefore inherits.

In appCommon.py, I'm daisy chaining classes so that one final class has access to all classes' methods. In case you don't want to do this you can just create classes which inherit the class methods you are interested to pass on further to a module.

Does it make sense?
I would be very happy to get any kind of feedback on this kind of layout. It's just something I've come up with and based a large application on without really having anyone to bounce ideas off. I'm sure there are lots of cons with this setup, but for me/us it works quite well so far.

// Fredrik






e955...@gmail.com

unread,
Mar 19, 2015, 8:03:08 AM3/19/15
to python_in...@googlegroups.com
wow!,

thanks eveyone for taking the time to reply to this. Just had a chance to glance at the responses. But will dive into this soon. seems very interesting

thanks,
Sam
Reply all
Reply to author
Forward
0 new messages