A Comparison of Python Class Objects and Init Files for Program Configuration

Showing 1-1 of 1 messages
A Comparison of Python Class Objects and Init Files for Program Configuration metaperl 9/11/06 11:57 PM
A Comparison of Python Class Objects and Init Files for Program
Configuration
=============================================================================

Terrence Brannon
bau...@metaperl.com
http://www.livingcosmos.org/Members/sundevil/python/articles/a-comparison-of-python-class-objects-and-init-files-for-program-configuration/view

Abstract
--------

Init files serve as a convenient mini-language for configuring
applications. The advantages of such a mini-language are

* non-technical people can edit them.
* program behavior can be configured without modifying the source

The author `has always been suspicious of mini-languages
<http://perlmonks.org/?node_id=428053>`_, initially in the context of
dynamic HTML generation.

This document provides a comparison of two approaches to program
development, the use of `a popular Python mini-language
<http://www.voidspace.org.uk/python/configobj.html>`_ and the use of
`Python class objects <http://www.mired.org/home/mwm/config.html>`_.

The Problem Space
-----------------

I work for a company that takes data in various formats (e.g., CSV,
XML, Filemaker) and integrates them all into our database. The 30 or
so scripts that did this were written in Perl. I was told to rewrite
them and elected to use Python.

The Initial Rewrite Using Init Files
------------------------------------

In the initial version using config files, I used a generic config
file and specialized/overwrote its values with a local one::

  gconfig = ConfigObj("../generic.ini")
  cnf = ConfigObj("local.ini")
  cnf.merge(gconfig)

I then proceeded to login to an FTP server and check for a certain
file and download it if it existed::

  host = ftputil.FTPHost(cnf['ke_ftp']['host'],
                       cnf['ke_ftp']['user'],
                       cnf['ke_ftp']['pass'])

  host.chdir(cnf['ke_ftp']['cwd'])
  found = ""
  for f in host.listdir(host.curdir):
     if f.endswith( cnf['ke_ftp']['filepattern'] ):
        found = f
        print "Downloading", found
        host.download(f, f, 'b')

  if found == "":
     print "No file with pattern", cnf['ke_ftp']['filepattern'], "on
server"
     sys.exit()

Now lets see the object-oriented configuration
----------------------------------------------

Instead of generic and specialized config files, one would initially
think of using inheritance when going for an OO approach. However,
each program can get configuration information from a number of
places. As a result, HAS-A will work better than IS-A in this case::

 class config(object):

   # best to use has-a because we need config info from all over the
place
   import data.config.ke

   ke = data.config.ke.ftp()
   ke.user     = 'dmsdatasystems'
   ke.password = 'sredna?'
   ke.cwd      = 'DMS/companies'
   ke.filepattern = 'companies.csv.gz'

   import data.config.snap
   snap = data.config.snap.ftp()


   """
   import data.storage
   storage = data.storage.logic()
   print dir(storage)
   sys.exit()"""

 class proc(object):

   def __init__(self):
      self.cnf = config()

   def fetch_file(self):
      host = ftputil.FTPHost(self.cnf.ke.host, self.cnf.ke.user,
                             self.cnf.ke.password)
      host.chdir(self.cnf.ke.cwd)

      self.downloaded_file = ""
      for f in host.listdir(host.curdir):
         if f.endswith( self.cnf.ke.filepattern ):
            self.downloaded_file = f
            print "Downloading", f
            host.download(f, f, 'b')
            print "Downloaded", f

      if self.downloaded_file == "":
         print "No file with pattern", self.cnf.ke.filepattern, "on",
self.cnf.ke.host
         sys.exit()


Evaluation
==========

Appearance
----------

Accessing object attributes, IMHO, is much cleaner looking. Compare::

  config['munger']['outfile']

with::

  config.munger.outfile


One takes less characters to type and lacks the noisy quote marks
(that's one of the things I miss from Perl hashes - the ability to
index into hashes using bare words without quotes).

Power
-----

A mini-language is a moving target. They tend to grow and grow over
time, adding more and more ad hoc semantics and peculiarities. I'm on
the ConfigObj mailing list and they are currently discussing how to
handle multi-line config values. In Python, one simply needs to break
out """ and you are done. This epitomizes why I prefer the shortcut
of library over mini-language. The library has the full power of a
widely used, widely debugged language while the mini-language has a
smaller user community and less man hours of development behind it.

Learning Curve
--------------

Again, once you learn Python, you can use it for configuration as well
as programming. And Python is a very readable, clean and regular
language - it doesn't look very different from a configuration
language.

The merits of studying the syntax and semantics of a configuration
language can possibly outweigh the benefits.


Maintenance
-----------

It is harder to hire people with a good background in object
oriented programming than it is to find a "scripter".

By taking the
object-oriented route with this program, I have made it harder for
people to maintain my code in one sense but easier in another. A
well-decomposed OO program, especially in a language as elegant and
powerful as Python, is truly something to enjoy. To those versed in
such technologies, it is probably easier and more fun to extend such a
program.

On the other hand, a config file and a script would allow a person
with average scripting skills, perhaps even with no Python background,
to make small edits and extensions with little difficulty.

Manipulation
------------

Indexing into a hash with strings is a bit more easy to parameterize
than accessing object attributes. For example, to make a tuple of a
list of hash values, you can do this::

[a , b , c] = map ( lambda k: config[k], "key1 key2 key3".split() )

To get attributes from an object is slightly wordier::

[a , b , c] = map ( lambda k: getattr(cnf, k), "key1 key2 key3".split()
)

Self-documenting problem decomposition
--------------------------------------

One thing I really like about the OO approach is that it provided an
affordance to break down the steps of the processing into separate
methods. This leads to self-documenting code::

   if __name__ == '__main__':

      p = proc()

      p.fetch_file()
      p.unzip_file()
      p.munge_data()
      p.snap_upload()
      p.archive_zipdir()

In contrast, when I wrote the program using config files, I had
comments interspersed in the linear stream-of-consciousness script to
break out sections::

#
--------------------------------------------------------------------------
# Fetch DMS file from KE FTP
#
--------------------------------------------------------------------------

Now, granted, one _could_ create bunch of methods and decompose the
problem in that way, it's just that structuring a program with one
or more mini-languages as opposed to object-oriented libraries does
not provide as much of an affordance for such an approach.

This Document
=============

This document was prepared using `reStructuredText
<http://docutils.sourceforge.net/rst.html>`_.