config-file driven development

Config-file driven development

When I write software, I try to imagine how the users will use it and anticipate what changes in behaviour they may wish in the future, or what new features. I then try to make that behaviour available by simple changes to human-readable configuration files, instead of code changes. I like being in the situation when I am in a meeting representing my department to discuss what changes, timelines and cost is required for some new functionality, and while the other department is claiming 6 months and hundreds of thousands of dollars, I can state that my end will require 37 seconds. (yes, I really have said this. More than once). That’s how long (I estimated) it would take me to change a few lines in a config file.

Much of the software I have written the last several years has been diagnostic software for call-centres and Tier-X level support for devices in your home, such as Cable-Modems, Set-Top-Boxes, (telephone) Media Terminal Adapters, etc. That software behaves differently for each group or department based on which config file is used.  So instead of writing several similar programs with slightly different behaviour, I try to write a single program that uses different configuration files depending on the caller.

Years ago, I decided I wanted a specific config file format that I considered human readable, could be recursive by including other config files, could have continuation lines,  have comments, and could be given a optional definitions file that could change the behaviour from the default keyword = value where value was a scalar (simple value). So I wrote a library function, initially in Perl. I then used this library function for almost anything new that I wrote, so config file formats were consistent across all my diagnostic software.

I noticed that as I started to write new software for new projects, the first thing I did was to write the config file. I would look at what I wrote in the config file, and try to think of anything that a user might want to add or change in the future, stick it in the config file and then I would write the code to implement the behaviour for that config file. I’ll call this config-file driven development. For example, lets say I was writing some web-service that may, or may not, use syslog. I might have a section in the config file that has:

syslog:
    use-syslog          = yes
    SYSLOG-priority     = LOG_INFO
    SYSLOG-facility     = LOG_LOCAL7
    SYSLOG-ident        = MyProgram

File format

The above example showing syslog usage uses a few simple keyword = value lines where the values are simple scalars. They are grouped into a section called ‘syslog’. A section is just of way of organizing keywords. If there is no definitions file given when calling the config library routine, then the keywords (use-syslog, etc) can be anything and the values are simple scalars. But a optional definitions file can be given where you can specify that the keywords are different types. The valid types are scalar (default), array, and hash. It can also specify allowed-values for a keyword. So for the example keyword of ‘use-syslog’ above, you could specify that the only allowed values are ‘yes’ and ‘no’. If a definitions file is provided, then only keywords defined in that file can be used – unless the option ‘AcceptUndefinedKeywords’ is set to true when calling the config routine.

An example of a config entry where a type might be an array and a hash, with only certain allowed values, might be:

cable-modem:
    ip-domains = ip4, ip6
    ip4    = %SERIAL-NUMBER%.%SUB-DOMAIN1-NET4%.mydomain.net, \
             %SERIAL-NUMBER%.%SUB-DOMAIN2-NET4%.mydomain.net, \
             %SERIAL-NUMBER%.%SUB-DOMAIN3-NET4%.mydomain.net
    ip6    = %SERIAL-NUMBER%.%SUB-DOMAIN1-NET6%.mydomain.net
    variables  = SUB-DOMAIN1-NET4 = a.ip4-subnet1, \
                 SUB-DOMAIN2-NET4 = b.ip4-subnet2, \               
                 SUB-DOMAIN3-NET4 = c.ip4-subnet3, \
                 SUB-DOMAIN1-NET6 = a.ip6-subnet1

where the definitions file for the above would contain:

keyword         = variables
type            = hash
separator       = ,

keyword         = ip4
type            = array

keyword         = ip6
type            = array

keyword         = ip-domains
type            = array
allowed-values  = ip4, ip6

The above example is a bit contrived, but hopefully you get the idea.  In the above, a number of variables would be expanded by the calling program if they were surrounded by percent signs.  Most of those variables are defined in the config file itself, and presumably the program knows the serial number for the cable-modem device.  On a system that supported both IPv4 and IPv6, you would have ‘ip-domains’ set as above to ‘ip4, ip6’.  But for a IPv4-only system, you would remove the ‘, ip6’ in the ip-domains line.  (You could keep the IPv6 definitions in the file for future use and let the ip-domains line determine which protocols are supported).

Programming methodology

When I wrote the first version of the config library routine in Perl, I used procedural programming techniques. This worked well, but the routine returned a pointer to a complex data-structure and the calling program had intimate knowledge of this structure. For this reason, when I’d first start coding a new program that called this routine for this new config file, the first thing I would do is write debug code to dump out what I think I read in, just to be sure I parsed the data-structure correctly. This wasn’t very elegant. Although I am rarely a object oriented programmer, this is one case that seems to cry out for it. It is much more elegant to be able to make calls like:

conf = Config( 'blah.conf', 'blah-defs.conf' )
sections = conf.get_sections()
for section in sections:
    print "\t" + section
    keywords = conf.get_keywords( section )
    for keyword in keywords:
        print "\t\t" + keyword + ' = ',
        values = conf.get_values( section, keyword )
        print values

The above Python code would, for each section, print out each keyword and their value(s). This is a lot cleaner than diving down into some data structure. (I removed the error checking above in getting the Config object for brevity)

Sample config-file code

The above config file format I described I first wrote in Perl while employed to write diagnostic software. I like this format and I will probably continue to develop code by writing a config file first if the program I need/want is well suited to it. So, I am rewriting this library code from scratch to be sure there is no conflict in code ownership. I just rewrote it in Python, as part of a excuse to (re)learn Python. As of this writing, this object oriented Python version is not being called by anything so it has not been torture tested. But if it is of any use to anyone, it is available at Github

About

RJ is a freelance consultant living in Toronto specializing in software development and systems administration on Unix/Linux systems.

Leave a Reply