I was reading Ryan Tomayko's recent article about the success of Ruby on Rails, and he raises a couple of excellent questions about package and module layout and design that have been bothering me lately:
This may sound silly but I waste a ton of time trying to organize my project's source files.
- How should database scripts, templates, other web resources, scripts, documentation, etc. be organized?
- How should my packages be laid out? Do I create separate sub-packages for model, view, and controller? How about just separate modules? Or should I organize by object, keeping the model, view, and controller implementations in a single module? I've tried all of these approaches and found that each has pros and cons but that consistency is always king.
Another aspect of Rails that I appreciated is that it seems to promote—and possibly even require—a strict organizational structure of source files. This structure can be generated for you by helper scripts.
This is the kind of stuff I think is underrated. It significantly lowers the barrier to entry for new programmers to get moving with the framework and lowers the amount of time it takes for experienced programmers to create new projects.
Not silly at all, actually. I spent a lot of time doing this too. It's trickier than it may seem, and getting it right can greatly ease the barriers to entry for would-be developers and extenders of one's work. For many people, including myself, this is still somewhat of an afterthought, nowhere near the typical "design" tasks on the priority list.
I don't think much about package layout up-front. I'll typically start with a single main package, with subpackages organized by functional category ('protocols', 'scripts', 'utilities', core', etc). Eventually, it becomes apparent that one or more of these packages are unnecessary, and they'll get moved back to the top level. As the API approaches completion, typically new packages get created that weren't envisioned at the beginning. Finally, a top-level facade module or two that offer convenience APIs and define the main API of the package.
I've also noticed a few other distinct package layout patterns in popular Python software:
The Single Python File
RSS.py and SOAP.py come to mind here. These modules are usually protocol implementations with few dependencies, and are well-suited for this format.
Psychologically, adding one of these to a project doesn't feel as 'heavyweight' a dependency as adding a distutils-based tarball does - which is not really true - but it's one of the peculiarities of a module's marriage to the filesystem. I was never really comfortable with the Smalltalk approach, where the code I'm extending is a monolithic image, that includes the IDE and all manner of other cruft. When I'm working with someone elses code, a part of me wants to know what file a given class is in.
The Messy Flat Namespace
These are common. The Python standard library is one, but an excusable one given the nature of it's evolution. Small teams and single developers are more able to design and maintain a sensible module hierarchy, yet the MFN layout is common among packages that should know better (including many of my earlier efforts).
Script Package
Twisted and PEAK use this one, where the logic of command-line scripts is kept in modules in a 'scripts' package, instantiated by a single frontend script (`twistd` and `peak`, respectively).
The script module invoked is usually responsible for returning an object that conforms to a some sort of 'Command' or 'Application' interface.
Facade Module
__init__.py is one of these - an abstraction point for the functionality of the modules in a package. Frameworks like PEAK and Zope have extended package conventions, usually including an 'api' and 'interfaces' module.
Advanced versions of the Facade module, as seen in PEAK, support funky things like lazy loading and module inheritance.
These patterns address module layout, but none address what to do with the other project files, static templates, images, and data that should probably live 'closer' to the code than /usr/share or /var/whatever. It would be nice to have a distutils binary format that does something like py2app and py2exe do - put everything in a single executable for deployment. There are many cases where one doesn't want to tell a customer to install Python, or where one doesn't trust a customer not to bollocks his copy of Python. And again, the Single File is nice.
Finally, I'll agree with Ryan that this stuff is indeed underrated. One major difference between my personal projects and work projects is the amount of time spent on ease-of-use bells and whistles like stub code generation in the latter. |