Speno's Pythonic Avocado 2004/1

2004-01-28

MAC address normalizer: Now with string iteration and 100% less regexes!

In response to Ian Bicking's Python nit, chapter 3 , I commented that I had just used iterating over a string to validate hardware addresses.

Here's my fairly liberal hardware address normalizer thingy that uses iterating through a string:



HEXDIGITS = '0123456789abcdefABCDEF'
MAC_ADDR_LENGTH = 12

def normalize_mac(address):
    """
    Return valid MAC address in our favored format. 
    It is very liberal in what it accepts as there can be any number
    of seperator characters in any combination in the input address.

    Arguments:

        address -- (string) A possible MAC address.

    Returns:

        The possible MAC address in our choosen format (xx:xx:xx:xx:xx:xx)

    Raises:

        TypeError -- if address isn't a string.
        ValueError -- when address isn't a valid MAC address.
    """
    if not isinstance(address, str):
        raise TypeError, "address must be a string"

    seperators = '-:. ' # NOTE: contains space as a seperator
    for sep in seperators:
        address = address.replace(sep, '') 
   
    count = 0 

    for digit in address:
        if digit not in HEXDIGITS:
            err = 'Invalid MAC: Address contains bad digit (%s)' % digit
            raise ValueError, err
        else:
            count = count + 1
            if count > MAC_ADDR_LENGTH: 
                err = 'Invalid MAC: Address too long'
                raise ValueError, err 

    if count < MAC_ADDR_LENGTH:
        err = 'Invalid MAC: Address too short'
        raise ValueError, err 

    address = address.lower()
    parts = [address[i:i+2] for i in range(0, MAC_ADDR_LENGTH, 2)]
    return ':'.join(parts)

So yeah, I think iterating over strings is nifty, and eschewing regular expressions is doubly so. The code once used the following to find bad digits in MAC addresses:


    illegal_chars = re.compile(r'[^0-9a-f]', re.IGNORECASE).search
    match = illegal_chars(address)
    if match is not None:
        err = 'Invalid MAC: address contains bad digit (%s)' % match.group()
        raise ValueError, err

but I changed it to avoid using a regex. I'm happier that way.

Take care.

This post references topics: python

posted at 22:07:44 # comment [] trackback []

2004-01-25

load testing BIND using dnspython and queryperf

A new name server running BIND needed to be load tested. Included in BIND's distribution is a utility called queryperf which can be used to test a name server's performance. As input, queryperf takes a list of domain resource names and resource types seperated by whitespace and uses that list to generate queries to a name server. It should look like this:


foo.example.com. A
bar.example.com. A
mail.example.com.   MX
1.0.0.127.in-addr.arpa. PTR
2.0.0.127.in-addr.arpa. PTR

It would have been fairly easy to parse our zonefiles, looking for $ORIGIN tags and the like to construct our input file for performance testing. However, dnspython, an open-source DNS toolkit written in python, made it even easier. Here's the code:


# zoner.py
import dns.zone
import os.path
import sys

TYPES = ['A', 'PTR', 'MX']

for filename in sys.argv[1:]:
    zone = dns.zone.from_file(filename, os.path.basename(filename),
                              relativize=False)

    for type in TYPES:
        for (name, ttl, rdata) in zone.iterate_rdatas(type):
            print '%s\t%s' % (name, type)

This program parsed our zonefiles to produce an input file for queryperf. As for the new name server, it passed the test with flying circuses.

This post references topics: python

posted at 10:55:44 # comment [] trackback []

2004-01-21

My so called development environment.

I wanted to talk about my development environment for Python. I do all my main development under MacOS X, but deploy to other UNIX systems. I write all my code using Vim . I don't use syntax coloring, nor do I use folding. Sometimes I'll use a split buffer when I'm feeling daring. Smiley

I make heavy use of PyChecker to check my code before running it. I really love it because it catches all of the stupid mistakes I make.

Like many python programmers, I spend a lot of time typing small code snippets directly into the python interpreter just to test things. This is amazingly useful. Beyond using tab-copletion with readline, I like to have my python history saved between sessions, so I use this in my .pythonrc file:


import rlcompleter
import atexit
import os
import readline
from pydoc import help

# Make TAB complete stuff
readline.parse_and_bind("tab: complete")

# Save our interpreter history between sessions
historyPath = os.path.expanduser("~/.pyhistory")

def save_history(historyPath=historyPath):
    import readline
    readline.write_history_file(historyPath)

if os.path.exists(historyPath):
   readline.read_history_file(historyPath)

atexit.register(save_history)

# clean up
del os, atexit, readline, rlcompleter, save_history, historyPath

I didn't write those history saving bits myself. They were found on the net. Score! I've recently tried using IPython, an enhanced interactive python shell. I haven't switched over to it yet, but I think it's @edit command is really cool.

When it comes time to debug, I'm a big fan of just using print statements. Beyond that, I like to use 'python -i' to inspect the state of variables in the interpreter, and rarely I'll use 'pdb.run()' to run code under the debugger.

That's basically it. Nothing fancy, but the job gets done.

Take care.

This post references topics: python

posted at 20:14:40 # comment [] trackback []

2004-01-19

PIL vs. CoreGraphics

In my previous entry, Goodbye PIL. Hello CoreGraphics I talked about how I had trouble getting PIL packaged up into an application. Today, I went back and got a pre-built PIL from the "offcial unofficial" PackageManager repository and tried it again. It worked! No muss, no fuss. Thanks, Bob!

Now I had both a PIL version and a CoreGraphics version so of course I raced them to see who was faster. Since this is a python blog, here was my quick and dirty main program for getting timing information:

if __name__ == '__main__': import HTMLPhotoGallery import sys import time f = open('/tmp/pylog', 'a') sys.stdout = sys.stderr = f then = time.clock() HTMLPhotoGallery.main(sys.argv[1:]) print time.clock() - then

After several test runs using the same inputs, my best time for the CoreGraphics version was 4.14 seconds and for PIL it was 1.94 seconds. Timing from run to run was consistent. The PIL version was always at least twice as fast as the CoreGraphics version. Hello PIL! Goodbye CoreGraphics! Smiley

Take care.

This post references topics: python

posted at 21:59:12 # comment [] trackback []

2004-01-18

Goodbye PIL. Hello CoreGraphics.

While I was repackaging my photo gallery application, I ran into difficulties getting PIL bundled up with it. The application used PIL to create thumbnails from JPEGs. I decide to try a different solution instead of struggling with PIL.

MacOS X 10.3 comes with python 2.3. Apple also provided a python wrapper around its CoreGraphics library, aka Quartz. I knew there was an easy way to use CoreGraphics to resize images from reading Andrew Shearer's The fastest way to resize images with Panther .

My PIL replacement code is basically Andrew's code with a few additions to make it behave more like PIL's Image.thumbnail():

import CoreGraphics class JPEGImage: def __init__(self, path): """A JPEG image. Arguments: path -- Path to a JPEG file """ self.path = path self.image = CoreGraphics.CGImageCreateWithJPEGDataProvider( CoreGraphics.CGDataProviderCreateWithFilename(self.path), [0,1,0,1,0,1], 1, CoreGraphics.kCGRenderingIntentDefault) self.width = self.image.getWidth() self.height = self.image.getHeight() def thumbnail(self, thumbnail, size): """Reduces a JPEG to size with aspect ratio preserved. Arguments: image -- pathname of original JPEG file. thumbnail -- pathname for resulting thumbnail of image size -- max size of thumbnail as tuple of (width, height) """ new_width, new_height = self.width, self.height want_width, want_height = size if new_width > want_width: new_height = new_height * want_width / new_width new_width = want_width if new_height > want_height: new_width = new_width * want_height / new_height new_height = want_height cs = CoreGraphics.CGColorSpaceCreateDeviceRGB() c = CoreGraphics.CGBitmapContextCreateWithColor(new_width, new_height, cs, (0,0,0,0)) c.setInterpolationQuality(CoreGraphics.kCGInterpolationHigh) new_rect = CoreGraphics.CGRectMake(0, 0, new_width, new_height) c.drawImage(new_rect, self.image) c.writeToFile(thumbnail, CoreGraphics.kCGImageFormatJPEG) # inplace replace our data with thumbnail's like PIL does self.path = thumbnail self.image = CoreGraphics.CGImageCreateWithJPEGDataProvider( CoreGraphics.CGDataProviderCreateWithFilename(self.path), [0,1,0,1,0,1], 1, CoreGraphics.kCGRenderingIntentDefault) self.width = self.image.getWidth() self.height = self.image.getHeight()

And that does everything I needed from PIL plus it makes the application easier to build and smaller. Of course, it also means it will only work on MacOS X 10.3 and not on previous versions. This is acceptable since I'm just writing stuff for my own personal use.

posted at 21:27:12 # comment [] trackback []

2004-01-16

Building MacOS X applications with Python

Update: This tutorial was expanded again and posted here .

This is a cheap entry since I just added this answer to the MacPython FAQ on how to build a stand-alone MacOS X application using Python.

This can be very easy to build using bundlebuilder.py.

First create your app building script like so:

from bundlebuilder import buildapp buildapp( name='Application.app', # what to build mainprogram='main.py', # your app's main() argv_emulation=1, # drag&dropped filenames show up in sys.argv iconfile='myapp.icns', # file containing your app's icons standalone=1, # make this app self contained. includeModules=[], # list of additional Modules to force in includePackages=[], # list of additional Packages to force in )

Besides building a stand-alone application which will bundle up everything you need to run your application including the Python.Framework, you can build using 'semi-standalone=1' instead. This will make your application bundle smaller as it won't include python, but it does require that your users already have a working python framework installed. On MacOS X 10.3, Apple has included a full 2.3 Python.Framework for you which will work with semi-standalone applications just fine.

There are other options you can set in the call to buildapp(). See the bundlebuilder.py module for the rest of them.

Then do this:

% python makeapplication.py build

That command should create a build/Application.app bundle, which may even work, but probably needs a little help. There will most likely be a few warnings. These can usually be ignored safely. Also, there may be additional modules and libraries which need to be included that buildapp() couldn't determine. You can add those it missed using the includeModules and includePacakges arguments to buildapp().

For example, if your application used Tkinter, you may need to copy Tcl.Framework and Tk.Framework into build/Application.app/Contents/Frameworks. You can use those from the standalone AquaTcl/TK package (i.e. the Wish.app/Contents/Frameworks/*).

Take care.

This post references topics: python

posted at 20:59:28 # comment [] trackback []

2004-01-15

Name Change and PyDS passwords

Welcome to Speno's Pythonic Avocado, formerly known as Qu'est-ce que c'est Python. I upgraded to Python Desktop Server 0.7 and had my Python Community Server password reset to something I know. I also changed the name because, hey, I love avocados and python both.

The Python Desktop Server is still a huge confusing black box to me. Luckily I can peek in at its source code and almost figure things out. For example, like how to change my password without having to register a new blog. Here's what I did:

import metakit db = metakit.storage('/path/to/my/pyds/var/upstream.data', 1) prefs= "prefs[rpcurl:S,password:S,pinginterval:I,upstreaminterval:I,doupstream:I,doping:I,usernum :S,proxy:S,commentsurl:S,cloudurl:S,bytesused:I,maxbytes:I,maxfilesize:I,rankingurl:S,refererurl: S,mailtourl:S,updatesurl:S,webbugurl:S,cancomment:I,trackbackurl:S,cantrackback:I,canmirrorposts: I,hassearch:I,urlsearch:S)" vw = db.getas(prefs) for r in vw: if r.usernum: r.password = 'md5 hash of my new password' db.commit()

If you can read this, then that worked. Hopefully this will be easier by the time PyDS is out of beta. Maybe it already is but I just don't know how yet. Smiley In the meantime, I hope to make use of the newly created PyDS-Users mailing list.

Oh. I just figured out how to use shortcuts. It looks like you put things in double-quotes and they'll be expanded for you. One less question to ask on the list now.

Take care.

This post references topics: python

posted at 21:54:56 # comment [] trackback []

2004-01-14

EasyDialogs on MacOS X

My photo gallery application needed an update for MacOS X. MacPython's buildapplet program turned out a Python 2.2 application which still works fine. However, I wanted to turn it into a "real" MacOS X application and that meant turning my script into an application bundle.

One of the consequences of doing this was losing a console window which handled standard input and output. Buildapplet had given me this feature for free. Using the console, my script could just use print statements for feedback. I wanted the new version to have better looking feedback, but I didn't want to spend much time on making it work. Luckily I didn't have to.

I knew that Python had a large mac toolbox as part of the standard python install in it's plat-darwin directory. Just take a look at all those (Mac) designations on the Global Module Index and you'll see what I mean. The one module in particular which caught my eye was EasyDialogs.

Using EasyDialogs.py, I was quickly able to add a nice progress bar to my application as well as a dialog box to display some useful feedback. Here's an example of how to use them:

import EasyDialogs import time images = [str(x) + '.jpg' for x in range(50)] image_count = len(images) progress = EasyDialogs.ProgressBar(title='Creating Gallery', maxval=image_count) for image in images: progress.label('Processing %s' % image) progress.inc() time.sleep(0.1) EasyDialogs.Message("Finished processing %d images" % image_count)

If you are running this from the command-line on MacOS X, make sure you use pythonw instead of python since you need to use the windowmanager.

Take care.

posted at 21:29:20 # comment [] trackback []

2004-01-04

Simple thread pools

It's easy to write multi-threaded programs with python. I've always done it using a pool of worker threads that get their input from one queue and send their output to another. Another thread uses their output queue as its own input queue and processes the results in some manner, e.g. by inserting it into a database. Here's a basic framework for doing this:

import Queue import threading import time import random class Worker(threading.Thread): """A worker thread.""" def __init__(self, input, output): self._get_job = input.get if output: self._put_job = output.put threading.Thread.__init__(self) def run(self): """Get a job and process it. Stop when there's no more jobs""" while True: job = self._get_job() if job is None: break self._process_job(job) def _process_job(self, job): """Do useful work here.""" time.sleep(random.random()) result = job + 1 self._put_job(result) class Recorder(Worker): def _process_job(self, job): """Override Worker's _process_job method. Just print our input""" print job def main(): NUM_WORKERS = 20 job_queue = Queue.Queue(0) results_queue = Queue.Queue(0) # Create our pool of worker threads for x in range(NUM_WORKERS): Worker(job_queue, results_queue).start() # Create our single recording thread Recorder(results_queue, None).start() # Give the workers some numbers to crunch for x in range(NUM_WORKERS*2): job_queue.put(x) # Insert end of job markers for x in range(NUM_WORKERS): job_queue.put(None) # Wait for all workers to end while threading.activeCount() > 2: time.sleep(0.1) # Tell recording thread it can stop results_queue.put(None) # Wait for recording thread to stop while threading.activeCount() == 2: time.sleep(0.1) if __name__ == '__main__': main()

As you can see, this doesn't do anything useful, but you could easily subclass Worker with its own _process_job function to make it do what you want. Any questions?

Take care.

posted at 12:15:44 # comment [] trackback []

2004-01-01

Faking ident for IRC

While abusing Twisted, I needed to seek out some experts on the #twisted IRC channel. Unfortunately, the freenode IRC network requires clients to support the ident protocol. I didn't want to install a "real" identd, so I faked one using Twisted. Here it is:

#!/usr/bin/python from twisted.internet import reactor from twisted.internet.protocol import Protocol, Factory class FakeIdent(Protocol): def dataReceived(self, data): """Send fake ident on any data received""" data = data.strip().replace(' ,',',') out = data + ' : USERID : UNIX : ircuser\r\n' self.transport.write(out) def main(): f = Factory() f.protocol = FakeIdent reactor.listenTCP(8000, f) reactor.run() if __name__ == '__main__': main()

I had my firewall redirect the standard ident port of 113 to port 8000 on my iBook. Once this program was running, I was able to join the freenode IRC network and talk to some of the Twisted developers.

If you want a much better fake ident server written in python, use fauxident.

posted at 20:07:12 # comment [] trackback []

One python programmer's search for understanding and avocados. This isn't personal, only pythonic.