Archive for September, 2010
I’ve had to do some timing using time.clock(), but I like seeing output as hh:mm:ss.000. divmod is the perfect tool for this, as it does the division and remainder in a single call (rekindling my love for Python’s ease in returning multiple values from a function):
def to_hhmmss(sec): min,sec = divmod(sec,60) hr,min = divmod(min,60) return "%d:%02d:%06.3f" % (hr,min,sec)
Right about the time I wrote that, I happened to see an example using reduce, and thought about using successive calls to divmod to build up the tuple to be passed in to this function’s string interpolation. That is, I needed to call reduce in such a way to convert 3601.001 to the tuple (1, 0, 1, 1) (1 hour, 0 minutes, 1 second, and 1 millisecond).
reduce() applies a binary operation to a sequence by taking the first two items of the sequence, performing the operation on them and saving the result to a temporary accumulator, then applies the binary operation to the accumulated value and the 3rd item in the sequence, the to the 4th item, etc. For instance to use reduce to sum up a list of integers [1,5,10,50], we would call reduce with the binary operation:
fn = lambda a,b: a+b
and reduce would work through the list as if running this explicit code:
acc = 1 acc = fn(acc, 5) acc = fn(acc, 10) acc = fn(acc, 50) return acc
I need reduce to build up a tuple, which is a perfectly good thing to do with tuple addition.
Also, I’m going to convert the milliseconds field into its own integer field also, so I’ll need to convert our initial value containing seconds to a value containing milliseconds (also allowing me to round off any decimal portion smaller than 1 msec). And I’ll use divmod with a succession of divisors to get the correct time value for each field.
So the sequence of values that I will pass to reduce will be the succession of divisors for each field in the tuple, working right to left. To convert to hh, mm, ss, and msec values, these divisors are (1000, 60, 60), and these will be the 2nd thru nth values of the input sequence – the 1st value will be the value in milliseconds to be converted.
The last thing to do is to define our binary function, that will perform the successive divmods, will accumulating our growing tuple of time fields. Maybe it would be easiest to just map out how our sample conversion of 3601.001 seconds (or 3601001 msec) would work, with some yet-to-be-defined function F. The last step in the reduce would give us:
(1,0,1,1) = F((0,1,1), 60) (0,1,1) = F((1,1), 60) (1,1) = F(msec, 1000)
Well, that last line (representing the first binary reduction) looks bad, since all the others are taking a tuple as the first value. It seems that each successive reduction grows that value by one more term, so the first term should be a 1-value tuple, containing our value in milliseconds:
(1,1) = F((msec,), 1000)
Also, it’s not really true that this step would have to return (1,1). Just so long as the final tuple ends with (…,1,1). So I’ll redo the sequence of steps showing X’s for the unknown intermediate values:
(1,0,1,1) = F((X,1,1), 60) (X,1,1) = F((X,1), 60) (X,1) = F((msec,), 1000)
The heart of our mystery function F is essentially a call to divmod, using the 0’th element in the current accumulator:
F = lambda a,b : divmod(a,b)
But F has to also carry forward (accumulate) the previous divmod remainders, found in the 1-thru-n values in the tuple in a. So we modify F to include these:
F = lambda a,b : divmod(a,b) + a[1:]
Now we have just about all the pieces we need. We have our the binary function F. The sequence of values to pass to F is composed of the initial accumulator (msec,), followed by the divisors (1000, 60, 60). We just need to build our call to reduce, and use it to interpolate into a suitable hh:mm:ss-style string:
def to_hhmmss(s): return "%d:%02d:%02d.%03d" % \ reduce(lambda a,b : divmod(a,b) + a[1:], [(s*1000,),1000,60,60])
This may not be the most easy-to-read version of to_hhmmss, but it is good to stretch our thinking into some alternative problem-solving approaches once in a while.
I came across a very interesting wrinkle on the behavior of super() yesterday at work. In our most recent major release, we upgraded the Python we use from 2.4 to 2.6, and have been learning a lot about forward compatibility issues between the two versions.
In one part of our code, we import plugins at runtime, and the plugins expose a method process. One particular plugin PluginA has a subclass PluginAAlias that adds some simple aliasing to one of the arguments. So once the subclass does the argument translation, it finishes by calling the process method in its superclass PluginA, using this statement:
return super(PluginAAlias, self).process(a,b,c)
We got a customer report that this code, which worked under Python 2.4, was now failing under 2.6 with this error:
TypeError: super(type, obj): obj must be an instance or subtype of type
Now the simplest solution would be to dump super() and just hardwire the call to PluginA.process. But we use super() in a number of places in the product, and I didn’t want to chase them all down and change them too. I really wanted to better understand what the problem was.
Our first guess (suggested by colleague Mike Thornton) was that PluginA was not a new-style class. Our codebase includes some old and dusty code in places, and it is not inconceivable that we still have an old-style class floating around here and there. But sure enough, PluginA inherits from Plugin, which inherits from object. So that theory, while a good guess, was busted.
I googled for super(), Python 2.5 and 2.6 release notes, and the error message itself, and found a number of cautionary articles on the perils of super(), particularly in class hierarchies with multiple inheritance – but in this case, we are strictly singly inheriting. Just about all of the examples showed a subclass making upcalls to the superclass __init__ method, which also didn’t quite match my situation. I did learn that in 2.5, super was tightened up to ensure that the object passed to super was in fact an instance of the named class, but I knew this was the case since the problem code statement was *in* a method of that class, so how else would we have gotten there if it weren’t?
So then I started to add some print statements around the problem line of code, and sure enough, what I thought was impossible was in fact possible.
print PluginAAlias print self.__class__ print isinstance(self, PluginAAlias)
<class pluginAAlias.PluginAAlias> <class pluginAAlias.PluginAAlias> False
Huh??!!! I then expanded the print statements to print the id() of the classes, and sure enough the id’s were different – so that was why isinstance was failing, and super() was raising the TypeError.
With a little more instrumenting, I found that our code for loading the plugin modules can get run repeatedly. And this was the final piece of the puzzle. Our plugin-loading code was not using import, but imp.load_module(). The docs for imp.load_module() tell us that repeated calls with the same module reference will act like a reload. And so here is the smoking gun.
Completely outside of the product, I created a little module, a.py, containing an empty class A. Then from the Python prompt, I ran these commands:
>>> import imp >>> m = imp.find_module("a") >>> a = imp.load_module("a", *m) >>> a.A <class 'a.A'> >>> aobj = a.A() >>> aobj.__class__ <class 'a.A'>
Now that I had an object created, I reloaded the a module:
>>> a = imp.load_module("a", *m) >>> print a.A <class 'a.A'> >>> isinstance(aobj, a.A) False
I had recreated my “impossible” condition, an instance of a.A that fails isinstance(aobj, a.A).
The final proof, calling super() as in the original bug:
>>> super(a.A, aobj) Traceback (most recent call last): File "<stdin>", line 1, in TypeError: super(type, obj): obj must be an instance or subtype of type
Voila! The root cause was found! Because the repeated calls to load_module act as a reload, the objects created using the old class no longer satisfy the isinstance test, so super will fail.
My solution? Originally, I thought I would memoize our plugin loader, so that plugins won’t get reloaded for the same module name. But I’m still a bit new on this team, and it occurred to me that reloading of plugins without having to restart a daemon might be of some advantage. So instead, I added an __init__ method to PluginAAlias.
def __init__(self, *args, **kwargs): self.as_super = super(PluginAAlias, self) self.as_super.__init__(*args, **kwargs)
My readings on super told me that super doesn’t just do a cast of self, but returns a proxy object that delegates attribute lookups to self, following the MRO beginning with self’s superclass. I knew that at the original init time as the plugin object was created, that super() must succeed, since this was the initial creation of the object and we had not yet had any chance to reload the plugin.
Then I changed the offending line to read:
The as_super attribute, being built in the original __init__ method, contains a proxy to the correct superclass of the plugin, even if the plugin module gets reloaded later. (I also thought about saving the superclass’s process method in a super_process attribute, and just calling that; but I wanted a more general solution, in case there were other methods on plugins that I hadn’t found yet.)
So that was the solution. I thought I’d write this up, as it was a slightly different form of super() failure than any I had found in my own googling, so it might be of interest to someone else struggling with crazy-looking TypeError messages, when you just know the object is of the correct class.