Archive for category Python Upgrades
I came across a very interesting wrinkle on the behavior of super() yesterday at work. In our most recent major release, we upgraded the Python we use from 2.4 to 2.6, and have been learning a lot about forward compatibility issues between the two versions.
In one part of our code, we import plugins at runtime, and the plugins expose a method process. One particular plugin PluginA has a subclass PluginAAlias that adds some simple aliasing to one of the arguments. So once the subclass does the argument translation, it finishes by calling the process method in its superclass PluginA, using this statement:
return super(PluginAAlias, self).process(a,b,c)
We got a customer report that this code, which worked under Python 2.4, was now failing under 2.6 with this error:
TypeError: super(type, obj): obj must be an instance or subtype of type
Now the simplest solution would be to dump super() and just hardwire the call to PluginA.process. But we use super() in a number of places in the product, and I didn’t want to chase them all down and change them too. I really wanted to better understand what the problem was.
Our first guess (suggested by colleague Mike Thornton) was that PluginA was not a new-style class. Our codebase includes some old and dusty code in places, and it is not inconceivable that we still have an old-style class floating around here and there. But sure enough, PluginA inherits from Plugin, which inherits from object. So that theory, while a good guess, was busted.
I googled for super(), Python 2.5 and 2.6 release notes, and the error message itself, and found a number of cautionary articles on the perils of super(), particularly in class hierarchies with multiple inheritance – but in this case, we are strictly singly inheriting. Just about all of the examples showed a subclass making upcalls to the superclass __init__ method, which also didn’t quite match my situation. I did learn that in 2.5, super was tightened up to ensure that the object passed to super was in fact an instance of the named class, but I knew this was the case since the problem code statement was *in* a method of that class, so how else would we have gotten there if it weren’t?
So then I started to add some print statements around the problem line of code, and sure enough, what I thought was impossible was in fact possible.
print PluginAAlias print self.__class__ print isinstance(self, PluginAAlias)
<class pluginAAlias.PluginAAlias> <class pluginAAlias.PluginAAlias> False
Huh??!!! I then expanded the print statements to print the id() of the classes, and sure enough the id’s were different – so that was why isinstance was failing, and super() was raising the TypeError.
With a little more instrumenting, I found that our code for loading the plugin modules can get run repeatedly. And this was the final piece of the puzzle. Our plugin-loading code was not using import, but imp.load_module(). The docs for imp.load_module() tell us that repeated calls with the same module reference will act like a reload. And so here is the smoking gun.
Completely outside of the product, I created a little module, a.py, containing an empty class A. Then from the Python prompt, I ran these commands:
>>> import imp >>> m = imp.find_module("a") >>> a = imp.load_module("a", *m) >>> a.A <class 'a.A'> >>> aobj = a.A() >>> aobj.__class__ <class 'a.A'>
Now that I had an object created, I reloaded the a module:
>>> a = imp.load_module("a", *m) >>> print a.A <class 'a.A'> >>> isinstance(aobj, a.A) False
I had recreated my “impossible” condition, an instance of a.A that fails isinstance(aobj, a.A).
The final proof, calling super() as in the original bug:
>>> super(a.A, aobj) Traceback (most recent call last): File "<stdin>", line 1, in TypeError: super(type, obj): obj must be an instance or subtype of type
Voila! The root cause was found! Because the repeated calls to load_module act as a reload, the objects created using the old class no longer satisfy the isinstance test, so super will fail.
My solution? Originally, I thought I would memoize our plugin loader, so that plugins won’t get reloaded for the same module name. But I’m still a bit new on this team, and it occurred to me that reloading of plugins without having to restart a daemon might be of some advantage. So instead, I added an __init__ method to PluginAAlias.
def __init__(self, *args, **kwargs): self.as_super = super(PluginAAlias, self) self.as_super.__init__(*args, **kwargs)
My readings on super told me that super doesn’t just do a cast of self, but returns a proxy object that delegates attribute lookups to self, following the MRO beginning with self’s superclass. I knew that at the original init time as the plugin object was created, that super() must succeed, since this was the initial creation of the object and we had not yet had any chance to reload the plugin.
Then I changed the offending line to read:
The as_super attribute, being built in the original __init__ method, contains a proxy to the correct superclass of the plugin, even if the plugin module gets reloaded later. (I also thought about saving the superclass’s process method in a super_process attribute, and just calling that; but I wanted a more general solution, in case there were other methods on plugins that I hadn’t found yet.)
So that was the solution. I thought I’d write this up, as it was a slightly different form of super() failure than any I had found in my own googling, so it might be of interest to someone else struggling with crazy-looking TypeError messages, when you just know the object is of the correct class.