The execution of a Python program is mostly a sequence of function calls involving the special methods described in the earlier section "Special Methods." If you find that a program runs slowly, you should first check to see if you’re using the most efficient algorithm. After that, considerable performance gains can be made simply by understanding Python’s object model and trying to eliminate the number of special method calls that occur during execution.
For example, you might try to minimize the number of name lookups on modules and classes. For example, consider the following code:
import math d= 0.0 for i in xrange(1000000): d = d + math.sqrt(i)
In this case, each iteration of the loop involves two name lookups. First, the math module is located in the global namespace; then it’s searched for a function object named sqrt. Now consider the following modification:
from math import sqrt d = 0.0 for i in xrange(1000000): d = d + sqrt(i)
In this case, one name lookup is eliminated from the inner loop, resulting in a considerable speedup.
Unnecessary method calls can also be eliminated by making careful use of temporary values and avoiding unnecessary lookups in sequences and dictionaries. For example, consider the following two classes:
class Point(object): def __init__(self,x,y,z): self.x = x self.y = y self.z = z class Poly(object): def __init__(self): self.pts = [ ] def addpoint(self,pt): self.pts.append(pt) def perimeter(self): d = 0.0 self.pts.append(self.pts) # Temporarily close the polygon for i in xrange(len(self.pts)-1): d2 = (self.pts[i+1].x - self.pts[i].x)**2 + (self.pts[i+1].y - self.pts[i].y)**2 + (self.pts[i+1].z - self.pts[i].z)**2 d = d + math.sqrt(d2) self.pts.pop() # Restore original list of points return d
In the perimeter() method, each occurrence of self.pts[i] involves two special-method lookups—one involving a dictionary and another involving a sequence. You can reduce the number of lookups by rewriting the method as follows:
class Poly(object): ... def perimeter(self): d = 0.0 pts = self.pts pts.append(pts) for i in xrange(len(pts)-1): p1 = pts[i+1] p2 = pts[i] d2 = (p1.x - p2.x)**2 + (p1.y - p2.y)**2 + (p1.z - p2.z)**2 d = d + math.sqrt(d2) pts.pop() return d
Although the performance gains made by such modifications are often modest (15%–20%), an understanding of the underlying object model and the manner in which special methods are invoked can result in faster programs. Of course, if performance is extremely critical, you often can export functionality to a Python extension module written in C or C++.