Category Archives: NumPy

Watermarking images with OpenCV and NumPy

Here's a simple Python example of adding a watermark to an image using NumPy and OpenCV. I created a black/white version of the OpenCV logo for this example but feel free to use any image you like. In this case I wanted to have a white watermark image, white text, and the rest of the image unchanged. If you have watermark images in color or grayscale the same process should work. Here's the code using the OpenCV license plate sample image:

import numpy as np
import cv2

# read images
original = cv2.imread('data/licenseplate_motion.jpg')
mark = cv2.imread('logo.png')

m,n = original.shape[:2]

# create overlay image with mark at the upper left corner, use uint16 to hold sum
overlay = np.zeros_like(original, "uint16")
overlay[:mark.shape[0],:mark.shape[1]] = mark

# add the images and clip (to avoid uint8 wrapping)
watermarked = np.array(np.clip(original+overlay, 0, 255), "uint8")

# add some text 5 pixels in from the bottom left
cv2.putText(watermarked, "Watermarking with OpenCV", (5,m-5), cv2.FONT_HERSHEY_PLAIN, fontScale=1.0, color=(255,255,255), thickness=1)

cv2.imshow("original", original)
cv2.imshow("watermarked", watermarked)
Note that I created an array of type "uint16" to hold the sum of the two images before clipping with NumPy's clip function. The text is added with OpenCV's putText function which will need position, font (out of the available OpenCV fonts) and text scale and a color tuple (here white). The result looks like this:

Reading Gauges – Detecting Lines and Circles

I received a question from a reader on how I would approach reading a simple gauge with one needle on a good frontal image of a circular gauge meter. This makes a good example to introduce Hough transforms. Detecting circles or lines using OpenCV and Python is conceptually simple (each particular use-case requires some parameter tuning though). Below is a simple example using the OpenCV Python interface for detecting lines, line segments and circles. The documentation for the three relevant functions are here. You can also find more on using the Python interface and the plotting commands in Chapter 10 of my book.

import numpy as np
import cv2

Script using OpenCV's Hough transforms for reading images of
simple dials.

# load grayscale image
im = cv2.imread("gauge1.jpg")
gray_im = cv2.cvtColor(im, cv2.COLOR_RGB2GRAY)

# create version to draw on and blurred version
draw_im = cv2.cvtColor(gray_im, cv2.COLOR_GRAY2BGR)
blur = cv2.GaussianBlur(gray_im, (0,0), 5)

m,n = gray_im.shape

# Hough transform for circles
circles = cv2.HoughCircles(gray_im,, 2, 10, np.array([]), 20, 60, m/10)[0]

# Hough transform for lines (regular and probabilistic)
edges = cv2.Canny(blur, 20, 60)
lines = cv2.HoughLines(edges, 2, np.pi/90, 40)[0]
plines = cv2.HoughLinesP(edges, 1, np.pi/180, 20, np.array([]), 10)[0]

# draw
for c in circles[:3]:
# green for circles (only draw the 3 strongest), (c[0],c[1]), c[2], (0,255,0), 2)

for (rho, theta) in lines[:5]:
# blue for infinite lines (only draw the 5 strongest)
x0 = np.cos(theta)*rho
y0 = np.sin(theta)*rho
pt1 = ( int(x0 + (m+n)*(-np.sin(theta))), int(y0 + (m+n)*np.cos(theta)) )
pt2 = ( int(x0 - (m+n)*(-np.sin(theta))), int(y0 - (m+n)*np.cos(theta)) )
cv2.line(draw_im, pt1, pt2, (255,0,0), 2)

for l in plines:
# red for line segments
cv2.line(draw_im, (l[0],l[1]), (l[2],l[3]), (0,0,255), 2)


# save the resulting image
This will in turn; read an image, create a graylevel version for the detectors, detect circles using HoughCircles(), run edge detection using Canny(), detect lines with HoughLines(), detect line segments with HoughLinesP(), draw the result (green circles, blue lines, red line segments), show the result and save an image. The result can look like this: From these features, you should be able to get an estimate on the gauge reading. If you have large images, you should probably scale them down first. If the images are noisy, you should adjust the blurring for the edge detection. There are also threshold parameters to play with, check the documentation for what they mean. Good luck.

Adjacency matrix for image pixel graphs

One of my favorite NumPy functions is roll(). Here's an example of using this function to get neighborhood indices to create adjacency matrices for images.

An adjacency matrix is a way of representing a graph and shows which nodes of the graph are adjacent to which other nodes. For n nodes is is an n*n matrix A where a_ij is the number of edges from vertex i to vertex j. The number of edges can also be replaced with edge weights depending on the application.

For images, a pixel neighborhood defines a graph which is sparse since each pixel is only connected to its neighbors (usually 4 or 8). A sparse representation, for example using dictionaries, is preferable if only the edges are needed. For clustering using spectral graph theory, a full matrix is needed. The following function creates an adjacency matrix in sparse or full matrix form given an image:

from numpy import *

def adjacency_matrix(im,connectivity=4,sparse=False):
""" Create a pixel connectivity matrix with
4 or 8 neighborhood. If sparse then a dict is
returned, otherwise a full array. """

m,n = im.shape[:2]

# center pixel index
c = arange(m*n)
ndx = c.reshape(m,n)

if sparse:
A = {}
A = zeros((m*n,m*n))

if connectivity==4: # 4 neighborhood
hor = roll(ndx,1,axis=1).flatten() # horizontal
ver = roll(ndx,1,axis=0).flatten() # vertical
for i in range(m*n):
A[c[i],hor[i]] = A[hor[i],c[i]] = 1
A[c[i],ver[i]] = A[ver[i],c[i]] = 1

elif connectivity==8: # 8 neighborhood
hor = roll(ndx,1,axis=1).flatten() # horizontal
ver = roll(ndx,1,axis=0).flatten() # vertical
diag1 = roll(roll(ndx,1,axis=0),1,axis=1).flatten() # diagonal
diag2 = roll(roll(ndx,1,axis=0),1,axis=0).flatten() # diagonal
for i in range(m*n):
A[c[i],hor[i]] = A[hor[i],c[i]] = 1
A[c[i],ver[i]] = A[ver[i],c[i]] = 1
A[c[i],diag1[i]] = A[diag1[i],c[i]] = 1
A[c[i],diag2[i]] = A[diag2[i],c[i]] = 1

return A

Use it like this:

from pylab import *

im = random.random((10,10))
A = adjacency_matrix(im,8,False)


Which should give you an image like the one below.

RQ Factorization of Camera Matrices

A common operation on camera matrices is to use RQ factorization to obtain the camera calibration matrix given a 3*4 projection matrix. The simple example below shows how to do this using the scipy.linalg module. Assuming the camera is modeled as P = K [R | t], the goal is to recover K and R by factoring the first 3*3 part of P.

The scipy.linalg module actually contains RQ factorization although this is not always clear from the documentation (here is a page that shows it though). To use this version, import rq like this:

from scipy.linalg import rq

Alternatively, you can use the more common QR factorization and with some modifications write your own RQ function.

from scipy.linalg import qr

def rq(A):
Q,R = qr(flipud(A).T)
R = flipud(R.T)
Q = Q.T
return R[:,::-1],Q[::-1,:]

RQ factorization is not unique. The sign of the diagonal elements can vary. In computer vision we need them to be positive to correspond to focal length and other positive parameters. To get a consistent result with positive diagonal you can apply a transform that changes the sign. Try this on a camera matrix like this:

# factor first 3*3 part of P
K,R = rq(P[:,:3])

# make diagonal of K positive
T = diag(sign(diag(K)))

K = dot(K,T)
R = dot(T,R) #T is its own inverse

Multidimensional meshgrid with Python

Looking for an easy way to plot cubes in 3D I found NumPy's mgrid that can do meshgrid in any number of dimensions. The syntax is a bit confusing but once you understand the trick, it works.

Here's a simple example that creates points for a unit cube and plots the points.

from pylab import *
from numpy import *
from mpl_toolkits.mplot3d import axes3d

# create 3D points
x,y,z = mgrid[0:2,0:2,0:2]
xx = x.flatten()
yy = y.flatten()
zz = z.flatten()

# plot 3D points
fig = figure()
ax = fig.gca(projection='3d')


Converting video to NumPy arrays

Using OpenCV it is possible to read video frames from a file and convert them to NumPy arrays. Frames returned by QueryFrame() are in OpenCV's IplImage format where the pixels are stored as arrays of color in BGR order. Here's an example of converting the 100 first frames of a video to standard RGB and storing them in a NumPy array.

from numpy import *
import cv

capture = cv.CaptureFromFile('')
frames = []
for i in range(100):
img = cv.QueryFrame(capture)
tmp = cv.CreateImage(cv.GetSize(img),8,3)
frames = array(frames)

Here the CvtColor() function converts from BGR to RGB and NumPy's asarray() does the final conversion. The array will have dimensions (100,width,height,3) in this case.

There is no good way to determine the end of the file, QueryFrame() will just loop around to the beginning. If you want to you can add a check to break when the first frame appears a second time.

Here's the video I used:

NLTK under Python 2.7 and SciPy 0.9.0

Python 2.7 has been out for months, but I have been stuck using Python 2.6 since SciPy was not working for Python 2.7. SciPy 0.9 Beta 1 binary distribution has just been released.
Normally I try to stay clear of beta quality software, but I really like some of the new features in Python 2.7 especially the argparse module, so despite my better judgement I installed Python 2.7.1 and SciPy 0.9.0 Beta 1, to run with a big NLTK based library. This is blog post describes the configuration that I use; and my first impression of the stability.

SciPy 0.9 RC1 was released January 2011.
SciPy 0.9 was released February 2011.
I tried both of them and found almost the same result as for SciPy 0.9 Beta 1, which this review was originally written for.

Direct downloads
Here is a list of the programs I installed directly:

Installation of NLTK

The install was very simple just type:

\Python27\lib\site-packages\ nltk

Other libraries installed with

  • CherryPy
  • ipython
  • PIL
  • pymongo
  • pyodbc

    YAML Library
    On a Windows Vista computer with no MS C++ compiler were I tested this NLTK install I also had to do a manual install of YAML from:

    Libraries from unofficial binary distributions
    There are a few packages that have build problems, but can be loaded from Christoph Gohlke's site with Unofficial Windows Binaries for Python Extension Packages: I downloaded and installed:
    • matplotlib-1.0.0.win32-py2.7.exe
    • opencv-python-2.2.0.win32-py2.7.exe


    The installation was simple. Everything installed cleanly. I ran some bigger scripts and they ran fine. Development and debugging also worked fine. Out of 134 NLTK related unit tests only one failed under Python 2.7

    Problems with SciPy algorithms

    The failing unit test was maximum entropy training using the LBFGSB optimization algorithm. These were my settings:
    nltk.MaxentClassifier.train(train, algorithm='LBFGSB', gaussian_prior_sigma=1, trace=2)

    First the maximum entropy training would not run because it was calling the method rmatvec() in scipy/sparse/ This method has been deprecated for a while and has been taken out of the SciPy 0.9. I found this method in SciPy 0.8 and added it back. My unit test ran, but instead of finishing in a couple of seconds it took around 10 minutes eating up 1.5GB before it crashed. After this I gave up on LBFGSB.

    If you do not want to use LBFGSB, megam is another efficient optimization algorithm. However it is implemented in OCaml and I did not want to install OCaml on a Windows computer.

    This problem occurred for both SciPy 0.9 Beta 1 and RC1.

    Python 2.6 and 2.7 interpreters active in PyDev

    Another problem was that having both Python 2.6 and 2.7 interpreters active in PyDev made it less stable. When I started scripts from PyDev sometime they timed out before starting. PyLint would also show errors in code that was correct. I deleted Python 2.6 interpreter under PyDev Preferences, and PyDev worked fine with just Python 2.7.

    I also added a version check the one failing unit test, since it caused problems for my machine.
    if (2, 7) < sys.version_info: return

    Multiple versions of Python on Windows

    If you install Python 2.7 and realize that some code is only running under Python 2.6 or that you have to rollback. Here are a few simple suggestions:

    I did a Google search for:
    python multiple versions windows
    This will show many ways to deal with this problem. One way is calling a little Python script that change the Windows register settings.

    Multiple versions of Python have not been a big problem for me. So I favor a very simple approach. The main issue is file extension binding. What program gets called when you double click a py file or type on the command line.

    Changing file extension binding for rollback to Python 2.6

    Under Windows XP You can change file extensions in Windows Explorer:
    Under Tools > Folder Option > File Types
    Select the PY Extension and press Advanced then press Change
    Select open press Edit
    The value is:
    "C:\Python27\python.exe" "%1" %*
    You can change this to use a different interpreter:
    "C:\Python26\python.exe" "%1" %*

    Or even simpler when I want to run the older Python interpreter I just type:
    Instead of typing

    Is Python 2.7 and SciPy 0.9.0 Beta 1 stable enough for NLTK use?

    The installation of all the needed software was fast and unproblematic. I would certainly not use it in a production environment. If you are doing a lot of numerical algorithms you should probably hold off. If you are impatient and you do not need to do new training it is worth trying it, you can always roll back.

    Installing Numpy and Matplotlib on Mac OS X 10.6 Snow Leopard

    This is partly an update to my previous post on installing for Leopard, partly a note to myself. After upgrading to Snow Leopard (and 64bit) I could still use my MacPort installs but not upgrade or install new ones (since all previous builds were for 32bit). So, time for another fresh install.

    With MacPorts:

    sudo port install python26
    sudo port install py26-numpy
    sudo port install py26-matplotlib

    After this I can start python and load pylab but I get problems with the plotting window totally missing. My colleague Jerome Piovano found a fix over at stackoverflow. Replace the last line with

    sudo port install py26-matplotlib +cairo+gtk2

    Then edit ~/.matplotlib/matplotlibrc (it should be in your home directory, if no such file exists just make a new file) by adding:

    backend: MacOSX

    Now everything works as it should.

    sparray – sparse n-dimensional arrays in Python

    Yesterday I wrote about sparse arrays for Python using dictionaries. I couldn't get it out of my head so tonight I actually made a complete module containing a class for sparse n-dimensional arrays. The module supports any number of dimensions and any size. Sparray has all basic operations like adding, subtracting, multiplication, division, mod, pow, printing, sum() and can easily be converted to full NumPy arrays.

    The sparray module can be downloaded from here. Running will show a series of examples and print the results in the console.

    Enjoy. Bug reports and feedback welcome!

    Below is some output from the example:

    [[ 3. 3. 3.]
    [ 3. 13. 3.]
    [ 3. 3. 3.]]
    [[-3. -3. -3.]
    [-3. 7. -3.]
    [-3. -3. -3.]]
    [[ 0. 0. 0.]
    [ 0. 30. 0.]
    [ 0. 0. 0.]]
    [[ 9. 9. 9.]
    [ 9. 9. 9.]
    [ 9. 9. 9.]]
    [[ 0. 0. 0.]
    [ 0. 3. 0.]
    [ 0. 0. 0.]]
    [[ 0. 0. 0.]
    [ 0. 1. 0.]
    [ 0. 0. 0.]]
    [[ 0. 0. 0.]
    [ 0. 1000. 0.]
    [ 0. 0. 0.]]
    [[ 3. 3. 3.]
    [ 3. 13. 3.]
    [ 3. 3. 3.]]
    sum of elements...
    mix with NumPy arrays...
    [[ 6. 6. 6.]
    [ 6. 26. 6.]
    [ 6. 6. 6.]]
    Frobenius norm...

    Sparse arrays in Python using dictionaries

    UPDATE: I made a complete module for sparse n-dimensional array. See the next post.

    A great application of Python's dictionary type is to use it for representing sparse arrays and matrices (in arbitrary dimensions). Consider the following code example:

    a = {}
    a[0,3] = 12
    a[2,1] = 5
    a[3,3] = 7

    Now, this dictionary looks like this:

    {(0, 3): 12, (3, 3): 7, (2, 1): 5}

    It is easy to define operations on these dictionaries. For example, the Euclidean (L2) distance between two arrays (Frobenius norm in the case of matrices) is simply:

    def L2(a1,a2):
    ndx = set(a1.keys()+a2.keys())
    return sum([ (a1.get(i,0)-a2.get(i,0))**2 for i in ndx])

    Here we used set() to remove duplicates after concatenating the keys and the dictionary get() method that returns a default value (in this case zero) if a key is missing.

    With another example array like this:

    b = {}
    b[0,3] = 10
    b[1,1] = 15
    b[1,3] = 4

    the value of L2(a,b) is 319.

    There are some interesting efforts for sparse multidimensional arrays in Python out there. Ndsparse is a new one, which uses lists instead of dictionaries and looks very nice and clean. Unfortunately I found lots of basic things broken when testing today. Hope to see a new release with bugs fixed soon! There is also the SciPy sparse module with data structures for 2D sparse matrices but that looks a little messy (and is only 2D).

    CVXOPT – convex optimization with Python

    The other day I was looking around for a good optimization package for Python and found something that looks great.

    CVXOPT is a free software package (GPL license) for convex optimization based on Python and developed by Joachim Dahl and Lieven Vandenberghe. CVXOPT is NumPy compatible which means that you can use your arrays as usual.

    The CVXOPT website contains lots of examples from the book Convex Optimization by Boyd and Vandenberghe. For example this total variation reconstruction which is quite neat.

    After playing around with it and trying some examples I really like it and it looks solid. Give it a try next time you have a linear program or something in need of optimization.

    Is your NumPy using the right ATLAS?

    Here's a tip I got from my colleague Martin. Making sure that your NumPy installation uses the right back-end can drastically improve performance. ATLAS is the software library used by NumPy (and Matlab, Octave etc.) to do linear algebra.

    Try running the following 1000*1000 matrix multiplication test:

    >>> from numpy import *
    >>> import time
    >>> A = random.random((1000,1000))
    >>> B = random.random((1000,1000))
    >>> t = time.time(); dot(A,B); print time.time()-t

    Disappointed about the time this takes? Take a minute to check that you are running the right ATLAS for your computer. On my Mac this test was fast (0.2-0.3s) but on my desktop running Ubuntu very slow (5-6s). In Ubuntu, open the Package Manager and search for "atlas". If you find something like "libatlas3gf-base" you are using a generic version that works on most hardware. Unless your computer is antique, you can probably do better. Try selecting "libatlas3gf-sse2" (or even libatlas3gf-sse1 if this doesn't work) instead and run the test again. (on other linux platforms the process should be similar)

    Tests I did on a couple of machines show speedup of a factor 4-10. Well worth the minute it takes to check and fix. Thanks Martin!

    Exporting from NumPy to Weka

    Weka is a very useful tool for experimenting with different classifiers, pre-processing of data and much more. Weka is an open source (GPL) collection of machine learning algorithms for data mining written in Java. Not only algorithms, Weka also has a GUI for exploring data, running classification experiments and visualization without having to write any code. Some screen shots:

    Loading and exploring some data (in this case the iris sample data set).

    Running cross-correlation with different classifiers.


    If you are used to working in NumPy, the following function will write NumPy arrays to Weka .arff data files that can be loaded directly from the GUI.

    def write_to_weka(filename,relationname,attributenames,attributes,comment=''):
    """ writes NumPy arrays with data to WEKA format .arff files

    input: relationname (string with a description), attributenames (list
    of the names of different attributes), attributes (array of attributes,
    one row for each attribute, WEKA treats last row as classlabels by
    default), comment (short description of the content)."""

    nbrattributes = len(attributenames)
    if attributes.shape[1] != nbrattributes:
    raise Exception('Number of attribute names is not equal to length of attributes')

    f = open(filename, 'w')
    f.write('% '+comment+'\n')
    f.write('@RELATION '+relationname+'\n')

    for a in attributenames:
    f.write('@ATTRIBUTE '+str(a)+' NUMERIC\n') #assume values are numeric

    f.write('@DATA\n') #write the data, one attribute vector per line
    for i in range(attributes.shape[0]):
    for j in range(nbrattributes):
    if j < nbrattributes-1:
    f.write(', ')

    Weka is one of those tools I always forget I have, I hope this post will help me remember and hopefully introduce Weka to new people.

    Tip: If you run out of memory, start Weka with "java -Xmx512m weka.jar weka.gui.GUIChooser" and you will have 512MB... or whatever you choose.