The ImageMagick Sharpen Operation

ImageMagick has a sharpen operation. It's not an unsharp mask, the standard weapon of image sharpening - that's unsharp. So what is it? The documentation says "Sharpen the image. Use a Gaussian operator of the given radius and standard deviation (sigma).". And that's it. Helpful!

Luckily, ImageMagick is open source, so we can see for ourselves. The key code is in effect.c, in the SharpenImageChannel function (called from SharpenImage). For what it's worth, the comment says:

%  SharpenImage() sharpens the image.  We convolve the image with a Gaussian
%  operator of the given radius and standard deviation (sigma).  For
%  reasonable results, radius should be larger than sigma.  Use a radius of 0
%  and SharpenImage() selects a suitable radius for you.
%  Using a separable kernel would be faster, but the negative weights cancel
%  out on the corners of the kernel producing often undesirable ringing in the
%  filtered result; this can be avoided by using a 2D gaussian shaped image
%  sharpening kernel instead.

And, arch as this is, the code bears this out: It consists of two phases: preparation of a kernel, and then convolution of the image with it. The convolution appears to be completely standard, and so it's the preparation of the kernel that's of interest. The code for that is as follows (paraphrased in a dialect of C that may not actually compile anywhere):

  const double PI = 3.1415926535897931 ;
  long width = GetOptimalKernelWidth2D(radius, sigma) ;
  double *kernel = malloc(sizeof(double) * width * width) ;
  long i = 0 ;
  long u = 0 ;
  long v = 0 ;
  double normalize = 0.0 ;
  for (v = -width/2; v <= width/2; ++v)
    for (u = -width/2; u <= width/2; ++u)
      double alpha = exp(-(u*u + v*v) / (2.0 * sigma * sigma)) ;
      kernel[i] = -alpha / (2.0 * PI * sigma * sigma)) ;
      if ((width < 3) || (u != 0) || (v != 0))
        normalize += kernel[i] ;
      ++i ;
  kernel[i/2] = -2.0 * normalize ;

GetOptimalKernelWidth2D is in gem.c, and i'm not going to worry about what it does for now.

So, what does this do? Well, a lot of it is fairly obvious shunting as comprises most of any program - call a helper function (defined in gem.c, and which i haven't delved into yet) to compute the size of the kernel, allocate memory for the kernel, set up some zero variables. Then, loop over the kernel in two dimensions, with v and u instead of y and x, for some reason. Along with those loops, we maintain an counter, i, which is an index into the kernel array, pointing to the entry corresponding to the coordinate (u, v); incrementing it separately saves having to calculate it from u and v every time, which would be a minor pain. In the heart of the loop, do two things: calculate the value of the kernel at (u, v) and put it in kernel[i], and update the 'normalize' variable. After we finish the loops, set the central element of the kernel (at position i/2, where i has by now reached the size of the kernel array) to a particular value.

So, how are the kernel values calculated? There's a load of maths type stuffs with alphas, the exponential function, sum of squares of coordinates, the square of sigma, etc. Like it says in the comment, this is computing a Gaussian distribution: take a line through this kernel, and you have a bell curve. Why a Gaussian? It's a standard for convolution kernels. Yes, but why? I think it's because it's an approximation to the two-dimensional sinc function (aka the Airy disc, which is the way diffraction spreads out light, which is what we're trying to reverse here) without all the irritating higher-order maxima, and it has some nice analytical properties (self-inverse in Fourier space or something, i don't know). Note that the kernel values are all negative: alpha, being the output of exp, is always positive, and sigma is always positive, so -alpha over a positive constant times sigma is always negative.

So, what's the normalisation bit about? Well, we add up all the values in the kernel, excluding the central element, which fails the (u != 0) || (v != 0) test (except when the kernel is tiny, although i don't see how a tiny kernel is even possible), and set the central element of the kernel to -2 times the total. This means that the total value of the kernel is -normalize.

So what does this all do? Well, the shape of the kernel is a negative Gaussian with a big positive spike in the middle. The image produced by convolution by that kernel is equal to the convolution of the source with a positive Gaussian subtracted from two times the source image. Right? Think about it. What does that look like? Hey, that's exactly what an unsharp mask is! A basic one, anyway: there's no thresholding, and no control over how much of the blurred image is subtracted from the source - 'amount' in the usual terminology. The lack of control over amount is slightly odd, since it could easily be done by changing the value of -2.