A new road to saliency

Posted: February 17, 2013 in Work
Tags: , , ,

I have been searching for some good papers which I could implement to improve  my current implementation. After lots of googling and reading few papers, my eyes were half red. It was around 3 a.m. when I came across  Ioannis Katramados and Toby Breckon’s paper on “REAL-TIME VISUAL SALIENCY BY DIVISION OF GAUSSIANS”. This was not exactly what I was looking for but it somehow seems to satisfy my need. So I thought to implement the paper. Since then I have been discussing various things with both I. Katramados and T. Breckon. They both are really helpful and so is their paper.

Coming back to implementation, the paper has been written perfectly and it seems easy to implement the paper using OpenCV (SEEMS!).

This is what I am doing:
1) I converted image to GrayScale (32F)
2) according to step one in paper: “The Gaussian pyramid U comprises of n levels,starting with an image U1 as the base with resolution w × h. Higher pyramid levels are derived via downsampling using a 5 × 5 Gaussian filter. The top pyramid level has a resolution of (w/2n−1 ) × (h/2n−1 ). Let us call this image Un .
    which means I simply have to perform pyrDown() operation in opencv. I did it 8 times.
3) according to step two paper: “U n is used as the top level Dn of a second Gaus-sian pyramid D in order to derive its base D1 . In this case, lower pyramid levels are derived via upsampling using a 5×5 Gaussian filter.”
I simply performed 8 times pyrUp image.
4) And then goes pixel by pixel division of values as according to paper.
5) I normalized the result matrix to 0-255
I. Katramados has been really generous. Both the writers of the paper have been really helpful. I guess few changes to its implementation might result in a better output.
3:41 PM = So now its time follow few more advises  Ioannis Katramados.
  1. ameya005 says:


    I tried implementing the same code in OpenCV recently. But I am getting an output image where the saliency map is repeated multiple times. Did you face a similar problem?

    • Hey! I can give you a better answer if you can share a link to input-output images. Could be probably due to wrong interpretation of image dimensions or channels. Just check if you are not passing right number of channels of image to any function. I never faced the problem you specified (of repeating patterns). Also do check if you are not dealing with float or 32F properly.

  2. Jon Lee says:

    Hello! I started learning opencv a few weeks ago and recently discovered this subreddit and was wondering if I could get some help in implementing the following.


    I’m not quite sure where my code is wrong is my current problem (there’s a strange pattern showing up). My output images simply do not match the output images as described in the paper.

    Here’s the source code pastebin.com/25bTqZHh8 Here’s the output image imgur.com/EMSl4xn.png

    • Hi there.
      Your code has been removed from pastebin. It is in there image, I will forward my implementation to you.

      Your outputs are very different, maybe following could help:
      While calculating minimum ration, make sure you avoid dividing by zero. An easy way would be to add 1 in both numerator and denominator before division. It is not mathematically correct, but results in very small error.

      Other thing which I can tell you is, while implementing this paper, even I never got the exact results depicted by paper. But my results were far better than one you posted. I also discussed the matter (of difference in result) with an author. I shared my code with him. However, I implemented his paper in, whereas authors implementation was in C. Hi said that my implementation was right, but output was significantly different for same set of input images. He assumed, this could because of difference in c and c++ implementations of opencv. So he shared his C code:

      // Calculate Minimum Ratio (MiR) matrix
      cvDiv(pyramid->level_image[0], pyramid_inv->level_image[0], matrix_ratio);
      cvDiv(pyramid_inv->level_image[0], pyramid->level_image[0], matrix_ratio_inv);
      cvMin(matrix_ratio, matrix_ratio_inv, matrix_min_ratio);

      // Derive salience by subtracting from unit matrix
      cvSub(unit_matrix, matrix_min_ratio, saliency_matrix);
      cvConvertScale(saliency_matrix, image_8u, 255.0);

      // Filter
      avg = cvAvg(image_8u);
      cvSubS(image_8u, avg, image_8u);

      // Normalization to range 0-255
      cvNormalize(image_8u, image_8u, 0, 255, CV_MINMAX);

      I did not give it a try in C because by then I optimized my results in c++ only. But if you do not have time constrains, and since you are learning. Do give it a try.
      Hope it helps !

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s