8 Steps to Generating Content Aware Images

content aware images with face detection

There are many image resizing/cropping technologies available on the market, but almost none of them answer a simple question: how do I resize an image that preserves the relevant visual information without distorting the content? How do I make content aware images?

Every front end developer or designer can relate to this situation: mocking up content on various devices that result in a distorted image.  The root of the problem is that the image loses it’s relevant visual information due to the image aspect ratio constraints.

With the wide adoption of the mobile devices for a web developer, it is imperative to expose the same pictures on each device while maintaining the important features.

Nowadays almost every website is developed to be responsive and seamlessly adapt it’s content to different resolutions.  HTML, as well as other standards, can support dynamic changes of page layout and text.

But this does not resolve the issue with images.

Although being one of the key elements in digital media, images typically remain rigid in size and cannot automatically deform to fit different layouts.

This is why it’s so important to develop a method, which not only rescales the image, but preserves the image aspect ratio of the content.

In this article we present a technique developed by Shai Avidan and Ariel Shamir knowns as seam-carving, popularized in their paper Seam Carving for Content-Aware Image Resizing.

Approaching Content Aware Images

Let’s consider the image below (Fig.1). It’s a nice and clean picture with a wide open background. Now suppose that we want to make it smaller. We have two options: either to crop it, or to scale it.

Cropping is limited since it can only remove pixels from the image periphery. Advanced cropping features like smart cropping cannot resolve our issue, since it will remove the person from the left margin or will crop a small part from the castle. Scaling is also not sufficient since it is not aware of the image content and typically can only be applied uniformly.

 

Fig.1: Sample Image

 

Seam carving was developed typically for this kind of use case. It works by establishing a number of seams (a connected path of low energy pixels) crossing the image from top to down or from left to right, defining the importance of pixels. By successively removing or inserting seams, we can reduce or enlarge the size of the image in both directions.

Fig.2. Illustrates the process.

Fig.2: The seam carving method illustrated

 

Now let’s skim through the details and summarize the important steps

The 8 Steps of Generating Content Aware Images

  1. An energy map (edge detection) is generated from the provided image.
  2. The algorithm tries to find the least important parts of the image taking into account the lowest energy values.
  3. Using a dynamic programming approach the algorithm will generate individual seams crossing the image from top to down, or from left to right (depending on the horizontal or vertical resizing) and will allocate a custom value for each seam. The least important pixels have the lowest energy cost and the most important ones have the highest cost.
  4. Traverse the image from the second row to the last row and compute the cumulative minimum energy for all possible connected seams for each entry.
  5. The minimum energy level is calculated by summing up the current pixel value with the lowest value of the neighboring pixels from the previous row.
  6. Traverse the image from top to bottom (or from left to right in case of vertical resizing) and compute the minimum energy level. For each pixel in a row, we compute the energy of the current pixel plus the energy of one of the three possible pixels above it.
  7. Find the lowest cost seam from the energy matrix starting from the last row and remove it.
  8. Repeat the process.

Implementation

Seam carving can support several types of energy functions such as gradient magnitude, entropy and sobel filter. These functions have something in common: they value a pixel by measuring its contrast with its neighboring pixels.

We will use the Sobel filter operator in our case. Typically in image processing the Sobel operator is used to detect image edges. It’s working with an energy distribution matrix to differentiate the sensitive image information from the less sensitive. (Fig. 3)

Fig.3: Sobel threshold applied to the original image

 

Once we obtain the energy distribution matrix, we can advance to the next step to generate individual seams of one pixel wide.

We’ll use a dynamic programming approach to store the results of sub-calculations in order to simplify calculating the more complex ones. For this purpose we define some setter and getter methods whose role is to set and get the pixel energy value.

[code lang=”go”]// Seam struct contains the seam pixel coordinates.
type Seam struct {
X int
Y int
}

// Carver is the main entry struct having as parameters the newly generated image width, height and seam points.
type Carver struct {
Width int
Height int
Points []float64
}

// NewCarver returns an initialized Carver structure.
func NewCarver(width, height int) *Carver {
return &Carver{
width,
height,
make([]float64, width*height),
}
}

// Get energy pixel value.
func (c *Carver) get(x, y int) float64 {
px := x + y*c.Width
return c.Points[px]
}

// Set energy pixel value.
func (c *Carver) set(x, y int, px float64) {
idx := x + y*c.Width
c.Points[idx] = px
}[/code]

After generating the energy map there are two important steps we need to follow:

Traverse the image from the second row to the last row and compute the cumulative minimum energy M for all possible connected seams for each entry (i, j).  M is the two dimensional array of cumulative energies we are building up. The minimum energy level is calculated by summing up the current pixel value with the minimum pixel value of the neighboring pixels from the previous row. This can be done via Dijkstra’s algorithm. Suppose that we have a matrix with the following values:

 

A matrix used for content aware images
Original Matrix

 

To compute the minimum cumulative energies of the second row, we start with the columns from the first row and sum up with the minimum value of the neighboring cells from the second row. After the above operation is carried out for every pixel in the second row, we go to the third row and so on.

 

Using Dijkstra's algorithm to produce content aware images
Matrix with calculated energy values

 

Using Dijkstra’s algorithm to calculate the minimum energy values.

[code lang=”go”]// ComputeSeams compute the minimum energy level.
func (c *Carver) ComputeSeams(img *image.NRGBA, p *Processor) []float64 {
var src *image.NRGBA
newImg := image.NewNRGBA(image.Rect(0, 0, img.Bounds().Dx(), img.Bounds().Dy()))
draw.Draw(newImg, newImg.Bounds(), img, image.ZP, draw.Src)

src := SobelFilter(Grayscale(newImg), float64(p.SobelThreshold))

for x := 0; x < c.Width; x++ {
for y := 0; y < c.Height; y++ {
r, _, _, a := src.At(x, y).RGBA()
c.set(x, y, float64(r)/float64(a))
}
}

var left, middle, right float64

// Traverse the image from top to bottom and compute the minimum energy level.
// For each pixel in a row we compute the energy of the current pixel
// plus the energy of one of the three possible pixels above it.
for y := 1; y < c.Height; y++ {
for x := 1; x < c.Width-1; x++ {
left = c.get(x-1, y-1)
middle = c.get(x, y-1)
right = c.get(x+1, y-1)
min := math.Min(math.Min(left, middle), right)
// Set the minimum energy level.
c.set(x, y, c.get(x, y)+min)
}
// Special cases: pixels are far left or far right
left := c.get(0, y) + math.Min(c.get(0, y-1), c.get(1, y-1))
c.set(0, y, left)
right := c.get(0, y) + math.Min(c.get(c.Width-1, y-1), c.get(c.Width-2, y-1))
c.set(c.Width-1, y, right)
}
return c.Points
}[/code]

Once we calculate the minimum energy values for each row, we select the last row as the starting position and search for the pixel with the smallest cumulative energy value. Then we traverse up on the matrix table one row at a time and again search for the minimum cumulative energy value up until the first row. The obtained values (pixels) make up the seam which should be removed.

[code lang=”go”]// FindLowestEnergySeams find the lowest vertical energy seam.
func (c *Carver) FindLowestEnergySeams() []Seam {
// Find the lowest cost seam from the energy matrix starting from the last row.
var min = math.MaxFloat64
var px int
seams := make([]Seam, 0)

// Find the pixel on the last row with the minimum cumulative energy and use this as the starting pixel
for x := 0; x < c.Width; x++ {
seam := c.get(x, c.Height-1)
if seam < min { min = seam px = x } } seams = append(seams, Seam{X: px, Y: c.Height – 1}) var left, middle, right float64 // Walk up in the matrix table, check the immediate three top pixel seam level and // add the one which has the lowest cumulative energy. for y := c.Height – 2; y >= 0; y– {
middle = c.get(px, y)
// Leftmost seam, no child to the left
if px == 0 {
right = c.get(px+1, y)
if right < middle {
px++
}
// Rightmost seam, no child to the right
} else if px == c.Width-1 {
left = c.get(px-1, y)
if left < middle {
px–
}
} else {
left = c.get(px-1, y)
right = c.get(px+1, y)
min := math.Min(math.Min(left, middle), right)

if min == left {
px–
} else if min == right {
px++
}
}
seams = append(seams, Seam{X: px, Y: y})
}
return seams
}[/code]

Removing the seam requires two steps

  1. Obtain the pixel coordinates of the seam values
  2. Check on each iteration if the processed pixel coordinates corresponds with the seams pixel values position.

We can check the accuracy of our logic by drawing the removable seams on top of the image. (Fig.4)

A content aware image with seams
Fig.4: Seams applied on the original image

 

[code lang=”go”]// RemoveSeam remove the least important columns based on the stored energy (seams) level.
func (c *Carver) RemoveSeam(img *image.NRGBA, seams []Seam, debug bool) *image.NRGBA {
bounds := img.Bounds()
// Reduce the image width with one pixel on each iteration.
dst := image.NewNRGBA(image.Rect(0, 0, bounds.Dx()-1, bounds.Dy()))

for _, seam := range seams {
y := seam.Y
for x := 0; x < bounds.Max.X; x++ {
if seam.X == x {
if debug {
dst.Set(x-1, y, color.RGBA{255, 0, 0, 255})
}
continue
} else if seam.X < x {
dst.Set(x-1, y, img.At(x, y))
} else {
dst.Set(x, y, img.At(x, y))
}
}
}
return dst
}[/code]

The same logic can be applied to enlarge images, only this time we compute the optimal vertical or horizontal seam (s) and duplicate the pixels of s by averaging them with their left and right neighbors (top and bottom in the horizontal case).

 

A content aware image
Fig.5: The final resized image

 

Below are some results for both shrunk and enlarged images:

Content aware images
Enlarged and shrunk content aware images

 

Face Detection In Content Aware Images

For content that includes faces, relations between features are important. Automatic face detection can be used to identify the areas that needs protection. This is an important requirement since in certain situations when sensible image regions like faces (detected by the sobel filter operator) are compressed in a small area it might happen to get distorted (see Fig.6).

To prevent this, once these regions are detected we can increase the pixel intensity to a very high level before running the edge detector. This way we can assure that the sobel detector will consider these regions as important ones, which also means that they will receive high energy values.

A content aware image
The original photo

 

content aware images without face detection
Resized without face detection

 

content aware images with face detection
Resized with face detection

Conclusion

Seam carving is a technique which can be used on a variety of image manipulations including:

  • Aspect Ratio Change
  • Image Retargeting
  • Object Removal
  • Content Amplification

It can also be seamlessly integrated with a Convolutional Neural Network which can be trained for specific object recognition, making the perfect toolset for every content delivery solution. There are other numerous possible domains this technique could be applied or extended to like video resizing or the ability for continuous resizing in real time.

As a concrete example, Twitter recently introduced a solution in their tech stack which crops the picture previews to their most interesting parts by training a neural network to recognize the picture zones a person is looking at when freely viewing the image.

The solution we presented can be easily integrated into this kind of tech stack but with a much better efficiency since the cropping surface is not limited to the periphery of the pictures, but it’s freely adaptable.

Of course as every technology has its limitations, like the case when the processed image is very condensed, in the sense that it does not contain “less” important areas, ugly artifacts might appear.

The algorithm also does not perform very well when the image, albeit being not very condensed, content is laid out in a manner that does not permit the seams from bypassing some important parts. In certain situations by tweaking the parameters, like using a higher sobel threshold or applying a blur filter, these kind of limitations could be surpassed.

Please let us know if you have any questions left unanswered about making content aware images!

Read More →