Image segmentation is one of the most essential
segments in numerous image processing and computer vision tasks. Some formal
definitions of image segmentation are as follows:
computer vision, image segmentation is the process of partitioning a digital
image into multiple segments (sets of pixels, also known as super-pixels)-
is a process of grouping together pixels that have similar attributes.
Segmentation is the process of partitioning an image into non-intersecting
regions such that each region is homogeneous and the union of no two adjacent
regions is homogeneous-Pal,1994
Need of Segmentation:
general, image processing can be categorized in three ways, according to types
of input and output as low-level image processing, mid-level image processing
and high-level image processing.
image processing: This category includes all the
processes that take image as an input and produces an image as output (Fig.1).
Low-level processes involve primitive operations such as image preprocessing to
reduce noise, contrast enhancement, and image sharpening etc.
1 Low-level image processing
Mid-level image processing:
A mid-level process is characterized by the fact that its inputs generally are
images, but its outputs are attributes extracted from those images (e.g.,
edges, contours, and the identity of individual objects) (Fig.2). Mid-level
processing on images involves tasks such as partitioning an image into regions
or objects (segmentation), description of those objects to reduce them to a
form suitable for computer processing, and classification (recognition) of
2 Mid-level image processing
High-level image processing:
These are the processes that take image as input and output may be in the forms
of decisions. Such process are the basic foundation of new field called computer vision that can be defined as
all the tasks for acquiring, processing, analyzing and understanding digital
images, and extraction of high-dimensional data from the real world in order to
produce numerical or symbolic information, e.g., in the forms of decisions.
3 High-level image processing
it is clear last two categories depends on the segmentation. Hence,
segmentation plays crucial role in image analysis and computer vision.
Techniques of image segmentation:
The most basic attribute for segmentation is image
luminance amplitude for a monochrome image and color components for a color
image. Image edges and texture are also useful attributes for segmentation.
Fig. 4. Classification
of Image segmentation methods
based Segmentation: Basically there are three types of discontinuity
1.1 Point Detection:
Isolated points can be defined as a point whose gray level is significantly
different from its background and which is located in a homogeneous or nearly
homogeneous area. Using the mask shown in Fig 5, we say that a point has been
detected at the location on which the mask
is centered if new calculated value is greater than some nonnegative threshold.
Basically, this formulation measures
the weighted differences between the center point and its neighbors. The idea is that an isolated point will be quite
different from its surroundings, and thus be easily detectable by this type of
mask. Results of
5. 3×3 Laplacian masks to detect points
Fig. 6 Example of Point detection: (a)Input image,
(b) Result of point detection using mask mentioned in Fig. 5(b), (c) Result
after applying thresholding
code for point detection:
-1 -1;-1 8 -1; -1 -1 -1;
I = imread(‘input_image.jpg’);
I1 = imfilter(I,m1_point_lapl);
title(‘Image after thresholding’)
While edges (i.e. boundaries
between regions with relatively distinct gray levels) are by far the most
common type of discontinuity in an image, instances of thin lines in an image
occur frequently enough that it is useful to have a separate mechanism for
detecting them. Here we present a convolution based
technique which produces an image description of the thin lines in an input
image. Note that the Hough
transform can be used to detect lines; however, in that case, the
output is a parametric description
of the lines in an image.
The line detection operator consists of a
convolution kernel tuned to detect the presence of lines of a particular width
n, at a particular orientation ?. Fig. 7 shows a collection of four such
kernels, which each respond to lines of single pixel width at the particular
Fig. 7 Masks to detect single pixel
wide line; (a) mask to detect horizontal lines, (b) mask to detect lines
oriented with 135o, (c) mask to detect vertical lines, (d) mask to
detect lines oriented with 45o
In practice, we run
every mask over the image and we combine the responses:
R(x, y) = max(|R1 (x, y)|, |R2 (x, y)|, |R3 (x, y)|,
|R4 (x, y)|)
If R(x, y)
> T, then discontinuity
(a) (b) (c) (d) (e)
Fig.8 Line detection: (a) input image, (b) Horizontal lines
highlighted , (c) Vertical lines
highlighted (d) lines oriented at 45o
lines highlighted (e) lines
oriented at 135o lines
Fig. 9 Example
of line detection in sample image
Matlab code for
-1 -1;2 2 2; -1 -1 -1;
2;-1 2 -1; 2 -1 -1;
2 -1;-1 2 -1; -1 2 -1;
-1;-1 2 -1; -1 -1 2;
I = imread(‘input_image.jpg’);
Fig. 10 Line detection in an image
1.3 Edge detection
Edge is defined as sudden change in
intensity, i.e. edges have higher pixel intensity values than those surrounding
it. There are many ways to perform edge detection. However, edge detection can
be grouped into two categories, gradient based and Laplacian zero crossing
based. The gradient method detects the edges by looking for the maximum and minimum
in the first derivative of the image. The Laplacian method searches for zero crossings
in the second derivative of the image to find edges. Edge
Detection is one of the fundamental steps in image processing, image analysis,
image pattern recognition, and computer vision techniques. Complete
classification is given in Fig. 12.
In general edges are of four types;
step, line, ramp and roof edge. Step edges are where the image intensity
abruptly changes from one value on one side of the discontinuity to a different
value on the opposite side. In line edges, the image intensity abruptly changes
value but then returns to the starting value within some short distance. In
real images step and line edges are very rare, because of low frequency
components or the smoothing introduced by most sensing devices, sharp
discontinuities rarely exist in real signals. Due to smoothing, step edges
become ramp edges and line edges become roof edges, where intensity changes are
not instantaneous but occur over a finite distance. Illustrations of these edge
shapes are shown in Fig.11.
Fig. 11 Types of edges: (a) step
edge, (b) ramp edge, (c) line edge, (d) roof edge
1.3.1 Gradient-based edge detection
The Prewitt Detection:
The Prewitt edge detector is an appropriate way to estimate the magnitude and
orientation of an edge. Although differential gradient edge detection needs a
rather time consuming calculation to estimate the orientation from the
magnitudes in the x and y-directions, the compass edge detection obtains the
orientation directly from the kernel with the maximum response.
Fig. 12 Classification of edge
Fig. 13 Roberts mask
Fig. 13 Edge Detection using
gradient based techniques: (a) input image, (b) Result of Prewitt mask, (c)
Results of Sobel mask, (d) Results of Roberts mask
Prewitt edge detection technique was given to overcome the problem faced in
Sobel edge detection due to the absence of the smoothing modules. The operator
adds a vector value in order to provide smoothing.
The Roberts Detection:
The Roberts Cross operator performs a simple, quick to compute, 2-D spatial
gradient measurement on an image. It thus highlights regions of high spatial
frequency which often correspond to edges. In its most common usage, the input
to the operator is a grayscale image, as is the output. Pixel values at each point
in the output represent the estimated absolute magnitude of the spatial
gradient of the input image at that point.
Robert Cross Edge Detector is based on the 2D spatial gradient measurement of
the image. The Edge Detection is performed by the high spatial frequencies. The
magnitude of the spatial gradient of the input image for each different pixel
is provided as the output for each pixel, a gray scale image. The convolution
kernel used is as shown in the Figure1.Robert proposed the equation.
method is similar to the Roberts operator. It finds the approximate absolute
gradient magnitude at each point. Here the operator consists of 3×3 convolution
kernels. One kernel is the other rotated by 90 degree. Finally, the gradient
magnitude is thresholded. |G| =? , q
= arctan (Gx | Gy).
Matlab code for edge detection
using gradient based filters
I = imread(‘input_image.jpg’);
imshow(I) % Input image
I1 = edge(I,
I2 = edge(I,
I3 = edge(I,
imshow(I1) % RESULTS OF Prewitts mask
imshow(I2) % RESULTS OF Sobel mask
imshow(I3) % RESULTS OF Roberts mask
edge detector is regarded as one of the best edge detectors currently in use,
Canny’s edge detector ensures good noise immunity and at the same time detects
true edge points with minimum error. Canny has optimized the edge detection
with regard to the following criteria:
1. Maximizing the signal-to-noise ratio of the
2. An edge localization factor, which ensures that
the detected edge is localized as accurately as possible.
3. Minimizing multiple responses to a single edge.
The steps of Canny algorithm are as follows:
1. Smoothing: Blurring of the image to remove noise
by convolving the image with the Gaussian filter.
2. Finding gradients: The edges should be marked
where the gradients of the image has large magnitudes, finding the gradient of
the image by feeding the smoothed image through a convolution operation with
the derivative of the Gaussian in both the vertical and horizontal directions
3. Non-maximum suppression: Only local maxims should
be marked as edges. finds the local maxima in the direction of the gradient,
and suppresses all others, minimizing false edges.
4. Double thresholding: Potential edges are
determined by thresholding, Instead of using a single static threshold value
for the entire image, the Canny algorithm introduced hysteresis thresholding,
which has some adaptivity to the local content of the image. There are two threshold
levels, th, high and tl, low where th > tl. Pixel values above the th value
are immediately classified as edges.
5. Edge tracking by hysteresis: Final edges are
determined by suppressing all edges that are not connected to a very strong
example shows how to use watershed segmentation to separate touching objects in
an image. The watershed transform is often applied to this problem. The
watershed transform finds “catchment basins” and “watershed
ridge lines” in an image by treating it as a surface where light pixels
are high and dark pixels are low.
Segmentation using the watershed transform works
better if you can identify, or “mark,” foreground objects and
background locations. Marker-controlled watershed segmentation follows this basic
1. Compute a segmentation function. This is an image
whose dark regions are the objects you are trying to segment.
2. Compute foreground markers. These are connected
blobs of pixels within each of the objects.
3. Compute background markers. These are pixels that
are not part of any object.
4. Modify the segmentation function so that it only
has minima at the foreground and background marker locations.
5. Compute the watershed transform of the modified
This example highlights
many different Image Processing Toolbox™ functions, including fspecial,
imfilter, watershed, label2rgb, imopen, imclose, imreconstruct, imcomplement,
imregionalmax, bwareaopen, graythresh, and imimposemin.
Step 1: Read in the
Color Image and Convert it to Grayscale
rgb = imread(‘pears.png’);
I = rgb2gray(rgb);
text(732,501,’Image courtesy of Corel(R)’,…
Step 2: Use the Gradient Magnitude as the Segmentation Function
Use the Sobel edge masks, imfilter, and
some simple arithmetic to compute the gradient magnitude. The gradient is high
at the borders of the objects and low (mostly) inside the objects.
hy = fspecial(‘sobel’);hx = hy’;Iy = imfilter(double(I), hy, ‘replicate’);Ix = imfilter(double(I), hx, ‘replicate’);gradmag = sqrt(Ix.^2 + Iy.^2);figureimshow(gradmag,), title(‘Gradient magnitude (gradmag)’)
Step 3: Mark the Foreground Objects
A variety of procedures could be applied here to find the
foreground markers, which must be connected blobs of pixels inside each of the
foreground objects. In this example you’ll use morphological techniques called
“opening-by-reconstruction” and “closing-by-reconstruction”
to “clean” up the image. These operations will create flat maxima
inside each object that can be located using imregionalmax.
Opening is an erosion followed by a dilation, while
opening-by-reconstruction is an erosion followed by a morphological
reconstruction. Let’s compare the two. First, compute the opening using imopen.
se = strel(‘disk’, 20);Io = imopen(I, se);figureimshow(Io), title(‘Opening (Io)’)
Next compute the opening-by-reconstruction using imerode and imreconstruct.
Ie = imerode(I, se);Iobr = imreconstruct(Ie, I);figureimshow(Iobr), title(‘Opening-by-reconstruction (Iobr)’)
Following the opening with a closing can remove the dark spots
and stem marks. Compare a regular morphological closing with a
closing-by-reconstruction. First try imclose:
Ioc = imclose(Io, se);figureimshow(Ioc), title(‘Opening-closing (Ioc)’)
Now use imdilate followed by imreconstruct.
Notice you must complement the image inputs and output ofimreconstruct.
Iobrd = imdilate(Iobr, se);Iobrcbr = imreconstruct(imcomplement(Iobrd), imcomplement(Iobr));Iobrcbr = imcomplement(Iobrcbr);figureimshow(Iobrcbr), title(‘Opening-closing by reconstruction (Iobrcbr)’)
As you can see by comparing Iobrcbr with Ioc,
reconstruction-based opening and closing are more effective than standard
opening and closing at removing small blemishes without affecting the overall
shapes of the objects. Calculate the regional maxima of Iobrcbr to obtain good foreground markers.
fgm = imregionalmax(Iobrcbr);figureimshow(fgm), title(‘Regional maxima of opening-closing by reconstruction (fgm)’)
o help interpret the result, superimpose the foreground marker
image on the original image.
I2 = I;I2(fgm) = 255;figureimshow(I2), title(‘Regional maxima superimposed on original image (I2)’)
Notice that some of the mostly-occluded and shadowed objects are
not marked, which means that these objects will not be segmented properly in
the end result. Also, the foreground markers in some objects go right up to the
objects’ edge. That means you should clean the edges of the marker blobs and
then shrink them a bit. You can do this by a closing followed by an erosion.
se2 = strel(ones(5,5));fgm2 = imclose(fgm, se2);fgm3 = imerode(fgm2, se2);
This procedure tends to leave some stray isolated pixels that
must be removed. You can do this using bwareaopen, which
removes all blobs that have fewer than a certain number of pixels.
fgm4 = bwareaopen(fgm3, 20);I3 = I;I3(fgm4) = 255;figureimshow(I3)title(‘Modified regional maxima superimposed on original image (fgm4)’)
Step 4: Compute Background Markers
Now you need to mark the background. In the cleaned-up image, Iobrcbr, the
dark pixels belong to the background, so you could start with a thresholding
bw = imbinarize(Iobrcbr);figureimshow(bw), title(‘Thresholded opening-closing by reconstruction (bw)’)
The background pixels are in black, but ideally we don’t want
the background markers to be too close to the edges of the objects we are
trying to segment. We’ll “thin” the background by computing the
“skeleton by influence zones”, or SKIZ, of the foreground of bw. This
can be done by computing the watershed transform of the distance transform of bw, and
then looking for the watershed ridge lines (DL == 0) of
D = bwdist(bw);DL = watershed(D);bgm = DL == 0;figureimshow(bgm), title(‘Watershed ridge lines (bgm)’)
Step 5: Compute the Watershed Transform of the Segmentation
The function imimposemin can be used to modify an image so that it
has regional minima only in certain desired locations. Here you can use imimposemin to modify the gradient magnitude image so
that its only regional minima occur at foreground and background marker pixels.
gradmag2 = imimposemin(gradmag, bgm | fgm4);
Finally we are ready to compute the watershed-based
L = watershed(gradmag2);
Step 6: Visualize the Result
One visualization technique is to superimpose the foreground
markers, background markers, and segmented object boundaries on the original
image. You can use dilation as needed to make certain aspects, such as the
object boundaries, more visible. Object boundaries are located where L == 0.
I4 = I;I4(imdilate(L == 0, ones(3, 3)) | bgm | fgm4) = 255;figureimshow(I4)title(‘Markers and object boundaries superimposed on original image (I4)’)
This visualization illustrates how the locations of the
foreground and background markers affect the result. In a couple of locations,
partially occluded darker objects were merged with their brighter neighbor
objects because the occluded objects did not have foreground markers.
Another useful visualization technique is to display the label
matrix as a color image. Label matrices, such as those produced by watershed and bwlabel, can
be converted to truecolor images for visualization purposes by usinglabel2rgb.
Lrgb = label2rgb(L, ‘jet’, ‘w’, ‘shuffle’);figureimshow(Lrgb)title(‘Colored watershed label matrix (Lrgb)’)
You can use transparency to superimpose this pseudo-color label
matrix on top of the original intensity image.
figureimshow(I)hold onhimage = imshow(Lrgb);himage.AlphaData = 0.3;title(‘Lrgb superimposed transparently on original image’)