# Automatic object detection and segmentation of the histocytology images using reshapable agents.

INTRODUCTIONAutomated image analysis of cells and tissues has been an active research field in biomedical informatics for the past three decades (Mulrane et al., 2008; Gurcan et al., 2009). However, it has recently attracted increased attention due to developments in computer and microscopy hardware. Nowadays with everlasting perfection of microscopy imaging technology, an increasing volume of high quality medical images becomes available. This huge volume of images, both in routine clinical work and in research and development, calls for an increasing degree of automation of image analysis processes.

The aim of any segmentation method is to extract boundary elements belonging to the same structure and integrate these elements into a coherent and consistent model of the structure. The recent literature suggests that investigation for robust and practical cell/nuclei segmentation methods, as a critical step of automated image analysis, is still on its way. According to Wang et al. (2007) the automated approaches to segmentation could be categorized generally in three different classes: supervised (model based), unsupervised (inspired by low level image properties) and weakly supervised approaches.

Supervised segmentation methods use classification algorithms such as k-nearest-neighbor, Bayes classifier, Neural Networks, and Support Vector Machines (SVMs). These classifiers learn models of the characteristics of different tissue types from labeled examples and adapting the resultant models for segmenting new images. Supervised algorithms could be slow to train and may require a substantial amount of manually segmented data. In contrast, unsupervised segmentation methods divide an image into homogeneous regions based on an objective measure of homogeneity. Such unsupervised techniques do not require any training data. However they can lead to groupings that do not correspond to the desired conceptual tissue categories. Furthermore, their ad-hoc nature prevents them to be applied to a wide range of microscopy images. Weakly supervised approaches arise from the idea of using together a large amount of unlabeled data which is often easy to obtain and a few of labeled data which is hard to obtain, since it requires human experts.

To overcome the segmentation problem in histocytology images, a number of different approaches have been proposed in context of various biomedical applications. Thresholding approaches with a fixed or adaptive threshold values are the most straightforward techniques of segmentation which have been employed in Korde et al. (2009) for nuclei segmentation of bladder and skin tissue images. However, the global thresholding technique supposes that the nuclei have a range of intensities that is sufficiently different from the background. This is generally not true, since the background varies significantly. The result may be improved by adaptive thresholding, but large intensity variations between and within the nuclei will cause the segmentation procedure to fail.

Moreover, there have been other approaches that incorporate more complex segmentation techniques such as region growing, active contours and edge/contour based methods. Region growing methods are based on the assumption that the objects consist of connected regions of similar pixels. Region growing and merging methods are commonly used for segmentation of cells and nuclei from fluorescence microscopy images as shown in Adiga et al. (2006). However, large intensity variations between and within the nuclei may cause these methods to fail.

Although active contour models or snakes are widely used in medical image segmentation, these methods are sensitive to initialization of a start contour or a seed inside each object of interest. Furthermore, they may lead to poor segmentation results if applied to cluttered images. Several methods have been proposed to adapt active contours to the nature of histocytology images. For instance, a method suggested in Ali and Madabhushi (2012) utilizes active contour algorithm for segmentation of histopathological images of breast and prostate tissues.

Edge/contour based methods are another class of methods that have been used widely in medical image segmentation. Typically they are considered as unsupervised or automatic methods. In recent years there have been efforts to apply these methods on histocytology images. For instance, a gradient flow tracking method which is used for segmentation of touching cells is proposed in Li et al. (2008). Another recent study that uses a contour-based cell detection and segmentation algorithm is proposed in Wienert et al. (2012). However, these methods can fail due to complicated spatial and color patterns of the histocytology images.

The aim of this paper is to develop a method for localization and segmentation of the target objects in histocytology images. Unlike fully automatic (unsupervised) methods, which are suitable for very specific kinds of microscopy images, the proposed method is model based. That is, to achieve more flexible and general solution the features of the interested object samples are given to system through a training stage. The framework of the proposed approach is depicted in Fig. 1. The input images are processed in five steps including preprocessing, object detection by rectangular window, stochastic reshaping and contour's cost evaluation.

Localization of the potential objects is carried out through scanning the whole images and matching the rectangular regions of an image with a template obtained from the training stage. Afterwards, the contours of detected rectangular regions are reshaped iteratively to achieve finer segmentation levels. An iterative stochastic contour reshaping algorithm is proposed to reshape the contours and to fit the objects of the interest properly. The reshaping process is controlled by a cost function including the prior shape, regional texture and gradient terms. The performance of the proposed method is evaluated in both detection (rectangular regions) and finer segmentation levels and compared with the well-known region growing method. The precision and recall measures are used for the assessment of the object localization. Furthermore, the segmentation performance is compared against the manually segmented ground truth using the Jaccard and Zijdenbos similarity indices.

The reminder of this paper is organized as follows: the next section introduces the dataset and describes preprocessing stage. Then, rectangular detection of the target objects will be described next. Afterwards, we describe the stochastic contour reshaping algorithm which is proposed to finer segmentation of the rectangular detected objects. The results section reports the detection and segmentation performance results of the proposed method. Finally we conclude with a discussion of our results.

MATERIALS AND METHODS

As a publicly available dataset, the acute lymphoblastic leukemia image database (ALL-IDB1) is employed in our experiments. It is introduced in Labati et al. (2011) and includes 109 images in JPG format with 24 bit color depth. The images are captured with a PowerShot G5 camera and their resolution is 2592 x 1944. The dataset contains about 39000 blood elements, where the lymphoblasts (immature lymphocytes) as the target objects have been labeled by the oncology experts. The number of labeled lymphoblasts presented in the ALL-IDB1 is 510. An example image of the dataset is depicted in Fig. 2. The target objects are identified by the yellow dots.

PREPROCESSING

Prior to applying the method to dataset images, a preprocessing step is carried out. This step includes color quantization of the images. Since the co-occurrence matrices are used for describing the image content, depending to intensity or color ranges, their dimensionality could be too large. Fortunately the stained blood smear or tissue images have considerably limited color spectrum. As confirmed by the sample image presented in Fig. 2, there are few dominant colors (hues of blue, purple and pink) in the images obtained from staining techniques. Thus, this allows us to efficiently reduce the color space down to k quantized colors using uniform quantization algorithm. To avoid large and sparse color co-occurrence matrices, which affects the time performance of the method, the number of quantized colors should be small. On the other hand, it should be large enough such that different regions of the image can be described and identified correctly. Depending to the spatial structure and color patterns of the images the value of k could be 16, 32 or even more quantized colors. In this paper regarding the time efficiency and detection performance issues we have selected 32 quantized colors experimentally.

TRAINING

A training set [S.sub.l] of regions representing the structure of the interest (lymphoblasts) is obtained before starting of the detection procedure. Each region, Reg [member of] [S.sub.l], of the training set consists of a set of pixels whose texture characteristics are described by the color co-occurrence matrices, as shown in Kovalev et al. (2011). The texture sufficiently represented by a three dimensional matrix W([[DELTA].sub.ij], [c.sub.i], cj) where ci and cj are indices of suitably quantized RGB color intensities of the pixels i and j, [[DELTA].sub.ij] is the Euclidean distance between pixels i, i = ([x.sub.i], [y.sub.i]) and j, j = ([x.sub.j], [y.sub.j]) and W is the frequency of the spatial occurrence of such elementary image structures on the image plane.

As it can be seen from the Fig. 2 the target objects (lymphoblasts) are similar to each other, whereas remaining objects of the non-target classes (i.e., any types of objects except the lymphoblasts) are visually different from the target class. This is confirmed by the multidimensional scaling analysis of the texture feature vectors of the samples belonging to the target and other non-target classes. Fig. 3 depicts an approximated 2D distance distribution of the scaled feature vectors belonging to the mentioned samples. The green circles in Fig. 3 represent the scaled vectors of the target class, the red ones represent the background of the images (red blood cells), the black, the yellow and the pink icons represent other types of white blood cells like basophils, eosinophils, monocytes and non-blast lymphocytes. The target samples are close to each other in the feature space whereas other samples of non-target classes are spread over. Since the lymphoblasts can be distinguished and separated from the others by their distance in the feature space, we considered lymphoblast as the only class in the training set and all the remaining objects are considered as the background.

According to the mentioned reasons, there is no need to annotate too many objects as the training samples. The ratio of the annotated items with regard to the total number of the target objects is about 10% (51 annotated out of 510 manually labeled target items). Moreover, due to homogeneity of the target objects they can be distinguished from the others by a proper distance threshold in the feature space as confirmed in Fig. 3. Therefore there is no need to employ classifiers like SVM.

DETECTION OF THE TARGET OBJECTS

The detection procedure is carried out by scanning the images with a rectangular scanning object. Since the size of lymphoblasts is homogeneous, the rectangular scanning object's size, [s.sub.rec], is determined based on average size of the bounding boxes in the training set. The detection procedure consists of 3 main steps which are executed iteratively until complete scanning of an image. The first step is moving the rectangular scanning object over an image. It starts from (0, 0) coordinate of the image and moves to the end of it row by row in 10 pixels increments. The second step is texture feature extraction of a region identified by the rectangular object which is carried out by computing the co-occurrence matrix (W) of that region. The third and most critical part is the classification of the rectangular region. The decision whether a region is belong to the target class or not, is resolved by measuring the distance between feature vectors of that region and centroid of the training set in the feature space. Let us denote v, v [member of] [R.sup.n] as the co-occurrence matrix of an image region identified by the rectangular scanning object during detection process, and u, u [member of] [R.sup.n] as the texture vector of the training set's centroid which is obtained from the co-occurrence matrices of the training set items. To classify the region into the proper class, the v and the u vectors are normalized via:

[??] = v/[parallel]v[parallel], [??] = u/[parallel]u[parallel], (1)

then the city block distance of normalized vectors is determined by:

[d.sub.c] = [n.summation over (i=1)] [absolute value of [[??].sub.i] - [[??].sub.i]]. (2)

Having [[bar.d].sub.c] normalized to [0, 1], if the value of [[bar.d].sub.c] is less than or equal to a distance threshold [t.sub.dist], [t.sub.dist] [member of] [0, 1], the region is regarded as a target region otherwise it will be considered as background class. To optimize the detection performance, a range of [t.sub.dist] values is examined in our experiments and the optimum [t.sub.dist] value is selected regarding the precision and recall values of the detection procedure.

The graphical output of applying detection procedure to an example image is depicted in Fig. 4. The yellow rectangles show the correctly detected target objects whereas the red ones indicate the false alarms.

As a result of this phase we end up with the localization (detection) of the target objects. The contour pixels of a rectangular agent indicating to a target object will be regarded as the initial set of pixels of an reshapable agent. The agents will be reshaped in the next phase of the method's processing pipeline to achieve finer segmentation of the target objects.

STOCHASTIC RESHAPING

To accomplish segmentation of the targets, stochastic reshaping contour algorithm is proposed. It is a customized implementation of region-, shape- and gradient based active contours. The notion of active contours is utilized and adapted to make it applicable to a range of histology/cytology images. It is known that the basic idea of active contour model is to evolve a curve, subject to constraints from a given image I , in order to achieve an optimal state and therefore to outline the object. Active contours and level set methods have been widely used and progressively improved for medical image segmentation. The concept of active contours was introduced in Kass et al. (1988) for segmentation of objects in images using dynamic curves.

Let [OMEGA] be a bounded open subset of [R.sup.2], with [partial derivative][OMEGA] as its boundary. Let I : [bar.[OMEGA]] [right arrow] R be a given image. Usually, [bar.[OMEGA]] is a rectangle in the plane and I takes values between 0 and 255. Denote by C(S): [0, 1] [right arrow] [R.sup.2] a parameterized curve. According to Chan and Vese (2001) the energy functional of the active contour model is expressed as:

E(C) = [E.sub.1](C) - [E.sub.2](C). (3)

The first term controls the rigidity and elasticity of the contour and represents the internal energy of the active contour and is defined as:

[E.sub.2](C) = [alpha] [[integral].sup.1.sub.0] [[absolute value of C'(s)].sup.2] ds + [beta][[integral].sup.1.sub.0] [[absolute value of C"(s)].sup.2] ds, (4)

where [alpha], [beta] are positive parameters. The second term ([E.sub.2](C)) represents external energy which attracts the model to the target objects in the image I and is defined as:

[E.sub.2](C) = [lambda] [[integral].sup.1.sub.0] [[absolute value of [nabla]I(C(s))].sup.2] ds, (5)

where [lambda] is a positive parameter and [nabla]I(C(s)) represents the gradient of the contour.

Regardless of internal and external forces implementation of an active contour based models, the evolving curve C in [OMEGA], as the boundary of an open subset [omega] of [OMEGA] (i.e. [omega] [subset] [OMEGA] and C = [partial derivative][omega]) is represented by a Lipschitz function [phi] : [OMEGA] [right arrow] R such that:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (6)

Here, inside(C) denotes the region [omega], outside(C) denotes the region [OMEGA] \ [bar.[omega]].

In the classical form of its implementation, the active contour model is forced to move on locations of maxima [absolute value of [nabla]I] under limitations provided by first two terms by minimizing the energy in Eq. 3. The boundary-based approaches such as geodesic/geometric active contours have become popular on account of their reliable performance when strong object gradients are present. However, as only the edge information is utilized, their performance is limited by the strength of the image gradient. These models are typically unable to handle object occlusion or scene clutter and as a result multiple overlapping objects are often segmented as single object.

The basic idea behind our stochastic reshaping approach is to change the behavior of the active contour and make it applicable to a range of histology/cytology images. Hence, we propose a cost function which is used inside reshaping algorithm and consisted of two components. The proposed cost function is defined as:

F = [F.sub.1] + [F.sub.2], (7)

where the first component ([F.sub.1]) represents the internal force of the proposed cost function and it is responsible to preserve the integrity of the contour. It controls weather integrity of the border is kept or not due to shrinkage or expansion actions applied to border points [P.sub.i]. It is defined as:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (8)

The second component of the cost function is defined as follows:

[F.sub.2] = [gamma] [DELTA] [V.sub.texture] + l[DELTA][V.sub.shape] - [eta][nabla]I(C) , (9)

where [gamma], [eta] and [iota] are weighting coefficients. The first term of the [F.sub.2] forces the agent's contour toward the regions that have similar texture to its prototype. The prototype is the most similar training item to the initial state of an agent in terms of internal texture and contour's shape. Due to homogeneity of the target objects we chose centroid of the training set as the prototype. Similarly, the second term of the [F.sub.2] forces the agent's contour to get a similar shape like its prototype's shape. Finally, the third term encourages the contour toward the edges. Having [bar.u], [bar.v] [member of] [R.sup.n] as normalized texture feature vectors of [omega] (i.e. area bordered by the contour) and its prototype respectively, [DELTA][V.sub.texture] is the city block distance between [bar.u] and [bar.v] at any reshaping step. Similarly, having [rho], [psi] as shape feature vectors of the counter (C) and its prototype respectively and [bar.[rho]], [bar.[psi]] as their normalized forms, [DELTA][V.sub.shape] represents the city block distance between [bar.[rho]] and [bar.[psi]] at any iteration step in the reshaping process. The final term [nabla]I(C) of the [F.sub.2] represents the average gradient value of the agent's contour at any reshaping step. Taking into account the requirements of reasonable computational complexity, simple shape features were utilized to shape description of the contour in every reshaping step. The shape feature vector is defined as follows:

[rho] = ([[rho].sub.1], [[rho].sub.2], [[rho].sub.3], [[rho].sub.4]), (10)

where [[rho].sub.1] = [P.sub.c] is the perimeter of contour, [[rho].sub.2] = [A.sub.c] is the area limited by the contour and [[rho].sub.3] = [R.sub.c] is the roundness of the contour which is defined as:

[R.sub.c] = 4[pi][A.sub.c]/[P.sup.2.sub.c]. (11)

The last item of shape feature vector, [[rho].sub.4] = [E.sub.c] represents the eccentricity of the contour which is defined as:

[E.sub.c] = [d.sub.max]/[d.sub.min], (12)

where [d.sub.max], [d.sub.min] are the length and the width of contour's bounding box respectively.

The reshaping process is described in Table 1. The three major components of the reshaping algorithm are: the "compute distance to centroid", "Shrink" and "Expand" actions. The "compute distance to centroid" component includes extraction of shape, texture and gradient features of the current state of an agent and calculate the cost of its current state using Eq. 7 in every reshaping state (initial, shrunk or expanded). According to the algorithm a trial shrinkage or expansion of the agent will be accepted if the value of cost function F of Eq. 7 in every step becomes less than its value in the previous step. In other words, iterative changes of contour made by Shrink and Expand actions should decrease the value of cost function. Otherwise, the effect made by shrinkage or expansion actions on the contour will be rolled back.

The shrinkage action starts with selection of a random shrink point, [P.sub.shrink], from the list of agent's border points. Then, the contour points which are located in distance r of shrink point will be shrunk in such a way that the integrity of contour is kept. Similarly, expansion starts with selection of a random expansion point, [P.sub.expand], such that the Euclidean distance between the expansion point and shrink point is greater than a distance threshold, [d.sub.t]. In order to avoid overlapping of expansion and shrinkage areas on the contour and keep them far from each other, [P.sub.expand] should be located on the position of contour so that [d.sub.t] > 2 x r. After locating the [P.sub.expand] in a proper position, the contour points that are located in distance r of the expand point will be expanded in such a way that the integrity of contour is kept. Fig. 5 depicts a schematic view of the proposed reshaping process.

Fig. 6 represents a reshapable agent imposed on an example image. The evolution of the agent's contour is shown from left to right.

METHOD VALIDATION

In order to evaluate the performance of the proposed method, the accuracy of the method is measured in both detection and segmentation steps. Furthermore, the segmentation quality of the method is compared with a state of the art method. In the detection level the accuracy is measured by calculation of the detection precision and its recall. The precision can be defined as the probability that detector's signal was recognized correctly. The recall is the probability that all the ground truth objects are recognized. The detection procedure is applied to the data set images and true positive (Tp), false negative (Fn) and false positive (Fp) events is determined using the ground truth information of the target objects. The precision and recall were computed as follows:

Precision = Tp/[Tp + Fp], (13)

Recall = Tp/[Tp + Fn]. (14)

In the segmentation level, the stochastic reshaping algorithm is applied to the detected rectangular agents and the segmentation similarity index or segmentation agreement to the ground truth is measured. The results are compared with the segmentation results of a state of the art method (region growing based method). Two segmentation similarity indices (the Zijdenbos and the Jaccard) are used to measure the segmentation performance. The Zijdenbos similarity index, as shown by Zijdenbos et al. (1994), is a well-known metric for performance assessment of any region-based segmentation method. It measures the percentage of the overlapping ratio between the two shapes A (automatic segmented area) and M (manually segmented area or ground truth). It is defined as:

ZSI = 2 * [absolute value of A [intersection] M]/[absolute value of A] + [absolute value of M], (15)

where A and M are the binary images generated by the proposed method and manual segmentation of image (ground truth), respectively.

In addition to ZSI similarity index, the Jaccard similarity index is also calculated to provide comprehensive evaluation of the method. The Jaccard similarity index is defined as:

JSI = [absolute value of A [intersection] M/[absolute value of A [union] M]. (16)

Furthermore, the segmentation error indices are calculated via:

EF = [bar.M][intersection]A/M, (17)

MF = M[intersection][bar.A]/M, (18)

where the EF stands for extra fraction and shows the over segmentation fraction and the MF stands for miss fraction and represents the under segmentation fraction of any segmentation method.

RESULTS

The experimental results are organized into two subsections. In the first part, we report the manner of the parameters choice and the performance of the rectangular detection algorithm. Afterwards, in the second part of the results, we report the parameters choice of the reshaping algorithm and the segmentation quality of the proposed method using ZSI, JSI, EF and MF indices. Then, the segmentation results are compared to the results of a region growing based method.

DETECTION RESULTS

Three main parameters that influence the detection accuracy of the rectangular detection procedure are: the number of quantized colors (k), the scanning window size ([s.sub.rec]) and the distance of rectangular region to the centroid of training set in the feature space ([t.sub.dist]). Considering trade-off between the performance and computational cost, we have run several experiments and set different values to these parameters to find optimal combination of the parameters experimentally. Setting fixed values to k and [t.sub.dist] and changing the value of window size we noticed that the optimal value for this parameter is the average size of training items bounding boxes. Similarly, having a fixed value to the window size and changing the values of k [member of] {16, 32, 64} and [t.sub.dist] [member of] [0,1], we measured the precision and recall of the detection procedure. Fig. 7 depicts the precision-recall curves of the detection procedure. For each value of the parameter k (i.e., the number of quantized colors) there is a curve which is built up based on different [t.sub.dist] values.

As it is obvious from the figure, the detection accuracy for the curves with k = 64, 32 is considerably higher than the curve with k = 16. Taking into account that the detection performance for k = 64 colors is slightly higher than k = 32 as well as detection procedure for k = 64 colors needs 64 x 64 co-occurrence matrices, we chose k = 32 colors due to the following reasons. Its performance is almost the same with the k = 64 and it needs twice less data structures and computational time than the case with k = 64 colors.

The optimal combination of the mentioned parameters, i.e., [s.sub.rec] = average bounding box size of the training items, k = 32 and [t.sub.dist] = 0.53, resulted in acceptable detection accuracy with precision = 0.94 and recall = 0. 88.

SEGMENTATION RESULTS

The main influencing parameters on segmentation performance are the three weighting coefficients of the Eq. 9 ([gamma], [eta] and [iota]). Moreover, the parameter a [member of] [0, 1] of Table 1 which controls the reshaping iterations has significant role in final segmentation quality. Again, to find the optimal combination of parameters we have run several experiments. Since the size of the target objects in the current dataset are homogeneous and all comparisons in the reshaping process are made to the centroid of the training set, through experiments we noticed that the influence of the size coefficient in this specific dataset is trivial. Therefore, we set the value of [iota] to zero then a range of values adaptively are set to the texture and the gradient weighting coefficients ([gamma], [eta]) to find the their optimal values. The number of reshaping iterations has been set to a fixed number while investigating the optimal values of the [gamma] and [eta] parameters. The curve which is shown in Fig. 8 gives an indication of how segmentation quality changes due to changes in the values of [gamma] and [eta]. The optimal combination of the mentioned parameters, i.e., [gamma] = 0.51, [eta] = 0.49 and [iota] = 0, for a fixed number of reshaping iterations leads to segmentation agreement to the ground truth with ZSI = 0. 83.

Having the optimal values of the weighting parameters of the cost function ([gamma], [eta] and [iota]), regarding time efficiency and segmentation quality, a set of values has been assigned to a [member of] [0, 1] to measure its effect on final segmentation agreement. Table 2 represents the effect of the parameter a on the final segmentation agreement of the proposed method.

As it can be seen from the Table 2 the optimal value of the parameter a is 0.4 since it is resulted in final segmentation agreement of [ZSI.sub.reshaped] = 0.835. Basically lower values of the 'a' should lead to better segmentation agreement results. Since for the smaller values of the a (a < 0. 4) the reshaping iterations increase dramatically at the same time the segmentation agreement rises slightly, regarding time efficiency, the optimal value of this parameter is set to a = 0.4.

To enable more precise comparison in contrast with another state of the art segmentation method, the dataset images are segmented further with a well-known method which is based on region growing followed by thresholding used in Adiga et al. (2006). Table 3 provides the segmentation results for both rectangular and reshaped levels of the proposed method contrasted with the results of the Adigas method.

Here, the [bar.ZSI] and [bar.JSI] are average similarity indices between segmented images and the ground-truth, the [bar.EF] and [bar.MF] are average segmentation errors and time shows the mean processing time per image in seconds.

As it can be seen from Table 3 the first two rows of the table represent the results of the proposed method in both rectangular and reshaped levels and the third row represents the results of the method used in Adiga et al. (2006). The average similarity index ([bar.ZSI]) of the mentioned method in the rectangular level (proposed rect) is 0.78 which significantly increased to 0.83 due to stochastic reshaping process. Similarly, the Jaccard's ([bar.JSI]) index increased from 0.6 to 0.67. The over segmentation error rate sank significantly by 0.09 (50%). However, there is a slight increase in under segmentation error rate.

The performance of the proposed method in rectangular level is almost the same with the Agida's methods. Whereas the proposed method in reshaped level outperforms the Agida's method considering both similarity and error indices. However, stochastic reshaping process needs more processing time to exactly delineate the borders of the target objects.

Fig. 9 depicts an example of the binary images that are used together for segmentation assessment of the methods including images produced by the proposed or Agida's method and the ground-truth.

Figs. 10 and 11 depict example images segmented by proposed and Agida's methods respectively. The target objects are identified with white dots inside and the final segmented objects borders are shown in yellow.

DISCUSSION

Existing bottom-up segmentation approaches like Al-Kofahi et al. (2010) which utilize low- or mid-level image features such as edges or gradient values, lead to poor segmentation results once applied to occluded and cluttered images (i.e., images that contain complex spatial color patters). These methods are mostly unsupervised and may present acceptable time performance and skip training step. However, due to their ad-hoc nature, the edge- gradient based methods may fail even once they applied to rather similar subcategories of a specific image type. Furthermore, in real daily pathological routine, pathologists might be interested to a specific subgroup of the target objects for instance malignant cells or lymphoblasts in the case. This confirms the need for a flexible supervised method with the ability of being tuned and trained to specific categories of images.

To overcome the mentioned limitations, we proposed a supervised method which utilizes both low- and high level image features. Since the method utilizes content, shape and gradient information, it is robust enough to handle various subcategories of stained histocytology images. The processing pipeline of the method consisted of rectangular detection and segmentation components. The stochastic reshaping component could be optionally activated in the cases that there is a need for more precise delineation of the target objects. Unlike level set methods, the proposed method can handle segmentation of multiple regions on a single image. Another significant property of the method is that it can be adapted to various types of the histocytology images. The adaption could be carried out by adjusting weighting parameters related to the texture, shape and gradient terms of the proposed cost function (Eq. 8). For instance if there is no clear borders in an image dataset and the target objects are described mainly by their texture rather than other features, the gradient weighting parameter ([eta]) should be set to a smaller value than the texture parameter ([gamma]). However, finding the optimal combination of the parameters is a challenge that needs to be addressed by experiments.

The segmentation performance of the proposed method is compared with another state of the art method. Although both methods including the proposed one are not extremely precise, according to Table 3 the proposed method demonstrated statistically better segmentation performance than Agida's method. However, due to the unsupervised nature of the Agida's method it needs less processing time than the proposed method. The final segmentation performance of the proposed method is dependent to the initial rectangular detection. Although stochastic reshaping algorithm can converge to the object's borders in the cases of improper initialization, however it needs significant amount of processing time. Hence the proper initialization effects on both processing time and segmentation quality. The proposed method matched to the ground-truth with the average ZSI score of 0.83. According to Zijdenbos et al. (1994), it is generally accepted that a ZSI > 0.7 represents very good agreement. Therefore, the average agreement of the proposed method in reshaped level is appropriate. The prototype software developed based on the proposed method could be considered as a potential tool for pathologists in daily diagnostic routines and it could be also utilized in the research projects.

CONCLUSION

A new method is proposed for rectangular detection and segmentation of the immature cells found in peripheral blood (lymphoblasts). The method is robust enough to be tuned and applied to the other similar histocylogy images. It demonstrated appropriate level of detection accuracy (precision = 0.94, recall = 0.88) and segmentation agreement with the ground truth ([bar.ZSI] = 0.83). The prototype software developed based on the method could be considered as a potential CAD tool for diagnosis of the acute lymphoblastic leukemia in the clinical process. Moreover, it can be used by the researchers who are investigating the computer aided analysis of the histocytology images.

doi: 10.5566/ias.v32.p89-99

REFERENCES

Adiga U, Malladi R, Fernandez-Gonzalez R, de Solorzano C (2006). High-throughput analysis of multispectral images of breast cancer tissue. IEEE T Image Process 15(8):2259-68.

Ali S, Madabhushi A (2012). An integrated region-, boundary-, shape- based active contour for multiple object overlap resolution in histological imagery. IEEE Trans Med Imag 31(7):1448-60.

Al-Kofahi Y, Lassoued W, Lee W, Roysam B (2010). Improved automatic detection and segmentation of cell nuclei in histopathology images. IEEE Trans Biomed Eng 57:841-52.

Chan T, Vese L (2001). Active contours without edges. IEEE T Image Process 10(2):266-77.

Gurcan MN, Boucheron LE, Can A, Madabhushi A, Rajpoot NM, Yener B (2009). Histopathological image analysis: A review. IEEE Rev Biomed Eng 2:147-71.

Kass M, Witkin A, Terzopoulos D (1988). Snakes: Active contour models. INT J Comput Vision 1(4):321-31.

Korde VR, Bartels H, Barton J, Ranger MJ (2009). Automatic segmentation of cell nuclei in bladder and skin tissue for karyometric analysis. Anal Quant Cytol Histol 31(2):83-9.

Kovalev V, Dmitruk A, Safonau I, Frydman M, Shelkovich S (2011). A method for identification and visualization of histological image structures relevant to the cancer patient conditions. In: Real P, Diaz-Pernil D, Molina-Abril H, Berciano A, Kropatsch W, eds. Lect Not Comput Sci 6854:460-8.

Labati R, Piuri V, Scotti F (2011). All-idb: The acute lymphoblastic leukemia image database for image processing. In: Proc 18th IEEE Int Conf Image Process. Sep 11-14. Brussels, Belgium. 2045-8.

Li G, Liu T, Nie J, Guo L, Chen J, Zhu J, Xia W, Mara A, Holley S, Wong S (2008). Segmentation of touching cell nuclei using gradient flow tracking. J Microsc 231(1):47-58.

Mulrane L, Rexhepaj E, Penney S, Callanan JJ, Gallagher WM (2008). Automated image analysis in histopathology: a valuable tool in medical diagnostics. Expert Rev Mol Diagn 8:707-25.

Wang L, Shi J, Song G, Shen IF (2007). Object detection combining recognition and segmentation. In: Yagi Y, Kang S, Kweon I, Zha H, eds. Lect Not Comput Sci 4843:189-99.

Wienert S, Heim D, Saeger K, Stenzinger A, Beil M, Hufnagl P, Dietel M, Denkert C, Klauschen F (2012). Detection and segmentation of cell nuclei in virtual microscopy images: A minimum-model approach. Sci Rep 2:503.

Zijdenbos A, Dawant B, Margolin R, Palmer A (1994). Morphometric analysis of white matter lesions in mr images: method and validation. IEEE Trans Med Imag 13(4):716-24.

MEHDI ALILOU ([mail]), (1,2) and VASSILI KOVALEV (2)

(1) Department of Computer Science, Khoy Branch, Islamic Azad University, Khoy, Iran; (2) Department of Biomedical Image Analysis, United Institute of Informatics Problems, National Academy of Sciences, Minsk, Belarus

e-mail: me.alilou@gmail.com, vassili.kovalev@gmail.com

(Received December 15, 2012; revised May 14, 2013; accepted June 11, 2013)

Table 1. Reshaping procedure. Input: C [union] [omega], contour and body pixels of the rectangular agent. Output: New C [union] [omega] fitting region of the interest. [Cost.sub.initial] [??] compute rectangular agent's distance to centroid using cost function F [Cost.sub.min] [??] [Cost.sub.initial] repeat [C.sub.t], [[omega].sub.t] [??] Shrink(C, [omega]) [Cost.sub.t] [??] compute shrunk agent's distance to centroid using cost function F if [Cost.sub.t] < [Cost.sub.min] then [Cost.sub.min] [??] [Cost.sub.t], C, [omega] [??] [C.sup.t], [[omega].sub.t] end if [C.sub.t], [[omega].sub.t] [??] Expand(C, [omega]) [Cost.sub.t] [??] compute expanded agent's distance to centroid using cost function F if [Cost.sub.t] < [Cost.sub.min] then [Cost.sub.min] [??] [Cost.sub.t], C, [omega] [??] [C.sub.t], [[omega].sub.t] end if until ([Cost.sub.min] < a x [Cost.sub.initial]) Table 2. The effect of the parameter a on final segmentation agreement of the proposed method in both rectangular and reshaped levels. ([ZSI.sub.rect]: average segmentation agreement of the images segmented by the rectangular agents, [ZSI.sub.reshaped]: average segmentation agreement of the images segmented by the reshaped agents, iteration: the number of the reshaping iterations). a [ZSI.sub.rect] [ZSI.sub.reshaped] iteration 1 0.78 0.785 1 0.9 0.78 0.791 58 0.8 0.78 0.795 146 0.7 0.78 0.812 198 0.6 0.78 0.818 238 0.5 0.78 0.824 303 0.4 0.78 0.835 355 0.3 0.78 0.837 574 0.2 0.78 0.839 611 0.1 0.78 0.841 755 Table 3. The segmentation results of the proposed method in both rectangular and reshaped levels in contrast with Adiga's method. ([bar.ZSI], [bar.JSI]: similarity indices, [bar.EF], [bar.MF]: segmentation errors, time: processing time per image) Method [bar.ZSI] [bar.JSI] [bar.EF] [bar.MF] time Proposed rect 0.78 0.60 0.18 0.14 10.1 Proposed 0.83 0.67 0.09 0.18 301.4 reshaped Adiga's 0.78 0.56 0.15 0.13 7.03

Printer friendly Cite/link Email Feedback | |

Title Annotation: | Original Research Paper |
---|---|

Author: | Alilou, Mehdi; Kovalev, Vassili |

Publication: | Image Analysis and Stereology |

Article Type: | Report |

Geographic Code: | 4EXBE |

Date: | Jun 1, 2013 |

Words: | 6516 |

Previous Article: | Exact simulation of a Boolean model. |

Next Article: | Characterization of the formation of filter paper using the Bartlett spectrum of the fiber structure. |

Topics: |