Printer Friendly

Automatic segmentation and classification of cells from broncho alveolar lavage.


Broncho alveolar lavage is the most commonly used diagnostic tool for confirming alveolar hemorrhage. Golde has introduced a ranking score, based on the hemosiderin content of macrophages which enables ranking cells from 0 to 4 based on the degree of Prussian blue stain. We propose a complete image analysis scheme to automatically perform both the extraction of the cellular objects and the ranking of each cell according to the Golde score. The image analysis techniques used mainly involve clustering and mathematical morphology. A 2D histogram is clustered to extract the main cellular components, a color watershed is used to determine and refine the regions. Finally, the cellular components of interest are firstly classified according to their hue and secondly according to their staining repartition. The proposed image analysis technique is very fast and produces reliable and accurate results.

Keywords: clustering, color, Golde score, mathematical morphology, segmentation.


Pathologists today need a high degree of precision and reproducibility in their analysis results. The key to this is objectivity and standardization in their laboratory practice. Static laboratory imaging is now well established since the first attempts to incorporate digital imaging in pathology. Many industrial companies provide such imaging workstations whose investigation fields are not reduced to pathology but more generally to biology. In particular, broncho alveolar lavage (BAL) is a simple investigation method carried out during a bronchial endoscopy by injection of sterile physiological salt solution (250 ml) in a pulmonary segment. This is followed by a soft aspiration recovery. BAL is the most commonly used diagnostic tool for confirming alveolar hemorrhage (AH). The diagnosis of alveolar hemorrhage is made by progressively bloodier BAL returns when hemorrhage is recent or by an increased number of hemosiderin-laden macrophages using Prussian blue staining (Lassence et al., 1995). To establish the diagnosis of alveolar hemorrhage in cells recovered by broncho alveolar Lavage, Golde and colleagues created a score (Golden et al., 1975) based on the hemosiderin content of alveolar macrophages stained with Prussian blue. Each cell is graded for hemosiderin content on a scale of 0 to 4 (0 = no stain, 4 = dense staining). The ranking for hemosiderin content uses the following scale: 0, no stain; 1, faint blue in one portion of the cytoplasm; 2, deep blue in a minor portion of the cell; 3, deep blue in most areas of the cytoplasm; and 4, deep blue throughout the cell. The total score for an average of 200 cells is divided by two to obtain the Golde score (GS) (Moumouni et al., 1993). As reported by Kahn et al. (1987), AH is defined by a Golde score greater than or equal to 20 and is considered mild and severe when the Golde score is 20 to 100 and higher than 100, respectively. To automate the computation of the Golde score on microscopical color images, we propose a complete image analysis scheme to automatically perform both the extraction of the cellular objects and the ranking of each cell according to the Golde score. The section "Materials and Methods" describes the imaging system and its calibration. The section "Image Analysis Scheme" presents the proposed segmentation and classification scheme. The section "Results and Discussion" presents experimental results. The last section presents conclusions.


The imaging workstation we worked with consists in an Olympus BX 50 microscope with a Marzhauser motorized scanning and a 3CCD JVC KY-F75 camera connected to a computer by IEEE 1394. Before attempting to acquire images, the system needs to be allowed sufficient time to warm up. To determine this thermal equilibrium, a flat field image of a slide has been acquired every 5 minutes during three hours. By computing the difference image between two successive flat field images and taking the mean gray value of the whole difference image, we can plot the time course of color values after the system has been switched on. The data indicates that the thermal stabilization of the system has been located after 90 minutes (only 0.1 mean gray level of difference). The 90 min warming-up period was therefore used in our experiments. The lighting level of the microscope can be modified by the user, but we have fixed it to a constant voltage of 9 V which correspond to a color temperature of 5500 K (D55 illuminant) representing the recommended one for microscopy. Once the lighting level has been fixed, the user can adjust the microscope condenser aperture which levels the amount of light passing through the optical lens. To determine the best suited aperture level of the condenser, a calibration slide with a set of neutral density filters with known transmissions (5, 10, 30, 50 and 80%) was used. Theoretically, photodiodes have a linear response with respect to incident light intensity (Puech and Giroud, 1999). We used this assumption to determine the correct aperture. However, some aperture levels of the condenser are not appropriated because too much light passing through the lens results in a saturation of the 3CCD camera sensors. Therefore, we first plot the mean luminance of a flat field image according to the condenser aperture for several neutral density filters and this allows us to reduce the range of the aperture to [0.2-0.5]. For the evaluation of linearity, the mean luminance of a flat field image has been determined according to the transmission of neutral density filters for several condenser apertures within its allowed range. According to medical standards, the coefficient of determination of the regression line plotted over a set of measurements with different neutral density filters should be close to 0.99 (Puech and Giroud, 1999). We have then chosen the regression line having the closest determination factor which corresponds to an aperture of 0.25. This aperture is appropriated only for a given objective lens (20x magnification) and has to be determined for each objective (corresponds to 0.5 for 40x magnification). Once the microscope settings are fixed, images can be acquired in a reproducible way. Fig. 1 presents an example of a typical Prussian blue stained color image digitized by our capture system. All the digitized images are color images of size 752 x 576 at a 20x magnification.



To segment images stained by Prussian blue, we consider the following scheme (resumed in Fig. 2):

A) Clustering, which extracts the main components of the image according to their color,

B) Spatial refinement, which refines the boundaries of objects obtained from the clustering,

C) Cell separation, which disaggregates touching or clustered cells,

D) Cell classification, which classifies cells according to their hue and staining repartition.

In the sequel, we present each steps of the proposed segmentation and classification scheme.



To perform the clustering of a color image, several strategies can be used. They generally differ on the dimension of the data used to cluster the image but they share the same assumptions: homogeneous regions in the image plane give rise to explicit clusters in the colorimetric projection representation. A colorimetric projection representation can range from 1D to 3D (or higher) dimensions. Usually, the spatial repartition of the colors in a color space is used to cluster an image assuming that the colors of the objects (describing homogeneous regions in the image plane) are grouped around dominant color clusters in a colorimetric projection (the center of the colorimetric projection classes). A cluster therefore corresponds to a class of pixels which have similar color properties. Clusters are peaks for the 1D case (1D histogram projections) and clouds of colors for the 3D case (3D histogram projection). In the 1D case, the histograms of each band are considered separately (marginal approach) and the method is reduced to finding thresholds dissecting the band histograms (Celenk, 1990; Lezoray and Cardot, 2002; Busin et al., 2004). Since one has one 1D histogram per color band, several clusterings are obtained. The resulting clustering can be combined with different methods, such as intersection, majority vote, Demspter-Shafer or Bayesian theory. This method is known as multi thresholding (Kurugollu et al., 2001). In the 3D case, the vectorial aspect of color is taken into account (vectorial approach) and the clustering is considered as the classification of multi spectral data (Postaire et al., 1993; Soille, 1996; Park et al., 1998; Geraud et al., 2001). One has to note that a clustering is not itself a segmentation. A clustering produces a clustering map where to each pixel is associated a class. Since a color cluster in a color space is not necessarily associated to connected colored pixels in the image plane, the clustering can assign the same class to pixels that are not connected. It is therefore necessary to perform a labeling of the clustering map to obtain a real segmentation. The 1D clustering method suffers from the fact that a color cluster is not always present in each band and the combination of the different segmentations cannot always catch this spatial property of the colors. The 3D clustering method is handicapped with data sparseness on the one hand and with the complexity of the search algorithm on the other hand.

An interesting alternative to 1D and 3D clustering methods relies on the use of pairwise projections and especially bivariate histograms (2D histograms) which use two color bands together (pairwise associations) namely RG, RB and GB in the RGB color space. This can bring several advantages (Matas, 1995; Kurugollu et al., 2001; Xue et al., 2003; Lezoray and Cardot, 2003; Lezoray, 2004). The paucity of the data encountered in the 3D case is partially overcame and the search complexity is drastically reduced. Moreover it partially uses the spatial repartition of colors and offers an intermediate method to the 1D and 3D ones. Another advantage to be considered is the fact that a 2D histogram is nothing more than a gray-level image, therefore classical and fast gray-level image processing algorithms can be used to cluster a 2D histogram. A 2D histogram is the projection of a 3D histogram onto band-pair planes. Fig. 3 presents the 3D histogram (Fig. 3a) in the RGB color space and the three 2D histograms (Figs. 3b-d) obtained by band-pair projections for the image of Fig. 1. One can see that most of the information is present in a single 2D projection. The RG projection has a distribution which is sufficient to encode the color content of the image and we will further consider only this band-pair projection. We choose this projection not only because this projection seems accurate but because it is the one which provides the better clusterings. One has to note that Figs. 3b-d, which represents the projections, are in color for understanding purposes but a 2D histogram is a gray-level image of reduced size (256 x 256). Fig. 4a shows such an image which has been down sampled by a factor 4 for speed computation reasons (the histogram has therefore a size of 64 x 64). This down-sampling has few influence on the final clustering and simplifies it. To cluster this 2D histogram, we assume that the different objects of the image are present in the histogram around the dominant clusters (homogeneous regions in this gray-level image). Obviously these clusters also correspond to clusters in the 3D case. The main clusters of a 2D histogram (the main connected regions) are considered as the cluster centroids. The clustering of a 2D histogram H proceeds in several steps. Since the histograms are generally noisy (this is mainly due to the sparseness of the colors in the images), the latter is smoothed by mean curvature non linear diffusion named [phi]. Fig. 4b presents the result of this smoothing. The result [phi](H) is reconstructed (by a morphological reconstruction process [psi]) in the original histogram image H to obtain a noiseless and regularized version of the histogram [H.sup.(2)] = y([phi](H), H). Fig. 4c presents the result of this reconstruction. From this regularized histogram image we seek the main clusters with a k-means clustering. k = 3 since we want to extract three classes of pixels the color of which corresponds to background, mucus elements and cells. To ensure reproducibility, cluster centers are not initialized at random but spreaded out over the luminance axis in the band-pair plane projection. Fig. 4d presents the result of the clustering.



From the clustered 2D histogram (which has to be up sampled to to its original resolution by replication of the labels), a clustering map is obtained. Indeed, each region in the clustered histogram image corresponds to a set of colors in the original image: to each (RGB) vector in the image plane is associated the corresponding label in the clustered 2D histogram. Fig. 5a present the clustering map obtained after establishing the class to which each pixel belongs. A clustering map not being a segmentation since unconnected pixels in the image plane can have the same label, this clustering map is labeled to obtain a segmentation map (a partition of the original image in regions) and not only a clustering.


The above clustering method provides relatively coarse segmentations since the only information that is used to construct the segmentation comes from the way colors do organize in the color space. This is a strong limitation since colors close one to each other in the color feature space do not always correspond to adjacent pixels in the image plane. The spatial information in the image has therefore to be taken into account to refine the segmentation. To do this we use a mathematical morphology way of proceeding. A label erosion is performed: in the segmentation map (Fig. 5b), if, for a pixel of label j, any of its neighbors has a label k [not equal to] j, then the pixel is set to unassigned (see Fig. 6a). This corresponds to impose the boundaries of the regions to be unassigned. Unassigned regions are extended in the image by performing a morphological erosion with a square structuring element of size 5 x 5 (see Fig. 6b). The effect one attends with this processing is to remove small objects and to keep only the core part of each object to further refine its boundaries. The remaining connected components are then considered as seeds and a color watershed algorithm (Lezoray, 2003) is used to fill the unassigned areas (see Fig. 6c).

However, when spatially refining the segmentation, one looses the initial clustering information which was assigned to each pixel in the clustering map assessing whether a pixel belongs to the following classes: background, mucus elements and cells. We can recover this information by assigning to each spatially refined region its mean class in terms of number of pixels: to each region is assigned its majority class according to the initial clustering map. Performing this, one obtains a spatially refined clustering map (Fig. 6d). With this information, we can once again simplify the segmentation map by giving the same label to all regions which belong to the background class (see Fig. 6e).



In order to enable individual analysis of each cell, cells that are clustered together must be separated one from each other. The separation of clustered or overlapping objects has been extensively studied (Cong and Parvin, 2000), but no method has proved to be superior in all situations. We choose to use a classical but very efficient morphological process to separate clustered elements. First, the Euclidean distance to the objects boundaries is computed, its ultimate erode set is extracted which provides a seed for each individual cellular component to be extracted. A watershed is then applied on the inverse of the distance to obtain a correct segmentation map where clustered cells have been separated. Fig. 7 illustrates all the steps of the cell separation.



Once an accurate segmentation is obtained, we perform a classification of the cellular objects. Cellular objects are regions of the segmentation map which correspond to the class of cells in the refined clustering map. The number of cells to be classified is therefore lower than the whole number of regions in the image since the regions which correspond to background and mucus are not considered. The classification proceeds in two steps. Since Prussian Blue staining colors iron in blue, the first thing to take into account to classify a cell is the presence of blue pixels within its area. To this aim, each region is analyzed and classified among three classes according to their mean hue (Fig. 8). If the mean hue is close to red ([bar.h] [member of] [0; [pi]/3] [union] [5[phi]/3;2[phi]], i.e.,, from magenta to yellow) then the cell belongs to the first class (Golde score of 0: no blue pixels). If the mean hue is close to blue ([bar.h] [member of] [[pi];3 x [pi]/2], i.e.,, from cyan to deep blue), the cell belongs to the third class (Golde score higher than 1: substantial presence of blue pixels).



If the mean hue does not correspond to one of the two precedent configurations, i.e.,, the hue is between magenta and deep blue, the cell belongs to the second class (Golde score of 1: very few blue pixels). For this first classification based on hue, we assume that the hue of any pixel in an image colored by Prussian blue never belongs to [[pi]/3;[pi]] which is coherent with the staining process.


Fig. 9b presents the corresponding classification on the cells depicted in Fig. 9a, where the label LUT is provided by Fig. 9c. Cells having a score higher than one have to be distributed among the possible scores ranging from one to four. The main information that can be used to quantify the score of cells, which increases with the degree of staining, are color and staining intensity. Therefore, for all the candidate cells, their colors and staining intensity are analyzed. In terms of color two attributes are considered:

--The mean hue of the cell [bar.h]

--The standard deviation of the hue of the cell [[sigma].sub.h],

In terms of staining intensity (which is close to extracting the texture), the pixels of each cell are classified among three classes: Intense staining (I), Middle staining (M), Low staining (L) using the rules defined by Wolf et al. (1995) to partition the texture of cells in three classes. Fig. 10a presents the extraction of the different classes of staining intensity for cells of interest (the ones for which the Golde score is assumed to be higher than one, see Fig. 9b). In Fig. 10a, gray level corresponds to a class of staining intensity: I in dark gray, M in white and L in light gray. For a cell, three staining degree indicators are considered, one by class of cell pixels. For all the pixels of a cell belonging to a given staining class C among the three above mentioned, we consider the following quantity:

[T.sub.C] = [summation over (St(p)=C)](1/8 [summation over (St(q)=St(p))])[absolute value of h(q)s(q) - h(p)s(p)]) / [n.sub.C], (1)

where p denotes a pixel in a cell, St(p) the staining class of a pixel (St(p) [member of] {I, M, L}), h(p) the hue of the pixel p, s(p) the saturation of the pixel p, q is a 8-neighbor of p and [n.sub.C] the number of pixels of class C in the cell. To a cell are therefore attached three staining degree indicators, one by staining class of pixels. [T.sub.C] quantifies, for a given class of pixels, the mean color dispersion (considered as the product of hue and saturation) around the pixels of this class. Hue is an information the relevance of which is variable (Carron and Lambert, 1994). When color is highly saturated (chromatic pixels), hue is very relevant, and when saturation is low (achromatic pixels), hue is irrelevant. Therefore we blend hue and saturation together for a better accuracy of color dispersion measure.

All the cells are then classified among the remaining four possible Golde scores (from 1 to 4) with a k-means classifier. Each cell is described by a vector of five features ([bar.h], [[sigma].sub.h], [T.sub.I], [T.sub.M], [T.sub.L]). Fig. 10b presents the final classification (Golde scores) of the cells of Fig. 9b the Golde score of which was established as higher than one, the final label LUT is provided by Fig. 10c.



We have applied the proposed segmentation and classification scheme on 50 images of BAL stained by Prussian blue. These images were taken at random for a given slide sample. Once the whole scheme is performed, 5 sets of cells are obtained which corresponds to the different scores cells can have. Figs. 11-15 present these sets on the pool of tested images. For Cells of scores 0 and 1, only subsets of the whole set of cells are provided. Table 1 resumes the proportions of cellular objects extracted from the images.

A visual evaluation of the different categories of cells enables us to see that they effectively correspond to an increasing hemosiderin content (the blue coloration becomes more intense and distributed in cells). At the end of the whole segmentation and classification scheme we propose, the computation of the Golde score is however still not possible. Indeed, the Golde score has to be computed on macrophages and the cellular component that have been extracted and classified by our scheme are not necessarily macrophages: one can for instance encounter lymphocytes. This will be done in future works. However, since each extracted cell has to be qualified as macrophage or non macrophage, a reference database of macrophages and non macrophages has to be constructed. The problem of designing a cell as a macrophage then becomes a supervised machine learning problem as opposed to all the classification techniques used in our scheme which are unsupervised. Therefore, to enable to compute the Golde score, we are currently working to the building of a learned model of macrophages.







An automated robust approach for the segmentation and classification of cells from broncho alveolar Lavage stained by Prussian blue has been presented. Segmentation is based on an hybrid combination of bivariate histogram clustering which is spatially refined by a watershed. Classification is based on a priori information on cells to classify. A first coarse classification uses mean hue to extract initial classes of cells. For cells having a blue hue, the staining intensity is analyzed according to its repartition in the cell. The whole segmentation and classification scheme is unsupervised since no parameter is used in the segmentation and unsupervised classifiers are used in the classification. The preliminary experimental results allow us to conclude that the algorithms perform well for the images acquired by our imaging system. But anyway, the system must be extensively tested on a more important number of images. On the other hand, some ways of improvement are currently under investigation. The first one we are currently investigating is to categorize extracted cells as macrophages or non macrophages in order to be able to compute the Golde score.


This work is supported in part by a research grant of FNADT and FEDER funds.

(Accepted October 2, 2007)


Busin L, Vandenbroucke N, Macaire L, Postaire JG (2004). Color space selection for unsupervised color image segmentation by histogram multithresholding. In: Proceedings of the IEEE International Conference on Image Processing (ICIP'04). Singapore. October 24-27, 2004.

Carron T, Lambert P (1994). Color edge detector using jointly hue, saturation and intensity. In: Proceedings of IEEE International Conference on Image Processing (ICIP'94). Austin, USA. November 13-16, 1994.

Celenk M (1990). A color clustering technique for image segmentation. Computer Vision Graph 52:145-70.

Cong G, Parvin B (2000). Model-based segmentation of nuclei. Pattern Recognition 33:1383-93.

Geraud T, Strub P, Darbon J (2001). Color image segmentation based on automatic morphological clustering. In: Proceedings of IEEE International Conference on Image Processing (ICIP'01), vol. 3. Thessaloniki, Greece. October 7-10, 2001.

Golden D, Drew W, Klein H, Finley T, Cline M (1975). Occult pulmonary haemorrhage in leukaemia. Brit Med J 2:166-8.

Kahn F, Jones J, England D (1987). Diagnosis of pulmonary hemorrhage in the immunocompromised host. Am Rev Respir Dis 136:155-60.

Kurugollu F, Sankur B, Harmanci A (2001). Color image segmentation using histogram multithresholding and fusion. Image Vison Comput 19:915-28.

Lassence AD, Fleury-Feith J, Escudier E, Beaune J, Bernaudin J, Cordonnier C (1995). Alveolar hemorrhage: diagnostic criteria and results in 194 immunocompromised hosts. Am J Respir Crit Care Med 151:157-63.

Lezoray O (2003). Supervised automatic histogram clustering and watershed segmentation. application to microscopic medical images. Image Anal Stereol 22:113-20.

Lezoray O (2004). An unsupervised color image segmentation based on morphological 2d clustering and fusion. In: Proceedings of 2nd European Conference on Color in Graphics, Imaging and Vision (CGIV'2004).

Lezoray O, Cardot H (2002). Histogram and watershed based segmentation of color images. In: Proceedings of 1st European Conference on Color in Graphics, Imaging and Vision (CGIV'2002). Poitiers, France. April 2-5, 2002.

Lezoray O, Cardot H (2003). Hybrid color image segmentation using 2d histogram clustering and region merging. In: Proceedings of International Conference on Image and Signal Processing (ICISP'2003), vol. 1. Agadir, Morocco. June 25-27, 2003.

Matas J (1995). Colour-based Object Recognition. Ph.D. thesis, University of Surrey.

Moumouni H, Lamotte F, Anthonioz P (1993). Sideroses of alveolar macrophages. analysis of a continuous series of 360 bronchoalveolar lavages. Pathol Biol 41:604-9.

Park S, Yun I, Lee S (1998). Color image segmentation based on 3-d clustering: morphological approach. Pattern Recogn 31:1061-76.

Postaire J, Zhang R, Lecocq-Botte C (1993). Cluster analysis by binary morphology. IEEE Trans Pattern Anal 15:170-80.

Puech M, Giroud F (1999). Standardization of dna quantitation by image analysis: quality control of instrumentation. Cytometry 36:11-7.

Soille P (1996). Morphological partitioning of multispectral images. Journal Electron Imaging 18:252-65.

Wolf G, Beil M, Guski H (1995). Chromatin structure analysis based on a hierarchic texture model. Anal Quant Cytol 17:25-34.

Xue H, Geraud T, Duret-Lutz A (2003). Multi-band segmentation using morphological clustering and fusion application to color image segmentation. In: Proceedings of the IEEE International Conference on Image Processing (ICIP'03), vol. 1. Barcelona, Spain. September 14-18, 2003.


(1) Universite de Caen, GREYC CNRS UMR 6072, 6 Bd. Marechal Juin, 14050 Caen, France; (2) Service d'anatomie et de cytologie pathologiques, Centre Hospitalier Public du Cotentin, Rue du Val de Saire, F-50130 Cherbourg-Octeville, France

Table 1. Number of cells in each class.

Class Number of cells

0 557
1 131
2 20
3 26
4 51
Total: 785
COPYRIGHT 2007 Slovenian Society for Stereology and Quantitative Image Analysis
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2007 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Lezoray, Olivier; Lecluse, Michel
Publication:Image Analysis and Stereology
Article Type:Report
Geographic Code:1USA
Date:Nov 1, 2007
Previous Article:Morphological segmentation of hyperspectral images.
Next Article:Percolation of random cylinder aggregates.

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters