Automatic identification and classification of bacterial cells.
Microscopy is one of the most important techniques in microbial ecology, since this is the most direct approach to examine the microbe's world from its own perspective. The value of quantitative microscopy in studies of microbial ecology can be increased even further when used in conjunction with computer-assisted image analysis. There are two main advantages of using digital image processing and pattern recognition techniques in conjunction with microscopy for quantitative studies of microbial ecology. First, automatic image analysis reduces the amount of tedious work with microscopes needed to perform a more accurate quantitative analysis of in situ microbial abundance and metabolic activity. Secondly, these techniques provide an important quantitative tool to analyze the structures and spatial features of complex microbial communities in situ without cultivation. Five major types of information useful in microbial ecology can be extracted from resolved and segmented microscopical images of growing microbial communities in situ. These include recognition of cellular morphological diversity, cell abundance, and spatial, metabolic, and phylogenetic relationships of cells to each other and their surrounding environment.
The process of semi-automatic image analysis of cells to evaluate these aspects of microbial communities can be principally divided into four stages: (i) interactive image acquisition, digitization, and segmentation to locate cells; (ii) automatic measurement to extract features of interest; (iii) classification of different bacterial cells; and (iv) statistical analysis, computations, and interpretation of the data. One of the most important and yet most tedious tasks performed during microscopic analysis of microbial communities is the classification of observed cells into known morphological categories and recognition of new categories as well if new distinct characteristics are captured. A major challenge in microbial ecology is to develop reliable and facile methods of computer assisted microscopy that can analyze digital images of complex microbial communities at single cell resolution, and compute useful quantitative characteristics of their organization and structure without cultivation.
Bacteria are unicellular microscopic organisms which can only be seen through microscope. Bacteria exist in different sizes and shapes and they measure in micrometer (which is a millionth part of a meter). Bacteria are found everywhere and in all types of environments. There are numerous types of bacteria in the world. Bacteria are mainly classified based on their shapes, biochemistry and staining methods. Recently, along with the morphology, other profiles such as their metabolic activities, conditions required for their growth, biochemical reactions, antigenic properties, and other characteristics are also helpful in classifying the bacteria. However, each type of bacteria has its own characteristics. Most of the bacteria are characterized by three main shapes: rod (rod shaped bacteria are called bacilli), sphere (sphere shaped bacteria are called cocci) and spiral (spiral shaped bacteria are called spirilla or spiral). Some bacteria posses different shapes, which are more complex than the above mentioned shapes.
The statistical imaging method for automatic identification of bacterial types is proposed by Trattner and Greenspan . The artificial neural network approach for bacterial classification has been investigated by Nicolas Blackburn, et al. . The data mining techniques are employed for the classification of HEp-2 cells by Petra Perner , in which a simple set of shape features are used for classification of bacterial cells. Hiremath and Parashuram [4,5,6] have investigated the automatic classification of bacterial cells and its different growth phases using digital microscopic images using geometric shape features. A computer-aided system for the image analysis of bacterial morphotypes in microbial communities using geometric shape features has been investigated by J. Liu et al. . Thomas Posch et al.  have proposed a new image analysis tool to study biomass and morphotypes of three major bacterioplankton groups in an alpine lake using geometric features. Carolina Wahlby et al. , have investigated algorithms for cytoplasm segmentation of fluorescence labeled cells using statistical analysis techniques based on shape descriptive features.
In this paper, the objective is to propose a method for automatic identification and classification of bacterial cells in digital microscopic images using geometric features that characterize the shapes of bacterial cells. The experimental results are compared with the manual results obtained by microbiology expert and demonstrate the efficacy of the proposed method.
Materials and methods
The spread plate technique is used for the separation of a dilute mixed population of micro-organisms, so that individual colonies can be isolated. Aseptically transfer the scoopful of mixed culture on the Nutrient Agar medium. Spread uniformly with the help of L-shaped spreader on the surface of medium plates. After spreading, incubate at 37[degrees]C for 24-48 hours. After incubation, single colonies will appear on the Nutrient Agar media plates. Then pickup the colony and go further identification by using staining techniques. A smear of mixed culture bacteria is deposited on a glass slide and thoroughly air-dried. It is stained for 1 min in Crystal Violet solution, 1 min in iodine solution, washed for 20s in ethanol and finally, counterstained with safranin for 1 min. The glass slide is examined under oil immersion at 1000x-2500x magnification with direct illumination in a Dialux 20 microscope equipped with a 3 CCD Sony color camera and connected to a PC [6, 9]. We have considered 100 color images for present study and these are converted into gray scale images .
The purpose of the automated image analysis of digital bacterial cell image is to identify the type of bacteria whether it is bacilli or cocci or spiral based on their geometric features using different classification techniques, namely, 3[sigma] classifier, K-NN classifier, Neural Network classifier and Fuzzy classifiers.
Out of many geometric features used by various authors in the literature[4,7], it is observed that there are five geometric features, namely, circularity, compactness, eccentricity, tortuosity and length-width ratio, which provide better classification results. Hence, we have used these five features, which are defined as given below:
Circularity ([x.sub.1]) 4[pi](Area)/[perimeter.sup.2] Compactness([x.sub.2]) A measure of compactness ([Perimeter.sup.2]/4[pi]*Area) Eccentricity([x.sub.3]) It is the ratio of the length of the highest chord of the shape to the longest chord perpendicular to it; i.e. [Length.sub.major_axis]/ [Length.sub.minor_axis] Tortuosity([x.sub.4]) major axis/perimeter Length-width major axis/minor axis. ratio([x.sub.5])
The bacterial cell images generally contain noise, small debris and artifacts depending on the different staining methods. To remove this debris, we have preprocessed the image by applying morphological operations, namely, erosion, reconstruction and dilation. This stage is of high importance in achieving good results in segmentation and further process. The gray scale image of cells is segmented using the global thresholding, which yields binary image. After labeling the segmented image, the geometric features [x.sub.i], i = 1,2, ..,5, are extracted for each labeled segment. These features are used as a basis for the cell classification. Using the training set of images (with known cell classification), for each feature [x.sup.k.sub.j], i = 1,2,.. ,5, of kth cell type, we compute the mean [[bar.x].sup.k.sub.j] and standard deviation [[sigma].sup.ks.ub.i] of the sampling distribution of the feature values and store them as knowledge base. In the testing phase, for a given test image, feature values [X.sup.(test).sub.i] of the segmented regions (cells) are computed and then cell classification is done using the 3[sigma] rule, namely: For a segmented region in the test image, if the feature values [x.sup.(test).sub.i] lie in the interval [[bar.x].sup.k.sub.i] [+ or -] 3[[sigma].sup.k.sub.i], i = 1,2,.., 5, then the region is a cell of type k. The k=1,2,3 correspond to bacilli, cocci and spiral, respectively.
The K-nearest neighbor (K-NN) classification is performed by using a reference data set (training set) which contains both the input (feature set) and the target variables (known cells) and then by comparing the unknown (test data) which contains only the input variables (features) to that reference set. The distance of the unknown to the K nearest neighbors determines its class assignment by either averaging the class numbers of the K nearest reference points or by obtaining a majority vote from them.
Neural Network Classifier
The input layer has 5 neurons and 5 shape features as inputs, and output layer has three outputs (bacilli, cocci and spiral). The transfer function used is 'tan sigmoidal', training function used is Levenberg-Marquardt back propagation, the weight/bias learning function is 'gradiant descent' function and the performance function is 'mean square error (mse)' which is set to 0.01. In the case of radial basis neural network, the shape features are used as inputs. The error function is 'mean square error (mse)' which is set to 0.15. The spread for radial basis function is 1.0 and the maximum number of neurons allowed to add during training is 300 .
The fuzzy rule based classification is performed by using mean and standard deviation of the data set (training set). The Sugeno model is used to model any inference system in which the output membership functions are either linear or constant; this model is employed because the expected output is the constant membership function of the class number to which the bacilli cell belongs. The simple Gaussian member ship function is used and set with the linguistic variables of the mean and standard deviation for the geometric features.
The proposed method for the classification of bacterial cells based on their geometric features is given below:
Algorithm 1: Extraction of features for knowledge base
Step 1: Input bacterial cell image (RGB color training image).
Step 2: Convert the RGB image into gray scale image.
Step 3: Perform preprocessing method and segment the resulting binary image.
Step 4: After removing border touching cells, perform labeling the segmented image.
Step 5: For each labeled segment, compute geometric shape features [x.sub.i], i = 1,2, ..,5, (i.e. eccentricity, compactness, circularity, tortuosity and length-width ratio) for each cell type k.
The k=1, 2, 3 correspond to bacilli, cocci and spiral, respectively.
Step 6: Repeat steps 1 to 5 for all the training images.
Step 7: Compute mean [[bar.X].sup.k.sub.i] and standard deviation [[sigma].sup.k.sub.i] of the sampling distribution of the feature values for each cell type k and store them as knowledge base.
Algorithm 2: Classification of bacterial cells.
Step 1: Input bacterial cell image (RGB color test image).
Step 2: Convert the RGB image into gray scale image.
Step 3: Perform preprocessing method and segment the resulting binary image.
Step 4: After removing border touching cells, perform labeling the segmented image.
Step 5: For each labeled segment, compute geometric shape features [x.sub.i], i = 1,2, ..,5, (i.e.eccentricity, compactness, circularity, tortuosity and length-width ratio) and store these features as [x.sup.(test).sub.i].
Step 6: Apply 3[sigma] rule for classification of the bacterial cells: A segmented region is of cell type k, if its features [x.sup.(test).sub.i] lie in the interval [[bar.x].sup.k.sub.i] [+ or -] 3[[sigma].sup.k.sub.i], i = 1,2, ..,5, The k=1,2,3 correspond to bacilli, cocci and spiral, respectively.
Step 7: Repeat the Steps 5 and 6 for all labeled segments and output the classification of identified cells.
The above algorithm for classification phase can be modified to apply K-NN classifier, Neural Network classifier and Fuzzy classifier to the feature set in the Step 6 and the classification performance of the different classifiers can be compared. The K-NN classifier with K=1 is the minimum distance classifier.
Experimental results and discussions
For the purpose of experimentation, 300 color digital bacterial cell images containing different types of bacterial cells (non-overlapping) namely, bacilli, cocci and spiral are considered (as described in section 2). The implementation is done on a Pentium Core 2 Duo @ 2.83 GHz machine using MATLAB 7.9. In the training phase, each input color image of bacterial cell (Fig. 1(a)) is converted into gray scale image (Fig.1(b)), and the morphological operations such as erosion, reconstruction and dilation are applied. The resulting image is thresholded to obtain segmented binary image (Fig. 1(c)). The segmented image is labeled and for each segmented region (known cells), the geometric features are computed. The Table 1 presents the geometric feature values computed for the segmented cell regions of the image in Fig. 1(d)-(f).
[FIGURE 1 OMITTED]
The mean and standard deviation of the sampling distribution of these features obtained from the training images are stored in the knowledge base of the cells: bacilli, cocci and spiral, as shown in Table 2. Some sample training images are shown in Fig. 2.
[FIGURE 2 OMITTED]
In the testing phase, for a test image, the feature extraction algorithm is applied and the test feature values x(test) for each segmented region are used for classification using 3o rule, K-NN classifier, Neural Network classifier and Fuzzy classifier. The classification results are given in the Table 3. The Fig. 3 shows some sample test images used for classification of bacterial cells.
The Table 3 summarizes the average classification rates obtained by different classification techniques. For testing images, the 3o classifier has yielded an accuracy in the range of 94% to 96% and K-NN classifier has yielded 85% to 96% for K=1(i.e. minimum distance classifier). The neural network classifier has yielded 90% to 100% and the fuzzy classifier has yielded 99% to 100% accuracy. The performance comparison indicates that the fuzzy classifier has good classification ability.
[FIGURE 3 OMITTED]
Although the comparison of classification performance of the various state-of-the art methods in the literature is difficult because of the different cell image data sets used for experimentation, it may be observed that, in  statistical modeling techniques are applied for staphylococcus aureus cells and has yielded 98%, in  data mining approach was used for HEp-2 cells and has yielded 86.67% classification rate, in  neural network approach has yielded above 90% classification rate in the various different types of bacterial cells and in  the statistical methodology has yielded classification rates in the range 89% to 98% for different categorization methods for fluorescent labeled cells. In  statistical analysis method for classification of various bacterioplankton groups was used and has yielded 80% overall accuracy. The proposed method is computationally less expensive and yet yields comparable classification rates. The 3[sigma] classifier has yielded an accuracy in the range of 94% to 96% and K-NN classifier has yielded 85% to 96% for K=1(i.e. minimum distance classifier). The neural network classifier has yielded 90% to 100% and the fuzzy classifier has yielded 99% to 100% accuracy for different bacterial cell types.
[FIGURE 4 OMITTED]
The Fig. 4 shows some sample cell images corresponding to misclassification results. In Fig. 3(a) and (e), the cocci is classified as bacilli. Also, in Fig. 3(b) and (f), the cocci is classified as bacilli. In Fig. 3(c) and (g), a bacilli is not classified (i.e. unknown) due to over segmentation. Also, in Fig. 3(d) and (h), a spiral is not classified (unknown) due to over segmentation. These problems can be overcome by employing better segmentation methods. Further, the classification results can be improved by using better classification techniques. These aspects will be considered in our future work.
In this paper, we have proposed an automated cell identification and classification by segmenting digital microscopic bacterial cell images and extracting geometric features of cells. The experimental results are compared with the manual results obtained by expert. The proposed method is computationally less expensive and yet yields comparable classification rates. The 3[sigma] classifier has yielded an accuracy in the range of 94% to 96% and K-NN classifier has yielded 85% to 96% for K=1(i.e. minimum distance classifier). The neural network classifier has yielded 90% to 100% and the fuzzy classifier has yielded 99% to 100% accuracy for different bacterial cell types. It could be improved further by better preprocessing methods and feature sets, which will be taken up in our future work.
The authors are grateful to the referees for their valuable comments and suggestions. Further, the authors are indebted to Dr. A. Dayanand, Professor of Microbiology, Gulbarga University, Gulbarga and Dr. Ramakrishna, Department of Microbiology, Government Degree College, Gulbarga, for providing bacterial cell images and manual results of the cell images by visual inspection. This research work was funded by UGC-SWRO, Bangalore vide No. MRP(S)-715/2010-11/KAGU009/UGC-SWRO.
 Aneja, K. R. (2002) Experiments in Microbiology Plant Pathology Tissue Culture and Mushroom Culture, Newage International Publications, New Delhi, India.
 Carolina Wahlby, et al., (2002) Algorithms for cytoplasm segmentation of fluorescence labeled cells, Analytical Cellular Pathology, 24, 101-111.
 Dennis Kunkel Microscopy, Inc, Science Stock Photography, http://denniskunkel.com/DK/Bacteria/
 Hiremath P. S. and Parashuram Bannigidad, (2009) Automated Gram-staining Characterization of Digital Bacterial Cell Images, Proc. Int'l. Conf. on Signal and Image Processing (ICSIP 2009), pp. 209-211.
 Hiremath P.S. and Parashuram Bannigidad (2010) Automatic Identification and Classification of Bacilli Bacterial Cell Growth Phases, IJCA Special Issue on Recent Trends in Image Processing and Pattern Recognition (RTIPPR-2010), Vol.1 (2), pp.48-52.
 Hiremath P.S. and Parashuram Bannigidad (2010) Automatic identification and classification of Bacterial Cells on Digital Microscopic Images, 2nd International Conference on Digital Image Processing (ICDIP-2010), Proc. of SPIE, Vol. 7546-53, Feb. 26-28, 2010, Singapore, pp.754613-1-6.
 Liu, J. F.B. Dazzo, O. Glagovela, B. Yu, A.K. Jain (2001) CMEIAS: A Computer-Aided System for the Image Analysis of Bacterial Morphotypes in Microbial Communities, Springer-Verlag, Microb. Ecol. 41: pp. 173-194.
 Jeffrey C. Pommerville (2010) Alcamo's Fundamentals of Microbiology Body systems edition, Jones and Bartlett Publishers.
 Nicholas Blackburn, et al., (1998) Rapid Determination of Bacterial Abundance, Biovolume, Morphology, and Growth by Neural Network-Based Image analysis, Applied and Environmental Microbiology, 64(9), 3246-3255.
 Pattan Prakash C., V.D. Mytri and P.S. Hiremath. (2010) Classification of Cast Iron based Graphite Grain Morphology using Neural Network Approach, 2nd International Conference on Digital Image Processing (ICDIP-2010), Proc. of SPIE Vol. 7546-53, Feb. 26-28, 2010, Singapore.
 Petra Perner, (2001) Classification of HEp-2 Cells using Fluorescent Image Analysis and Data Mining, Medical Data Analysis, Springer Verlag, LNCS 2199, pp.219-224.
 Rafael C. Gonzalez and Richard E. Woods (2002) Digital Image Processing, Pearson Education Asia.
 Sigal Trattner and Greenspan (2004) Automatic Identification of Bacterial Types Using Statistical Imaging methods, IEEE Transactions on Medical Imaging, 23(7), 807-820.
 Thomas Posch et al. (2009) New image analysis tool to study biomass and morphotypes of three major bacterioplankton groups in an alpine lake, Acuatic Microbiol Ecology, 54: pp. 113-126.
 Venkataraman, S., et al., (2006) Automated image analysis of atomic microscopy images of rotavirus particles, Ultramicroscopy, Elsevier, 106, 829-837.
 Hiremath P.S. and Parashuram Bannigidad (2010) Digital Image Analysis of Bacilli Bacterial Cell Growth Phases, Journal of Computational Intelligence in Bioinformatics, Vol.3 No. 2, pp. 137-145.
P.S. Hiremath (1) and Parashuram Bannigidad (2)
(1,2) Department of Computer Science, Gulbarga University, Gulbarga, Karnataka, India
(1) E-mail: email@example.com
(2) E-mail: firstname.lastname@example.org
Table 1: The geometric feature values of the cell regions of the image in Fig. 1(d)-(f). Cell features Bacilli Cocci Spiral Circularity ([x.sub.1]) 0.3695 0.6595 0.0469 Compactness ([x.sub.2]) 2.7060 1.5163 21.3002 Eccentricity([x.sub.3]) 0.9640 0.2849 0.9988 Tortuosity ([x.sub.4]) 0.3786 0.2643 0.3847 LW ratio ([x.sub.5]) 3.7630 1.0432 20.2628 Table 2: Mean and standard deviation of geometric features of bacterial cells of types: Bacilli, Cocci and Spiral. Cell features Bacilli Mean ([[bar.x].sup.1.sub.i]) Circularity ([x.sub.1]) 0.4425 Compactness ([x.sub.2]) 2.5441 Eccentricity([x.sub.3]) 0.9425 Tortuosity ([x.sub.4]) 0.1220 LW ratio ([x.sub.5]) 3.4023 Bacilli SD ([[sigma].sup.1.sub.i]) Circularity ([x.sub.1]) 0.2203 Compactness ([x.sub.2]) 0.7594 Eccentricity([x.sub.3]) 0.0373 Tortuosity ([x.sub.4]) 0.0433 LW ratio ([x.sub.5]) 0.9740 Cocci Mean ([[bar.x].sup.2.sub.i]) Circularity ([x.sub.1]) 0.6427 Compactness ([x.sub.2]) 1.5580 Eccentricity([x.sub.3]) 0.4810 Tortuosity ([x.sub.4]) 0.2369 LW ratio ([x.sub.5]) 1.1761 Cocci SD ([[sigma].sup.2.sub.i]) Circularity ([x.sub.1]) 0.0236 Compactness ([x.sub.2]) 0.0581 Eccentricity([x.sub.3]) 0.1407 Tortuosity ([x.sub.4]) 0.0144 LW ratio ([x.sub.5]) 0.1397 Spiral Mean ([[bar.x].sup.3.sub.i]) Circularity ([x.sub.1]) 0.0881 Compactness ([x.sub.2]) 12.3827 Eccentricity([x.sub.3]) 0.9846 Tortuosity ([x.sub.4]) 0.0628 LW ratio ([x.sub.5]) 7.7203 Spiral SD ([[sigma].sup.3.sub.i]) Circularity ([x.sub.1]) 0.0239 Compactness ([x.sub.2]) 3.9769 Eccentricity([x.sub.3]) 0.0086 Tortuosity ([x.sub.4]) 0.0211 LW ratio ([x.sub.5]) 5.0824 Table 3: Classification accuracy for the different bacterial cells in the testing set images. Bacterial cell growth phases No. of cells in 3o classifier test images  Bacilli 270 96% Cocci 100 94% Spiral 90 95% Bacterial cell Classification accuracy (%) growth phases K-NN Neural Fuzzy classifier Network classifier classifier K=1 K=3 Bacilli 96% 96% 98% 99% Cocci 90% 90% 90% 100% Spiral 85% 85% 100% 100%
|Printer friendly Cite/link Email Feedback|
|Author:||Hiremath, P.S.; Bannigidad, Parashuram|
|Publication:||International Journal of Computational Intelligence Research|
|Date:||Jan 1, 2011|
|Previous Article:||SVD based robust image watermarking using particle swarm optimization for DRM.|
|Next Article:||A structured approach for finding frequent itemset in E-commerce data.|