Tips for Studies with Quantitative Morphology (Morphometry and Stereology)/ Consejos para los Estudios de Morfologia Cuantitativa (Morfometria y Estereologia).
Morphologists (anatomists, histologists, cell biologists, electron microscopists, pathologists and others) have for some time been using and developing quantitative approaches that have made it possible to estimate the regular composition of various organs and tissues, as well as to verify how external agents or diseases can modify the structure.
In the current text, 'morphometry' is used whenever a ruler is used to measure lengths (distances) (from the Greek morphe, 'shape, form,' and metria, 'measurement). The ruler can be microscopic (stage micrometer). Stereology refers to the quantitative analysis of 3-D objects based on their 2-D appearance on cut sections. Stereo is a combining form Greek, where it meant 'solid,' used concerning three-dimensionality in the formation of compound words.
The stereology traditionally used a 'model-based design,' while updating techniques of stereology are a 'design-based design.' Classical stereology uses frames, points, and lines for the data acquisition. The data should be put into formulas allowing previously defined calculation (based on probabilistic geometry). The number of points, lines and even the size of the test area can be modified for the acquisition of the data. Some may think that just increasing the number of points or lines, improves stereological analysis, but, as we will see this is not a linear relationship.
Morphometry and stereology should be executed having several cases and samples of material in an adequate size, respecting randomness and distribution. However, the literature shows examples that have visibly neglected the methodological principles and provide results that we just cannot believe.
The purpose of the present text is not to teach how to perform a point count, or how to consider the intercepts, customary in stereological studies. This text was also not intended as a manual to teach stereology. Some references explain the execution of the stereological techniques: textbooks (Weibel, 1979; Aherne & Dunnill, 1982; Elias et al, 1983; Russ & Dehoff 2000; Howard & Reed, 2005; Mouton, 2011; West, 2012); articles (Abercrombie, 1946; Sterio, 1984; Gundersen et al., 1988a; Gundersen et al., 1988b).
The present manuscript aims to fill the moment between the initial idea and the execution of the project, informing about the numerous possibilities of application of quantitative methods in morphology.
Statistical background. The random variable is a variable that has a unique value (determined randomly) for each result of an experiment. The random word indicates that in general we only know that value after the test is performed.
Quantitative variables can be measured on a quantitative scale, that is, they have numeric values that make sense. They can be 'continuous' or 'discrete.
a) Continuous variables: measurable information that takes values on a continuous scale, for which fractional values make sense. Usually, they should be measured by some instrument. Examples: mass (balance), height (ruler), time (clock), blood pressure, age.
b) Discrete variables: measurable characteristics that can take on only a finite or infinite number of values and thus just make total values meaningful. They are usually the result of counts. Examples: number of children, the number of steps on a ladder.
Besides these two types of variables, there is a third possibility that is the 'nominal' variable (the result of an interview, for example), but the nominal variable does not exist in the quantitative morphological world. In the anatomopathological diagnosis sometimes 'scores' are used to 'quantify' the lesions. Scores are qualitative (nominal) variables that try to express the severity of tissue change in increasing numbers (score 0, score 1, score 2). The biggest problem with the use of scores is that they are subjective and depend on the experience of the pathologist who does the analysis (so they are poorly reproducible). It is wrong to use 'scores' to calculate the mean and standard deviation of the lesion scores. Nor can these values be used in parametric statistical tests (this will be commented below).
Roughly speaking, continuous random variables are found in studies with morphometry, whereas discrete random variables are more common in stereological studies (because they are based on the counts of points and intercepts).
Continuous random variables have a normal distribution and follow the central limit theorem, i.e., has an expected population mean [mu] and a standard deviation [delta]. A discrete variable has a mean M as the sample size increases, determined as the sum of (value x probability), a sum of all values (after separately calculating value x probability for each value); the [delta] of a discrete variable allows to know the spread, or variability, of the data. We use the [delta] equation changing [mu] by the expected value. Fortunately, discrete random variables in stereological studies behave as continuous random variables when we have much-calculated data, that is, we can approximate them to a normal distribution.
A population is a group of individual units with some commonality, for example, diabetics. Imagine a study to estimate the glomerular size in people with diabetes. Thus, persons having diabetes would be the population analyzed in the survey, but it would be impossible to collect information for all people with diabetes in a country. Therefore, we would select individuals from which to collect the data, which is called sampling. If the group from which the data is drawn is a representative sample of the population, we may accept to generalize the results of the study to the population. As we can note, defining sample size (number of individuals) is a crucial time of the research.
We always study a sample (not the population) in quantitative morphological work. It is essential to analyze data that are 'normally' distributed since a population has an expected mean [mu] and a standard deviation [delta]. The mean is considered as the arithmetic average of values and can be biased by extreme values. Therefore, the median is a more robust measure of location and more suitable for distributions irregularly shaped (Krzywinski & Altman, 2013a). The [delta] is calculated based on the square of the distance of each value from the mean and [[delta].sup.2] is the variance.
Usually, we can define various samples of the population, each one with a mean M and a standard deviation SD (note that [mu] is used in the population and M is used in the sample, as [delta]--population, and SD--sample). Hence, we can have [M.sub.1] and SD1 for the sample 1, [M.sub.2] and S[D.sub.2] for the sample 2, and continues (the sample M will vary from sample to sample). A sample will be only one of many possible samples of the population. The way the variation occurs among samples is the 'sampling distribution' of the mean. We can estimate how much sample M will vary from the SD of the sampling distribution, determining the standard error (SE) of the estimate of the M. The SE of the M depends on both the SD and the sample size (SE = SD/[square root]n). Error bars may show confidence intervals, SE, SD, or other quantities. Different types of error bars give entirely diverse information (make it clear in the figure legends what the error bars are represented) (Cumming et al., 2007).
The question is, that sample is a representative of the population? If a second sample from the same population is taken will the results be consistent with the first sample? It is easy to understand that we should always do a 'pilot' study to define the best sample to be examined, the one that best represents the population.
Increasing the number of individuals in the sample is a general recommendation that improves statistical analysis (the improvement of the sample size usually is related to a better chance of being closest to the population). The SE falls as the 'n' increases (the extent of chance variation is reduced), but the SD will not tend to change as we increase the 'n.' As the sample size increases, or repeat the experiment more times, the resulting M will tend to approximate the truth [mu] (the mean of the population). Duplicating the experiment several times, the resulting SD will tend to approximate the actual 6 of the population. M and SD do not change systematically with the changes of 'n,' thus, we can use SD as the best estimate of the unknown deviation of the population, whatever the value of 'n' (Cumming et al., 2007). When the 'power' is small, significant effects are detected easier, but negative findings cannot be consistently interpreted. A power analysis is used to estimate the sample size. The use of many animals is conflicting to the current guidelines of experimentation (it wastes animals, time, effort, and is not ethical). Contrarily, if few animals are studied the experiment may lack power and miss a scientific significance. In this case, the sample size can be increased as a procedure that facilitates the observation of the effects of interest (Krzywinski & Altman, 2013b).
In a sample well defined, the data must have a normal distribution. The normal distribution implies that the data follow specific known patterns and the data set with measures (M and SD) that help us in its interpretation. SD indicates how much the probability distribution disperses around M. A large SD reflects considerable dispersion, whereas a smaller SD translates less dispersion, with values relatively close to the mean. In a normal distribution, 39% of values fall within [+ or -] 0.5 SD. Increasing the SD will keep more and more values: 68 % of values fall within [+ or -] 1 SD, 95 % in [+ or -] 2 SD, and 99.7 % in [+ or -] 3 SD (Krzywinski & Altman, 2013a). It is easy to understand that with the same M = 10 a different dispersion of data exists if SD = 2 or SD = 1 (Fig. 1).
Often, the tests are applied to check if the measured values are different comparing groups (e.g., experimental group vs. control group). The scope of statistics is to assess whether we can reject the null hypothesis ([H.sub.o], no difference) or whether we should consider an alternative hypothesis ([H.sub.alt], the difference exists). A P-value < 0.05 merely informs that an improbable event has occurred in the context of this assumption. The degree of improbability is checked against [H.sub.0] and supports [H.sub.alt] that the sample indeed came from a population with a mean different than [mu]. The statistical significance suggests (but does not imply) biological relevance (Krzywinski & Altman, 2013c).
It was already mentioned that continuous random values easily have normal (Gaussian) distribution when they are in sufficient quantity (sample size). If this is confirmed (and if the group variances do not differ significantly, i.e., has 'homogeneity' or 'homoscedasticity'), we can use parametric tests to check for possible differences between groups (recommended two-sided P-value).
Discrete random values, when samples are not large enough, will hardly have a normal distribution and this leads us to nonparametric tests. The problem with nonparametric tests is no longer its execution, since the various statistical software available, many free, can perform these tests without significant effort. The problem with nonparametric tests is that their results do not have the strength of parametric tests. When a nonparametric test indicates a difference, it is true that the equivalent parametric test would also show the difference. The question is when the nonparametric test cannot detect a difference (Krzywinski & Altman, 2014). The use of parametric tests would be a good reason always to try improving the samples to have a Gaussian distribution.
In summary (Zar, 2010):
a) Large data sets have no problems. Quickly we know if the data come from a Gaussian population, and nonparametric tests are robust, and parametric tests are robust with large datasets.
b) Small datasets have a problem. It is hard to know if the data come from a Gaussian population with small data sets, and nonparametric tests are not robust, and parametric tests are not robust with a small dataset.
A recurring and challenging question is how many cases are needed to compose each study group. Cruz-Orive and Weibel have mentioned that at least five cases per group represent a sample to be examined to allow a statistically significant result because stereology usually is based on discrete random values, which often fall into a 'binomial distribution' (that may increase or decrease) (Cruz-Orive & Weibel, 1990). Thus, the probability is calculated as, P = [(1/2).sup.n]
Morphometry. In this text, the term 'morphometry' refers to several techniques of measuring objects (although 'morphometry' is used by others with the sense of 'stereology') (Aherne & Dunnill, 1982). It was already mentioned that rulers are used for making measurements (e.g., the macroscopic measurement may use a Vernier caliper (or pachymeter), while the microscopic measurement may use a stage micrometer). For example, ophthalmologists currently make a corneal pachymetry for measuring the thickness of the cornea. Also, tailors need to make the client's body morphometry before they cut the cloth to make the clothes. Measurements may have a problem when are based on photos, and the ruler and the subject are not in the same level of focus (in police photos, the ruler is attached to the back wall and the person in front of it), leading to significant distortions.
It was already remarked that morphometric measures are usually continuous random values. Currently, using digital images, there are several possibilities of software to make measurements. It should be remembered that software 'does not' make a measure (we do!). It is up to the researcher to give the necessary information for the proper execution of the method, including calibrating the software correctly (usually with set measurement, for example, the stage micrometer in the microscopes). If the software is not precisely calibrated (each objective of the microscope must be calibrated independently), the result will hardly be correct. As a rule in the lab, each investigator must have their calibration when making morphometric measurements, and not to use the calibration of their colleagues. With that, each one should be responsible for the results. It will be discussed further, with digital images, the image size (pixels/inch) and format (JPEG, TIFF, other) substantially change the calibration value affecting the measurements.
Another technical problem that affects the accuracy of the measurements is the tissue undergoes dramatic dimensional changes during processing and microtomy. Sometimes, correction factors should be determined for the Retraction--which occurs 3-D, and compression--that happens in the direction of the cut of the microtome knife (Weibel, 1979). The expansion of the paraffin-embedded material after microtomy (usually in a water bath) is also not uniform throughout the preparation of the histological slides and may introduce a bias when morphometry is made in the slides (Mandarim-de-Lacerda, 1987). Material embedded in resin undergoes less retraction and compression. The Epon-embedded material, for example, is highly satisfactory, resulting in a volume change of only 3-5 percent (Aherne & Dunnill, 1982).
Also, a measurement should be done between appropriate reference points. When the references are well marked, the measure is straightforward (e.g., the length of the tibia between the tubercles of intercondylar eminence and the extremity of the medial malleolus, the diameter of the glomerulus in the equatorial plane passing through the urinary pole or the vascular pole). However, when the references are not well defined and depend on the observer's choice, measures are less confident (the same observer, at different times, can 'choose' various reference points and, consequently, the measurements will be different). Imagine with several investigators
Whenever a morphometric measure is done, and this step is crucial for results, it must define the reference points carefully to be used by all observers, thus reducing the variability due to the method. Remember, 'nothing is stronger than its weakest part.' In this point of view, stereology is preferable to morphometry, which will become more evident with the example of adipocytes (or glomeruli or other nearest spherical structure), a case that lies between morphometry and stereology.
The Figure 2 illustrates a typical problem of measuring diameters in relatively circular structures. Hypothetically, a globular structure (near the sphere) was sectioned with four consecutive cuts of thickness less than the diameter of the structure (numbers 1 to 4). It is easy to see that only the sections passing through the equatorial plane of the structure allow an image compatible with the exact size of the structure. The farthest sections of the equatorial plane provide smaller pictures of the structure than their actual size. However, the structures are dispersed in the tissue and are randomly sectioned by the cutting planes. Some structures are cut at the poles (as in Sections 1 and 4) while other structures can be sectioned near the equatorial plane. Observing the histological section an observer may consider that there are small and large structures, but in fact, the structures are all the same size.
A plausible solution to overcome the problem is to use stereology. The average sectional area of an object ([bar.a]) is a ratio between its Vv [structure, tissue] and twice its numerical density per area (N/[A.sub.T]) ([A.sub.T] = test area).
[bar.a] = Vv/2 x N[A.sub.T]
With the use of stereology to estimate the average sectional area of an object, it is not necessary to define or use reference points reducing the bias that the measure has. On the contrary, stereology is based on probabilistic statistics, and on the dispersion and size of the structure.
The 'loss of caps' is another artifact analyzing independent objects in the sectioned material embedded in paraffin or similar. Therefore, not all the objects can be seen in the slide because some were lost during processing (Hedreen, 1998). The 'caps' of the objects that have been sectioned tangentially are lost when chemical agents remove the paraffin. So, less than the correct number of objects are observed in the section, and the result will be distorted.
Digital images. The films mounted on the celluloid base practically disappeared. Today digital images can be obtained with various cameras, in different formats and sizes (pixels/ inch). However, still, it is correct to consider that photography (photomicrography) depends on the lighting and the quality of the lens of the equipment.
Photomicrographs (traditional with film, or digital) are not just photographs taken with the microscope. Photomicrographs, to be well explored in quantitative studies, must be obtained with an accurate technique in quality equipment. The light microscope is a precision equipment that must be manipulated by anyone who knows. The ease that the automatic machine has makes it appear that everything is just tightening buttons, but not entirely. The microscope for professional photomicrographs should have at least flat (plan achromatic) objectives (recommended plan apochromatic objectives). The observer should adjust the lighting with the Kohler illumination. The entire optical system must be clean, free of oil and dust.
The sensor size of the digital cameras was a limitation in the early period, but now most of the cameras have large sensors that allow making photomicrographs with pixels that are even excessive for the daily necessities. The images obtained in the microscope can be stored in various formats, usually in JPEG or TIFF (recommended TIFF).
The JPEG format (Joint Photographic Experts Group) is a lossy compression format for digital images, mainly images produced by digital photography with widespread use (including cell phones). The degree of compression can be adjusted, allowing a selectable tradeoff between storage size and image quality. The TIFF (Tagged Image File Format) is for handling images and data within a single file. The ability to store image information in a lossless format makes TIFF a useful image archive, because it supports layers and, unlike standard JPEG files, TIFF uses a lossless compression (or none). RAW is another image format that some digital cameras generate. When an image is made in RAW, it carries full details of color and brightness that the photos in other formats do not have. Therefore, images saved in RAW are large. By having a vast variety of colors and intensities, shooting in RAW can bring full fidelity to photos. RAW is a required format for technical expertise in photographs because it stores all information about equipment and manipulations. However, beyond the format in which the images should be saved, it is necessary that they have all the needed information to be reproduced (printed) with quality.
Warning: if digital images are used to measure structures (morphometry or stereology) and compare groups, all the images should be made in the same format and the same size (pixels), the same equipment if possible.
Classical stereology. The 'model-based stereology' (MBS) assumes the material is 'statistically homogeneous.' Thus, MBS is well suited to most applications in materials science, geological science, food science and other fields, which focus attention on the typical contents of a homogeneous material, having virtually infinite extent on the scale of observation (Baddeley & Jensen, 2005).
Points within a frame can be used to estimate the 'volume density' (Vv) of a structure. Lines within a frame and intercepts can be used to assess the 'surface density' (Sv). Knowing the frame area is essential for counting structures that occupy the interior of the frame. The 'length density' (Lv) is estimated by counting transects of the object that has a length character with the test-line (Weibel, 1979; Howard & Reed, 2005).
Warning: only with randomness there is a chance to make quantitative estimates from the statistical point of view. Therefore, the sampling of tissue and the sections should follow a design named 'isotropic and uniformly random,' or IUR sections).
'Isotropic' and 'anisotropic' are two different adjectives used to describe the properties of materials (stereologists have borrowed these terms from geology). The word 'isotropy' is related to uniformity (uniformity in all directions). Anisotropy is the opposite of isotropy, is dependent on the direction (the measured properties of a material differing in various directions in anisotropy).
There isotropic organs--and isotropic tissues (i.e., have the same histological appearance no matter the orientation of how they are included and cut). Liver, salivary glands, exocrine pancreas and others are isotropic. Other organs (and tissues) are anisotropic (i.e., they change appearance depending on the orientation of how they are included and sectioned). Skeletal muscle is anisotropic, and all stratified organs and tissues.
It is easier to achieve IUR sections in isotropic tissues than in anisotropic ones. In anisotropic tissues, to overcome this difficulty, methodological procedures have been proposed such as the 'orientator' sampling (Mattfeldt et al., 1990), or the so-called 'vertical sections' analyzed with 'cycloid arcs' (see figure 5D) (Baddeley et al., 1986).
Also, some stereological estimates are more robust than others. For example, the estimation of Vv [structure, tissue] accepts without significant bias the fact that the sections are not entirely IUR. On the contrary, the Sv [structure, tissue] is sensitive and requires IUR sections in its execution.
The increase in the number of points, or frame lines, is associated with a correlated increase in the likelihood of one of these points (or lines) being hit by the studied structure. However, it should be considered that increasing the number of points or lines counted in a section will enhance the study of subject variability. Only increasing the number of cases in the sample, and not the counts in a subject, the analysis will be improved. Such idea was formulated as 'do more, less well' encompassing the concept of efficiency in a stereological study (Gundersen & Osterby, 1981).
Estimating a volume. The volume is a significant 3-D estimate. It is readily available using the 'Archimedes principle' (displacement of a liquid) to great organs. However, the volume of a small organ (of an experimental animal for example) requires a little more work to be estimated.
A procedure has been proposed resembling the Archimedes principle to estimate the small organs' volume (Scherle, 1970). A graduated vessel filled with physiological saline (specific density approximately equal to 1) should be placed on a scale, and the organ puts into the saline suspended by a thread so that it does not touch the walls of the vessel nor the bottom. Volume and mass relate to each other through its density, and so:
Volume = mass/density
Therefore, using the above-described apparatus, volume ([cm.sup.3]) = mass (g).
However, the procedure is not indicated to estimate the volume of a structure within an organ (or tissue) that can only be evaluated on cuts to the organ (a granuloma, for example), or a microscopic structure.
Bonaventura Cavalieri was an Italian religious (a Jesuata), mathematician and geometer of the 17th century, considered a disciple of Galileo, who worked on the beginnings of the infinitesimal (integral) calculus. Cavalieri gave to the stereology the Cavalieri's principle or the Cavalieri estimator of volume. Cavalieri sections--and more recently Cavalieri slices, especially in combination with noninvasive scanning--are widely used to estimate volumes (Cruz-Orive, 1999). The volume of an object exhaustively sectioned in a series of cuts is the product of the sum of the cut areas (from the first to the last section, the sectional area of the 'i' structure, Ai, should be measured) and the thickness of the section (t). Thus:
Vol[i] = t x [n.summation over (i = 1)] [A.sub.i]
Various techniques can be used to determine Ai. In digital images, any image analysis software allows planimetry and the determination of areas. Counting points may also estimate Ai into a frame of known area ([A.sub.T]) where a number 'x' of equidistant points inside the frame may be drawn (each point equals the fraction 1 / x of [A.sub.T]). The sum of points that hit the structure will estimate Ai (the area of the structure 'i') in the section (Fig. 3).
It is necessary to introduce already the meaning of the notation in stereology to facilitate the understanding of the text. V is used for the volume, L for the length, S for the area (surface), and N for the number. A relative parameter (or density) is given by placing 'v' after the parameter letter, then Vv is a volume density. In brackets, the structure, and the organ are indicated. Then, Vv [steatosis, liver] designates the volume density of steatosis in the liver.
The 'reference volume' (Vol [ref]) in an organ or tissue) is a vital probe that should always be mentioned in the stereological estimates. Also, full information (V, L, S, N) can be obtained by multiplying the relative parameters (Vv, Lv, Sv, Nv) by the Vol [ref].
Frames and Length density. It is common to count how many structures are in a frame (test-area). However, not all structures can be wholly within the frame. The counting numbers may underestimate (or overestimate) when the objects considered are partially within (or partially outside) the frame. Therefore, objects that go beyond two consecutive sides of the frame or its extensions should be disregarded, but should be counted even if they exceed the two straight opposite sides of the frame (the 'edge effect'). In practice, two consecutive sides of a frame are called 'forbidden lines' and nothing cut by these lines is considered in the counts (Fig. 4) (Gundersen, 1977).
A 2-D measurement can eventually give some information about the tissue (primarily when the tissue has no clear boundaries) as the numeric density per area (QA, from German 'Querschnitte' or transects in the test area). The area of the frame is known, then it is straightforward to make a count (considering the 'forbidden lines').
The design and application of different frames with points, lines, points and lines in the stereological research was facilitated using STEPanizer (http:// www.stepanizer.com/), which is a web application allowing to construct frames, with points and lines (Tschanz et al., 2011). It is easily possible to vary the number of test points, or test lines according to the needs of the study. However, STEPanizer does not answer the question: do we increase the counts in a subject, or do we increase the number of cases in the sample? It is up to the researcher to define the sample. With this thought, a 'multi-purpose test system' was designed around fifty years ago by Weibel and coworkers and made the happiness of a generation of stereologists (Weibel et al., 1966).
The number of points (and lines) to count on the images will depend on the abundance of the structure studied. Currently, the images are digital (STEPanizer can use the JPEG format). A structure that occupies more the tissue (i.e., that has a higher Vv [structure, tissue]), will require a frame with fewer points (or lines). The opposite will occur if the structure is rarer and less likely to be sampled. Figure 5 illustrates the facility in constructing frames and acquisition points or lines with STEPanizer.
Lv [structure, tissue] is calculated in IUR sections for structures having a length (vessels, axons, fibers). Lv [structure, tissue] has as unit mm / [mm.sup.3] or [mm.sup.-2].
Lv = 2 x [Q.sub.A] or 2 x N/[A.sub.T] ([A.sub.T] = test area)
Therefore, the total length of the structure in a reference volume of the tissue (or organ) is estimated by the relation:
L = Lv x Vol[ref]
Points and Volume density. The volume density is the simplest (and fastest) way to get a useful quantitative information. Counting points can estimate Vv as the ratio between the number of points that hit the structure (partial points or [P.sub.P]) and the total number of points ([P.sub.T]) (the unit is [mm.sup.o] or expressed as a percentage):
Vv = [P.sub.P]/[P.sub.T]
The total volume of the structure in a reference volume of the tissue or organ can be calculated as:
Volume = Vv x Vol [ref]
The advent of various image analysis systems has changed the way to estimate Vv [structure, tissue], often by relating the area of the structure and the total area of the image. Among the available software, ImageJ is free (National Institute of Health webpage: https:// imagej.nih.gov/ij/download.html).
It is possible because of the 'Delesse's principle' (Achille Ernest Oscar Joseph Delesse, a 19th-century French geologist). Delesse has shown that the 'volume density' (Vv) is equal to the 'area density' (Aa) (Delesse, 1847). Afterward, the concept was expanded to other quantitative relationships. Therefore, it can be achieved relating any partial measure (in structure, indicated with a 'P' after the parameter designation) with its comparable total measure (in the tissue, indicated by a 'T' after the parameter designation).
[P.sub.P]/[P.sub.T] = [L.sub.P]/[L.sub.T] = [A.sub.P]/[A.sub.T] = [V.sub.P]/[V.sub.T]
In the nonalcoholic fatty liver disease study, for example, the estimate of Vv [steatosis, liver] by 'counting points' is so accurate as by 'image analysis' (and even faster to be performed) (Catta-Preta et al., 2011; St Pierre et al., 2016).
Warning: points (or lines) must be counted in a set of images per subject, several subjects in a group. How many points 'n' should be counted in the structure to get significant results? An equation relating Vv and 'n' is used to start the study (Hally, 1964; Aherne & Dunnill, 1982) (SE, the relative standard error is expected to be = 0.05):
SE = [square root of 1 - Vv/n]
In a hypothetical example, Vv equal to 50 % (i.e., Vv = 0.5). So, it is usually understood that a pilot study should be made to estimate a temporary Vv [structure, tissue] to use the equation.
0.05 = [square root of 1 - 0.5/n] or [(0.05).sup.2] = 0.5/n n = 200 points.
However, if 200 points are required to evaluate the structure in each group, and the structure occupies only 50 % of the tissue, a correction is obtained by dividing the calculated 'n' by 0.5. Thus, the corrected 'n' is 400 points. Figure 6 illustrates the results of Hally's equation for all possible Vv (in %).
Lines and Surface density. Sv [structure, tissue] is more complicated to obtain than Vv [structure, tissue] because of the section orientation and randomness have a significant influence on this parameter. Only with IUR sections, Sv [structure, tissue] will provide suitable results.
Sv [structure, tissue] measures the structure having a surface (area) changing in the reference volume. The estimate uses the 'Buffon's needles' concept (Georges-Louis Leclerc, Comte de Buffon, 18th-century French naturalist, mathematician, cosmologist, and other). In a floor made of parallel strips of wood of the same width, and needles dropped onto the floor, the probability of the needle crossing the interface between two strips can be estimated. Therefore, the probabilities of lines within the frame (whose length is known) cut off the boundaries of the structure (or intercepts) can also be estimated.
A frame containing a set of lines is used to assess Sv [structure, tissue]. The lines can be horizontal, vertical, both horizontal and vertical, incomplete, straight or curved. STEPanizer can draw all possibilities, increasing or decreasing the number of lines.
Sv = 2 x I/[L.sub.T]
Sv [structure, tissue] is the ratio between twice the number of intercepts of the structure with the test line (I) and the total line length ([L.sub.T]) (the unit is [mm.sup.3]/[mm.sup.2] or [mm.sup.-1]). The full surface of the structure is determined by multiplying the surface density and the reference volume:
S = Sv x Vol[ref]
Modern stereology. Under the designation, 'design-based stereology' (DBS) are methods that appeared in the early 1980s, and little by little have become increasingly important in the studies (Mayhew & Gundersen, 1996). DBS and new stereological techniques were possible because of the 'random sampling approach,' new random sampling designs associated with alternative interpretations of the current stereological formulae, having a probability theory underlying these new methods (Baddeley & Jensen, 2005).
The critical point is that design-based inference does not require assumptions about the material. In DBS survey sampling, estimates of population parameters are unbiased by the randomization of the sample, without any need for assumptions about the population structure. Therefore, DBS depends on adherence to the random sampling protocol (Tschanz et al., 2014).
Estimating the size of objects. In the section on morphometry, the measurement of diameters in histological sections was explained to be subject to significant distortions, because the structures can be sectioned at different heights (apart from the fact of 'loss of caps'). DBS avoids the problem.
A point-sampling of linear intercept lengths of single sections provides the 'volume-weighted nuclear volume.' The only requirement is that individual objects can be identified by their profiles in random sections (Gundersen & Jensen, 1985). It is also valid for objects of arbitrary shape and any combination of ellipsoids (spheres, oblates, prolates and triaxial ellipsoids). The estimator is reduced only to a function of measurements of diameters in the section plane, as illustrated in the prostate cancer example (Leze et al., 2014).
Estimating the number of objects. The estimation of the number of objects is a question first conveniently addressed by Abercrombie (Abercrombie, 1946) using MBS. Afterward, others have made suggestions and updates on the Abercrombie method, as Weibel-Gomez (Weibel & Gomez, 1962) and Aherne (Aherne, 1967). Since 1984, a DBS method was developed and proved to be suitable to estimate the number of objects without any assumption about the object size, shape and distribution (Sterio, 1984).
There is an advantage in using DBS, mainly for estimating the number of objects in a reference volume (N [structure, tissue]). With MBS, the calculation of N [structure, tissue] depends on a homogeneous distribution and similar geometric shape of the structure, which is a significant difficulty in biology. Also, tissue processing, retraction, and compression are challenging to overcome. In general, the count in MBS is more affected by these artifacts than in DBS, although the object number estimation in a disector x V [ref] design on paraffin sections do not be immune from bias. The deformation may occur during the histological processing of the tissue. It is especially noted that the widely used optical disector may be biased by dimensional changes in the z-axis (i.e., the direction perpendicular to the section plane), which is often the case when frozen sections or vibratome sections are used for the stereological measurements (Dorph-Petersen et al., 2001).
DBS presents two complementary methods for the estimation of N [structure]: 'fractionator' and 'disector.' Literature has various examples of the use of fractionator, of disector, and of fractionator / disector combination. What is sometimes called the 'fractionator' in the literature implicitly includes the application of the 'disector' at the very last sampling step. Some software packages are marketed with the promise of performing these methods on automated equipment connected to the internet.
The 'optical fractionator' uses thick sections and estimates the total number of objects from the number of objects sampled randomly from a set of sections covering the entire Vol [ref] with a uniform distance between sections. For example, cerebellar Purkinje cell nucleoli were used as counting units to obtain unbiased (fractionator) estimates of the number of Purkinje neurons in adult mammalian cerebella. Also, the estimates were used as variables in an allometric bivariate study, which was consistent with the suggestions that neuronal packing densities decrease with increasing brain size (Mayhew, 1991).
The 'optical disector' (so-called 'NvVref' method) calculates the numerical density of objects (Nv [structure, tissue]) in a Vol[ref] (can be an organ, a region, a nucleus). As mentioned, the total number of objects (N [structure]) is obtained by multiplying Nv [structure, tissue] by Vol [ref] (it may be determined by Cavalieri's method). Also, the 'nucleator' is an unbiased estimator based on dissector-sampling of objects and, thus, provides the 'number-weighted mean volume' (Gundersen, 1988).
In a study counting the number of cells within the glomerulus, investigators have compared two approaches (Basgen et al., 2006). With the Weibel-Gomez method (MBS), the cellular densities in each glomerulus were estimated, then the cell number was obtained by multiplying the density by the glomerular volume (Weibel & Gomez, 1962). The disector/fractionator method (DBS) counted the number of cells in a fraction of sections. The Weibel-Gomez method requires assumptions about the glomerular size distribution and shape that are potential sources of bias, difficult to be verified (Samuel et al., 2007). In the comparison, the Weibel-Gomez method produced an overestimation, whereas the disector/fractionator method was considered unbiased (Basgen et al., 2006).
In the 'disector,' objects are counted considering two paired sections of the same field spaced apart from a range of approximately 1/3-1/4 of the object size (Gundersen et al., 1988a). The object is counted when it is seen only in one of the sections (the section considered in the count should be defined 'a priori'). There is no need to have any knowledge about the shape, size or distribution of the object in the tissue.
The Figure 7 summarizes a comparison between morphometry and stereology and its possibilities in a quantitative morphological study. The images were taken from a previous publication in the pineal gland (Ferreira-Medeiros et al., 2007).
Abbreviations. [A.sub.p], partial area; [A.sub.T], test area; DBS, design-based stereology; FAQ, frequently asked question; [H.sub.alt], alternative hypothesis; [H.sub.o], null hypothesis; IUR, isotropic and uniformly random; JPEG, Joint Photographic Experts Group; L, length; [L.sub.p], partial length; [L.sub.T], total line length; Lv, length density; MBS, model-based stereology; N, number; [N.sub.A], numerical density per area; Nv, numerical density; NvVref, numerical density estimated by disector method; [P.sub.p], partial point; PT, total number of points; QA, 'Querschnitte' or transects in the test area; S, surface (area); SD, sample standard deviation; SE, standard error; Sv, surface density; TIFF, Tagged Image File Format; V, volume; Vol[ref], reference volume; [V.sub.p], partial volume; Vv, volume density; [delta], populational standard deviation; [[delta].sup.2], variance; [mu], populational mean.
The Laboratory in UERJ is supported by CNPq (Conselho Nacional de Desenvolvimento Cientifico e Tecnologico, grant number 302.920/2016-1) and FAPERJ (Fundacao Carlos Chagas Filho do Amparo a Pesquisa do Rio de Janeiro, grant numbers E-26/201.186/2014). The association UERJ/UFRO is funded by FAPERJ (grant number E-26/010.003.093/2014).
Abercrombie, M. Estimation of nuclear population from microtome sections. Anat. Rec, 94(2):239-47, 1946.
Aherne, W. A. & Dunnill, M. S. Morphometry. London, Edward Arnold Pu, 1982.
Aherne, W. Methods of counting discrete tissue components in microscopical sections. J. Microsc, 87(3-4):493-508, 1967.
Baddeley, A. J. & Jensen, E. B. V. Stereology for Statisticians. Boca Raton, Chapman & Hall/CRC, 2005.
Baddeley, A. J.; Gundersen, H. J. & Cruz-Orive, L. M. Estimation of surface area from vertical sections. J. Microsc., 142(Pt. 3):259-76, 1986.
Basgen, J. M.; Nicholas, S. B.; Mauer, M.; Rozen, S. & Nyengaard, J. R. Comparison of methods for counting cells in the mouse glomerulus. Nephron. Exp. Nephrol., 103(4):e139-48, 2006.
Catta-Preta, M.; Mendonca, L. S.; Fraulob-Aquino, J.; Aguila, M. B. & Mandarim-de-Lacerda, C. A. A critical analysis of three quantitative methods of assessment of hepatic steatosis in liver biopsies. Virchows Arch, 459(5):477-85, 2011.
Cruz-Orive, L. M. & Weibel, E. R. Recent stereological methods for cell biology: a brief survey. Am. J. Physiol, 258(4Pt. 1):L148-56, 1990.
Cruz-Orive, L. M. Precision of Cavalieri sections and slices with local errors. J. Microsc, 193(3):182-98, 1999.
Cumming, G.; Fidler, F. & Vaux, D. L. Error bars in experimental biology. J. Cell. Biol., 177(1):7-11, 2007.
Delesse, M. A. Procede mecanique pour determiner la composition des roches. C. R. Acad. Sci. Paris, 25:544-5, 1847.
Dorph-Petersen, K. A.; Nyengaard, J. R. & Gundersen, H. J. Tissue shrinkage and unbiased stereological estimation of particle number and size. J. Microsc, 204(Pt. 3):232-46, 2001.
Elias, H.; Hyde, D. M. & Scheaffer, R. L. A Guide to Practical Stereology. Basel, Karger, 1983.
Ferreira-Medeiros, M.; Mandarim-de-Lacerda, C. A. & Correa-Gillieron, E. M. Pineal gland post-natal growth in rat revisited. Anat. Histol. Embryol, 36(4):284-9, 2007.
Gundersen, H. J. & Jensen, E. B. Stereological estimation of the volume-weighted mean volume of arbitrary particles observed on random sections. J. Microsc, 138(Pt. 2):127-42, 1985.
Gundersen, H. J. & Osterby, R. Optimizing sampling efficiency of stereological studies in biology: or 'do more less well!'. J. Microsc., 121(Pt. 1):65-73, 1981.
Gundersen, H. J. G. Notes on the estimation of the numerical density of arbitrary profiles: the edge effect. J. Microsc., 111(2):219-23, 1977.
Gundersen, H. J. The nucleator. J. Microsc, 151(Pt. 1):3-21, 1988.
Gundersen, H. J.; Bagger, P.; Bendtsen, T. F.; Evans, S. M.; Korbo, L.; Marcussen, N.; M0ller, A.; Nielsen, K.; Nyengaard, J. R.; Pakkenberg, B.; Sorensen, F. B.; Vesterby, A. & West, M. J. The new stereological tools: disector, fractionator, nucleator and point sampled intercepts and their use in pathological research and diagnosis. APMIS, 96(10):857-81, 1988a.
Gundersen, H. J.; Bendtsen, T. F.; Korbo, L.; Marcussen, N.; Moller, A.; Nielsen, K.; Nyengaard, J.R.; Pakkenberg, B.; S0rensen, F.B.; Vesterby, A. & West, M.J. Some new, simple and efficient stereological methods and their use in pathological research and diagnosis. APMIS, 96(5):379-94, 1988b.
Hally, A. D. A counting method for measuring the volumes of tissue components in microscopical sections. J. Cell Sci., 105(S3): 503-17, 1964.
Hedreen, J. C. Lost caps in histological counting methods. Anat. Rec., 250(3):366-72, 1998.
Howard, C. V. & Reed, M. G. Unbiased stereology: three-dimensional measurement in microscopy. 2nd ed. New York, BIOS Scientific Pu., 2005.
Krzywinski, M. & Altman, N. Points of significance: Importance of being uncertain. Nat. Methods, 10(9):809-10, 2013a.
Krzywinski, M. & Altman, N. Points of significance: Nonparametric tests. Nat. Methods, 11(5):467-8, 2014.
Krzywinski, M. & Altman, N. Power and sample size. Nat. Methods, 10(12):1139-40, 2013b.
Krzywinski, M. & Altman, N. Significance, P values and t-tests. Nat. Methods, 10(11):1041-2, 2013c.
Leze, E.; Maciel-Osorio, C. F. E. & Mandarim-de-Lacerda, C. A. Advantages of evaluating mean nuclear volume as an adjunct parameter in prostate cancer. PLoS One, 9(7):e102156, 2014.
Mandarim-de-Lacerda, C. A. Quantitative Development of the Human Embryonic Heart in the Post-Somitic Period. Cardiac Morphometry in Staged Embryos. Thesis, Full Professor. Rio de Janeiro, University of the State of Rio de Janeiro (UERJ), 1987. pp.157.
Mattfeldt, T.; Mall, G.; Gharehbaghi, H. & Moller, P. Estimation of surface area and length with the orientator. J. Microsc., 159(Pt. 3):301-17, 1990.
Mayhew, T. M. & Gundersen, H. J. If you assume, you can make an ass out of u and me': a decade of the disector for stereological counting of particles in 3D space. J. Anat, 188(Pt. 1):1-15, 1996.
Mayhew, T. M. Accurate prediction of Purkinje cell number from cerebellar weight can be achieved with the fractionator. J. Comp. Neurol., 308(2):162-8, 1991.
Mouton, P. R. Unbiased Stereology: A Concise Guide. Baltimore, John Hopkins University Press, 2011.
Russ, J. C. & Dehoff, R. T. Practical Stereology, 2nd ed. New York, Kluwer Academic/Plenum Publishers, 2000.
Scherle, W. A simple method for volumetry of organs in quantitative stereology. Mikroskopie, 26(1):57-60, 1970.
St. Pierre, T. G.; House, M. J.; Bangma, S. J.; Pang, W.; Bathgate, A.; Gan, E. K.; Ayonrinde, O. T.; Bhathal, P. S.; Clouston, A.; Olynyk, J. K. & Adams, L. A. Stereological analysis of liver biopsy histology sections as a reference standard for validating non-invasive liver fat fraction measurements by MRI. PLoS One, 11(8):e0160789, 2016.
Sterio, D. C. The unbiased estimation of number and sizes of arbitrary particles using the disector. J. Microsc., 134(Pt. 2):127-36, 1984.
Tschanz, S. A.; Burri, P. H. & Weibel, E. R. A simple tool for stereological assessment of digital images: the STEPanizer. J. Microsc., 243(1):47-59, 2011.
Tschanz, S.; Schneider, J. P. & Knudsen, L. Design-based stereology: Planning, volumetry and sampling are crucial steps for a successful study. Ann. Anat., 196(1):3-11, 2014.
Weibel, E. R. & Gomez, D. M. A principle for counting tissue structures on random sections. J. Appl. Physiol., 17:343-8, 1962.
Weibel, E. R. Stereological Methods. Practical Methods for Biological Morphometry. London, Academic Press, 1979.
Weibel, E. R.; Kistler, G. S. & Scherle, W. F. Practical stereological methods for morphometric cytology. J. Cell Biol., 30(1):23-38, 1966.
West, M. J. Basic stereology for biologists and neuroscientists. New York, Cold Spring Harbor, 2012.
Zar, J. H. Biostatistical Analysis. 5th ed. Upper Saddle River, Prentice Hall, 2010.
Dr. Carlos Alberto Mandarim-de-Lacerda
Universidade do Estado do Rio de Janeiro
Departmento de Anatomia
Instituto de Biologia
Laboratorio de Morfometria metabolismo e Doenca Cardiovascular.
Av. 28 de Setembro 87 fds
Rio de Janeiro, RJ
Carlos Alberto Mandarim-de-Lacerda (1) & Mariano Del Sol (2)
(1) Department of Anatomy, Institute of Biology, Biomedical Center, Laboratory of Morphometry, Metabolism and Cardiovascular Diseases, The University of the State of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil.
(3) Doctoral Program in Morphological Sciences, Universidad de La Frontera (UFRO), Temuco, Chile. The review was written to celebrate the agreement between the universities UERJ and UFRO for the advancement of morphological sciences
Caption: Fig. 1--The normal distribution. Data with a normal distribution (Gaussian) have a 'bell-shaped' frequency distribution. Using the same data set with mean = 10, in A the standard deviation (SD) = 2; in B the SD = 1 (that is, the curve is less widened, denoting less data variability than in A).
Caption: Fig. 2. Measuring diameters. The scheme represents a spherical structure having the diameter 'd'. The structure was sectioned at various heights (sections 1-4). On the right side, depending on the height the structure was cut its profile appeared different differing from 'd' as indicated by the 'red' arrows. This effect distorts (underestimates) 'd'.
Caption: Fig. 3. Cavalieri estimator of volume. Schematic drawing of a serially sectioned elliptical structure with thickness 't' = 5 [micro]m. The sections can be grouped into sets of four sections (the thickness of the set will be 'T' = 20 [micro]m). The area of the profile should be determined in the upper section of each set (Ai). The volume is calculated multiplying T by the sum of Ai. In detail, the profile area can be determined by 'counting points.' In a frame of the known area with 'n' equidistant points, each point represents a fraction 1 / n of the frame area (that is the 'area of the point' aP). By summing the number of points that hit the profile and multiplying by aP, the area of the profile can be determined (by planimetry)
Caption: Fig. 4. Unbiased counting frame. Some structures within a frame may be partially outside the frame, moving outwards. Counting all these structures will overestimate their number. Therefore, two consecutive edges of the frame (and their extensions) are considered 'forbidden lines' (all structures exceeding the 'forbidden lines' should not be considered). Consequently, the structures surpassing the opposite consecutive edges of the frame are counted. The example shows 12 profiles, but four should not be counted (numbers 1, 8, 11, and 12). Only eight profiles are computed here.
Caption: Fig. 5. STEPanizer. On a skeletal muscle section digital image, four examples of grids in frames with 'forbidden lines' (solid lines) were produced by STEPanizer. A and B--Fig.s have only test points in a different number. C--the system has straight segments delimiting test points at its ends (inspired in Weibel's 'multi-purpose' test system with points and lines). D--the system is composed of 'cycloid arcs' delimiting test points at the ends, and section should align with the arrow of the frame.
Caption: Fig. 6. How many points should be counted in the study? Using the 'Hally's formula' from an initial volume density estimate, the expected number of points to count can be calculated with a relative standard error of 0.05. Therefore, the grid of points can be made (for example, in STEPanizer) increasing or decreasing the number of points in combination with the number of photomicrographs per case we have. It should be done as a 'pilot study' before starting the study itself.
Fig. 7. Morphometry vs. Stereology. An example of rat pineal gland and pinealocytes (based on Ferreira-Medeiros et al., 2007): A--Morphometry, a stage micrometer was used to measure larger diameters (D1), and smaller diameters (D2) of pinealocytes. B-- Stereology, the two planes of a 'disector' (up and down) spaced apart by three micrometers are shown. Pinealocytes are counted only when they are seen on one of the planes. It was possible to determine the number of pinealocytes in the pineal gland using this method. The table below summarizes the comparison between morphometry and stereology. Data Morphometry Stereology Measurement in the plane Yes No 3-D information No Yes Caliper Yes No Test-system No Yes Unit mm 1/mm3 Image analysis Usually Yes Usually No Variables Continuous Discrete Statistics Usually Parametric Usually Nonparametric