# Bayesian estimation of population density and visibility.

ABSTRACT. -- It is assumed that a sample region may be subjected either to "exhaustive search", in which case the number of animals found is Poisson distributed with a mean proportional to the population density [lambda], or else a cheaper sort of "cursory search", in which case the mean count is multiplied by an unknown "visibility" parameter p, between 0 and 1. We consider Bayesian, or at least formally Bayesian, estimation of [lambda] and p based on independent cursory searches, exhaustive searches, prior information about [lambda], and prior information about p. The subjects of examples treated are gallinule nests and bicycles which may or may not be visible from the street. Key words: population density; visibility bias; Bayesian estimation; equivalent sample information; posterior distribution.

**********

Let A be the size of a sample region, let T be the number of animals present in this region, and let X = T - Y be the number of animals which might be noticed in a cursory search. We assume that the distribution of T given [lambda] (and p) is Poisson with mean [lambda]A. Assume, also, that the distribution of X given T and p (and [lambda]) is binomial with parameters T and p.

The a priori probability density f([lambda],p) is assumed to be noninformative, or more generally to be in the form of the product of a gamma density for [lambda] and a beta density for p. Then, the posterior density f([lambda],p|T), which becomes available if the area is searched exhaustively, is of the same (conjugate) form. Under some circumstances it may be possible to search the area cursorily and then to follow up with an exhaustive search of the same area. Then, one obtains a posterior density f([lambda],p|T,X) which is again of the conjugate form.

We assume that the results of exhaustive searches and any other experiments yielding independent information about [lambda] and p, are absorbed into the assumed a priori density

f([lambda],p) = [[[e.sup.-[lambda]a][[lambda].sup.v.sup.-1][a.sup.v]]/[[GAMMA](v)]] * [[[p.sup.[alpha].sup.-1](1 - p)[.sup.[beta]-1]]/[B([alpha],[beta])]] (1)

where B ([alpha],[beta]) = [GAMMA]([alpha])[GAMMA]([beta]) / [GAMMA]([alpha] + [beta]). Figure 1 depicts a double sampling scheme for which [alpha] = x + 1, [beta] = y + 1 and v = t + 1 yields a likelihood of the form given by density (1). Therefore, density (1) may represent nothing more than a factor of the likelihood function discussed above or it may have a more thoroughly Bayesian meaning. In any case, we now adopt the assumption that our "prior" estimation of the parameters is to be updated by observation of X alone.

[FIGURE 1 OMITTED]

MATHEMATICAL PRELIMINARIES AND RESULTS

The formulas presented below, and much more fully in Kutran (1975), may be derived by essentially straightforward methods. The results are presented in terms of hypergeometric functions and confluent hypergeometric functions. It is of interest to note that the confluent hypergeometric function, M, yields the prior moment generating function of p from density (1).

M([alpha];[alpha]+[beta];z) = [[integral].sub.0.sup.1] [[[p.sup.[alpha]-1](1 - p)[.sup.[beta]-1]]/[B([alpha],[beta])]][e.sup.pz]dp = [[infinity].summation over (k=0)] [[[GAMMA]([alpha] + k)]/[[GAMMA]([alpha])]] [[[GAMMA]([alpha] + [beta])]/[[GAMMA]([alpha] + [beta] + k)]] [[z.sup.k]/[k!]] (2)

Consider the average of the cursory count, x, given p and [lambda]; that is, E(x|p,[lambda]). For random p and [lambda], E(x|p,[lambda]) = a p[lambda] is a random variable with moment generating function given by the hypergeometric function, F.

F(v,[alpha];[alpha]+[beta];z) = [[integral].sub.0.sup.1] [[[p.sup.[alpha]-1](1-p)[.sup.[beta]-1]]/[B([alpha],[beta])]] (1-pz)[.sup.-v]dp = [[infinity].summation over (k=0)] [[[GAMMA](v + k)]/[[GAMMA](v)]] [[[GAMMA]([alpha] + k)]/[[GAMMA]([alpha])]] [[[GAMMA]([alpha] + [beta])]/[[GAMMA]([alpha] + [beta] + k)]] [[z.sup.k]/[k!]] (3)

Johnson and Kotz (1969) provide an adequate discussion of the behavior of these functions. The terms in the above series are recursive and easily programmed for evaluation. Consider the kth term of F(t,a;c;z), say [B.sub.k], then

[B.sub.k] = [B.sub.k-1] [[(t+k-1)(a+k-1)]/[(c+k-1)]] [Z/k].

The prior information in (1) is now updated to include the observation X over the sample region A. Averaging out the parameters in the binomial conditional probability model for X we obtain

f(X) = [[B(X,[alpha] + [beta])]/[B(X,v)B(X,[alpha])]] (A/[A+a])[.sup.X](a/[A+a])[.sup.v]F(X + v, [beta]; X + [alpha]+[beta]; [A/[A+a]])/X

X = 1,2,3 .... (4)

Hence, the posterior marginal distributions on the visibility p and density [gamma] are, respectively,

f(p|X) = [[[B.sup.-1](X + [alpha],[beta])[p.sup.X+[alpha]-1](1 * p)[.sup.[beta]-1]]/[F(X + v,X + [alpha]; X + [alpha] + [beta]; -A/a)]](1+p A/a)[.sup.-(X+v)], 0 < p < 1, (5)

(which might be called a hypergeometric function density) and

f([lambda] | X) = [[a.sup.X+v]/[[GAMMA](X + v)]] [[lambda].sup.X+v-1][e.sup.-a[lambda]] [[M(X + [alpha]; X + [alpha] + [beta]; -A[lambda])]/[F(X + v, X + [alpha]; X + [alpha] + [beta]; -A/a)]], y > 0. (6)

The information in (5) and (6) may be summarized in several ways such as using the mode to estimate the respective parameters. Here the means are utilized.

E([lambda]|X) = [[X + v]/[A + a]] [[F(X + v + 1,[beta]; X + [alpha] + [beta]; A/(A + a))]/[F(X + v,[beta]; X + [alpha] + [beta];A/(A + a))]] (7)

E(p|X) = [[X + [alpha]]/[X + [alpha] + [beta]]] [[F(X + v,[beta]; X + [alpha] + [beta] + 1; A/(A + a))]/[F(X + v,[beta]; X + [alpha] + [beta];A/(A + a))]] (8)

The standard deviation offers one measure of variation and is given by

[sigma]([theta]|X) = [E([[theta].sup.2]|X) - [E.sup.2]([theta]|X)][.sup.1/2] (9)

where [theta] represents p and [lambda]. The second moments are

E[[p.sup.2]|X] = [[(X+[alpha])(X+[alpha]+1)]/[(X+[alpha]+[beta])(X+[alpha]+[beta]+1)]] [[F(X+v, X+[alpha]+2;X+[alpha]+[beta]+2;-A/a)]/[F(X+v,X+[alpha];X+[alpha]+[beta];-A/a)]] (10)

and

E[[[lambda].sup.2]|X] = [[(X+v)(X+v+1)]/[(A+a)[.sup.2]]] [[F(X+v+2,X+[alpha];X+[alpha]+[beta];A/(A+a))]/[F(X+v,X+[alpha];X+[alpha]+[beta];A/(A+a))]] (11)

Further properties of densities (5) and (6) are discussed in Kutran (1975) along with interesting related distributions.

A GALLINULE EXAMPLE

Gallinules (Porphyrula martinica and Gallingula chloropus) nest in clumps of emergent vegetation on the edges of canals in the Lacassine Wildlife Refuge. Using an airboat we made a cursory search of 500 linear feet or 0.5 (1000 feet) and found four nests. An exhaustive search of the same region revealed that seven nests had been overlooked in the cursory search. In the absence of other prior information we set [alpha] = 5, [beta] = 8, v = 12, and a = 0.5 feet. Then we made a cursory search of an additional A = 4.3(1000) linear feet and discovered X = 21 gallinules. With the assumptions outlined above, ([lambda]|X) is estimated by (7)

E[[[lambda].sup.2]|X] = (33/4.8) [[F(34,8;34;4.3/4.8)]/[F(33,8;34;4.3/4.8)]] = 16.05.

That is, we estimate 16.05 nests per 1000 linear feet. The fact that this estimate is based on little information is reflected in the fact that the a posteriori standard error [sigma]([lambda]|X) = 4.67 is about 20% of the mean. Similarly, the estimate E(p|X) = 0.36 is hedged by a standard error [sigma](p|X) which is some 40% of E(p|X). The a posteriori distribution of [lambda] and p is thus rather diffuse, as it should be. This exercise can be compared with a similar study, unfortunately not concerned with gallinule nests, where more data were collected.

A BICYCLE EXAMPLE

An attempt was made to estimate [lambda], the average number of (parked) bicycles per house in a certain area of Lafayette, Louisiana, and p, the probability that a bicycle in this area is invisible from the street.

The "training sample" consisted of three streets (53 houses) on which a door-to-door census was made to determine the exact number (94) of bicycles. In a subsample (2 streets), a cursory visual search was also made and 35 bicycles were sighted while 19 bicycles were present, as ascertained by the door-to-door survey, but not sighted.

We considered that our model fit at least crudely and set a = 53 (houses), v = 94, [alpha] = 36, and [beta] = 20. Then 5 more streets (A = 119 houses) were subjected to cursory search. On the basis of the data from the training sample, one would predict that E(X) = 137 bicycles, more or less, would be sighted in this main sample area. As it happened, the number X = 107 turned out to be about 1.5 standard deviations below the expectation. Hence, the new data had a considerable impact on our estimation of [lambda]. For example, the posterior mean value of [lambda], namely E([lambda]|X) = 1.64 bicycles per house, was smaller than the prior mean E([lambda]) = 1.77. Similarly, the estimate of visibility p was lowered from E(p) = 0.64 to E(p|X) = 0.59.

Reflecting the greater amount of information on bicycles as opposed to gallinule nests, the a posteriori coefficient of variation of [lambda] was only [sigma]([lambda]|X)/E([lambda]|X) = 11%. The most striking difference between this example and the other is that here we have much less uncertainty, either before or after the main sampling effort, about the value of p. In fact [sigma](p) is only 0.02 and [sigma](p|X) is down to 0.003.

CONCLUSIONS

It seems, on the basis of the examples we have considered, that our formally Bayesian approach to the analysis of data on density and visibility can give very sensible and appropriately hedged estimates, but we think that the conclusions should be analyzed cautiously. In fact, the Bayesian data analysis seems to have some built-in warning signs. For instance, in the bicycle example, the a posteriori correlation between [lambda] and p was about 35%, a value which expresses in a very precise sense the lack of independence between our information about density and visibility.

LITERATURE CITED

Cook, R. D., and F. B. Martin. 1974. A model for quadrat sampling with visibility bias. J. Amer. Statistical Assoc., 69:345-349.

Johnson, L. N., and S. Kotz. 1969. Discrete Distributions. Houghton Mifflin Co., Boston, 328 pp.

Kutran, K. 1975. Bayesian corrections for visibility bias in population estimations. Unpublished Ph.D. dissertation, The University of Southwestern Louisiana, Lafayette, 79 pp.

CHARLES ANDERSON, THOMAS BRATCHER, AND KHALIL KUTRAN

University of Southwestern Louisiana, Lafayette, Louisiana 70504, Baylor University, Waco, Texas 76798, and University of Petroleum and Minerals, Dahran, Saudi Arabia