# Improvement of the Wald method applied to the evaluation of zero-inflated binomial linear functions/Aprimoramento do metodo de Wald aplicado a estimacao de funcoes lineares binomiais com excesso de zeros.

IntroductionThe Wald method is highly relevant among the known procedures in literature for inference from binomial proportions. The method, widely used to compare two binomial proportions, is characterized essentially for being asymptotic, where the distribution of the estimator is approximately normal. Due to this approximation, numerous studies show that the method presents shortcomings with regard to results of coverage probability and its use in small samples. Alternative methods are proposed to correct this deficiency. An improvement to the Wald method, proposed by Agresti and Coull (1998), briefly consists of adding four pseudo-observations, two successes and two failures, in the expression of the proportion estimator. This procedure is known as the 'add--4 method'. However the more general problem of interval estimation for a linear function of binomial proportions mentioned by Price and Bonett (2004), including pairwise comparisons, complex contrasts, interaction effects and simple main effects (BONETT; WOODWARD, 1987), are factors that influence the probability coverage estimate.

Studying the Wald method and comparing it to other methods using the bootstrap approach, Carari et al. (2010) came to the conclusion that the Wald method presented probabilities of coverage with rates lower than the confidence coefficient's nominal rates, thus jeopardizing its practical application to small samples. Regard to the add-4 method, the study showed that it stood out by producing adequate results for probabilities of coverage and intervals with shorter lengths.

The Wald method has also been used in dealing with linear functions which involve binomial proportions, also known as binomial families. A generalization of this method with its approach is stated by Price and Bonett (2004) as a confidence interval for the parameter rate F = [q.summation over (i=1)] [[delta].sub.i][[pi].sub.i]. As such, the confidence interval approximately 100(1-a)% from Wald to F is given (1)

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (1)

where:

[n.sub.i] is the reference sample size for i-th binomial population;

[[??].sub.i] = [Y.sub.i]/[n.sub.i]; [[delta].sub.i] is a known co-efficient and specified by researcher;

q is the number of coefficients involved in the function. Even with the above generalization, the Wald method still presents the flaws mentioned and in this context alternative methods have emerged. More details may be found in Price and Bonett (2004), Tebbs and Roths (2008) and Cirillo et al. (2009).

It is worth mentioning that the Wald method applied to the comparison of two binomial proportions or generalized for binomial linear functions put forth in the literature does not consider zero-inflated binomial (ZIB) samples. In this case, the use of these methods would certainly exacerbate the deficiencies mentioned previously with regard to coverage probability and applications to small samples. Silva and Cirillo (2010) warn that, even assuming the adequacy of the model, some zeros may be considered outliers, and thus different methods of assessment are sensitive to this anomaly.

Consequently, robust assessment methods must be arrived at which will consider the presence of divergent data and provide a coherent estimate of the parameter required. Faced with this problem, methods which deal with the effect of outliers on estimates is still the focus of research. Andrade et al. (2014) have proposed a bootstrap algorithm which looks at the effect of divergent observations and/or influential on estimates for non-linear parameter models.

While keeping focus on tallying data, Silva et al. (2012) studied the zero-inflated effect on a Poisson model according to sampling size and different parametric rates inferring from a zero-inflated Poisson (ZIP) model. The authors reached the conclusion that discrimination of ZIP and Poisson through a score test was recommended on the basis of a sampling size greater than n = 40 in samples with a high proportion of null rates.

Wood et al. (2005) proposed two alternatives to estimate the probability of success in binomial samples tainted with divergent observations. These alternatives referred to two estimators differentiated by arithmetic average and rationalized means of the proportions observed.

After comparing estimators variances, the authors come to the conclusion that an estimator's recommendation will apply at different situations characterized by the distribution of proportions and the number of trials (n) performed.

In view of a scarcity of robust, zero-inflated methods to estimate binomial linear functions, current research is characterized by the proposal for an improvement of the Wald method applied to the intervallic binomial linear functions. The above turns the method robust to zero-inflated binomial samples and replaces the maximum likelihood estimates by robust estimates. Several scenarios among different parametric configurations are assessed via Monte Carlo to validate the method.

Material and methods

Following the objectives proposed, the method was performed in two steps, specified in sections 2.1 and 2.2, with details below.

Simulation of zero-inflated binomial samplings.

Using Monte Carlo simulation techniques, the zero-inflated binomial samples were generated while taking into account the ZIB model (RUCKSTUHL; WELSH, 2001), characterized by the mixture of two components in such a way that one component presumes that the occurrence of zero is defined by a [gamma] probability, while the other component represents a binomial distribution with a (1-[gamma]) probability. The ZIB model is thus defined by the following expression (2)

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (2)

with the expectation that

E(Y) = (1 - [gamma]) (1 - [pi]) m[pi]

and the variance defined as

Var (Y) = [(1-[gamma]) m[pi]][(1 - [pi])(1 - [gamma]m)]

where [gamma] is a probability of zero occurrence and m the number of Bernoulli experiments. Using the model given in (2), set m = 100 Bernoulli experiments for n samples sizes, the parametric rates assumed in the Monte Carlo simulation process are described in Table 1.

Keeping the parametric rate configurations, estimators for robust to zero-inflated binomial proportions are defined by [[pi].sub.zib]. This estimator was obtained as a combination of estimators found in Ruckstuhl and Welsh (2001) and Silva and Cirillo (2010).

[[??].sub.zib] = [m.summation over (y=0)] [[rho].sub.s] (x) [[??].sub.mle] (3)

where [[??.sub.]mle] is the maximum likelihood estimator of [??] given in (4)

[[??].sub.mle] = 1/m [n.summation over (i=1)] [y.sub.i] [f.sub.n] (y)(4)

where:

[f.sub.n](y) =1/n [n.summation over (i=1)] I(Y=[y.sub.i]) (5)

The expression presented in (3) is based on the likelihood disparity of E-estimators (RUCKSTUHL; WELSH, 2001) and [[rho].sub.s](X) represents a function that minimizes the disparity.

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (6)

where [c.sub.1] and [c.sub.2] are affinity constants.

The function argument x=[f.sub.n](y)/[p.sub.n](y) is fixed, where [p.sub.n](y) is the probability for a Binomial distribution, considering the estimate of [pi] given by (4). The rates for s are set in 1 and 2, defining the estimator [[??].sub.zib] in two approaches mentioned in current research as the incorporation of [[rho].sub.1] and [[rho].sub.2] components.

We would like to emphasize that the structure of [p.sub.1] and [[rho].sub.2] in the estimation process is understood as a systematic component taking into consideration that the researcher may choose which function will be assumed. Note that by assuming u= 1, [[rho].sub.2] = [[rho].sub.1] suggests that [[rho].sub.2] is a generalization of [[rho].sub.1] differing only in the asymptotic properties.

In this context, the rates for affinity constants [c.sub.1] and [c.sub.2] are defined on the basis of the component in such a way that, upon assuming the component [[rho].sub.1], the coefficients u = [c.sub.2] = 1 are fixed and a value for [c.sub.1] < [c.sub.2] = 1 is investigated. Thus, [[rho].sub.1] (x) is prone to a greater increase when x [right arrow] [infinity].

Keeping the [c.sub.1] < [c.sub.2] = 1 inequality in mind, according to Ruckstuhl and Welsh (2001), the maximum likelihood estimates tend to be more robust. Taking into consideration [[rho].sub.2], it is assumed that [c.sub.1] = 0.1 keeping the [c.sub.1] < [c.sub.2] = 1 restriction, whereas the rate of u is examined so as to reduce the increase of [[rho].sub.2] (x) when x [right arrow] [infinity].

It is worth underscoring that the accuracy and precision of the estimator (3) depend on the rates of the affinity constants [c.sub.1] and [c.sub.2] which make it robust to expected numbers of null values. Consequently, the research for these constants was carried out by a computer routine.

The intention of Silva and Cirillo (2010) was to reproduce Tables for rates of u and [c.sub.1] in two situations of [[rho].sub.s](x) described in (6). Thus the researcher may use the estimator [[pi].sub.zib] in a statistical inference based on the maximum likelihood estimate of [pi] on a zero inflated sample, observing the deviations between [[pi].sub.mle] and [[pi].sub.zib] estimates, given by |[[pi].sub.zib] - nmlJ < k, where k indicates a tolerable rate for this difference. The first step is to evaluate the maximum likelihood estimate ([[pi].sub.mle]), proceeding the evaluation of [p.sub.n](y) for application on [[rho].sub.s](x) and evaluation of [[pi].sub.zib], according to expression (4). Thus foregrounded, Tables presented by Silva and Cirillo (2010) may be helpful for the choice of u and [c.sub.1].

So that zero-inflated binomial estimates could be compared and validated, the relative bias for [[pi].sub.mle] and [[??].sub.zib] estimates were valued according to expression (7)

[v.sub.mle] = [[??].sub.zib] - [pi]/[pi] and [v.sub.zib] = [[??].sub.zib] - [pi]/[pi] (7)

Definition and estimation of linear functions of binomial proportions taking into consideration the Wald method

After generating the binomial samples, the structure of the binomial linear functions was represented by the parametric rate, as shown in (8)

F = [q.summation over (i=1)][[delta].sub.i][[pi].sub.i], (8)

where q is the total number of binomial populations, the i-th coefficient associated with the success proportion regarding to the i-th binomial population is expressed as [[delta].sub.i], following specifications shown in Table 2.

For each F linear function representing a binomial family, the intervallic estimates for F were numerated, taking into account Wald's confidence intervals according to expression (1). Maximum likelihood estimates were replaced by [[pi].sub.zib] estimates with the systematic [[rho].sub.1] and [[rho].sub.2] component.

Finally, according to assessment scenario (Table 1), the intervals adapted for robust zero-inflated proportions were compared by a 100 (1-a)% interval for exact probability of coverage for a fixed value of F(8) defined by (9)

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (9)

where I ([y.sub.1], ..., [y.sub.q]) equals 1 if the intervals contains F (8), when [Y.sub.1] = [y.sub.1], ..., [Y.sub.g] = [y.sub.g] equals zero if the interval does not contain F (8). An approximation is obtained from 2000 Monte Carlo simulations by means of estimated interval percentages which include the F parameter calculated from a program developed by R 3.00 software (R DEVELOPMENT CORE TEAM, 2011).

Results and discussion

Taking into consideration the evaluation scenarios mentioned in Methodology (Section 2.1), the number of Bernoulli m = 100 trials in this first step was established when obtaining the study samples for the recommended methods.

With this specification, [[??].sub.mle] maximum likelihood estimates and zero-inflated robust as represented by [[??].sub.zib] were obtained in binomial samples generated via Monte Carlo with the null observations percentages nearing 20 and 30% as per the parametric values specified in the [gamma] = 0.2 and 0.3 mixture probability. Results are shown in Tables 3-6.

In short, results made it clear that, in fact, in zero-inflated contaminated binomials, estimates for maximum likelihood were not accurate. This statement might be confirmed from the bias results, including situations of greater size sampling. However, when taking into consideration [[??].sub.zib] estimates, it was noted that for almost all sample sizes and [gamma] rates on an average the relative biases were less than 0.01, including small swings due to the Monte Carlo error in [??] = 0.5 Tables (3 and 4) and [??] = 0.7 rates (Tables 5 and 6).

Based on results on n [[pi].sub.zib] estimates accuracy, the composition of binomial linear functions for the Wald method was conducted and coverage probabilities were calculated. For comparison purpose, a 95% nominal confidence level was taken into consideration. Each binomial family was represented by [F.sub.1], [F.sub.2], [F.sub.3] and [F.sub.4], respectively with regard to [[??].sub.1], [[??].sub.2], [[??].sub.3] and [[??].sub.4] coefficient vectors, described in Table 2. Thus, the graphics with probability estimates are shown as follows in the Figures 1 - 8:

[FIGURE 1 OMITTED]

[FIGURE 2 OMITTED]

[FIGURE 3 OMITTED]

[FIGURE 4 OMITTED]

Keeping a mean proportion of null values around 20% ([gamma] = 0.20) of sampled observations, the results shown in Figures 1-4 made it clear that the increase in sampling size resulted in a decrease of coverage probability, with rates much lower than the nominal confidence level.

This was demonstrated by arranging the binomial families using [[pi].sub.zib] estimates with the use of [[rho].sub.1] and [[rho].sub.2] components. However, when the null observation proportion was increased to about 30% of sample units ([gamma] = 0.30), while taking into consideration the parametric values which maximize the variance of binomial proportions, that is, [pi] = 0.5, the binomial families whose zero-inflated proportions were estimated with [[rho].sub.1] components showed probabilities of greater coverage at the nominal level of confidence (Figure 5). The same result for all sample sizes was observed when the parametric value increased, in situations where estimates were obtained using [[rho].sub.1] and [[rho].sub.2] systematic component (Figures 7-8).

[FIGURE 5 OMITTED]

[FIGURE 6 OMITTED]

[FIGURE 7 OMITTED]

[FIGURE 8 OMITTED]

It is worth mentioning that the Wald method, put into context for the obtainment of the estimates of binomial families, was assessed by Cirillo et al. (2009) for the use of the infinite bootstrap algorithm recommended by Conlon and Thomas (1990). Within this approach, authors of different assessment scenarios also concluded that results related to the probabilities of coverage were incoherent with the nominal level of confidence.

Silva and Cirillo (2010) produced studies related to the use of a robust estimator used in the inference of a binomial model contaminated by the mixture of binomial populations, when samples were obtained through Monte Carlo simulations. This study used an estimator belonging to the E estimator class (RUCKSTUHL; WELSH, 2001) incorporated into the [P.sub.1](x) (8), a component which altered the E estimator. Several [c.sub.1] affinity constant rates were considered, specified in rates 0.1 [less than or equal to] [c.sub.1] [less than or equal to] 0.9 sample sizes equal to 10, 50 and 80, besides the mixture rates equal to 0.20 and 0.40. The main conclusive results were illustrated in the recommendation to assume [c.sub.1] = 0.1 for samples greater than n = 50.

Already confirmed results were described regarding to flows noticed in the Wald method and the choice of c constants based on sampling size and degree of contamination for the results listed in this work.

The Wald method, when using zero-inflated proportion estimates obtained by the [[??].sub.zib] estimator incorporated into the systematic [[rho].sub.2] component, may be recommended in situations with proportions which maximize the binomial family variance, that is [pi] 0.7, since, for this parametric value, the scenarios evaluated led to coverage probabilities greater than 95%.

Conclusion

The use of the Wald method incorporated into estimates for zero-inflated binomial proportions using the [[rho].sub.2] component showed results in line with the nominal confidence level of binomial proportions. In practical terms, this method is recommended for samples in which proportions are close to 0.7 with proportions close to 0.3.

Doi: 10.4025/actascitechnol.v37i1.21250

References

ANDRADE, L. R.; CIRILLO, M. A.; BEIJO, L. A. Proposal of a bootstrap procedure using measures of influence in nonlinear regression models with outliers. Acta Scientiarum. Technology, v. 36, n. 1, p. 93-99, 2014.

AGRESTI, A.; COULL, B. A. Approximate is better than 'exact' for interval estimation of binomial proportions. The American Statistician, v. 52, n. 2, p. 119-126, 1998.

BONETT, D. G.; WOODWARD, J. A. Application of the Kronecker product and Wald test in log-linear models. Computational Statistics Quartely, v. 3, n. 1, p. 235-243, 1987.

CARARI, M. L.; LIMA, P. C.; FERREIRA, D. F.; CIRILLO, M. A. Estimacao da diferenca entre duas proporcoes binomiais via bootstrap. Revista Brasileira de Biometria, v. 28, n. 3, p. 112-134, 2010.

CIRILLO, M. A.; FERREIRA, D. F.; SAFADI, T. S. Avaliacao de metodos de estimacao intervalar para funcoes lineares binomiais via bootstrap infinito. Ciencia e Agrotecnologia, v. 33, n. esp., p. 1741-1746, 2009.

CONLON, M.; THOMAS, R. G. A new confidence interval for the difference of two binomial proportions. Computational Statistics and Data Analysis, v. 9, n. 2, p. 237-241, 1990.

PRICE, M. R.; BONETT, D. G. An improved confidence interval for a linear function of binomial proportions. Computational Statistics and Data Analysis, v. 45, n. 3, p. 449-456. 2004.

RUCKSTUHL, A. F.; WELSH, A. H. Robust fitting of the binomial model. The Annals of Statistics, v. 29, n. 4, p. 1117-1136, 2001.

SILVA, A. M.; CIRILLO, M. A. C. Estudo por simulacao Monte Carlo de um estimador robusto utilizado na inferencia de um modelo binomial contaminado. Acta Scientiarum. Technology, v. 32, n. 3, p. 303-307, 2010.

SILVA, V. S. P.; CIRILLO, M. A.; CESPEDES, J. G.; A study of the score test in discrimination poisson and zero-inflated poisson models. Acta Scientiarum. Technology, v. 35, n. 2, p. 333-337, 2012.

TEBBS, J. M.; ROTHS, S. A. New large-sample confidence intervals for a linear combination of binomial proportions. Journal of Statistical Planning and Inference, v. 138, n. 6, p. 1884-1893, 2008.

R DEVELOPMENT CORE TEAM. R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing, 2011.

WOOD, G. R.; LAI, C. D.; QIAO, C.G. Estimation of a proportion using several independent samples of binomial mixtures. The Australian and New Zealand Journal of Statistics, v. 47, n. 4, p. 441-448, 2005.

Received on August 16, 2012.

Accepted on November 19, 2012.

Cleide Silveira Brasil Peixoto (1) *, Marcelo Angelo Cirillo and Augusto Maciel da Silva

(1) Programa de Pos-graduacao em Estatistica e Experimentacao Agropecuaria, Departamento de Ciencias Exatas, Universidade Federal de Lavras, Cx. Postal 37, 37200-000, Lavras, Minas Gerais, Brazil. 2Departamento de Ciencias Exatas, Universidade Federal de Lavras, Lavras, Minas Gerais, Brazil. * Author for correspondence. E-mail: cleidebpeixoto@gmail.com

Table 1. Parametric rates to generate zero-inflated binomial samples. [gamma] = 0.2 [gamma] = 0.3 n [pi] n [pi] 30 0.5 and 0.7 30 0.5 and 0.7 40 0.5 and 0.7 40 0.5 and 0.7 50 0.5 and 0.7 50 0.5 and 0.7 60 0.5 and 0.7 60 0.5 and 0.7 70 0.5 and 0.7 70 0.5 and 0.7 80 0.5 and 0.7 80 0.5 and 0.7 90 0.5 and 0.7 90 0.5 and 0.7 Table 2. Coefficients used for linear function specifications. Family q Coefficient vector used in composition of F F1 3 [[??].sub.1] = (2,--1,--1) F2 5 [[??].sub.2] = (4,--1,--1,--1,--1) F3 7 [[??].sub.3] = (6,--1,--1,--1,--1,--1,--1) F4 10 [[??].sub.4] = (9,--1,--1,--1,--1,--1,--1,--1,--1,--1) Table 3. Comparative results of [[??].sub.male] and [[??].sub.zib] estimators, taking into account the parametric rate [pi] = 0.5 with [c.sub.2] = u = 1 restriction characterizing the systematic [[rho].sub.1] component. n [gamma] [c.sub.1] [[??].sub.male] Bias 30 0.2 0.2900 0.3995 -0.2010 30 0.3 0.4300 0.3504 -0.2992 40 0.2 0.2600 0.4006 -0.1988 40 0.3 0.4300 0.3499 -0.3002 50 0.2 0.2600 0.4002 -0.1996 50 0.3 0.4300 0.3508 -0.2984 60 0.2 0.2500 0.3985 -0.2030 60 0.3 0.4300 0.3506 -0.2988 70 0.2 0.2500 0.3998 -0.2024 70 0.3 0.4300 0.3504 -0.2992 80 0.2 0.2400 0.3991 -0.2018 80 0.3 0.4300 0.3497 -0.3006 90 0.2 0.2400 0.4000 -0.2000 90 0.3 0.4300 0.3499 -0.3002 n [[??].sub.zib] bias 30 0.5000 -0.0001 30 0.5018 0.0036 40 0.5007 0.0014 40 0.5001 0.0003 50 0.4941 -0.0119 50 0.4986 -0.0027 60 0.4983 -0.0034 60 0.4983 -0.0034 70 0.4913 -0.0173 70 0.4974 -0.0052 80 0.4962 -0.0076 80 0.4982 -0.0037 90 0.4914 -0.0171 90 0.4975 -0.0049 Table 4. Comparative results of [[??].sub.male] and [[??].sub.zib] estimators taking into account the parametric rate = 0.5 with [c.sub.1] = 0.1 and [c.sub.2] = 1 restriction characterizing the systematic [[rho].sub.2] component. n [gamma] u [[??].sub.male] Bias 30 0.2 0.1540 0.3995 -0.2010 30 0.3 0.1800 0.3514 -0.2972 40 0.2 0.1400 0.4000 -0.2000 40 0.3 0.1730 0.3499 -0.3002 50 0.2 0.1310 0.4003 -0.1994 50 0.3 0.1700 0.4002 -0.1995 60 0.2 0.1240 0.4002 -0.1995 60 0.3 0.1660 0.3507 -0.2986 70 0.2 0.1170 0.3993 -0.2014 70 0.3 0.1640 0.3509 -0.2982 80 0.2 0.1130 0.4003 -0.1994 80 0.3 0.1610 0.3500 -0.3000 90 0.2 0.1060 0.3999 -0.2002 90 0.3 0.1590 0.3491 -0.3018 n [[??].sub.zib] bias 30 0.5102 0.0203 30 0.4930 -0.0141 40 0.5026 0.0052 40 0.5020 0.0041 50 0.4997 -0.0006 50 0.4941 -0.0119 60 0.4952 -0.0096 60 0.4996 -0.0008 70 0.5039 0.0078 70 0.4970 -0.0061 80 0.4900 -0.0201 80 0.4982 -0.0037 90 0.5102 -0.0204 90 0.5000 -0.0010 Table 5. Comparative results of [[??].sub.male] and [[??].sub.zib] estimators taking into account the parametric rate [pi] = 0.7 with [c.sub.2] = u = 1 restriction characterizing the systematic [[rho].sub.1] component. n [gamma] [c.sub.1] [[??].sub.male] Bias 30 0.2 0.0001 0.5580 -0.2029 30 0.3 0.2700 0.4889 -0.3016 40 0.2 0.1500 0.5611 -0.1984 40 0.3 0.2700 0.4924 -0.2966 50 0.2 0.1500 0.5610 -0.1996 50 0.3 0.2700 0.4907 -0.1986 60 0.2 0.1500 0.5591 -0.2990 60 0.3 0.2700 0.4899 -0.3001 70 0.2 0.1500 0.5597 -0.2004 70 0.3 0.2800 0.4902 -0.2997 80 0.2 0.1500 0.5605 -0.1990 80 0.3 0.2800 0.4904 -0.2994 90 0.2 0.1500 0.5597 -0.2004 90 0.3 0.2800 0.4921 -0.2970 n [[??].sub.zib] bias 30 0.8322 0.0188 30 0.7011 0.0015 40 0.6977 -0.0033 40 0.6994 -0.0008 50 0.6961 -0.0056 50 0.7051 0.0073 60 0.6979 -0.0030 60 0.7050 0.0072 70 0.6961 -0.0056 70 0.6951 -0.0069 80 0.6944 -0.0080 80 0.6978 -0.0012 90 0.6976 -0.0035 90 0.6949 -0.0073 Table 6. Comparative results of [[??].sub.male] and [[??].sub.zib] estimators taking into account the parametric rate [pi] = 0.7 with [c.sub.1] = 0.1 and [c.sub.2] = 1 restriction characterizing the systematic [[rho].sub.2] component. n [gamma] [c.sub.1] [[??].sub.male] Bias 30 0.2 0.1300 0.5589 -0.2820 30 0.3 0.1400 0.4904 -0.2994 40 0.2 0.1270 0.5582 -0.2026 40 0.3 0.1400 0.4920 -0.2971 50 0.2 0.1240 0.5604 -0.1986 50 0.3 0.1400 0.4908 -0.2990 60 0.2 0.1220 0.5605 -0.2013 60 0.3 0.1390 0.4895 -0.2986 70 0.2 0.1190 0.5604 -0.3001 70 0.3 0.1390 0.4884 -0.2997 80 0.2 0.1180 0.5590 -0.1993 80 0.3 0.1390 0.4899 -0.2994 90 0.2 0.1160 0.5592 -0.2004 90 0.3 0.1390 0.4897 -0.2970 n [[??].sub.zib] bias 30 0.7078 0.0134 30 0.7025 0.0040 40 0.7020 -0.0019 40 0.6979 -0.0017 50 0.7001 0.0044 50 0.6972 -0.0014 60 0.6967 0.0017 60 0.7025 0.0028 70 0.7008 -0.0072 70 0.7015 0.0030 80 0.7002 -0.0008 80 0.6953 -0.0068 90 0.7048 -0.0038 90 0.6983 -0.0052 APPENDIX 1 R function for evaluation of fn, described in (5) fny = function (m, n, data, vet) { m: number of Bernoulli trials n: sample size data: binomial sample inflated by zeros vet: vector formed by 0,1,...,m for (a in 1:(m)) { prop = 0; aux = vet[a] for (b in 1:n) { if (aux == data[b]) prop = prop + 1 } v cont[a] = (prop) [n.sup.-1] } return (v cont) } Obtaining MLE when the specifications of the function arguments are given rfny = fny (m, n, data, vet) est_MLE = sum (data x rfny) [m.sup.-1] d = d binom (y, m, est MLE) input value on estimaPzib function x = (rfny [massa.sup.-1]) input value on estimaPzib function APPENDIX 2 Function to estimate the robust binomial proportion inflated by zeroes estimaPzib = function (x, d, c1, c2, u) { c1 and c2 : specification of constants to be used in [[rho].sub.1] u: constant to be used in [[rho].sub.2] x: rate to be replaced on function [[rho].sub.1] or [[rho].sub.2] d: densities given the maximum likelihood estimates estPzib = 0 for (b in 1 : length (x)) { if (x[b] > = c1 && x[b] < = c2) rho[b] = x[b]*log(x[b]) if (x[b] < c1) rho[b] = ((c1 ^ (1-u) * log(c 1) + ((1-u) * log(c1) + 1) * (c1 ^ (1-u) [u.sup.-1])) * x[b] ^ u) - (((1 - u) * log(c1) + 1) * c1 [u.sup.-1]) if(x[b] > c2) rho[b] = ((c2 ^ (1 - u) * log(c2) + ((1 - u) * log(c2) + 1) * (c2 ^ (1 - u) [u.sup.-1])) * x[b] ^ u) - (((1 - u) * log(c2) + 1) * c2 [u.sup.-1]) auxPzib = rho[b]*d[b]; estPzib = auxPzib + estPzib } return (estPzib) } Robust Estimate to zero excess With regard to this step, the researcher may choose between [[rho].sub.1] or [[rho].sub.2] u=0.13 function considering [rho]2 for any rate of u different from 1; u = 1 function considering [[rho].sub.1] Assume any value for c1, keeping the restriction c1 < c2 = 1 on P1 or P2 c1 is researched, like to an example assumed as c1 = 0.1 [c.sub.1] = 0.1; [c.sub.2] = 1 Evaluation of Pzibestimator Specify the functions arguments Pzib = estimaPzib (x, d, [c.sub.1], [c.sub.2], u)

Printer friendly Cite/link Email Feedback | |

Title Annotation: | texto en ingles |
---|---|

Author: | Peixoto, Cleide Silveira Brasil; Cirillo, Marcelo Angelo; da Silva, Augusto Maciel |

Publication: | Acta Scientiarum. Technology (UEM) |

Date: | Jan 1, 2015 |

Words: | 4633 |

Previous Article: | Modular panel with wood and particleboards of sugarcane bagasse for cattle handling facilities/Painel modular em madeira e chapas de particulas de... |

Next Article: | Acta Scientiarum. Technology improves visibility and increases the index in the Journal Citation Reports (JCR). |

Topics: |