# A new class of gamma distribution/Uma nova classe de distribuicoes gama.

IntroductionThe gamma distribution is used in a variety of applications including queue, financial and weather models. It can naturally be considered as the distribution of the waiting time between events distributed according to a Poisson process.

It is a two-parameter distribution, whose density is given by:

f(x) = [[[beta].sup.[alpha]]/[GAMMA]([alpha])] [x.sup.[alpha]-1][e.sup.-[beta]x], x > 0

where:

[alpha] > 0 is a shape parameter and [beta] > 0 is the reciprocal of a scale parameter.

Due to the importance of this distribution, recently some new distributions as well as families of probability distributions based on generalizations of the gamma distribution have been proposed. Given a distribution with continuous distribution function G (x) its generalization or exponentiated form G (x) is obtained by F (x) = [G.sup.a] (x), with a > 0 (power parameter). Gupta, Gupta, and Gupta (1998) proposed and studied some properties exponentiated gamma distribution.

Cordeiro, Ortega, and Silva (2011) extended the exponentiated gamma distribution defining a new distribution called Exponentiated Generalized gamma Distribution with four parameters, which is capable of modeling bathtub shaped failure rate phenomena.

Zografos and Balakrishnan (2009) defined a family of probability distributions based on the integration of a gamma distribution as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where:

G (x) is an arbitrary distribution function. When [alpha] = n +1 this distribution coincides with the distribution of the nth highest value record (Alzaatreh, Famoye, & Lee, 2014).

Alternatively, Ristic and Balakrishnan (2012) have proposed a new family of probability distributions, which is also based on the integration of the gamma distribution. They defined this new family as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where:

G(x) is an arbitrary distribution function. Similarly, when [alpha] = n + 1 this distribution coincides with the distribution of the nth smallest value record (Alzaatreh et al. 2014).

Following the line of work of Zografos and Balakrishnan (2009) and Ristic and Balakrishnan (2012), our goal in this work is to propose a new family of distributions based on gamma distribution. The family of distributions proposed here is the following:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (1)

where:

G (x) is an arbitrary distribution function and [H.sub.G] (x) has the same support as the distribution G(x). This new family shall be called gamma[(1-G)/G] class. The statistical properties of this new class, such as mean, variance, standard deviation, mean deviation, kurtosis, skewness, moment generating function, characteristic function and graphical analysis, are derived.

Then, to illustrate the applicability of the proposed new family, it is considered the particular case of the distribution obtained when taking into account that G (x) is the distribution function of an exponential random variable. By presenting mathematical structures for gamma- [(1 - G)/G] class, it was also derived statistical properties from this new distribution, and, to illustrate its potentiality, an application to a set of real data is performed. For this, the data set presented in the work of Choulakian and Stephens (2001) was used to verify if the models are well adjusted to this data. As comparative criteria of fitness of the models, it was considered the Akaike (AIC), and the Cramer-von Mises and Anderson-Darling tests. Both hypothesis tests, Anderson-Darling and Cramer-von Mises, are discussed in detail by Chen and Balakrishnan (1995) and belong to the class of quadratic statistics based on the empirical distribution function, since they work with the squared differences between the empirical distribution and the hypothetical.

Material and methods

Obtaining a class of probability distributions

The gamma-[(1 - G)/G] class is defined by the cumulative distribution function (cdf) (1) (for x > 0) which is equivalent to

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], (2)

where:

Q (a,z )=[GAMMA](a,z )/r(a) is the regularized incomplete gamma function and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], is the incomplete gamma function, and [GAMMA](a) is Euler gamma function. If the distribution G (x) has density g (x) the class will have a probability density function (pdf) given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (3)

The Equations (2) and (3) can be rewritten as a sum of exponentiated distributions. These distributions have been studied by some authors in recent years, as for example, Mudholkar and Srivastava (1993) for exponentiated Weibull, Gupta and Kundu (1999) for exponentiated exponential, among others.

Using the power series exponential, we rewrite (3) as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII.]

Furthermore, as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

it follows that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII.] (4)

Since [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] we can rewrite the distribution function as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII.]

Therefore,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII.] (5)

Next, we presented an expansion to gamma[(1-G)/G] class when G is discrete. If the distribution G(x) is discrete, [H.sub.G](x) is also discrete and we have that P (X = [x.sub.l]( = F([x.sub.l]) - F ([x.sub.l-1]). Therefore, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII.]

In addition, we can obtain the risk function of the new gamma-[(1-G)/G] class as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII.]

By inverting [H.sub.G] (x) - u (with, 0 < u < 1) it is obtained an explicit expression for uth quantile function as [H.sup.-1.sub.G] (u) = [G.sup.-1]{[beta]/[[Q.sup.-1] ([alpha],u) + [beta]]}, where [Q.sup.-1] ([alpha], u) is the inverse function of regularized incomplete gamma function.

Using the density and distribution function expansions, it is possible to get the statistical properties of the new class, as discussed below. Equations (4) and (5) are the main results of this subsection.

Moments and moment generating function

Several of the interesting characteristics and features of a probability model can be obtained using moments such as tendency, dispersion, skewness and kurtosis. The following equations are the development of the expansion calculations for the moments of order m for the gamma[(1-G)/G] class. The nth moment of a random variable having cdf (2) can be easily obtained from Equation (4). Hence, we have

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Therefore,

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (6)

The expression (6) is important since it generalizes the well-established probability weighted moments.

In particular, we have the following expansion of the mean for the gamma-[(1-G)/G] class

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The following is the development of the expansion calculations for the moment generating function for the gamma-[(1-G)/G] class. We have from Equation (4),

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Using the fact that [e.sup.tx] = [[infinity].summation over (m=0)] [[t.sup.m][x.sup.m]/m!], we can rewrite [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Therefore, using (6), the last equation can be expressed as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Similarly, one can establish the following expansion for the characteristic function for the gamma- [(1-G)/G] class

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Central moments and general coefficient

We will look at the development of the expansion calculations for central moments of order m to the gamma- [(1-G)/G] class. This measure can be calculated as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

or equivalently

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Since

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

it follows that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (7)

In particular, by expanding the range of variance for the gamma-[(1 - G)/G] class we have:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (8)

A new generalization called general coefficient, which extends the skewness and kurtosis, is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (9)

Substituting (7) and (8) in Equation (9), we obtain

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Note that, in particular, as m = 3 and m = 4 in [C.sub.g] (m) we obtain expansions to skewness and kurtosis measures, respectively.

Maximum likelihood estimation and Renyi entropy

After knowing a few regularity conditions, the maximum likelihood estimates (MLEs) can be obtained by equating the derivative of the log-likelihood function with respect to each parameter to zero. We determine the MLEs of the parameters of the gamma-[(1- G)/G] class from complete samples only. Let [x.sub.1], ..., [x.sub.n] be a random sample of size n from the new class, where [[theta].bar] is a vector of unknown parameters in the parent distribution G(x;[[theta].bar]). Earlier in section we wrote g (x)=g (x;[[theta].bar]) and G(x)=G(x;[[theta].bar]) to emphasize the parametric vector. The log-likelihood function for the vector of parameters [theta] = [([alpha], [beta], [[[theta].bar].sup.[tau]]).sup.T] can be obtained as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (10)

The log-likelihood can be maximized, for example, either directly by using the SAS (ProcNLMixed) or by using the nonlinear likelihood expressions obtained by differentiating (10). The components of the score vector U([theta]) are given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

and

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where:

[psi]([alpha]) = [d[gamma]([alpha])/d[alpha]] is the digamma function.

Entropy is a measure of uncertainty in the sense that the higher the entropy value, the lowest the information and the greater the uncertainty, or the greater the randomness or disorder. The following is the expansion entropy calculations for the gamma-[(1 - G)/G] class, using the Renyi entropy, which is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Substituting the expressions of density and cumulative distribution function given by Equations (3) and (2), respectively, we have

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

By expanding the exponential function in Taylor series as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

we have

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Now, using the following binomial expansion

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

it follows that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Thus, an explicit expression for Renyi entropy can be written

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

which, in turn, implies that (using Equation (6))

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Results and discussion

Special model

This section, will examine a particular distribution of the gamma-[(1-G)/G] class proposed here. It will be considered the particular case in which G (x) = 1 - [e.sup.-[lambda]], x > 0, that is called the gamma-[(1- Exp)/Exp] distribution.

The gamma-[(1 - Exp)/Exp] distribution

Considering G (x) the cdf of the exponential distribution with parameter [lambda] in Equation (2), we have the gamma-[(1 - Exp)/Exp] distribution:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Differentiating H (x), we get the density function of the gamma-[(1- Exp)/Exp] distribution:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Figure 1 show the graph of the gamma[(1 - Exp)/Exp] distribution probability density functions and cumulative distribution, for some values of the parameters.

We can also obtain the risk function using the gamma- [(1 - Exp)/Exp] distribution as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Figure shows the graph of the risk function using the gamma-[(1- Exp)/Exp] distribution generated from some values assigned to parameters.

Using procedure similar to what was done in pdf and cdf expansions, the pdf and cdf of the gamma-[(1 - Exp)/Exp] distribution we can rewritten as a sum of exponentiated exponentials, as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (11)

and

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. (12)

Various properties of the exponentiated exponential can be obtained from Gupta and Kundu (1999). Using expansions (11) and (12), it is possible to obtain mathematical quantities of the special model such as ordinary and central moments, moment generating and characteristic functions, general coefficient, Renyi entropy and some others from quantities exponentiated of exponential distribution. For example, we consider only moments for reasons of space. The mth ordinary moment of the special model can be expressed as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

In particular, we have that the mean of the gamma-[(1 -Exp/Exp] distribution is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Let [x.sub.1], ..., [x.sub.n] be a sample of the size n from X ~ gamma-[(1 -Exp)/Exp]([alpha], [beta], [lambda]). The log-likelihood function for the vector of parameters [theta] = [([alpha], [beta], [[lambda].sup.T]).sup.T] can be obtained as

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The components of the score vector u(g) are given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

and

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

where:

[psi]([alpha]) = [d[GAMMA]([alpha])/d[alpha]].

Application

In this section, an application to real data for the proposed gamma distribution will be displayed. The data used in this research are from the excesses of flood peaks (in [m.sup.3] [s.sup.-1]) Wheaton river near Carcross in the Yukon Territory, Canada. Seventy-two exceedances of the years 1958 to 1984 were recorded, rounded to one decimal place. These data were analyzed by Choulakian and Stephens (2001), and are presented in Table 1.

It is worth mentioning that this data set has also been analyzed by means of the distributions of Pareto, Weibull three parameters, the generalized Pareto and Beta--Pareto (Akinsete, Famoye & Lee, 2008).

In Table 2, we can see the maximum likelihood estimates obtained by the NewtonRaphson implemented in SAS 9.1 statistical software, parameters, standard errors, Akaike information criterion and Anderson-Darling statistics (A*) and Cramer von Mises (W*) to the gamma-[-log(1-Exp)] distribution (M1), gamma-[(1-Exp)/Exp] distribution (proposed model, M2), exponentiated Weibull (M3), modified Weibull (M4), beta Pareto (M5) and Weibull (M6). Its densities are given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

and

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where:

B ([alpha], [beta]) denotes the beta function and the parameters above are all positive real numbers.

For the six distributions shown in Table 2, the data applied to Wheaton river flooding, it was observed that beta-Pareto model (M5), which was described by Akinsete et al. (2008) as the best fitted model, in our studies had a lower performance with AIC = 524.398, A* = 2.0412 and W* = 0.3516, when compared to the proposed gamma-[(1- Exp)/Exp] model (M2) that obtained AIC = 505.030, A* = 0.4516 and W* = 0.0757. Also according to Table 2, the proposed distribution model M2 is the best tested once the lowest values of AIC, A* and W* are from such distribution.

The plots of the fitted gamma-[(1- Exp)/Exp] pdf and two better fitted pdfs are displayed in Figure 3. The graph shows that the gamma-[(1 - Exp)/Exp] model has similar behavior to that of other distributions, being very competitive in the analysis of such data.

Conclusion

As concluding remarks, we note that the class of gamma-[(1- G)/G] probability distributions developed in this work is a novel way of generalizing the gamma distribution and can be applied in different areas depending on the choice of the distribution G. In a future research, we intend to carry out more detailed comparisons between the novel distribution family proposed in this paper and the family of distributions investigated in Zografos and Balakrishnan (2009), which are also based on the integration of the gamma distribution.

In this paper, we study in detail only a distribution of the gamma-[(1-G)/G] class, namely the gamma-[(1 - Exp)/Exp] distribution. Some properties of this distribution were derived and applied to a set of real data, obtaining better fit than that obtained in a previous study by Akinsete et al. (2008). We intend to conduct the study of new distributions within this class as future work.

We note that, after adding several parameters to a model, this model can become better adjusted to a particular phenomenon due to its greater flexibility. On the other hand, one should not forget that there may be a problem for the estimation of the parameters, since it can occur both computational and identifiability problems in parameter estimation. Thus, the ideal is to choose a model that reflects well the phenomenon / experiment with the minimum number of parameters. In the case of the proposed class in this research, only two additional parameters are added to the set of parameters of the G distribution.

Doi: 10.4025/actascitechnol.v39i1.29890

References

Akinsete, A., Famoye, F., & Lee, C. (2008). The beta-Pareto distribution. Statistics, 42(6), 547-563.

Alzaatreh, A., Famoye, F., & Lee, C. (2014). The gamma-normal distribution: properties and applications.

Computational Statistics and Data Analysis, 69(1), 67-80. Chen, G., & Balakrishnan, N. (1995). The general purpose approximate goodness-of-fit test. Journal of Quality Technology, 27(2), 154-161.

Choulakian, V., & Stephens, M. A. (2001). Goodness-of-fit for the generalized Pareto distribution. Technometrics, 43(4), 478-484.

Cordeiro, G. M., Ortega, E. M. M., & Silva, G. O. (2011). The exponentiated generalized gamma distribution with application to lifetime date. Journal of Statistical Computation and Simulation, 81(7), 827-842.

Gupta R. C., Gupta, P. L., & Gupta, R. D. (1998). Modeling failure time data by Lehman Alternative. Communication in Statistics - Theory and Methods, 27(4), 877-904.

Gupta, R. D., & Kundu, D. (1999). Theory & Methods: Generalized exponential distributions. Australian and New Zealand Journal of Statistics, 41(2), 173-188.

Mudholkar, G. S., & Srivastava, D. K. (1993). Exponentiated weibull family for analyzing bathtub failure-rate data. IEEE Transactions on Riliability, 42(2), 299-302.

Ristic, M. M., & Balakrishnan, N. (2012). The gamma exponentiated exponential distribution. Journal of Statistical Computation and Simulation, 82(8), 1191-1206.

Zografos, K., & Balakrishnan, N. (2009). On the families of beta-and gamma-generated generalized distribution and associated inference. Statistical Methodological, 6(4), 344-362.

Received on November 20, 2015.

Accepted on May 4, 2016.

License information: This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Cicero Carlos Brito (1), Frank Gomes-Silva (1) *, Leandro Chaves Rego (2) and Wilson Rosa de Oliveira (1)

(1) Departamento de Estatistica e Informatica, Universidade Federal Rural de Pernambuco, Rua Dom Manoel de Medeiros, s/n, 52171-900, Campus Dois Irmaos, Recife, Pernambuco, Brazil. (2) Departamento de Estatistica e Matematica Aplicada, Universidade Federal do Ceara, Ceara, Brazil. * Author for correspondence. E-mail: franksinatrags@gmail.com

Caption: Figure 1. In right pdf and left cdf of the gamma- [(1 - Exp)/Exp] distribution for some values of [lambda].

Caption: Figure 2. Plots of the risk function for some parameter values.

Caption: Figure 3. Fitted distributions to the mass data of flood peaks in Wheaton river.

Table 1. Full excess peaks in [m.sup.3] [s.sup.-1] Rio Wheaton. Excess flood peaks of Rio Wheaton ([m.sup.3] [s.sup.-1]) 1.7 2.2 14.4 1.1 0.4 20.6 5.3 0.7 1.9 13.0 12.0 9.3 1.4 18.7 8.5 25.5 11.6 14.1 22.1 1.1 2.5 14.4 1.7 37.6 0.6 2.2 39.0 0.3 15.0 11.0 7.3 22.9 1.7 0.1 1.1 0.6 9.0 1.7 7.0 20.1 0.4 2.8 14.1 9.9 10.4 10.7 30.0 3.6 5.6 30.8 13.3 4.2 25.5 3.4 11.9 21.5 27.6 36.4 2.7 64.0 1.5 2.5 27.4 1.0 27.1 20.2 16.8 5.3 9.7 27.5 2.5 27.0 Table 2. Estimated maximum likelihood parameter, errors (standard errors in parentheses) and calculations of AIC statistics, AIC, BIC, HQIC, tests A* and W* for the M1 to M6 distributions. Models [??] [??] [??] [??] AIC A* W* M1 0.838 0.035 1.960 -- 508.689 0.752 0.131 (0.121) (0.007) (<E-3) -- M2 0.131 0.179 0.539 -- 505.030 0.452 0.076 (0.053) (0.070) (0.251) -- M3 1,387 0.519 0.050 -- 508.050 1.414 0.253 (0.590) (0.312) (0.021) -- M4 0.776 0.124 0.010 -- 507.343 0.594 0.098 (0.124) (0.035) (0.008) -- M5 84.682 65.574 0.063 0.010 524.398 2.041 0.352 (<E-3) (<E-3) (0.005) (<E-3) M6 0.901 0.086 -- -- 506.997 0.785 0.138 (0.086) (0.012) -- --