Printer Friendly

ON SOME APPLICATIONS AND SIMULATIONS OF COUNTING PROCESSES.

1. Introduction

Stochastic processes are random variables that change and evolve in time. As such, they play an important role in probability theory and related fields, providing good models for many real-life phenomena. Among them, counting processes deal with the number of occurrences of some events in time. As the name says, they count events, such as job arrivals (in a queuing system), message transmissions, customers served, completed tasks, holding times, times of various events occurrence, etc. They have many applications in renewal processes, survival analysis, seismology, software reliability and many other fields (see [1, 3, 4]).

In this paper, we discuss Binomial and Poisson counting processes, presenting their main characteristics, some applications and simulations.

1.1 Random Variables

We start with a brief review of basic notions from Probability Theory. Let S denote the sample space (the set of all possible outcomes) of some experiment and P a probability function (see e.g. [2, 8]).

Definition 1.1. A random variable is a function X : S [right arrow] R for which P (X [less than or equal to] x) exists, for all x [member of]R. If X(S) [??] M is a countable subset of R, then X is called a discrete random variable, otherwise, it is a continuous random variable.

A discrete random variable X is described by its probability distribution function (pdf) or probability mass function (pmf), an array that contains all the values taken by it, [x.sub.i], and the corresponding probabilities [p.sub.i] = P(X = [x.sub.i]),

[mathematical expression not reproducible] (1.1)

If X is a continuous random variable, we give its probability density function (pdf), i.e. the function f:R[right arrow] R,f(x) = F'(x).

We recall some discrete probability laws, which will be used later on.

Bernoulli distribution Bern(p), with parameter p [member of] (0,1). This is one of the simplest discrete distributions, with pdf

[mathematical expression not reproducible] (1.2)

This distribution best models a Bernoulli trial, i.e. the occurrence of "success/failure" in a trial.

Binomial distribution B(n,p), with parameters n[member of]N, p [member of] (0,1). In a series of n Bernoulli trials with success probability p in each trial (q = 1 - p), we let X denote the number of successes that occur. Then X has a Binomial distribution, with pdf

[mathematical expression not reproducible] (1.3)

It is easy to notice that a Binomial B (n, p) variable is the sum of n independent Bern(p) variables and that Bern(p) = B(1,p).

Geometric distribution Geo(p), with parameter p [member of] (0,1). In an infinite sequence of Bernoulli trials with success probability p in each trial (q = 1 - p), let X denote the number of failures that occur before the first success. Then X has a Geometric distribution, with pdf

[mathematical expression not reproducible] (1.4)

Shifted Geometric distribution SGeo(p), with parameter p [member of] (0,1). In the context described above, this is the number of trials needed to get the first success (as opposed to failures in the previous pdf). If X has a Geo(p) distribution, then X + 1 has a Shifted Geometric distribution, hence, the name.

Poisson distribution Poiss([lambda]), with parameter [lambda]> 0, with pdf

[mathematical expression not reproducible] (1.5)

A Poisson variable X denotes the number of "rare" events (discrete occurrences of infrequently observed events) that occur in a given interval of time. The parameter [lambda] represents the average number of such rare events per time unit. The Poisson distribution is used to model number of occurrences of discrete events in an interval of time, such as arrival of jobs (tasks, messages, signals, phone calls, etc.), accidents, earthquakes, that happen in given area, errors found in software, etc. In what follows, we recall some important continuous distributions. Note that we only mention the expression of the pdf on the region where it is non-zero (so it is understood that it is equal to 0, elsewhere).

Uniform distribution Unif (a, b), with parameters a < b [member of] R, has pdf

f(x) = [1/b-a] x [member of] (a,b). (1.6)

It is used when a variable can take any value randomly from an interval, so all values from an interval are equally probable to be taken by that random variable. Exponential distribution Exp([lambda]) with parameter [lambda]> 0, has pdf

f(x) = [[lambda]e.sub.-[lambda]x], x>0. (1.7)

An Exponential variable models time: interarrival time (time between arrival of jobs), halftime, failure time, time between rare events, etc. The parameter [lambda] represents the frequency of rare events, measured in [time.sup.-1].

It is worth mentioning that in a sequence of rare events, where the number of rare events occurring in an interval of time of length t has a Poiss([lambda]t) distribution, the time between rare events is modeled by an Exp([lambda]) distribution.

1.2 Stochastic processes and Markov chains

Many random variables are not static, they change and develop in time.

Definition 1.3. A stochastic process is a random variable that depends on time. It is denoted by X(t, e) or [X.sub.t](e), where t [member of] T is time and e [member of] S is an outcome. The values of X(t, e) are called states.

If t [member of] T is fixed, then [X.sub.t] is a random variable, while if we fix e [member of] S, [X.sub.e] is a function of time, called a realization or sample path of the process X(t, e).

Definition 1.4. A stochastic process is called discrete-state if [X.sub.t](e) is a discrete random variable for all t [member of] T and continuous-state if [X.sub.t] (e) is a continuous random variable, for all t [member of] T.

Similarly, a stochastic process is discrete-time if the set T is discrete and continuous-time if the set of times Tis an interval (bounded or unbounded) in R. In what follows, we omit writing e as an argument of a stochastic process (as it is usually done when writing random variables).

Definition 1.5. A stochastic process [X.sub.t] is Markov if for any times [t.sub.1] < [t.sub.2] < *** < [t.sub.n] < t and any sets [A.sub.1],[A.sub.2],..., [A.sub.n],A,

[mathematical expression not reproducible] 1.8)

Definition 1.6. A Markov chain is a discrete-state, discrete-time, Markov stochastic process.

To simplify the notations, we do the following: A Markov chain is a discrete-time process, so we can see it as a sequence of random variable {[X.sub.0],[X.sub.1],....}, where [X.sub.k] describes the situation at time t = k. Since it is also a discrete-state process, we can denote the states by 1,2,...n (they may start at other value and n may not be finite). Then the random variable [X.sub.k] has the pdf

[mathematical expression not reproducible] (1.9)

where [P.sub.k](1) = P([X.sub.k] = 1),...,[P.sub.k](n) = P([X.sub.k] = n). Then the Markov property (1.8) can be written as follows:

P([X.sub.t+1] = j| [X.sub.t] = i,[X.sub.t-1] = I,...) = P([X.sub.t+1] =j | [X.sub.t] = i), for all t [member of] T. (1.10)

All this information is summarized in a matrix. Definition 1.7.

- The probabilities

pij(t) = P([X.sub.t+1] = j | [X.sub.t] = 0, [p.sub.(h).sub.ij] (t) = P([X.sub.t+h] = j | [X.sub.t] = i) (1.11)

are called the transition and h-step transition probability, respectively. - The matrices

[mathematical expression not reproducible] (1.12)

are called the transition and h-step transition probability matrix, respectively, at time t.

Definition 1.8. A Markov chain is said to be homogeneous (stationary) if all transition probabilities are independent of time.

Proposition 1.9. Let {[X.sub.0], [X.sub.1], be a Markov chain. Then the following hold: p(h) = [p.sup.h], for all h = 1,2,... [p.sub.i]. = [P.sub.0]-[P.sup.i], for all i = 0,1,... (1.13)

Definition 1.10. Let X be a Markov chain. The vector [pi] = [[[pi].sub.1],...,[[pi].sub.n]], where [[pi].sub.k] = [lim.sub.h[right arrow][infinity]][Psub.h] (k),k = 1,...,n, (if it exists) is called a steady-state distribution of X.

Proposition 1.11. The steady-state distribution of a homogeneous Markov chain X, [pi] = [[[pi].sub.1],...,[[pi].sub.n]], if it exists, is unique and is the solution of the (n + 1) X n linear system

[mathematical expression not reproducible] (1.14)

Definition 1.12. A Markov chain is called regular if there exists h [greater than or equal to] 0, such that

[P.sup.(h).sub.ij > 0,

for all i,j = 1,...,n.

Proposition 2.13. Any regular Markov chain has a steady-state distribution. For more considerations on stochastic processes and Markov chains, see e.g. [2,5].

1. Counting Processes

A special case of stochastic processes are the ones where one needs to count the occurrences of some types of events over time. These are described by counting processes.

Definition 2.1. A counting process X(t), t [greater than or equal to] 0, is a stochastic process that represents the number of items counted by time t.

Counting processes deal with the number of occurrences of something over time, such as customers arriving at a supermarket, deleted errors, transmitted messages, number of job arrivals to a queue, holding times (in renewal processes), etc. In general, we refer to the occurrence of each event that is being counted as an "arrival". Since their sample paths (values) are always non-decreasing, non-negative integers, all counting processes are discrete-state stochastic processes. They can be discrete-time or continuous-time. Next, we will consider the most widely used examples, Binomial (discrete-time) and Poisson (continuous-time) counting processes.

2.1. Binomial counting process

Consider a sequence of Bernoulli trials with probability of success p and count the number of "successes".

Definition 2.2. A Binomial counting process X(n), is the number of successes in n Bernoulli trials, n = 0,1,....

Remark 2.3.

1. Obviously, a Binomial process X(n) is a discrete state, discrete-time stochastic process.

2. The pdf of X(n) is B(n, p) at any time n. Recall that E(X(n)) = np.

3. The number of trials between two consecutive successes, Y, is the number of trials needed to get the next (first) success, so it has a SGeo(p) pdf. Recall that E(Y) = [1/p], V(Y) = q/[p.sup.2]

It is important to make the distinction between real time and the "time" variable n ("time" as in a stochastic process). Variable n is not measured in time units, it measures the number of trials. Let us assume that Bernoulli trials occur at equal time intervals, say every [DELTA] seconds (or other time measurement units). That means that n trials occur during time t = n[DELTA]. The value of the process at time t has Binomial pdf with parameters n = [t/[DELTA]] and p. Then the expected number of successes during t seconds is

E(X(n)) = E(X([t/[DELTA]])) = np = [t/[DELTA]]p,

so the expected number of successes per second is

[lambda] = [p/[DELTA]].

Definition 2.4. The quantity [lambda] = [lambda] = [p/[DELTA]] is called the arrival rate, i.e. the average number of successes per unit of time. The quantity [DELTA] is called a frame, i. e. the time interval of each Bernoulli trial. The interarrival time is the time between successes.

We can now rephrase: p is the probability of arrival (success) during one frame (trial), n = [t/[DELTA]] is the number of frames during time t, X([t/[DELTA]]) is the number of arrivals by time t.

The concepts of arrival rate and interarrival time describe arrival of jobs in discrete-time queuing systems. In such models it is assumed that no more than 1 arrival can occur during each [DELTA]-second frame (otherwise, a smaller [DELTA] should be considered), thus, each frame is a Bernoulli trial.

The interarrival time, Y, measured in number of frames, has a SGeo(p) pdf (as mentioned earlier). Since each frame takes [DELTA] seconds, the interarrival time is T = [DELTA]Y, a rescaled SGeo(p) variable, whose expected value and variance are given by

E(T) = [DELTA]E(Y) = [1/[lambda]], V(T) = [[DELTA].sup.2]V(Y) =q/[[lambda].sup.2] (2.1)

Markov property of Binomial counting processes

Obviously, the number of successes in n trials depends only on the number of successes in n - 1 trials (not previous values n - 2,n - 3,...), so a Binomial process has the Markov property. Thus, it is a Markov chain. Let us find the transition probability matrix. At each trial (i.e. during each frame), the number of successes X(n) either increases by 1 (in case of success), or stays the same (in case of failure). Then,

[mathematical expression not reproducible] (2.2)

Obviously, transition probabilities are constant over time. Hence, X(n) is a stationary Markov chain with transition probability matrix given by

[mathematical expression not reproducible] (2.3)

Notice that it is an irregular Markov chain. Since X(n) is non-decreasing, e.g. p.sup.(h).sub.i,i-1] = 0, for all h[greater than or equal to] 0 (once we have a number of success, that number can never decrease). Hence, a Binomial counting process does not have a steady-state distribution.

Simulation of a Binomial counting process

A Binomial counting process is a sequence of Bernoulli trials. We simulate those (see e.g. [7]) and count the number of successes.

2.2 Poisson counting process

Next, we want to obtain a continuous-time counting process, where the time t runs continuously through an interval and, thus, X(t) changes at infinitely many moments. This can be obtained as a limit of some discrete-time process whose frame size (time between trials) [DELTA] approaches 0 (thus allowing more frames during any fixed period of time). We will let

[DELTA][right arrow] 0, as n [right arrow] [infinity]

while keeping the arrival rate [lambda] = const.

So, let us take the limiting case of a Binomial counting process as [DELTA] [right arrow] 0. Let us consider a Binomial counting process with arrival rate of [lambda] / time unit. X(t) denotes the number of arrivals by time t.

The number of frames during time t, n = [t/[DELTA]] [right arrow] [infinity], as [DELTA][right arrow] 0.

The probability of an arrival during a frame, p = [lambda][DELTA][right arrow] 0, as A[DELTA] 0.

Thus, the two parameters of the Binomial pdf approach, one [infinity], the other 0, yet [lambda] remains constant.

X(t) has a B(n,p) distribution with pdf P(X = k) = [C.sup.k.sub.n][p.sup.k][(1 - p).sup.n-k], k = [bar.0/n.] Let us see what this becomes:

[mathematical expression not reproducible]

So, the limiting pdf is

[mathematical expression not reproducible]

which means X(t) has a Poiss([lambda]t) distribution. This is a Poisson counting process.

Let us analyze what happens to the other characteristics. Recall that the interarrival time T = [DELTA]Y, where Y has SGeo(p) pdf. For its cdf, we have

[mathematical expression not reproducible]

Then its pdf is

[f.sub.T](t) = [F'.sub.T](t) = [[lambda]e.sup.-[lambda]t],t>0,

so T has an Exp ([lambda]) pdf.

Simulation of a Poisson counting process

A Poisson counting process can be simulated by a special method, using the fact that each interarrival time (and the first arrival time) has an Exp ([lambda]) pdf, which can be easily generated using the Inverse Transform Method by -[1/[lambda]]U, where U has a Unif(0,1) pdf (see [6]).

3 Applications

Example 3.1. Messages arrive at a communication center at the rate of 6 messages per minute. Assume arrivals of messages are modeled by a Binomial counting process.

1. What frame size should be used to guarantee that the probability of a message arriving during each frame is 0.1 ?

We have [lambda] = 6/min. and p = 0.1. Thus,

[lambda] = [p/[DELTA]] = [1/60]min. = 1 sec.

2. Using the chosen frame size, find the probability of no messages arriving during the next 1 minute;

So [DELTA]= 1 sec. In t =1 minute = 60 seconds, there are n = t/[DELTA] = 60 frames. The number of messages arriving during 60 frames, X(60), has a Binomial distribution with parameters n = 60 and p = 0.1. So the desired probability is

P(X(60) = 0) = [pdf.sub.X(60)](0) =0.0018.

3. Using the chosen frame size, find the probability of more than 350 messages arriving during the next hour;

Similarly, in t =1 hour=3600 seconds, there are n = [t/[DELTA]] = 3600 frames. Thus, the number of messages arriving during one hour, X(3600), has a binomial distribution with parameters n =3600 and p =0.1. Then, the probability of more than 350 messages arriving during the next hour is

P(X(3600) > 350) = 1 - P(X(3600) [less than or equal to] 350) = 1 - [cdf.sub.x(3600)](350) =0.6993.

4. Find the average interarrival time and its standard deviation.

By (2.1) we have

E(T) = [1/[lambda]] = [1/6] minutes =10 seconds, Std(T) = [square root of V(T)] = [square root of [1-p/[[lambda].sup.2]]]= [square root of 0.0250] minutes [approximately equal to] 9.5 seconds.

5. Find the probability that the next message does not arrive during the next 20 seconds.

Recall that the interarrival time T = [DELTA]Y, where Y has SGeo(p) distribution and, hence Y- 1 has a Geo(p) pdf. The next message does not arrive during the next 20 seconds, if T > 20. So,

P(T > 20) = P([DELTA]Y > 20) = P(Y > 20) = 1 - P(Y[less than or equal to]20) = 1 - P(Y - 1[less than or equal to] 19) = 1 - [cdf.sub.Y-1](19) = 0.1216.

Note that this is also the probability of 0 arrivals during n =[t/[DELTA]] = 20 frames. The number of messages arriving during the next 20 seconds, X(20), has a Binomial distribution with parameters n = 20 and p = 0.1. Thus, the probability that no messages arrive during the next 20 seconds is

P(X(20) = 0) = [pdf.sub.x(20)](0) = 0.1216.

6. The following MATLAB code gives a simulation and illustration for the number of messages arriving in one minute:
% Simulation of Binomial counting process with Del frame size.
clear all
  N = input('size of sample path = ');
  p = input('prob. of success (arrival) = ');
  Del = input('frame size (in seconds) = ');
  X = zeros(1, N);
  X(1) = (rand < p); % X denotes the nr. of successes
  for t = 2 : N
    X(t) = X(t - 1) + (rand < p); % count the nr. of
  successes
  end
clf
  % illustration
  axis([0 N 0 max(X)]); % allot the box for the simulated
segment
  hold on
  title('Binomial process of arrivals');
  xlabel('time'); ylabel('nr. of arrivals');
  for t = 1 : N
    plot(t, X(t), '*', 'MarkerSize', 8)% plot each point with a '*'
  pause(Del)
end


The program returns a sequence of the number of arrivals (messages) each second. The illustration is shown in Figure 1.
size of sample path = 60
prob. of success (arrival) = 0.1
frame size (in seconds) = 1
X = 001111222222222222222222233333344444444444445555566667888888


Example 3.2. Let us revisit the previous example and model the arrival of messages using a Poisson counting process, keeping the same arrival rate of 6 messages per minute.

1. Find the probability of no messages arriving during the next 1 minute; We have t =1 minute and [lambda]= 6 / minute. The number of messages arriving during 1 minute, X(1), has a Poisson distribution with parameter [lambda]t = 6. So the desired probability is

P(X(1) = 0)= [pdf.sub.x(1)](0) = 0.0025.

2. Find the probability of more than 350 messages arriving during the next hour; Similarly, in 1 hour = 60 minutes, the number of arriving messages, X(1), has a Poisson distribution with parameter [lambda]t = 360. Then the probability of more than 350 messages arriving during the next hour is

P(X(60) > 350) = 1 - P(X(60)) [less than or equal to] 350) = 1 - [cdf.sub.X(60)](350) = 0.6 894.

3. What is the average interarrival time and its standard deviation?

The interarrival time, T, now has an Exp([lambda]) = Exp(6) distribution, so

E(T) = [1/[lambda]] = [1/6] minutes =10 seconds, Std(T) = [square root of V(T)] = [square root of [1/[[lambda].sup.2]]] = [1/6] minutes =10 seconds.

Notice that the average interarrival time has not changed. This is to be expected, since jobs (messages) arrive at the same rate, [lambda], regardless of whether their arrivals are modeled by a Binomial or a Poisson process.

However, the standard deviation is slightly increased. That is because a Binomial process has a restriction on the number of arrivals during each frame, thus reducing variability.

4. Find the probability that the next message does not arrive during the next 20 seconds.

Either we work with seconds (so [lambda] = [1/10]/second) and compute the probability P(T > 20), where T has an Exp(1/10) distribution) or in minutes ([lambda] = 6 /minute, 20 seconds = 1/3 minutes) and compute the probability P(T >1/3), where Thas an Exp(6) distribution. Either way, we have

P(T >20) = 1 - P(T [less than or equal to] 20) = 1 - [cdf.sub.T](20) = 0.1353.

Again, this is the same as 0 arrivals in 1/3 minutes, where the number of arriving messages, X(1/3), has a Poisson distribution with parameter [lambda]t = 2.

P (x ([1/3]) = [theta])j = [pdf.sub.X([1/3])] (0) = 0.1353.

5. The following MATLAB code gives a simulation and illustration for the arrival times and the number of messages arriving in one minute:
% Simulation a of a Poisson process on the time interval [0, Tmax].
clear all
lambda = input('frequency lambda = '); % given arrival rate Tmax =
input('time frame (in minutes) Tmax = '); % given time period
arr times = -1/lambda * log(rand); % array containing arrival times
last arrival = arr_times; % each interarriv. time is Exp(lambda)
while last arrival <= Tmax
    last_arrival = last_arrival - 1/lambda * log(rand);
    arr times = [arr_times, last arrival];
end;
arr_times = arr_times(1:end - 1) % nr. of arrivals during time Tmax
                        % last last_arr should not be included
% illustration
step = 0.01; % small step size, to simulate continuity
t = 0 : step : Tmax; % time variable
Nsteps = length(t);
X = zeros(1, Nsteps); % Poisson process X(t)
for s = 1 : Nsteps;
    X(s) = sum(arr_times <= t(s));
  end; % X(s) is the number of arrivals by the time s
axis([0 max(t) 0 max(X)]); hold on
title('Poisson process of arrivals');
xlabel('time'); ylabel('number of arrivals');
plot(t, X, 'r')


The program returns the arriving times and a sequence for the number of messages arriving every 0.01 of a minute (or every 0.6 of a second). The illustration is shown in Figure 2.
frequency lambda = 6
time frame Tmax = 1
arr_times = 0.0912 0.3310 0.4608 0.4671 0.5677 0.6763 0.9201
X = 00000000001111111111111111111111112222222222222444
    444444455555555555666666666666666666666666677777777


Note. All computations of pdf's and cdf's were done in MATLAB.

4 Conclusions

This paper presents the main properties of counting processes. We present theoretical considerations, computational formulas and discuss properties of Binomial and Poisson counting processes. We also describe how each counting process can be simulated on the computer. As applications, we model a problem (message arrivals) by both a Binomial and a Poisson counting process, computing and discussing several quantities relating to the problem. We also illustrate the arrivals of messages in both cases using computer simulations.

References

[1] F. Ball, R. K. Milne, Simple Derivations of Properties of Counting Processes Associated with Markov Renewal Processes, J. of Appl. Probab., Vol. 42(4), 2005,1031-1043.

[2] M. Baron, Probability and Statistics for Computer Scientists, 2nd Edition, CRC Press, Taylor & Francis, Boca Raton, FL, USA, 2014.

[3] C. Breto, E. L. Ionides, Compound Markov counting processes and their applications to modeling infinitesimally over-dispersed systems, Stoch. Proc. Appl., Vol. 121(11), 2011, 2571-2591.

[4] G. Conforti, Bridges of Markov counting processes: quantitative estimates, Electron. Commun. Probab., Vol. 21, 2016, paper no. 19, 13 pp.

[5] S. Micula, R. Sobolu, Applications and Computer Simulations of Markov Chains, J. of Information Systems and Operations Management, Vol. 11(2), 2017, 243-253.

[6] S. Micula, I. D. Pop, Simulations of Continuous Random Variables and Monte Carlo Methods, J. of Information Systems and Operations Management, Vol. 10(2), 2016, 435-447.

[7] S. Micula, Statistical Computer Simulations and Monte Carlo Methods, J. of Information Systems and Operations Management, Vol. 9(2), 2015, 384-394.

[8] J.S. Milton, J. C. Arnold, Introduction to Probability and Statistics: Principles and Applications for Engineering and the Computing Sciences, 3rd Edition. McGraw-Hill, New York, 1995.

[9] http://www.mathworks.com/help/matlab/, 2019.

Sanda MICULA (1)

Rodica SOBOLU (2)

(1) Assoc. Prof., PhD. Habil., Department of Mathematics and Computer Science, Babes.-Bolyai University, Cluj-Napoca, Romania, email: smicula@math.ubbcluj.ro

(2) Corresponding author, Assoc. Prof., PhD., Department of Land Measurements and Exact Sciences, University of Agricultural Sciences and Veterinary Medicine, Cluj-Napoca, Romania, email: rodica.sobolu@usamvcluj.ro
COPYRIGHT 2019 Romanian-American University
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2019 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Micula, Sanda; Sobolu, Rodica
Publication:Journal of Information Systems & Operations Management
Date:Dec 1, 2019
Words:4267
Previous Article:ASPECTS REGARDING THE ADOPTION OF E-COMMERCE IN ROMANIA.
Next Article:THE COLLABORATIVE PLATFORMS USED IN E-SOCIETY--A CASE STUDY ON THE TOURISM INDUSTRY.
Topics:

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters