# AE entropy for the condition monitoring of CFRP subjected to cyclic fatigue.

Introduction

During the progressive degradation process of a cyclically loaded CFRP composite material various damage mechanisms are introduced in the material [1, 2]. In each cycle, AE is emitted from both damage progression and from cumulated damage, i.e. rubbing of delaminated surfaces. As a result multiple AE transients, with varying amplitude, duration, and frequencies, can be simultaneously emitted from the numerous AE sources within the material. Depending on both the damage mechanisms and loading, AE from cumulated damage is either separable transients or inseparable, e.g. due to high degree of overlapping. Throughout the cyclic life the number of AE signals increases with increasing cumulated damage. The energy of the AE also increases and for some composites the energy also varies within a cycle.

Features extracted from the AE signal have been used for detection of damage, i.e. delamination, matrix cracking, debonding, fiber cracking, and fiber pull-out [3-7]. AE from cumulated damage has mainly been regarded as unwanted and many attempts have been made to filter it out, e.g. by thresholding the AE features [8], coupling the AE to the load [5], limiting the analysis to a part of the loading cycle [9, 10], and frequency analysis [11]. The AE waves in the material will reflect and undergo attenuation before being picked up by the AE sensor. The attenuation is due to geometric spreading, dispersion, internal friction and scattering. The values of AE features from cumulated damage usually fall in the same range as the ones from damage growth [5, 13] and it can be very difficult to distinguish between the two types. Because important AE events can get buried in the AE signals generated by friction and the rubbing of crack surfaces [11, 12] attempts have been made to filter out AE events from these damage mechanisms [5, 10, 13]. Awerbuch and Ghaffari concluded that frictional AE should not be eliminated as it may provide important information about the condition of a composite [10]. They argued that damage detection could be made easier by using frictional AE, since damage growth, i.e. a material change, produces AE only once, but the resulting rubbing of damage surfaces generates AE many times.

It is reasonable to assume that both environmental and measurement noise can be kept relatively constant during monitoring. Hence, an increase in the randomness of the AE measurements will mainly be due to increased AE activity, from either cumulated damage or damage growth. In this study, it is investigated whether the randomness can be used for condition monitoring, i.e. if it can be used to provide early failure warning. For estimating the randomness of the AE measurements a fundamental concept in Information Theory is used; that is, the entropy.

Entropy and its Estimation

The information entropy was introduced by C. E. Shannon in 1948 [14]. In his paper Shannon developed a method of measuring the randomness of a signal, or its uncertainty. The randomness is information encoded in the signal and the entropy increases with more information. Shannon recognized that the form of the measure was the same as the entropy in statistical mechanics and, for this reason, he called his measure 'entropy'. Shannon's formula for the entropy is

[H.sub.SHANNON](X) = - [summation over (lambda)]Pr([chi] = [lambda])log(PR{[chi] = [lambda])). (1)

The signal's values are denoted by x and are considered to be discrete random variables. The possible signal values are denoted by [lambda], and Pr(x = l) is the probability mass function (PMF) of X. Consequently, the entropy is a function of the signal's probability mass function, but not the values themselves. Without any constraints, the maximum entropy is attained when all values are equally probable, i.e. when the signal is white noise. The entropy is the minimum weighted average number of units, per value, required to encode the signal. The unit of measurement depends on the choice of the logarithm base, i.e. by choosing 2, 10, or e as the base the units will be bits, hartleys, or nats, respectively. The base can be changed by using the law of logarithm, i.e.

[log.sub.a](X)=[log.sub.a](b)[log.sub.b](X). (2)

In practice, computing the entropy can be challenging because the underlying distribution is often unknown, for instance the AE signal measured during cyclic testing of CFRP. Consequently, the entropy needs to be estimated. This can be done by estimating the probability mass function using statistical methods or by estimating the entropy directly using data compression [15,16]. A normalized histogram of the random variable can be used to estimate the probability mass function. By using a histogram to estimate probabilities, the entropy is estimated with respect to a model that assumes that the frequencies of the signal's values are constant within the signal segment. The histogram can be normalized to sum to one by

[n.sub.i] = [m.sub.i]/[[summation over].sup.k.sub.i=1] for i = 1, ..., k (3)

where k is the number of bins and [m.sub.i] is the number of observed signal values that fall in bin i. The normalized values of the histograms represent the proportion, or probability, of the corresponding signal's values. In the frequency domain, the frequency can be considered to be the random variable and the normalized spectrum to be the probability mass function. The spectrum is normalized by

[x.sub.i] = [X.sub.i]/[[summation over].sup.N.sub.i=1] for i = 1, ..., N (3)

where [X.sub.i] is the magnitude of the ith frequency component of the spectrum, e.g. the amplitude if an amplitude spectrum is used. Based on this probability mass function, entropy can be computed using Shannon's formula. By considering the spectral amplitude to be the random variable, a different entropy can also be defined and computed using Shannon's formula; the probability mass function of the amplitude intensities can be estimated using a histogram. When the probabilities are based on discrete Fourier transform, or histograms, the entropy is estimated with respect to a static model. These two entropies will be referred to as the frequency entropy and the spectrum entropy, respectively.

The Shannon entropy is properly defined as the minimum entropy over all possible models, i.e. it is an entropy computed using Shannon's formula and the correct probability mass function. In other words, it is the theoretical upper limit on lossless data compression that can be achieved for a given signal [15, 16]. Consequently, the entropy can be used to evaluate compression algorithms to determine whether there is room for improvement. Conversely, compression algorithms can be used to estimate the entropy of data. The compressed data can be written to a file and the file size then converted from bytes to nats using:

[H.sub.COMPRESSION] = 8 loge(2)File_Size/length_of_signal (5)

Where File_Size, the size of the compressed file in bytes, is multiplied by 8 to convert to bits. Equation 2 is used to change from bits (base 2 logarithm) to nats (base e logarithm). The results are then averaged over all values (samples) by dividing the results with length_of_signal. If the header of the compressed file is included in the file, then the entropy estimate will be higher. If the AE signal length is kept constant then the error due to the header will be approximately same for all computations.

Among the best lossless compression approaches are those based on a scheme known as prediction by partial matching (PPM). The PPM compression scheme is divided into two steps: modeling, from which the scheme takes its name, and coding. Arithmetic coding is used to code the output of the modeler. Arithmetic coding is a highly effective technique, which can code data close to its entropy with respect to the model [16]. The PPM modeler works with the data in a symbol-wise manner and its output is a set of conditional probabilities for the symbols. The probabilities of the symbols are estimated adaptively and used to predict the next unseen symbol. For predicting the modeler uses finite context models of k symbols, which immediately precede the one to be predicted. The number k is also referred to as the model order and is specified by the user before the data compression is initiated.

During the modeling for each symbol, the modeler begins by looking up how many times the current context of length [l.sub.c] = k has occurred before. If the context has been observed before, followed by the symbol, the symbol can be coded using a probability of [n.sub.c]/n, where [n.sub.c] is equal to the number of times the context has been observed followed by the symbol and n is the number of times the context has been observed. If the context hasn't been encountered before, or it has only been followed by different symbols, an escape character is passed to the modeler. When the modeler receives the escape character it switches to a context that is one symbol shorter, i.e. to a context of length [l.sub.c] = k -1. Again, if the current context has not been observed before, or has only been followed by different symbols, another escape character is passed to the modeler and it starts to look for contexts, which are one symbol shorter. This can be repeated until the context length becomes [l.sub.c] = -1 symbols. When this occurs, all symbols from the alphabet are considered equally probable. Equi-probability is undesirable since it does not provide an accurate model; however, it poses no problem for accurate coding. The arithmetic coder is able to proceed even though the model is inaccurate; however, a higher number of bits may be required to encode the data. Intuitively, better compression is achieved with more accurate modeling. Fortunately, the context of [l.sub.c] = -1 symbols is only considered at most once for each symbol, and as the modeling proceeds the data statistics improve and lower values of [l.sub.c] become less and less frequent. Every time an escape character is sent (i.e. whenever the modeler is unable to code a symbol) the probability of observing a novel symbol, when presented with the current context, is updated. By assigning a probability to the escape character the modeling can be improved. For a detailed description of the PPM scheme and examples, the reader is referred to Text Compression [15] and Managing Gigabytes: Compressing and Indexing Documents and Images [16].

Different variants of the PPM compression scheme have been introduced in order to improve the PPM compression and to speed up calculations. One variant suggested by Howard [17] is referred to as method D, or PPMD, and estimates the conditional probability of observing a particular symbol given a specific context to be (2nc -1)=(2n), where [n.sub.c] is the number of times which the modeler has seen the symbol being preceded by the context and n is the total number of symbols preceded by the current context. The escape probabilities are estimated by [n.sub.u]=(2n), where [n.sub.u] is the number of unique symbols preceded by the current context and n has the same meaning as before.

Another variant was introduced by Dmitry Shkarin in Improving the Efficiency of the PPM Algorithm [18] (in Russian) under the name PPM with information inheritance, or PPMII. Shkarin presented the PPMII a year later in English [19]. The PPMI uses an additional model in order to get better estimation of the escape probabilities. In order to overcome the lack of statistical information when estimating the escape probabilities of long contexts, the PPMII allows the longer contexts to inherit statistics from shorter contexts. The inheritance reduces the computational cost compared with other approaches to this problem [19]. Starting with a code from Nelson [20], Shkarin implemented his improvements, and several other variations, and made them available in the public domain under the name PPM by Dmitry (PPMd).

Experimental Procedure

The test specimens used in this study are nominally identical prosthetic feet called Vari-Flex. The Vari-Flex is made by Ossur hf. A total number of 75 Vari-Flex feet were tested. The cyclic testing was performed in an ISO 10328 Foot/Limb test machine at Ossur's testing facilities. In the test, a foot is placed in the test machine where two actuators apply a load to the foot using a 90[degrees] phased sinusoidal constant amplitude loading at 1.0 Hz. One actuator loads the forefoot and the other loads the heel. The tests were accelerated by increasing the maximum loading by 50% and placing a 2[degrees] plastic wedge between the heel and the toe components. The wedge is used by amputees in order to stiffen the foot. The increased load and the use of the wedge result in considerably shorter fatigue tests. Frequently the damage mechanics change depending on the stress level, however, both preparatory tests and the fatigue test results show that the damage mechanisms leading to final failure are the same as observed under normal fatigue testing conditions.

All feet were tested until failure. The failure was defined by a 10% displacement criterion. The failure criterion is a heuristic criterion used in-house at Ossur. It defines a failure when a 10% change in the displacement of either actuator, with respect to initial value, is observed. Throughout each test, the AE data was acquired for one full fatigue cycle every 5 minutes. For a more detailed description of the experimental procedure the reader is referred to references [21-23].

Results

Two entropies are estimated in the time domain, one using Shannon's formula and the other using data compression. In order to apply Shannon's formula the probability mass function of the signal's values from each measurement is estimated from a normalized histogram using 216 bins. In other words, the number of bins is set equal to the number of quantized discrete values from the 16-bit A/D converter used for AE acquisition. In order to estimate the entropy using data compression, the signed 16-bit integer data is written to a file in ASCII form with no spaces in between. The file is then compressed using variant H of the PPMd algorithm (PPMdH), as implemented in an open source software program called 7-Zip. The resulting file size is then converted from bytes to nats (see Eq. 5). The model order is determined manually. The best compression results are obtained using model order k = 5, or when the maximum context length is equal to the maximum number of digits used (omitting the sign). In the remaining part of this paper, the entropy computed using Shannon's formula in the time domain will be referred to as the signal's entropy, and the entropy estimated using the PPMdH compression scheme will be referred to as the PPMdH entropy.

In the frequency domain Shannon's formula for the entropy is used to estimate both the frequency and the spectrum entropies. In order to compute the frequency entropy, the probability mass function of the frequencies is estimated by first transforming the signed 16-bit integer data to the frequency domain and then normalizing the one-sided amplitude spectrum to sum to one. Because of the normalization there is no need to convert the data values to volts. The transformation is made by applying a 221-point discrete Fourier transform (DFT). The number of DFT points is set to the next power of two higher than the length of the data series, for faster computation. This is done by zero-padding the data to make its length a power of two. The computations can also be made faster by using fewer points. If fewer points are used, the resolution of the spectrum decreases and less information is provided. Consequently, the entropy also decreases. If more points are used, the number of frequency bins increases, as does the entropy. This requires zero-padding. Zero-padding the data before applying the DFT results in a frequency interpolation, which means that the added frequency bins will not contain any new information. As a result, the entropy increase will only be a function of the number of added bins and is the same for all measurements.

In order to compute the spectrum entropy, the probability mass function of the spectral amplitudes is estimated from a histogram of the amplitude intensities. Amplitudes below the highest amplitude in the one-sided amplitude spectrum of all measurements are quantized with 16 bits. A 216-bin histogram is used to estimate the probability mass function of the quantized amplitudes.

Entropy as a Condition Signature

For evaluating whether the entropies can be used to provide early failure warnings and to compare them against other AE features, the approach in reference [24] is used. The probability distribution, of the entropy values at each percentage point is estimated by first computing the entropy from all measurements, for each foot tested, and then generating a histogram of the entropy values at the given percentage point of lifetime. The lifetime of each foot is normalized to 100% according to the 10% displacement failure criterion. All figures in this section have a grey area that represents all values that lie within one standard deviation from the mean. Also superimposed on the figures are histograms that show the distributions of the corresponding entropy at 50% and 95% of the normalized fatigue life. Figures 1a and 1b show, respectively, the average evolution of the AE hit count and the signal's energy for all feet tested. These two figures were presented in reference [24].

[FIGURE 1 OMITTED]

The average evolution of the four entropies for all the feet is shown in Figs. 1c-1f. As one can observe, the curves are relatively flat from approximately 20% to 95% of the normalized lifetime, and the standard deviation is high. The curves are therefore not suitable for issuing early failure alerts. This is verified by the almost perfect overlap of the histograms of the entropy values at 50% and 95% of the lifetime.

It is interesting to note that, of the two entropies computed in the time domain, the PPMdH entropy (shown in Fig. 1d) is lower than the signal's entropy (shown in Fig. 1c). In order to apply Shannon's formula, the probability mass function of the signal's values is estimated using a histogram of the signal's values. This estimates the entropy of the signal with respect to a static model. This is by no means the best model for the data--as can be observed; the PPDdH compression scheme produces a better model. Nonetheless, the evolution curve for the PPMdH entropy shows no additional information.

Table 1 presents the median Pearson and Spearman correlation coefficients estimated between the AE hit count, the energy and the four entropies. Each coefficient is estimated by first computing the corresponding coefficient between the curves from each fatigue test and then computing the median of the coefficients obtained from all feet.

The results presented in the table show that the evolution of the two entropies estimated in the time domain, i.e. the signal's entropy and the PPMdH entropy, correlate well with the evolution of the AE hit count. Furthermore, the calculations also show that there is a significant correlation between the energy and the spectrum entropy. Since the energy requires less computation and its interpretation is more intuitive the spectrum entropy does not seem to offer any advantage. The frequency entropy measures the flatness of the spectrum. It can be used to detect the presence of broadband events that may not be detected from the evolution of the energy. Consequently, the frequency entropy can possibly be used to supplement the energy feature.

Summary

In this study the AE hit count, energy and the entropies were extracted from the AE signals without applying any special filters to filter out any AE, e.g. from cumulative damage. The purpose of the study was to investigate whether the entropies could be used for condition monitoring of CFRP subjected to multi-axial cyclic loading, i.e. for providing early failure warning.

The results show that the entropies studied here cannot be used with the proposed approach for issuing early failure alerts. This is because they have relatively flat evolution curves from 20% to 95% of the normalized lifetime.

The results also showed that the trending results obtained using both the signal's entropy and the PPMdH entropy (file compression) are nearly the same as those obtained using a STFT based AE hit counting. This suggests that the signal's entropy might be a better alternative to the AE hit count for monitoring AE activity because it requires less computational effort, no filtering (the number of hits is reduced by thresholding) and the only tuning required is the choice of the histogram's bin size. These results suggest that further work should be done to compare the signal's entropy against AE hit counts that based on conventional thresholding.

Acknowledgements

The work of the first author was supported by grants from the University of Iceland Research Fund, the Icelandic Research Council (Rannis) Research Fund, and the Icelandic Research Council Graduate Research Fund. Furthermore, the authors wish to acknowledge Ossur for both providing prosthetic feet for testing and for providing access to their testing facilities.

References

[1.] G. Sims, Fatigue in Composites, Woodhead Publishing Ltd., Cambridge, 2003, 36-62.

[2.] M. Knops and C. Bogle, Composites Science and Technology, 66, 2006, 616-625.

[3.] M. Giordano, A. Calabro, C. Esposito, A. D'Amore, and L. Nicolais, Composites Science and Technology, 58, 1998, 1923-1928.

[4.] M. Wevers, NDT and E International, 30, 1997, 99-106.

[5.] D. Tsamtsakis, M. Wevers, P. De Meester, J. of Reinforced Plastics and Composites, 17, 1998, 1185-1201.

[6.] H. Nayeb-Hashemi, P. Kasomino, and N. Saniei, J. of Nond. valuation, 18, 1999, 127-137.

[7.] E. R. Green, J. of Nondestructive Evaluation, 17, 1998, 117-127.

[8.] N. Ativitavas, T. J. Fowler, T. Pothisiri, J. of Nondestructive Evaluation, 23, 2004, 21-36.

[9.] Y. A. Dzenis and J. Qian, International J. of Solids and Structures, 38, 2001, 1831-1854.

[10.] J. Awerbuch, S. Ghaffari, J. of Reinforced Plastics and Composites, 7, 1988, 245-64.

[11.] G. Kamala, J. Hashemi, A. A. Barhorst, J. of Reinf. Plastics and Comp. 20, 2001, 222-238.

[12.] A. P. Mouritz, Fatigue in Composites, Woodhead Publ. Ltd., Cambridge, 2003, 242-266.

[13.] Y. A. Dzenis, International J. of Fatigue, 25, 2003, 499-510.

[14.] C. E. Shannon, Computational Communication Rev. 5, 1948, 3-55.

[15.] T. Bell, J. Cleary, I. Witten, Text Compression, Prentice Hall, 1990.

[16.] I. Witten, A. Moffat, T. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images, Morgan Kaufmann, 1999.

[17.] P. G. Howard, Ph.D. thesis, Brown University, 1993.

[18.] D. A. Shkarin, Problems of Information Transmission, 37, 2001, 226-235.

[19.] D. Shkarin, DCC 2002, IEEE Computer Society, April, Snowbird, Utah, 2002, 202-211.

[20.] M. Nelson, Arithmetic coding and statistical modeling, Dr. Dobb's Journal.

[21.] R. Unnthorsson, T. P. Runarsson, and M. T. Jonsson, International J. of Fatigue, doi:10.1016/j.ijfatigue.2007.02.024, 2007.

[22.] R. Unnthorsson, T. P. Runarsson, M. T. Jonsson, J. of Acoustic Emission, 25, 2007, 253-259.

[23.] R. Unnthorsson, University of Iceland, Ph.D. Thesis, 2008

[24.] R. Unnthorsson, T. P. Runarsson, M. T. Jonsson, J. of Acoustic Emission, 26, 2008, 229-239.

RUNAR UNNTHORSSON, THOMAS P. RUNARSSON and MAGNUS T. JONSSON

Department of Mechanical & Industrial Engineering, University of Iceland, R eykjavik, Iceland

During the progressive degradation process of a cyclically loaded CFRP composite material various damage mechanisms are introduced in the material [1, 2]. In each cycle, AE is emitted from both damage progression and from cumulated damage, i.e. rubbing of delaminated surfaces. As a result multiple AE transients, with varying amplitude, duration, and frequencies, can be simultaneously emitted from the numerous AE sources within the material. Depending on both the damage mechanisms and loading, AE from cumulated damage is either separable transients or inseparable, e.g. due to high degree of overlapping. Throughout the cyclic life the number of AE signals increases with increasing cumulated damage. The energy of the AE also increases and for some composites the energy also varies within a cycle.

Features extracted from the AE signal have been used for detection of damage, i.e. delamination, matrix cracking, debonding, fiber cracking, and fiber pull-out [3-7]. AE from cumulated damage has mainly been regarded as unwanted and many attempts have been made to filter it out, e.g. by thresholding the AE features [8], coupling the AE to the load [5], limiting the analysis to a part of the loading cycle [9, 10], and frequency analysis [11]. The AE waves in the material will reflect and undergo attenuation before being picked up by the AE sensor. The attenuation is due to geometric spreading, dispersion, internal friction and scattering. The values of AE features from cumulated damage usually fall in the same range as the ones from damage growth [5, 13] and it can be very difficult to distinguish between the two types. Because important AE events can get buried in the AE signals generated by friction and the rubbing of crack surfaces [11, 12] attempts have been made to filter out AE events from these damage mechanisms [5, 10, 13]. Awerbuch and Ghaffari concluded that frictional AE should not be eliminated as it may provide important information about the condition of a composite [10]. They argued that damage detection could be made easier by using frictional AE, since damage growth, i.e. a material change, produces AE only once, but the resulting rubbing of damage surfaces generates AE many times.

It is reasonable to assume that both environmental and measurement noise can be kept relatively constant during monitoring. Hence, an increase in the randomness of the AE measurements will mainly be due to increased AE activity, from either cumulated damage or damage growth. In this study, it is investigated whether the randomness can be used for condition monitoring, i.e. if it can be used to provide early failure warning. For estimating the randomness of the AE measurements a fundamental concept in Information Theory is used; that is, the entropy.

Entropy and its Estimation

The information entropy was introduced by C. E. Shannon in 1948 [14]. In his paper Shannon developed a method of measuring the randomness of a signal, or its uncertainty. The randomness is information encoded in the signal and the entropy increases with more information. Shannon recognized that the form of the measure was the same as the entropy in statistical mechanics and, for this reason, he called his measure 'entropy'. Shannon's formula for the entropy is

[H.sub.SHANNON](X) = - [summation over (lambda)]Pr([chi] = [lambda])log(PR{[chi] = [lambda])). (1)

The signal's values are denoted by x and are considered to be discrete random variables. The possible signal values are denoted by [lambda], and Pr(x = l) is the probability mass function (PMF) of X. Consequently, the entropy is a function of the signal's probability mass function, but not the values themselves. Without any constraints, the maximum entropy is attained when all values are equally probable, i.e. when the signal is white noise. The entropy is the minimum weighted average number of units, per value, required to encode the signal. The unit of measurement depends on the choice of the logarithm base, i.e. by choosing 2, 10, or e as the base the units will be bits, hartleys, or nats, respectively. The base can be changed by using the law of logarithm, i.e.

[log.sub.a](X)=[log.sub.a](b)[log.sub.b](X). (2)

In practice, computing the entropy can be challenging because the underlying distribution is often unknown, for instance the AE signal measured during cyclic testing of CFRP. Consequently, the entropy needs to be estimated. This can be done by estimating the probability mass function using statistical methods or by estimating the entropy directly using data compression [15,16]. A normalized histogram of the random variable can be used to estimate the probability mass function. By using a histogram to estimate probabilities, the entropy is estimated with respect to a model that assumes that the frequencies of the signal's values are constant within the signal segment. The histogram can be normalized to sum to one by

[n.sub.i] = [m.sub.i]/[[summation over].sup.k.sub.i=1] for i = 1, ..., k (3)

where k is the number of bins and [m.sub.i] is the number of observed signal values that fall in bin i. The normalized values of the histograms represent the proportion, or probability, of the corresponding signal's values. In the frequency domain, the frequency can be considered to be the random variable and the normalized spectrum to be the probability mass function. The spectrum is normalized by

[x.sub.i] = [X.sub.i]/[[summation over].sup.N.sub.i=1] for i = 1, ..., N (3)

where [X.sub.i] is the magnitude of the ith frequency component of the spectrum, e.g. the amplitude if an amplitude spectrum is used. Based on this probability mass function, entropy can be computed using Shannon's formula. By considering the spectral amplitude to be the random variable, a different entropy can also be defined and computed using Shannon's formula; the probability mass function of the amplitude intensities can be estimated using a histogram. When the probabilities are based on discrete Fourier transform, or histograms, the entropy is estimated with respect to a static model. These two entropies will be referred to as the frequency entropy and the spectrum entropy, respectively.

The Shannon entropy is properly defined as the minimum entropy over all possible models, i.e. it is an entropy computed using Shannon's formula and the correct probability mass function. In other words, it is the theoretical upper limit on lossless data compression that can be achieved for a given signal [15, 16]. Consequently, the entropy can be used to evaluate compression algorithms to determine whether there is room for improvement. Conversely, compression algorithms can be used to estimate the entropy of data. The compressed data can be written to a file and the file size then converted from bytes to nats using:

[H.sub.COMPRESSION] = 8 loge(2)File_Size/length_of_signal (5)

Where File_Size, the size of the compressed file in bytes, is multiplied by 8 to convert to bits. Equation 2 is used to change from bits (base 2 logarithm) to nats (base e logarithm). The results are then averaged over all values (samples) by dividing the results with length_of_signal. If the header of the compressed file is included in the file, then the entropy estimate will be higher. If the AE signal length is kept constant then the error due to the header will be approximately same for all computations.

Among the best lossless compression approaches are those based on a scheme known as prediction by partial matching (PPM). The PPM compression scheme is divided into two steps: modeling, from which the scheme takes its name, and coding. Arithmetic coding is used to code the output of the modeler. Arithmetic coding is a highly effective technique, which can code data close to its entropy with respect to the model [16]. The PPM modeler works with the data in a symbol-wise manner and its output is a set of conditional probabilities for the symbols. The probabilities of the symbols are estimated adaptively and used to predict the next unseen symbol. For predicting the modeler uses finite context models of k symbols, which immediately precede the one to be predicted. The number k is also referred to as the model order and is specified by the user before the data compression is initiated.

During the modeling for each symbol, the modeler begins by looking up how many times the current context of length [l.sub.c] = k has occurred before. If the context has been observed before, followed by the symbol, the symbol can be coded using a probability of [n.sub.c]/n, where [n.sub.c] is equal to the number of times the context has been observed followed by the symbol and n is the number of times the context has been observed. If the context hasn't been encountered before, or it has only been followed by different symbols, an escape character is passed to the modeler. When the modeler receives the escape character it switches to a context that is one symbol shorter, i.e. to a context of length [l.sub.c] = k -1. Again, if the current context has not been observed before, or has only been followed by different symbols, another escape character is passed to the modeler and it starts to look for contexts, which are one symbol shorter. This can be repeated until the context length becomes [l.sub.c] = -1 symbols. When this occurs, all symbols from the alphabet are considered equally probable. Equi-probability is undesirable since it does not provide an accurate model; however, it poses no problem for accurate coding. The arithmetic coder is able to proceed even though the model is inaccurate; however, a higher number of bits may be required to encode the data. Intuitively, better compression is achieved with more accurate modeling. Fortunately, the context of [l.sub.c] = -1 symbols is only considered at most once for each symbol, and as the modeling proceeds the data statistics improve and lower values of [l.sub.c] become less and less frequent. Every time an escape character is sent (i.e. whenever the modeler is unable to code a symbol) the probability of observing a novel symbol, when presented with the current context, is updated. By assigning a probability to the escape character the modeling can be improved. For a detailed description of the PPM scheme and examples, the reader is referred to Text Compression [15] and Managing Gigabytes: Compressing and Indexing Documents and Images [16].

Different variants of the PPM compression scheme have been introduced in order to improve the PPM compression and to speed up calculations. One variant suggested by Howard [17] is referred to as method D, or PPMD, and estimates the conditional probability of observing a particular symbol given a specific context to be (2nc -1)=(2n), where [n.sub.c] is the number of times which the modeler has seen the symbol being preceded by the context and n is the total number of symbols preceded by the current context. The escape probabilities are estimated by [n.sub.u]=(2n), where [n.sub.u] is the number of unique symbols preceded by the current context and n has the same meaning as before.

Another variant was introduced by Dmitry Shkarin in Improving the Efficiency of the PPM Algorithm [18] (in Russian) under the name PPM with information inheritance, or PPMII. Shkarin presented the PPMII a year later in English [19]. The PPMI uses an additional model in order to get better estimation of the escape probabilities. In order to overcome the lack of statistical information when estimating the escape probabilities of long contexts, the PPMII allows the longer contexts to inherit statistics from shorter contexts. The inheritance reduces the computational cost compared with other approaches to this problem [19]. Starting with a code from Nelson [20], Shkarin implemented his improvements, and several other variations, and made them available in the public domain under the name PPM by Dmitry (PPMd).

Experimental Procedure

The test specimens used in this study are nominally identical prosthetic feet called Vari-Flex. The Vari-Flex is made by Ossur hf. A total number of 75 Vari-Flex feet were tested. The cyclic testing was performed in an ISO 10328 Foot/Limb test machine at Ossur's testing facilities. In the test, a foot is placed in the test machine where two actuators apply a load to the foot using a 90[degrees] phased sinusoidal constant amplitude loading at 1.0 Hz. One actuator loads the forefoot and the other loads the heel. The tests were accelerated by increasing the maximum loading by 50% and placing a 2[degrees] plastic wedge between the heel and the toe components. The wedge is used by amputees in order to stiffen the foot. The increased load and the use of the wedge result in considerably shorter fatigue tests. Frequently the damage mechanics change depending on the stress level, however, both preparatory tests and the fatigue test results show that the damage mechanisms leading to final failure are the same as observed under normal fatigue testing conditions.

All feet were tested until failure. The failure was defined by a 10% displacement criterion. The failure criterion is a heuristic criterion used in-house at Ossur. It defines a failure when a 10% change in the displacement of either actuator, with respect to initial value, is observed. Throughout each test, the AE data was acquired for one full fatigue cycle every 5 minutes. For a more detailed description of the experimental procedure the reader is referred to references [21-23].

Results

Two entropies are estimated in the time domain, one using Shannon's formula and the other using data compression. In order to apply Shannon's formula the probability mass function of the signal's values from each measurement is estimated from a normalized histogram using 216 bins. In other words, the number of bins is set equal to the number of quantized discrete values from the 16-bit A/D converter used for AE acquisition. In order to estimate the entropy using data compression, the signed 16-bit integer data is written to a file in ASCII form with no spaces in between. The file is then compressed using variant H of the PPMd algorithm (PPMdH), as implemented in an open source software program called 7-Zip. The resulting file size is then converted from bytes to nats (see Eq. 5). The model order is determined manually. The best compression results are obtained using model order k = 5, or when the maximum context length is equal to the maximum number of digits used (omitting the sign). In the remaining part of this paper, the entropy computed using Shannon's formula in the time domain will be referred to as the signal's entropy, and the entropy estimated using the PPMdH compression scheme will be referred to as the PPMdH entropy.

In the frequency domain Shannon's formula for the entropy is used to estimate both the frequency and the spectrum entropies. In order to compute the frequency entropy, the probability mass function of the frequencies is estimated by first transforming the signed 16-bit integer data to the frequency domain and then normalizing the one-sided amplitude spectrum to sum to one. Because of the normalization there is no need to convert the data values to volts. The transformation is made by applying a 221-point discrete Fourier transform (DFT). The number of DFT points is set to the next power of two higher than the length of the data series, for faster computation. This is done by zero-padding the data to make its length a power of two. The computations can also be made faster by using fewer points. If fewer points are used, the resolution of the spectrum decreases and less information is provided. Consequently, the entropy also decreases. If more points are used, the number of frequency bins increases, as does the entropy. This requires zero-padding. Zero-padding the data before applying the DFT results in a frequency interpolation, which means that the added frequency bins will not contain any new information. As a result, the entropy increase will only be a function of the number of added bins and is the same for all measurements.

In order to compute the spectrum entropy, the probability mass function of the spectral amplitudes is estimated from a histogram of the amplitude intensities. Amplitudes below the highest amplitude in the one-sided amplitude spectrum of all measurements are quantized with 16 bits. A 216-bin histogram is used to estimate the probability mass function of the quantized amplitudes.

Entropy as a Condition Signature

For evaluating whether the entropies can be used to provide early failure warnings and to compare them against other AE features, the approach in reference [24] is used. The probability distribution, of the entropy values at each percentage point is estimated by first computing the entropy from all measurements, for each foot tested, and then generating a histogram of the entropy values at the given percentage point of lifetime. The lifetime of each foot is normalized to 100% according to the 10% displacement failure criterion. All figures in this section have a grey area that represents all values that lie within one standard deviation from the mean. Also superimposed on the figures are histograms that show the distributions of the corresponding entropy at 50% and 95% of the normalized fatigue life. Figures 1a and 1b show, respectively, the average evolution of the AE hit count and the signal's energy for all feet tested. These two figures were presented in reference [24].

[FIGURE 1 OMITTED]

The average evolution of the four entropies for all the feet is shown in Figs. 1c-1f. As one can observe, the curves are relatively flat from approximately 20% to 95% of the normalized lifetime, and the standard deviation is high. The curves are therefore not suitable for issuing early failure alerts. This is verified by the almost perfect overlap of the histograms of the entropy values at 50% and 95% of the lifetime.

It is interesting to note that, of the two entropies computed in the time domain, the PPMdH entropy (shown in Fig. 1d) is lower than the signal's entropy (shown in Fig. 1c). In order to apply Shannon's formula, the probability mass function of the signal's values is estimated using a histogram of the signal's values. This estimates the entropy of the signal with respect to a static model. This is by no means the best model for the data--as can be observed; the PPDdH compression scheme produces a better model. Nonetheless, the evolution curve for the PPMdH entropy shows no additional information.

Table 1 presents the median Pearson and Spearman correlation coefficients estimated between the AE hit count, the energy and the four entropies. Each coefficient is estimated by first computing the corresponding coefficient between the curves from each fatigue test and then computing the median of the coefficients obtained from all feet.

The results presented in the table show that the evolution of the two entropies estimated in the time domain, i.e. the signal's entropy and the PPMdH entropy, correlate well with the evolution of the AE hit count. Furthermore, the calculations also show that there is a significant correlation between the energy and the spectrum entropy. Since the energy requires less computation and its interpretation is more intuitive the spectrum entropy does not seem to offer any advantage. The frequency entropy measures the flatness of the spectrum. It can be used to detect the presence of broadband events that may not be detected from the evolution of the energy. Consequently, the frequency entropy can possibly be used to supplement the energy feature.

Summary

In this study the AE hit count, energy and the entropies were extracted from the AE signals without applying any special filters to filter out any AE, e.g. from cumulative damage. The purpose of the study was to investigate whether the entropies could be used for condition monitoring of CFRP subjected to multi-axial cyclic loading, i.e. for providing early failure warning.

The results show that the entropies studied here cannot be used with the proposed approach for issuing early failure alerts. This is because they have relatively flat evolution curves from 20% to 95% of the normalized lifetime.

The results also showed that the trending results obtained using both the signal's entropy and the PPMdH entropy (file compression) are nearly the same as those obtained using a STFT based AE hit counting. This suggests that the signal's entropy might be a better alternative to the AE hit count for monitoring AE activity because it requires less computational effort, no filtering (the number of hits is reduced by thresholding) and the only tuning required is the choice of the histogram's bin size. These results suggest that further work should be done to compare the signal's entropy against AE hit counts that based on conventional thresholding.

Acknowledgements

The work of the first author was supported by grants from the University of Iceland Research Fund, the Icelandic Research Council (Rannis) Research Fund, and the Icelandic Research Council Graduate Research Fund. Furthermore, the authors wish to acknowledge Ossur for both providing prosthetic feet for testing and for providing access to their testing facilities.

References

[1.] G. Sims, Fatigue in Composites, Woodhead Publishing Ltd., Cambridge, 2003, 36-62.

[2.] M. Knops and C. Bogle, Composites Science and Technology, 66, 2006, 616-625.

[3.] M. Giordano, A. Calabro, C. Esposito, A. D'Amore, and L. Nicolais, Composites Science and Technology, 58, 1998, 1923-1928.

[4.] M. Wevers, NDT and E International, 30, 1997, 99-106.

[5.] D. Tsamtsakis, M. Wevers, P. De Meester, J. of Reinforced Plastics and Composites, 17, 1998, 1185-1201.

[6.] H. Nayeb-Hashemi, P. Kasomino, and N. Saniei, J. of Nond. valuation, 18, 1999, 127-137.

[7.] E. R. Green, J. of Nondestructive Evaluation, 17, 1998, 117-127.

[8.] N. Ativitavas, T. J. Fowler, T. Pothisiri, J. of Nondestructive Evaluation, 23, 2004, 21-36.

[9.] Y. A. Dzenis and J. Qian, International J. of Solids and Structures, 38, 2001, 1831-1854.

[10.] J. Awerbuch, S. Ghaffari, J. of Reinforced Plastics and Composites, 7, 1988, 245-64.

[11.] G. Kamala, J. Hashemi, A. A. Barhorst, J. of Reinf. Plastics and Comp. 20, 2001, 222-238.

[12.] A. P. Mouritz, Fatigue in Composites, Woodhead Publ. Ltd., Cambridge, 2003, 242-266.

[13.] Y. A. Dzenis, International J. of Fatigue, 25, 2003, 499-510.

[14.] C. E. Shannon, Computational Communication Rev. 5, 1948, 3-55.

[15.] T. Bell, J. Cleary, I. Witten, Text Compression, Prentice Hall, 1990.

[16.] I. Witten, A. Moffat, T. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images, Morgan Kaufmann, 1999.

[17.] P. G. Howard, Ph.D. thesis, Brown University, 1993.

[18.] D. A. Shkarin, Problems of Information Transmission, 37, 2001, 226-235.

[19.] D. Shkarin, DCC 2002, IEEE Computer Society, April, Snowbird, Utah, 2002, 202-211.

[20.] M. Nelson, Arithmetic coding and statistical modeling, Dr. Dobb's Journal.

[21.] R. Unnthorsson, T. P. Runarsson, and M. T. Jonsson, International J. of Fatigue, doi:10.1016/j.ijfatigue.2007.02.024, 2007.

[22.] R. Unnthorsson, T. P. Runarsson, M. T. Jonsson, J. of Acoustic Emission, 25, 2007, 253-259.

[23.] R. Unnthorsson, University of Iceland, Ph.D. Thesis, 2008

[24.] R. Unnthorsson, T. P. Runarsson, M. T. Jonsson, J. of Acoustic Emission, 26, 2008, 229-239.

RUNAR UNNTHORSSON, THOMAS P. RUNARSSON and MAGNUS T. JONSSON

Department of Mechanical & Industrial Engineering, University of Iceland, R eykjavik, Iceland

Table 1. The median Pearson (left) and Spearman (right) correlation coefficients between the AE hit count, the energy, and the four entropies. AE hit Energy Signal's PPMdH count [dB] entropy entropy AE Hit Count 1 0.63 / 0.58 0.93 / 0.89 0.92 / 0.89 Energy[dB] 1 0.74 / 0.72 0.73 / 0.70 Signal's Entropy 1 0.99 / 0.99 PPMdH Entropy 1 Freq. Entropy Spectrum Entropy Freq. Spectrum entropy entropy AE Hit Count -0.24 /-0.19 0.71 / 0.65 Energy[dB] -0.44 /-0.52 0.92 / 0.91 Signal's Entropy -0.28 /-0.31 0.79 / 0.79 PPMdH Entropy -0.24 /-0.29 0.78 / 0.79 Freq. Entropy 1 -0.23 /-0.20 Spectrum Entropy 1

Printer friendly Cite/link Email Feedback | |

Title Annotation: | carbon fiber reinforced polymer |
---|---|

Author: | Unnthorsson, Runar; Runarsson, Thomas P.; Jonsson, Magnus T. |

Publication: | Journal of Acoustic Emission |

Article Type: | Report |

Geographic Code: | 1USA |

Date: | Jan 1, 2008 |

Words: | 4017 |

Previous Article: | Wavelet entropy and power of AE signals as tools to evaluate damage in coatings submitted to scratch test. |

Next Article: | Experimental simulation and dynamic behavior of the AE due to martensitic transformation using shear wave transmission sensor. |

Topics: |