# Egg signals recognition based on LMD and relevance vector machine.

1. IntroductionElectrogastrogram (EGG) is the electrogastrographic signals recorded from the surface of stomach by surface electrode. EGG will reflect the condition of the gastric electrical activity, providing some clinical diagnosis value (Rosana, 2001). Deriving from the signal source in organs, it is highly susceptible to noise interference, like slight movement during breathing and measurement process, electrical activity of small intestine or the heart, etc. The waveform can't be utilized directly without processing due to its low signal-to-noise ratio. In the time-frequency transform theory, for non-stationary signal, wavelet transform or HHT is used to process original signal. Reconstruction is helpful to obtain the data having less noise but representing gastric electrical information better. Then a variety of intelligent algorithms for processing EGG signals are used to complete the classification of gastric electrical information with a computer, facilitating the clinical diagnosis by doctors.

2. Denoising and reconstruction of EGG signals

The raw and unprocessed signal is shown in figure 1. Due to the low signal-to-noise ratio, the waveform can't be utilized directly without any processing. A common practice is using wavelet transform or HTT method to process and obtain the reconstructed signal. Given the deficiencies in reconstruction with wavelet signal in (Norden, 2003), EMD is adopted to reconstruct the signal in this algorithm.

2.1. Introduction on EGG signals

2.1.1. Category of EGG signals

The main frequency of gastric electrical signal for a normal person is relatively stable. A gastric electrical record may be composed of signals with normal frequency or dysrhythmia. Patients with dysrhythmia often undergo the motor disorder of stomach. Dysrhythmia of the stomach includes tachygastria, bradygastria and arrhythmia (shown in table 1), which can be connected with the motor disorder of stomach and corresponding clinical symptoms together in (Shah and Rodriguez, 1999).

2.1.2. Noise sources of EGG signals

Theoretically, EGG signals can be captured and diagnosed by the expert and have a good effect on the patient diagnose. Actually, EGG signals will be affected by other signals' interference in the acquisition process. Because they are weak, making separating them from the noisy weak signals become a challenging problem. The main interference sources of EGG signals are: respiratory noise, friction noise between electrode and the skin, the city noise and electrical activity from other organs. The recording instruments will filter out some noise when recording EGG signals. But most of EGG signals belong to low-frequency components, having weak energy and poor quality. Without processing with powerful signal analysis tools, EGG signals can't be used for clinical diagnosis directly. The main noise sources are shown in Table 2:

3. Feature extraction based on LMD and fractal dimension for EGG signals

Local mean decomposition (LMD) is also a decomposition algorithm in nature, which is evolved from EMD. LMD has the advantage of decomposition, while EMD is dominant in signal reconstruction. The algorithm proposed in this paper will adopt HTT method to implement decomposition and reconstruction, but apply LMD decomposition algorithm when extracting features (Wang et al., 2009).

3.1. Brief introduction about LMD

LMD, which is proposed by Smith in (Jonathan, 2005), has taken good results when it used for processing EEG signals. LMD is evolved from EMD. Wang Yanxue made a thorough comparative study from solving local mean, calculating instantaneous frequency, decomposing component and filtering performance. LMD solves the issues of overshoot, undershoot faced in EMD, and complete solving instantaneous frequency and decomposing signal simultaneously (Norden, 1998; Goncalves et al., 2015). The flow chart of LMD is shown in figure 3.

3.2. Decomposition of EGG signals based on LMD

After obtaining the denoised EGG signals, certain analysis can be done to determine which category the EGG signal belongs to. From figure 2, the resulted waveform has a large amount of raw data. In order to facilitate the calculation, the waveform data must be handled properly in advance. Unlike the general algorithms which reduce the dimensionality using the combination of PCA and LDA, LMD is utilized to reduce the dimensionality according to the characteristics of the waveform data in this section. The processed signal after LMD is shown in figure 4:

3.3. Feature representation of EGG signals based on fractal dimension

In order to describe the decomposed waveform further and classify the EGG signals easily, this algorithm uses fractal dimension algorithm proposed by Katz (Kartz, 1988), to process the waveform obtained by decomposing the EGG signals, thus obtaining the fractal weft of sub waveform. The definition of fractal weft is as follows:

Where L is the total length of signal waveform curve, namely the distance summation between two points. d is the distance between the first point and another point who has the maximum distance in the sequence, namely d=max(dist(s1-s2)). n=L/a is numbers of steps of signal waveform, a is mean distance of two successive points.

After solving the fractal dimension from the above sub waveform, five eigenvalues will be obtained from every EGG signal sequence. Meanwhile, by solving fractal dimension from the raw signal, a 6-dimensional feature vector X=[x1, x2 .... x6] will be gotten.

Classification of EGG signals based on RVM

The common classification algorithms are ANN, KNN, SVM, etc. SVM is a famous algorithm based on a small amount of samples. While RVM, proposed by Tipping, can overcome the following shortcomings of SVM in (Branco et al., 2016):

1. RVM can get a probability prediction. Sometimes we wish to get a probability distribution of possible values under one classification result.

2. RVM doesn't need a given error parameter C. In SVM, C has a great influence on the outcome and it must be tried a variety of possible values to achieve a good result.

3. RVM is faster in processing large-scale data than SVM.

Meanwhile, some domestic literature has done some comparisons on the classification accuracy and time consumption between SVM and RVM. RVM has more sparsity than SVM, making it have faster speed but the same accuracy with SVM, especially suitable for on-line learning and rapid diagnosis, improving the efficiency of equipment and training the classifier rapidly when new samples are imported. The theoretical model of RVM which is on the basis of Bayesian framework can be as follows:

[{[x.sub.i]}.sup.N.sub.i=1] is the eigenvalue in the training set, t = [[[t.sub.1], [t.sub.2], ..., [t.sub.n]].sup.T] is the desired value. Assume [t.sub.i] is a Gaussian distribution whose mean value is y:

p([t.sub.i]) = N([t.sub.i]|y([x.sub.i]; [omega]), [[sigma].sup.2]); y(x; [omega]) = [M.summation over (i=1)][[omega].sub.i]k(x, [x.sub.i]) + [[omega].sub.0] (1)

Where K(x, [x.sub.j]) is the kernel function, [[omega].sub.j] is weight. To acquire the sparse solution, make [[omega].sub.j] become the Gaussian distribution with mean value 0, namely:

p([[omega].sub.i]|[[alpha].sub.i]) = N([[omega].sub.i]|0, [[alpha].sub.i.sup.-1]) (2)

The likelihood function of training sample set is:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (3)

Where t = [([t.sub.1], ..., [t.sub.N]).sup.T] and [omega] = [([[omega].sub.1], ..., [[omega].sub.N]).sup.T]. [PHI] is a matrix whose rows are corresponding to the responses acquired from all the kernel functions to the input [x.sub.i].

From the priori probability and likelihood distribution theory, the weight's posterior probability distribution calculated by Bayes formula is:

p ([omega]| t, [alpha], [[sigma].sup.2]) = p(t | [omega], [[sigma].sup.2])p ([omega]|[alpha])/p(t|[alpha], [[sigma].sup.2]) (4)

And this distribution is a Gaussian distribution with multivariable:

p([omega]|t, [alpha], [[sigma].sup.2]) = N ([mu], [SIGMA]) (5)

Where [SIGMA] = 2[([[sigma].sup.-2][[PHI].sup.T][PHI] + A).sup.-1] is the covariance, a is the diagonal matrix of ([[alpha].sub.o], ..., [[alpha].sub.n]), [mu] = [[sigma].sup.-2][SIGMA][[PHI].sup.T]t is the mean value.

The likelihood distribution of desired value can be acquired by integrating the weight variables:

p([omega]|t, [alpha], [[sigma].sup.2]) = [integral] p(t|[omega], [[sigma].sup.2]) p([omega]|[alpha])d [omega] (6)

The marginal likelihood distribution of hyper-parameters is:

p(t|[alpha], [[sigma].sup.2]) = N(0, C) (7)

Where c = [[sigma].sup.2] I + [PHI][A.sup.-1][[PHI].sup.T] (8)

The maximum posteriori estimate of weights is determined by the hyper-parameters [alpha] and noise variance [[sigma].sup.2]. And the estimate [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] are calculated by the maximum marginal likelihood distribution. The posterior distribution expresses the uncertainty of the most optimum weight or the prediction model. At a given input value [x.sup.*], corresponding probability distribution is:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (9)

The above formula belongs to Gaussian distribution:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (10)

The predicted mean value and variance can be calculated by the following formulas:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (11)

5. Experimental results

After gathering the EGG signals and recognized by more than three doctors who confirm the same result, we obtain 800 data sets containing 20 persons with 4 classes (slow, fast, disorder, normal). Each class has 10 waveforms for 20 minutes. Seven EGG records of each class in every person are taken as training data set, the rest three records are testing data set. The following three methods were used to extract features:

1. Morlet wavelet is used to decompose the original EGG signals. A 16-dimensional feature vector will be constituted by solving the fractal dimension of the decomposed wavelet waveform.

2. After decomposing the original EGG with EMD, a 8-dimensional feature vector will be formed by solving the fractal dimension of the decomposed wavelet waveform.

3. Process the decomposed EMD wavelet with Hilbert transform, and use the method put forward in section 2 to get a 6-dimensional feature vector.

RVM is used to train and test the above three kinds of feature data respectively. Three algorithms are: Morlet+RVM, EMD+RVM, EMD+LMD+RVM

The final recognition rates of the three algorithms are shown in figure 5:

From Figure 5, using Morlet to extract feature will get the lowest recognition rate. While employing EMD and LMD to accomplish feature extraction can get the highest recognition rate. Applying EMD to extract feature will get a middle recognition rate. Experimental results demonstrate that EMD is more efficient in denoising than wavelet transform. Because LMD has a less amount of feature data than Morlet and EMD, LMD has a faster speed. Also the algorithm of reconstructing original signal using EMD to remove noise and extracting feature with LMD, applies a minimum data (6-dimension) but gets the highest recognition rate. All these show that the proposed method is feasible and effective.

Conclusion

The purpose of this paper is to realize the automatic extraction and recognition of EGG signals. In this algorithm, EMD is firstly used to reconstruct to obtain the denoised EGG signals, then LMD and nuclear identification method are utilized to extract feature and reduce the dimensions respectively, finally RVM is selected to process the feature data to accomplish the auto analysis of EGG signals. This paper introduces the method of reconstructing gastric electrical with EMD, uses LMD to do further feature extraction especially for the characteristics of the waveform modal. Then the combination of the kernel discriminant dimensionality reduction makes the amount of feature data reduced greatly. In the last phase, to achieve the automatic recognition, RVM is used to analyze the processed feature data of EGG signals. Comparisons with other methods show that this algorithm maintains high recognition accuracy but less time consumption when dealing with EGG signals with small samples.

The reconstructed signals acquired by EMD decomposition filter noises better, solving the problem of baseline drift. LMD algorithm avoids the overshoot caused by decomposed signals when used for classification. RVM classifier will get a high recognition precision and improve the response speed, which is contribute to real-time online monitoring.

Recebido/Submission: 06/04/2016

Aceitacao/Acceptance: 20/07/2016

References

Branco, F., Martins, J., Goncalves, R. (2016). Das Tecnologias e Sistemas de Informacao a Proposta Tecnologica de um Sistema de Informacao Para a Agroindustria: O Grupo Sousacamp, RISTI--Revista Iberica de Sistemas e Tecnologias de Informacao, (18), 18-32

Brown C.A., Scharner J., Felice K., Meriggioli M.N., Tarnopolsky M. (2011). Novel and recurrent EMD mutations in patients with Emery|[ndash]|Dreifuss muscular dystrophy, identify exon 2 as a mutation hot spot [J]. Journal of Human Genetics, 56(8), 589-94.

Goncalves, J., Faria, B. M., Reis, L. P., Carvalho, V., & Rocha, A. (2015). Data mining and electronic devices applied to quality of life related to health data. In 2015 10th Iberian Conference on Information Systems and Technologies (CISTI) (pp. 1-4). IEEE.

Jonathan S.S. (2005). The local mean decomposition and its application to EEG perception data [J]. Journal of the Royal Society Interface, 2(5): 443-454.

Kartz M. (1988). Fractals and the analysis of waveforms [J]. Comput. Biol. Med., 18(3): 145-156.

Kim S.G., Ryu S.I. (2013). Enamel matrix derivative for replanted teeth in animal models: a systematic review and meta-analysis [J]. Restorative Dentistry & Endodontics, 38(4), 194-203.

Norden E.H., Manli C.W., Steven R.L., Samuel S.P.S., Qu W.D., Gloersen P., Kuang L.F. (2003). A Confidence Limit for the Empirical Mode Decomposition and Hilbert Spectrum Analysis, Proc. R. Soc. Lond. A, 459, 2317-2345.

Norden E.H., Zheng S., Steven R.L., Manli C.W., Hsing H.S., Zheng Q.A., Yen N.C., Chi C.T., Henry H.L. (1998). The Empirical Mode Decomposition and the Hilbert Spectrum for Nonlinear and Nonstationary Time Series Analysis. Proc. R. Soc. Lond. A, 454, 903-995.

Rosana Esteller, George Vachtsevanos, Javier Echauz, et al. (2001). A comparison of waveform fractal dimension algorithms [J]. IEEE Transactions on Circuits and Systems-I: Fundamental Theory and Applications, 48(2): 177-183.

Shah N., Rodriguez M., Louis S., Lindley D., Milla K., Milla P.J. (1999). Feeding difficulties and foregut dysmotility in Noonan's syndrome. Archives of Disease in Childhood, 81(1), 28-31.

Tello R.M., Muller S.M., Bastosfilho T.F., Ferreira A. (2014). Comparison of new techniques based on EMD for control of a SSVEP-BCI [C]. Industrial Electronics (ISIE), 2014 IEEE 23rd International Symposium on Industrial Electronics Isie, 992-997.

Wang Y.X., He Z.J., Zi Y.Y. (2009). A demodulation method based on improved local mean decomposition and its application in rub-impact fault diagnosis [J]. Measurement Science and Technology, 20(2), 28-28.

Ruijin Ma, Huisheng Zhang

1561000382@qq.com, 11192897@qq.com

Department of Electronic and Information, Northwestern Polytechnical University, 710072, Xi'an, China

Table 1--Frequency zone and its clinical value Components Freq(cpm) Normal 2.0~4.0 Max.freq.of constraction Tachygastria 4.0~9.0 Absence of constractions Bradygastria 0.5~2.0 Controversial Arrhythmias -- Motility disorder Table 2--Main noise sources of EGG signals Components Freq(cpm) Signal 2.0~4.0 0.5~9.0 Noise Respiration 12~24 Small Bowel 9~12 Motion Artifacts Whole Range

Printer friendly Cite/link Email Feedback | |

Author: | Ma, Ruijin; Zhang, Huisheng |
---|---|

Publication: | RISTI (Revista Iberica de Sistemas e Tecnologias de Informacao) |

Date: | Oct 15, 2016 |

Words: | 2503 |

Previous Article: | Primary education major curriculum system of new universities. |

Next Article: | Research on Chinese word segmentation based on matrix restraint. |

Topics: |