# An Automated Approach for Epilepsy Detection Based on Tunable Q-Wavelet and Firefly Feature Selection Algorithm.

1. IntroductionEpilepsy is a chronic brain disease that affects people of all ages. According to the World Health Organization (WHO), approximately 65 million people suffer from this disorder [1], the majority of whom reside in developing countries and cannot obtain adequate medical treatment. Epilepsy doubles or triples the probability of sudden death when compared with that for healthy people [2]. Moreover, epileptic patients suffer from social stigma and discrimination in their communities. This stigma has a negative impact upon the quality of life of patients and their families. Therefore, the investigation of epilepsy detection techniques and antiepileptic drugs could increase the probability of those coping with this disease to live healthily without social stigmas.

Epilepsy is usually characterized by two or more unprovoked seizures, which affect the ictal person at any time. An elliptic seizure is defined as an excessive electrical discharge in an arbitrary portion of the brain. This rapid discharge causes a disturbance and abnormal behavior in the nervous system. An adequate clinical tool used to recognize epileptic seizures is the EEG signal analysis, as it measures the electrophysiological signals of the brain in real time and measures brain conditions efficiently [3]. However, EEG signal analysis has some limitations in detecting elliptic seizures because of epilepsy behavior such as the following:

(1) The occurrence of some seizures is not always because of the epilepsy disorder, as approximately 10% of healthy people may suffer from one seizure in their lifetime. These nonepileptic seizures are similar to epileptic seizures, but they are not related to epilepsy [2]. Hence, the classification of both epileptic and nonepileptic seizures is further significant.

(2) Although qualified professional neurologists can visually detect epileptic seizures from an EEG data sheet, it is still considered a time-consuming process.

The diagnostics of epilepsy are usually performed by manual inspections of the EEG signals which not an easy task and requires a highly skilled neurophysiologist. Also, the manual inspection of a long interval recording is a tedious and time-consuming process. Therefore, an intelligent clinical computer-aided design (CAD) tool that analyzes the EEG signal and detects the epileptic seizure is required.

Various case studies have reported the advantages of using automated methods to recognize epileptic seizures from EEG signals. Many techniques are commonly employed for automated EEG analysis and epilepsy detection. Most of these techniques consist of two stages: the first is concerned with feature extraction from the raw EEG signal; the other is dedicated to classifying the features [2]. The feature extraction process is concerned with obtaining significant information from the raw EEG data as well, as it could be implemented in the time, frequency, and time-frequency domains. The time domain and frequency domain are used for signal processing when the EEG is assumed to be a stationary signal. On the other hand, when the EEG signal is considered nonstationary [4, 5], then the time-frequency domain is employed. Case studies demonstrated that the time-frequency domain is more suitable for EEG signal analysis and could obtain significant results [2]. Many algorithms have been proposed for elliptic seizure detection within the time-frequency domain such as empirical mode decomposition (EMD) [6, 7] and wavelet transformation [810]. The EMD methods provided a leading trend to detect elliptic seizures from the EEG signal. The EMD has been combined with 2D and 3D phase space representation (PSR) features to identify elliptic seizures. Then, a least-squares support vector machine (LS-SVM) is used to perform the classification process [11]. A combination of different intrinsic mode functions (IMFs) is constructed as a set of features to utilize the classification problem [12]. The EMD has also been used to decompose an EEG signal into a collection of symmetric and band-limited signals. Then, a second-order difference plot (SODP) is applied to obtain an elliptical area. The area under this shape with 95% confidence is used as a selection measure fed to an artificial neural network (ANN) to determine the seizures and seizure-free signals [6]. Although the EMD methods proved their effectiveness, these methods suffer from the mode-mixing problem, which produces intermediate signals and noise. Local Binary Pattern (LBP) based methods represents a different approach of the epilepsy detection. The work presented by [13] suggested a feature extraction based on one dimensional LBP to classify the epileptic seizure, seizure-free, and the healthy classes from the EEG signal. In [14], the researchers have implemented a technique based on the combination of the LBP and the Gabor filter of the EEG signals. Then, the k-nearest neighbor classifier was used for the classification of epileptic seizures and seizure-free signals. The wavelet transformation is usually employed with nonlinear measures to recognize seizures and seizure-free patients from raw EEG signals. An automatic epilepsy detection approach proposed by [15] used the discrete wavelet transformation (DWT) for signal decomposition and generated a feature set using improved correlation-based feature selection (ICFS). Then, the random forest classifier is applied for classification. The DWT has been used with many nonlinear features, and the effectiveness of this approach has been proved [16-23]. Although wavelet transformation is an effective method for EEG signal analysis, this transformation has some limitations [24]. The selection of an appropriate wavelet bias is vital in the time-frequency signal analysis.

A flexible wavelet transformation proposed by [25], namely, tunable Q-wavelet transformation (TQWT), controls the transformation of a discrete time signal by an easily tunable variable called the Q-factor. The TQWT solved the primary limits of the wavelet filter banks by providing a tunable Q-factor that controls the number of the oscillations of the wavelet transformation. Moreover, the TQWT decreased the search space of filter banks by providing three variables only for adjusting. Also, many researchers applied TQWT for physiological signal analysis and proved its effectiveness [21, 22, 26, 27]. However, the after-mentioned methods provided a static set of features (e.g., statistical, nonlinear, and spectral) and did not discuss the adaptive behavior of these features as a dynamical system.

In this paper, an intelligent computer-aided design (CAD) tool that analyses the EEG signal and classifies the epileptic seizure and the seizure-free signal from the input EEG. That provides an asset to the neurophysiologist in interpreting the EEG and reduces the diagnostics time. The proposed method is based on data fusion of a single-channel EEG signal and an image processing approach. In the single-channel EEG signal, the EEG data are processed as a time-frequency time series. The signal is divided into smaller segments of data using tunable Q-wavelet. Some statistical features are extracted from this time series in the time domain and frequency domain. On the other hand, an image processing technique extracts the significant texture from the medical image. Thus, the gray-level co-occurrence matrix is applied to the image, and the contrast, correlation, energy, and homogeneity are extracted. The data fusion approach is used to combine these features of the input EEG signal and construct a large dataset for each patient. Because of a large number of the extracted features, a feature reduction algorithm is needed to reduce the processing time by obtaining a compact subset of features instead of the original one. Moreover, the feature reduction algorithm selects the relevant features, removes redundant features, and discovers the dependency among these features. Therefore, the firefly algorithm is used to find the optimal subset of features. Consequently, bootstraps are obtained by resampling the compact subset to train the random forest classifier. The final decision is obtained by performing a vote for each decision tree of the forest. Hence, the classification of seizure and seizure-free is obtained. A real-world dataset from the University of Bonn is used for benchmarking and validation of the proposed method. A numerical experiment has been implemented, and a comparative study presented a promising efficiency of the proposed system regarding the overall accuracy, sensitivity, and specificity.

The remainder of this manuscript is organized as follows: the preliminaries concepts were introduced in Section 2. Section 3 introduced the combinational hybrid system of the epilepsy detection. The experiment and discussion were presented in Section 4. Lastly, the paper was concluded in Section 5.

2. Preliminary Knowledge

2.1. Tunable Q-Wavelet Transformation (TQWT). The tunability of the Q-factor provided a proficient method to adopt the wavelet transformation [25]. The TQWT have three inputs: Q-factor denoted by Q, which determines the number of oscillations of the wavelet; the number of the oversampling rate, which is denoted by r and which determines the number of the overlapping frequency responses; and the number of stages of decomposition, denoted by J. For each decomposition stage, the target signal s[n] with a sample rate of [f.sub.s] could be represented by low-pass and high-pass subbands with sampling frequencies of [alpha][f.sub.s] and [beta][f.sub.s], respectively, where [alpha] and [beta] are the parameters of signal scaling. The low-pass subband is presented by low-pass filter [H.sub.0]([omega]) and low-pass scaling LPS([alpha]). Similarly, the high-pass subband [[omega].sub.1] is produced by [H.sub.1]([omega]) and HPS([beta]). The low-pass and high-pass subband signals are formulated as follows:

H[c.sub.0] ([omega])

[mathematical expression not reproducible], (1)

[mathematical expression not reproducible], (2)

where [theta]([omega]) could be defined as follows:

[theta]([omega]) = 0.5(1 + cos ([omega])) [(2- cos ([omega]).sup.0.5], [absolute value of [omega]] [less than or equal to] [pi] (3)

Both of r and Q could be represented as filter-bank variables [alpha] and [beta] as follows:

r = [beta] / 1 - [alpha],

Q = 2 - [beta] / [beta]. (4)

2.2. Feature Sets. The feature sets used in this research are grouped into four main groups which are statistical, power spectrum, chaotic features, and gray-level co-occurrence matrix (GLCM). The first group contains a set of five features calculated from the time domain of the input signal. This feature set contains mean ([micro]), standard deviation (STD), variance (var), Shannon entropy (H), and approximate entropy (ApEn). The mathematical formulation of each feature is shown as follows [28-31]:

[mu](x) = 1/N [N.summation over (i=1)] [x.sub.i] (5)

STD (x) = [square root of (1 / N - 1 [N.summation over (i=1)] ([x.sub.i] - [[mu].sub.x]).sup.2])] (6)

var(x) = 1 / N - 1 [N.summation over (i=1)] [[absolute value of [x.sub.i] - [mu]].sup.2] (7)

H(X) = [N.summation over (i=1)] P([x.sub.i]) log (P([x.sub.i])) (8)

ApEn (x) = [[phi].sup.m] (r) - [[phi].sup.m+1] (r). (9)

The second set of features calculates the power spectrum of the input signal based on the frequency domain analysis. This feature set contains spectral centroid (SC), spectral speed (SS), spectral flatness (SF), spectral slope (SSI), and spectral entropy (PSE), where Y(q) denotes the for the discrete Fourier transformation of the input signal f(n). The mathematical formulation of each feature is shown as follows [32]:

SC = [[summation].sup.M-1.sub.q=0] q[absolute value of Y(q)] / [[summation].sup.M-1.sub.q=0] [absolute value of Y(q)] (10)

SS = [[summation].sup.M-1.sub.q=0] [(q - SC).sup.2] q[absolute value of Y(q)] / [[summation].sup.M-1.sub.q=0] [absolute value of Y(q)] (11)

SF = [[pi].sup.M-1.sub.q=0] [[absolute value of Y(q)].sup.1/M] / 1 / M [[summation].sup.M-1.sub.q=0] [absolute value of Y(q)] (12)

[mathematical expression not reproducible] (13)

PSE = [n.summation over (i=1)] Y (q) / [[summation].sup.n.sub.i=1] Y (q) ln (Y (q) / [[summation].sup.n.sub.i=1] Y (q)) (14)

The third set of features contains chaotic measures to obtain the dynamic behavior of the EEG signal. This set includes Higuchi's fractal dimension (HFD), Hurst exponent (Hr), and Katz fractal exponent (KATZ). These features are formulated as follows [33-36]:

HFD = ln (k / L(k)) (15)

[mathematical expression not reproducible] (16)

KATZ = log (n) / log (n) + log (d/L) (17)

The final set of features consists of statistical measures of an image represented as matrices called gray-level cooccurrence matrix (GLCM) where C(i, j) represents an entry in co-occurrence matrix and i, j = 0, 1, 2, ... L - 1, where L is the number of gray levels in the image. Those matrices represent the spatial dependencies between the gray levels of image reflecting the structure of the underlying texture. After the normalization of these matrices, the contrast, correlation, energy, and homogeneity are computed as follows:

Energy = [L-1.summation over (i=0)] [L-1.summation over (j=0)] C[(i j).sup.2] (18)

Contrast = [L-1.summation over (i=0)] [L-1.summation over (j=0)] [(i - j).sup.2] C (i, j) (19)

Correlation = [L-1.summation over (i=0)] [L-1.summation over (j=0)] (i - [[mu].sub.i]) (j - [[mu].sub.j]) C (i, j) / [[sigma].sub.i] [[sigma].sub.j] (20)

Local homogeneity = [L-1.summation over (i=0)] [L-1.summation over (j=0)] 1 / 1 + [(i - j).sup.2] C(i,j) (21)

2.3. Firefly Optimization Algorithm. The firefly algorithm is a swarm based stochastic search technique [37]. The firefly optimization algorithm consists of a set of members called fireflies; each firefly represents a candidate solution. The most attractive firefly is considered to be the leader firefly that leads the other candidates to the best region. The attractiveness is calculated based on the light intensity which is usually determined by the objective fitness function. The attractiveness between two fireflies [X.sub.i] and [X.sub.j] is determined as follows:

[mathematical expression not reproducible] (22)

[r.sub.ij] = [D.summation over (d=1)] [([x.sub.id] - [x.sub.jd]).sup.2] (23)

where D denotes the problem dimension such that D = {1, 2, ..., d}, [r.sub.ij] denotes the distance between [X.sub.i] and [X.sub.j]. Parameter [[beta].sub.0] denotes the initial attractiveness at r =0 and [gamma] denotes the light absorption factor such that [gamma] [member of] [0,1]. Each firefly [X.sub.i] is compared with the other fireflies [X.sub.j] where j [member of] {1, 2, ... N} such that i [not equal to] j and N denotes the count of the fireflies. If firefly [X.sub.i] is better (brighter) than [X.sub.j], then firefly [X.sub.j] moves towards [X.sub.i] with a step movement formulated as follows:

[mathematical expression not reproducible] (24)

where [[epsilon].sub.i] represents uniform a randomly distributed variable such that [[epsilon].sub.i] [member of] [-0.5,0.5] and a denotes the movement step such that [alpha] [member of] [0,1].

3. The Combinational Hybrid System of Epilepsy Detection from EEG Signal

In this research, a hybrid system was proposed to detect both seizures and seizure-free conditions from a raw EEG signal. Although some investigations focused on the feature extraction level, the proposed system was established based on four main levels. This system combined the data fusion approach with firefly optimization and random forest. The TQWT was applied for EEG signal decomposition; then the features were constructed using a data fusion technique. Due to the large number of features obtained for each subband (featuresCount x J x Q), a feature reduction was applied to reduce the features and to obtain a compact set of features instead of the original one. The obtained compact set of features was fed to a random forest algorithm to obtain the classification rules and hence used for training. After training, the classifier should be able to classify and estimate the preictal phase. The proposed system was divided into the following four levels of processing and then described in detail as shown in Figure 1.

(i) EEG decomposition using TQWT

(ii) Feature extraction using data fusion based on single-channel EEG signal and co-occurrence matrix

(iii) Feature reduction using firefly optimization algorithm

(iv) Training of random forest classifier to detect the seizures and seizure-free EEGs

3.1. EEG Decomposition Using TQWT. The preprocessing level applies the TQWT decomposition to the input EEG signal. The TQWT converted the continuous EEG signal to discrete potions of data that could be handled more effectively. This wavelet transformation is used because of its effectiveness in signal decomposition and its tunability. The obtained subbands using the TQWT provided a significant difference between the seizure-free and the epileptic seizure of the EEG signals as shown in Figure 2. The subfigures denoted by (a), (b), (c), and (d) visualize the histogram of the first, second, third, and fourth subbands of the seizure-free class. The remaining subfigures denoted by (e), (f), (g), and (h) represent the histogram of the epileptic seizure class for the same subbands. The values of the extracted subbands of the second class are about ten times stronger than the first class that prove the efficiency of this decomposition.

3.2. Feature Extraction Using Data Fusion Based on a Single-Channel EEG and a Co-Occurrence Matrix. In the first perspective, the EEG signal was described as a nonstationary time series. After the decomposition of the EEG signal using the TQWT, a feature extraction process was performed to obtain significant characteristics from each TQWT subbands. The extracted features were categorized into three main groups. The first group determines the statistical characteristics in the time domain. The mean, standard deviation, variance, Shannon entropy, and approximate entropy were calculated in the first group as formulated from (5) to (9). This group indicates some statistical information obtained from the time domain of TQWT subband. The second group of features determines a power spectrum analysis of obtained subbands. The discrete Fourier transformations (DFT) were applied to convert these subbands into a frequency domain. Then the power spectrum features were extracted. The second group consists of the spectral centroid, spectral speed, spectral flatness, spectral slope, and spectral entropy as formulated from (10) to (14). The power spectrum analysis represents an effective method to study the frequency behavior of the signal. The last group of features performs a chaotic analysis of each subband. Because of the nonlinearity of the EEG signals, a nonlinear analysis is required. One of the best analyses used for this issue is the chaotic analysis. In this analysis, the Higuchi fractal dimension (HFD), Hurst exponent, and Katz fractal exponent were computed for each subband as formulated from (15) to (18).

In the second perspective, the input EEG signal was converted to a gray image. Then a co-occurrence matrix was computed to obtain the gray levels of the image. Then the textures of contrast, correlation, energy, and homogeneity were calculated from this matrix to represent the statistical measures of the image as formulated from (19) to (21). Finally, after computing the feature space, a data fusion was applied to merge all of these features and create a single dataset.

3.3. Feature Reduction Using Firefly Optimization Algorithm. In this section, a feature reduction algorithm based on the firefly algorithm is proposed [37, 38]. This algorithm implements a chaotic movement, simulated annealing (SA) to produce efficient offspring candidates, and memory awareness of the best and worst solutions to improve the search diversity and prevent local optima. The firefly population is randomly initialized using a chaotic logistic map to ensure the diversely of the candidates and the randomness of each firefly. Afterwards, the fitness function is computed for each candidate and identifies both the best and worst solutions as [g.sub.best] and [g.sub.wosrt], respectively. For each iteration, an alternative candidate is declared as [S.sub.best] with a competitive fitness and located in a different region. Both of the best and the alternative candidate sets are used to lead weak solutions to reach the optimal region and prevent local optima problem. The mean of the leader firefly and the alternative one is enhanced using SA algorithm to obtain a better solution [g'.sub.best]. The improved local and global solutions are used to guide the low lightness fireflies to move towards that stronger lightness. The algorithm is repeated until the maximum number of iterations is reached, or a termination criterion is achieved. The behavior of the proposed algorithm is determined by some properties, namely, the objective function, the attractiveness movement step, and population diversity. The proposed algorithm is demonstrated in Algorithm 1.

ALGORITHM 1: The pseudocode of the proposed feature reduction algorithm based on firefly optimization and SA. (1) Input: The features matrix Fm (2) Output: The optimal solution [g.sub.best] (3) Initialize the firefly swam using a logistic chaotic map. (4) Evaluate each firefly [g.sub.i] using the fitness function f (x) (5) Select the best firefly [g.sub.best] and the worst one [g.sub.worst] (6) while termination condition are not reached do (7) Declare an alternative leader firefly as [S.sub.best], with a competitive fitness and located in different region. (8) Obtain an offspring solution [g.sub.best] using SA (9) for all (firefly i and [f.sub.i] [not equal to] [g.sub.wosrt]) do (10) for all (firefly j and [f.sub.j] [not equal to] [g.sub.wosrt]) do (11) if ([I.sub.j] > [I.sub.i]) then (12) Enhance firefly j using Equation (25) to obtain better offspring candidate firefly denoted [x'.sub.j] (13) Replace firefly j with the offspring [x'.sub.j] (14) Move the firefly j towards the neighboring and global optimal solutions using Equation (26) (15) end if (16) end for (17) end for (18) Update the worst solution [g.sub.worst] (19) if f([g'.sub.best]) > f([g.sub.best]) then (20) [g.sub.best] [left arrow] [g'.sub.best] (21) end if (22) Rank all fireflies and update the best [g.sub.best] and worst [g.sub.worst] solutions. (23) end while (24) return [g.sub.best]

The Objective Function. This function is used to evaluate each candidate in the algorithm and defined as follows:

f (x) = [w.sub.1] x accuracy (x) + [w.sub.2] / number_of_features (25)

where [w.sub.1] and [w.sup.2] represent the weights of the classification accuracy and the number of the selected features, respectively. The values of [w.sup.1] and [w.sup.2] are set to 0.9 and 0.1, respectively, as a recommended by [38].

The Attractiveness Movement Step. The proposed algorithm used a chaotic logistic map to initialize the firefly population and hence increases the diversity and avoid local optima. After obtaining the global best solution [g.sub.best], an alternative leader firefly [S.sub.best] is declared with a competitive fitness but located in a different region. Since both leaders are more likely to discover distinctive search regions, this strategy reduces the probability of being trapped in the local optima. In addition, the optimal offspring of the mean positions of the two leaders and the neighboring brighter candidates are used to lead the search process and guide the solutions with lower light intensity to move towards the optimal region.

[x.sub.i] = [x.sub.i] + [[beta].sub.0] [C.sub.k] {[x'.sub.j] - [x.sub.i]) + [C.sub.k] [epsilon] {[g'.sub.best] - [x.sub.i]) + [alpha]'

(26)

x sign [rand - 0.5]

[x'.sub.j] = [x.sub.j] + [[sigma].sub.1] (27)

[g'.sub.best] = mean ([g.sub.best] + [S.sub.best]) [[sigma].sub.2] (28)

where [x'.sub.j] denotes the offspring candidate with a brighter neighboring solution, [x.sub.j] is defined by the SA as shown in (26), and [g'.sub.best] represents the fitter offspring solutions of the mean of the leader firefly and the alternative one as formulated in (27). It worth mentioning that the values of [[sigma].sub.1] and [[sigma].sub.2] are two random variables set using the Gaussian distribution. The movement step of the firefly is determined as shown in (28), where [C.sub.K] represents the chaotic map variable in the movement step and e denotes the randomized vector defined in the traditional firefly algorithm. Parameter [alpha]' denotes an adaptive step initialized to 0.5 to control the diversity of the search process.

ALGORITHM 2: The pseudocode of the generation of offspring solutions using SA. (1) Input: [iterations.sub.max], [g.sub.mean], [T.sub.max] (2) Output: The optimal offspring [S.sub.best] (3) [S.sub.c] = CreateInitSolutions([g.sub.mean]) (4) [S.sub.best] = [S.sub.c] (5) while (i [less than or equal to] iterations max) do (6) [S.sub.i] = CreateNeighborSolution([S.sub.c]) (7) [T.sub.c] = Ca/cM/ateTemperatMre(i, [T.sub.max]) (8) if (cost([S.sub.i]) [less than or equal to] cost([S.sub.c])) then (9) [S.sub.c] = [S.sub.i] (10) if (cost([S.sub.i])[less than or equal to] cost([S.sub.best])) then (11) [S.sub.best] = [S.sub.i] (12) end if (13) else if (exp((cost([S.sub.best]) - cost([S.sub.i]))/ [T.sub.c]) [greater than or equal to] [sigma]) then (14) [S.sub.c] = [S.sub.i] (15) end if (16) i = i + 1 (17) end while (18) return [S.sub.best]

Offspring Generation Using SA. The proposed algorithm used SA for generating better candidates to enhance the search process as much as possible as shown in Algorithm 2. The SA accepts both of the best solution [g.sub.best] and the alternative solution [S.sub.best] as main inputs, then the traditional SA is applied. The better solution generated is accepted by default according to the SA heuristics. On the other hand, the weaker solution should be accepted with specific probability as shown in (29), where [DELTA]f denotes the difference of the fitness (energy) between to candidates and [T.sub.c] denotes the current temperature. A simple linear cooling mechanism is used to control the value of the temperature.

P[x.sub.j] = exp ([DELTA]f / [T.sub.c]) (29)

Population Diversity. In each iteration, the worst solution is detected after the ranking process as shown in Algorithm 1. The remaining solutions are guided by the average position obtained from (30) where a denotes a random value obtained by Gaussian map.

[x.sup.worst.sub.j] = [g.sub.best] + [S.sub.best]/2 + ([[x.sup.worst.sup.j] [g.sub.best] + [S.sub.best] / 2) (30)

3.4. Classification and Learning. The random forest (RF) is a successful ensemble approach used in supervised machine learning to solve classification or regression problems [39]. It consists of a collection of decision trees that could act as a single classifier with multiple classification methods or a method that has several variables. Several subsets of the training data are supplied to each tree to achieve the most stable tree classification that results in a generalized experience of the classifier. The original dataset is divided into two parts. The first part is used to train each tree by bootstrapping technique. The other part is used to evaluate the accuracy of the classification. Each tree is allowed to reach the maximum depth without tree pruning to obtain a high variance classifier. The splitting process remains until only one instance of a single class is dropped from any leaf node or a predefined termination condition is achieved. When the forest is established, the number of subsets remained as a constant. The obtained route of traversal from the root node to the leaf node is applied to the new instances or the unlabeled instance for classification. The final decision for classifying a new instance is provided by determining each class that has the most votes from every decision tree. The random forest performs slightly better when compared with other classifiers such as discriminant analysis, SVM, and artificial neural network (ANN) [15, 39].

3.5. Performance Evaluation. Various performance formulas were used to evaluate the effectiveness of a classifier. The sensitivity (SEN) or recall, specificity (SPEC), accuracy (ACC), F-measure, Matthew's correlation coefficient (MCC), and receiver operating characteristics (ROC) are used to evaluate the efficiency of the random forest classifier [40-42]. These parameters are defined as follows:

SEN = TP / TP + FN x 100 (31)

SPEC = TN / TN + FP x 100 (32)

ACC = TP + TN / TP + TN + FP + FN x 100 (33)

F - measure = 2TP / 2TP + FP + FN x 100 (34)

MCC

= (TP x TN-FP x FN) / [square root of ((TP + FN) (TP + FP) (TN + FN) (TN + FP))] (35)

x 100

TP and TN represent the total number of an epileptic seizure and seizure-free signals classified correctly, respectively. Similarly, FP and FN represent the total number of epileptic seizures and seizure-free signals classified incorrectly, respectively. Cross-validation has also been used to ensure the classifier reliability and effectiveness. The original dataset is split into k folds (subsets) for both training and testing. In this strategy, k -1 folds were selected randomly to train the classifier, and the remaining folds are used to testing. The overall performance is calculated as the average of each fold. In this work, the experiment is repeated 10 times with tenfold cross-validation.

4. Results and Discussion

The benchmark dataset used in this investigation was acquired by the University of Bonn [28]. The dataset contains three different categories, i.e., preictal, healthy, and ictal recorded using a single channel for a 23.6 s duration. Both normal and preictal conditions were collected from 200 case studies and 100 for the ictal state. The normal condition is acquired from five healthy volunteers using the international 10-20 system standard with each volunteer in a relaxed-awake state with eyes open and closed (100 cases per each set) [28]. The ictal data were collected from five patients during their epileptic seizures. The preictal represents the EEG data collected from the same five patients with no seizures. It is worth mentioning that all EEG signals were acquired using a 128 channel amplifier with sampling rate equal 173.61 Hz [28]. Finally, a bandpass filter with 0.5340 Hz ~12dB/octave was applied as a filter.

The proposed approach for automatic detection of epileptic seizures and seizure-free patients was implemented using MATLAB software. A TQWT comparison between the seizure-free and the epileptic seizures patients are shown in Figure 3 with Q = 1,r = 3, and J = 3. It can be observed that both of the amplitudes and the frequency of the epileptic seizure are much higher than the healthy one. Moreover, the oscillatory behavior of the epileptic patient is higher than the healthy one. The value of the parameter r is set to three, to prevent any excessive ringing of the wavelet as suggested by [22]. MATLAB for the TQWT toolbox is available for public access at http://eeweb.poly.edu/iselesni/TQWT/.

After the construction of the dataset, the feature reduction was applied using the firefly algorithm to remove the redundant and irrelevant features. The number of the data segments was determined by the parameters Q, r, and /. By tuning these parameters, the number of the data segments was varied, and thus the training process was adopted. The trial and error approach was used to set the value of these parameters. The experiment was implemented on various values of J to obtain the best level of decomposition. The performance measures were calculated for each level of decomposition. As shown in Figure 4, the best value of the variable J was from two to three. Moreover, the best value of the parameter Q was found to be one as shown in Figure 5. All the variables remained constant while changing the Q-factor to obtain the best value.

Then, the firefly algorithm with a population size of 20 fireflies, mutation probability of 0.01, and light absorption equal to 0.1 was applied to reduce the feature set. The result of the compact set of features is obtained after 100 iterations and shown in Figure 6 are the number of features obtained by the firefly algorithm and their corresponding accuracy, precision, specificity, and the recall. In the last step, the feature set is fed to a random forest classifier to obtain the seizure-free and seizure conditions. The random forest classifier obtained 98% accuracy, 97% precision, 97% specificity, 98% recall, 98% F-measure, and 95% MCC at the third level of decomposition, and the Q-factor equals 1.

The firefly algorithm reduced the search space into three features. These features could replace the original dataset which minimizes the processing time. The compact set of feature contains the STD, ApEn, and the KATZ; the classification rules are based on these features. The classification rules obtained by the proposed system are shown in Figure 7, where the decision tree consists of 4 leaves and 7 nodes, and the final decision represents the seizure-free and the epileptic seizure denoted by 0,1, respectively.

A comparative study of the proposed hybrid epilepsy detection approach and other existing classification systems has been performed in terms of the total accuracy. A novel method based on the EMDs was proposed to detect the epileptic seizures of epilepsy. This method used the Hilbert transformation of IMFs obtained by EMD process that provided an analytic signal representation of IMFs [12]. The classification rules obtained by this method achieved an accuracy of 90%. The usage of frequency domain features and Burg's method obtained 93.11% accuracy with SVM classifier [43]. Nonlinear features have been used with a Gaussian mixture model classifier and achieved 95% accuracy [44]. A decision tree classifier is used with energy, fractal dimension, and sample entropy and provided 95.7% [45]. A combination of ApEn and the Hurst exponent has been used to detect the diagnostics of epilepsy and produced 96.5% accuracy with SVM classifier and ANN [46]. In [47], an eigensystem based method was proposed cooperated with Multiple Layer Perceptron to classify the epileptic seizures, healthy, and the seizure-free. This approach provided an average accuracy of 97.5%. The EMD methods for epilepsy detection achieved 97.75% accuracy [6]. The Kraskov entropy is also combined with the SVM classifier and provided 97.75% [22]. An automated diagnostics system based on a set of entropies and fuzzy Sugeno classifier (FSC) achieved accuracy up to 98% [19]. The work presented by [48] developed a method for the epilepsy detection using the EMD. The generated IMFs using the EMD were represented as a set of amplitude and frequency modulated (AM-FM) signals. The two bandwidths, namely, amplitude modulation bandwidth and frequency modulation bandwidth, calculated from the analytic IMFs, have been fed to LS-SVM for classifying seizure and nonseizure EEG signals. This method achieved 98.18% average accuracy. The LBP-based methods have been combined with Gabor filter for texture extraction from the EEG signal. A k-nearest neighbor classifier was applied and obtained an accuracy of 98.3% [14]. The proposed method confirmed its superiority in the total accuracy compared to the other systems. The results, which prove the superiority of the proposed method compared to the other existed systems, is demonstrated in Figure 8.

Once the system detects the preictal phase, the clinic receives a notification about that patient. The main advantage of the hybrid system from the clinical point of view could be summarized as follows:

(i) Classification and detecting of the epileptic seizures and seizure-free signals from the EEG signal automatically

(ii) The detection of the preictal phase from studying the healthy and ictal phase of each patient

(iii) The detection of the preictal phase that provided the ability to send warning alert to the physicians to prepare the medical assessment for the patient

(iv) The proposed method being robust and reliable as its performance was benchmarked using 10-fold cross-validation

(v) The dynamic behavior obtained from the firefly algorithm which made the system adaptive to many features according to each case study.

(vi) A few sets of parameters required to analyze the EEG signal typical three variables (Q, r, J).

The limitations of this research were summarized as follows:

(i) The limited number of the studied subjects (typically 100 per class)

(ii) The diagnosis process that maybe reduced because of depending on some additional software prerequisites.

5. Conclusion

In this paper, an automated intelligent CAD tool has been proposed to classify and detect epileptic seizures and seizure-free EEG signals. This method provided an EEG signal analysis using a hybrid data fusion method. The data fusion method combined the collected features from two different perspectives. In the first perspective, the EEG signal was considered as an image. Then, the image was converted to a gray image and the GLCM was obtained to extract the textures of the image such as contrast, correlation, power, and homogeneity. In the second perspective, the EEG signal was divided into smaller segments using the TQWT to extract the time and frequency features. A set of statistical, nonlinear (chaotic), and power spectrum features were obtained from each segment. After the dataset was constructed, a feature reduction algorithm based on firefly optimization was used to reduce the irrelevant features and remove redundancy. Then, an RF was trained to classify and predict the epileptic seizures and seizure-free EEG signals from the dataset. The experimental results showed that the proposed method achieved a satisfactory degree of 99% accuracy, 97% precision, 97% specificity, 98% recall, 98% F-measure, and 95% MCC at the third level of decomposition.

Data Availability

The dataset used to support the findings of this research was included in the article and can be found as a citation to "R.G. Andrzejak, K. Lehnertz, F. Mormann, C. Rieke, P. David, C.E. Elger, Indications of Nonlinear Deterministic and Finite-Dimensional Structures in Time Series of Brain Electrical Activity: Dependence on Recording Region and Brain State, Physical Review E, 64 (2001) 061907".

https://doi.org/10.1155/2018/5812872

Conflicts of Interest

The authors declare that they have no conflicts of interest.

References

[1] Epilepsy Fact Sheet, 2018, http://www.who.int/mediacentre/ factsheets/fs999/en/.

[2] U. R. Acharya, S. V. Sree, G. Swapna, R. J. Martis, and J. S. Suri, "Automated EEG analysis of epilepsy: a review," Knowledge-Based Systems, vol. 45, pp. 147-165, 2013.

[3] A. Puce and M. S. Hamalainen, "A review of issues related to data acquisition and analysis in EEG/MEG studies," Brain Sciences, vol. 7, no. 6, 2017.

[4] V. Joshi, R. B. Pachori, and A. Vijesh, "Classification of ictal and seizure-free EEG signals using fractional linear prediction," Biomedical Signal Processing and Control, vol. 9, pp. 1-5, 2014.

[5] P. Ghaderyan, A. Abbasi, and M. H. Sedaaghi, "An efficient seizure prediction method using KNN-based undersampling and linear frequency measures," Journal of Neuroscience Methods, vol. 232, pp. 134-142, 2014.

[6] R. B. Pachori and S. Patidar, "Epileptic seizure classification in EEG signals using second-order difference plot of intrinsic mode functions," Computer Methods and Programs in Biomedicine, vol. 113, no. 2, pp. 494-502, 2014.

[7] A. R. Hassan and A. Subasi, "Automatic identification of epileptic seizures from EEG signals using linear programming boosting," Computer Methods and Programs in Biomedicine, vol. 136, pp. 65-77, 2016.

[8] M. Sharma, A. Dhere, R. B. Pachori, and U. R. Acharya, "An automatic detection of focal EEG signals using new class of time-frequency localized orthogonal wavelet filter banks," Knowledge-Based Systems, vol. 118, pp. 217-227, 2017.

[9] O. Faust, U. R. Acharya, H. Adeli, and A. Adeli, "Wavelet-based EEG processing for computer-aided seizure detection and epilepsy diagnosis," Seizure, vol. 26, pp. 56-64, 2015.

[10] Y. Kumar, M. L. Dewal, and R. S. Anand, "Relative wavelet energy and wavelet entropy based epileptic brain signals classification," Biomedical Engineering Letters, vol. 2, no. 3, pp. 147-157, 2012.

[11] R. Sharma and R. B. Pachori, "Classification of epileptic seizures in EEG signals based on phase space representation of intrinsic mode functions," Expert Systems with Applications, vol. 42, no. 3, pp. 1106-1117, 2015.

[12] V. Bajaj and R. B. Pachori, "Epileptic seizure detection based on the instantaneous area of analytic intrinsic mode functions of EEG signals," Biomedical Engineering Letters, vol. 3, no. 1, pp. 17-21, 2013.

[13] Y. Kaya, M. Uyar, R. Tekin, and S. Yildirim, "1D-local binary pattern based feature extraction for classification of epileptic EEG signals," Applied Mathematics and Computation, vol. 243, pp. 209-219, 2014.

[14] T. S. Kumar, V Kanhangad, and R. B. Pachori, "Classification of seizure and seizure-free EEG signals using local binary patterns," Biomedical Signal Processing and Control, vol. 15, pp. 33-40, 2015.

[15] M. Mursalin, Y. Zhang, Y. Chen, and N. V. Chawla, "Automated epileptic seizure detection using improved correlation-based feature selection with random forest classifier," Neurocomputing, vol. 241, pp. 204-214, 2017.

[16] L. Wang, W. Xue, Y. Li et al., "Automatic epileptic seizure detection in EEG signals using multi-domain feature extraction and nonlinear analysis," Entropy, vol. 19, no. 6, 2017.

[17] Q. Yuan, W. Zhou, L. Zhang et al., "Epileptic seizure detection based on imbalanced classification and wavelet packet transform," Seizure, vol. 50, pp. 99-108, 2017.

[18] S. Lahmiri, "Generalized Hurst exponent estimates differentiate EEG signals of healthy and epileptic patients," Physica A: Statistical Mechanics and its Applications, vol. 490, pp. 378-385, 2018.

[19] U. R. Acharya, F. Molinari, S. V. Sree, S. Chattopadhyay, K. H. Ng, and J. S. Suri, "Automated diagnosis of epileptic EEG using entropies," Biomedical Signal Processing and Control, vol. 7, no. 4, pp. 401-408, 2012.

[20] R. Dhiman, J. S. Saini, and Priyanka, "Genetic algorithms tuned expert model for detection of epileptic seizures from EEG signatures," Applied Soft Computing, vol. 19, pp. 8-17, 2014.

[21] A. R. Hassan, S. Siuly, and Y. Zhang, "Epileptic seizure detection in EEG signals using tunable-Q factor wavelet transform and bootstrap aggregating," Computer Methods and Programs in Biomedicine, vol. 137, pp. 247-259, 2016.

[22] S. Patidar and T. Panigrahi, "Detection of epileptic seizure using Kraskov entropy applied on tunable-Q wavelet transform of EEG signals," Biomedical Signal Processing and Control, vol. 34, pp. 74-80, 2017.

[23] J. Jia, B. Goparaju, J. Song, R. Zhang, and M. B. Westover, "Automated identification of epileptic seizures in EEG signals based on phase space representation and statistical features in the CEEMD domain," Biomedical Signal Processing and Control, vol. 38, pp. 148-157, 2017.

[24] T. Gandhi, B. K. Panigrahi, and S. Anand, "A comparative study of wavelet families for EEG signal classification," Neurocomputing, vol. 74, no. 17, pp. 3051-3057, 2011.

[25] I. W. Selesnick, "Wavelet transform with tunable Q-factor," IEEE Transactions on Signal Processing, vol. 59, no. 8, pp. 3560-3575, 2011.

[26] S. Patidar and R. B. Pachori, "Classification of cardiac sound signals using constrained tunable-Q wavelet transform," Expert Systems with Applications, vol. 41, no. 16, pp. 7161-7170, 2014.

[27] S. Patidar, R. B. Pachori, A. Upadhyay, and U. Rajendra Acharya, "An integrated alcoholic index using tunable-Q wavelet transform based features extracted from EEG signals for diagnosis of alcoholism," Applied Soft Computing, vol. 50, pp. 71-78, 2017.

[28] R. G. Andrzejak, K. Lehnertz, F. Mormann, C. Rieke, P. David, and C. E. Elger, "Indications of nonlinear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state," Physical Review E: Statistical, Nonlinear, and Soft Matter Physics, vol. 64, no. 6, 2001.

[29] P. Swami, A. K. Godiyal, J. Santhosh, B. K. Panigrahi, M. Bhatia, and S. Anand, "Robust expert system design for automated detection of epileptic seizures using SVM classifier," in Proceedings of the 2014 3rd IEEE International Conference on Parallel, Distributed and Grid Computing, PDGC 2014, pp. 219-222, India, December 2014.

[30] P. Swami, T. K. Gandhi, B. K. Panigrahi, M. Tripathi, and S. Anand, "A novel robust diagnostic model to detect seizures in electroencephalography," Expert Systems with Applications, vol. 56, pp. 116-130, 2016.

[31] H. Ocak, "Automatic detection of epileptic seizures in EEG using discrete wavelet transform and approximate entropy," Expert Systems with Applications, vol. 36, no. 2, pp. 2027-2036, 2009.

[32] J.-H. Kang, Y. G. Chung, and S.-P. Kim, "An efficient detection of epileptic seizure by differentiation and spectral analysis of electroencephalograms," Computers in Biology and Medicine, vol. 66, pp. 352-356, 2015.

[33] S. Barua, M. U. Ahmed, C. Ahlstrom, S. Begum, and P. Funk, "Automated EEG Artifact Handling with Application in Driver Monitoring," IEEE Journal of Biomedical and Health Informatics, 2017.

[34] G. E. Polychronaki, P. Y. Ktonas, S. Gatzonis et al., "Comparison of fractal dimension estimation algorithms for epileptic seizure onset detection," Journal of Neural Engineering, vol. 7, no. 4, 2010.

[35] J. Gorecka, "Detection of ocular artifacts in EEG data using the Hurst exponent," in Proceedings of the 20th International Conference on Methods and Models in Automation and Robotics, MMAR 2015, pp. 931-933, August 2015.

[36] F. Parastesh Karegar, A. Fallah, and S. Rashidi, "ECG based human authentication with using Generalized Hurst Exponent," in Proceedings of the 25th Iranian Conference on Electrical Engineering, ICEE 2017, pp. 34-38, May 2017.

[37] I. Fister, I. Fister Jr., X.-S. Yang, and J. Brest, "A comprehensive review of firefly algorithms," Swarm and Evolutionary Computation, vol. 13, no. 1, pp. 34-46, 2013.

[38] L. Zhang, K. Mistry, C. P. Lim, and S. C. Neoh, "Feature selection using firefly optimization for classification and regression models," Decision Support Systems, vol. 106, pp. 64-85, 2018.

[39] L. Breiman, "Random forests," Machine Learning, vol. 45, no. 1, pp. 5-32, 2001.

[40] T. Fawcett, "An introduction to ROC analysis," Pattern Recognition Letters, vol. 27, no. 8, pp. 861-874, 2006.

[41] J. P. Kandhasamy and S. Balamurali, "Performance analysis of classifier models to predict diabetes mellitus," Procedia Computer Science, vol. 47, pp. 45-51, 2015.

[42] G. Varoquaux, "Cross-validation failure: Small sample sizes lead to large error bars," NeuroImage, 2017.

[43] O. Faust, U. R. Acharya, L. C. Min, and B. H. C. Sputh, "Automatic identification of epileptic and background eeg signals using frequency domain parameters," International Journal of Neural Systems, vol. 20, no. 2, pp. 159-176, 2010.

[44] U. R. Acharya, C. K. Chua, T.-C. Lim, Dorithy, and J. S. Suri, "Automatic identification of epileptic EEG signals using nonlinear parameters," Journal of Mechanics in Medicine and Biology, vol. 9, no. 4, pp. 539-553, 2009.

[45] R. J. Martis, U. R. Acharya, J. H. Tan et al., "Application of intrinsic time-scale decomposition (ITD) to EEG signals for automated seizure prediction," International Journal of Neural Systems, vol. 23, no. 5, pp. 1557-1565, 2013.

[46] Q. Yuan, W. Zhou, S. Li, and D. Cai, "Epileptic EEG classification based on extreme learning machine and nonlinear features," Epilepsy Research, vol. 96, no. 1-2, pp. 29-38, 2011.

[47] A. R. Naghsh-Nilchi and M. Aghashahi, "Epilepsy seizure detection using eigen-system spectral estimation and Multiple Layer Perceptron neural network," Biomedical Signal Processing and Control, vol. 5, no. 2, pp. 147-157, 2010.

[48] V. Bajaj and R. B. Pachori, "Classification of seizure and nonseizure EEG signals using empirical mode decomposition," IEEE Transactions on Information Technology in Biomedicine, vol. 16, no. 6, pp. 1135-1142, 2012.

Ahmed I. Sharaf, (1,2) Mohamed Abu El-Soud, (1,3) and Ibrahim M. El-Henawy (4)

(1) Department of Computer Science, Faculty of Computers and Information, El-Mansoura University, Egypt

(2) Deanship of Scientific Research, Umm Al-Qura University, Mecca, Saudi Arabia

(3) Department of Computer Science, University College of Umluj, Tabuk University, Saudi Arabia

(4) Department of Computer Science, Faculty of Computers and Information, El-Zagazig University, Egypt

Correspondence should be addressed to Ahmed I. Sharaf; ahmed.sharaf.84@gmail.com

Received 20 April 2018; Revised 18 August 2018; Accepted 27 August 2018; Published 10 September 2018

Academic Editor: A. K. Louis

Caption: FIGURE 1: Block diagram of the combinational hybrid system of epilepsy detection from EEG signal with 4 levels of processing.

Caption: FIGURE 2: Illustration of seizure-free and epileptic seizures of subbands obtained from TQWT with Q =1, r = 3, J = 3. Figures (a), (b), (c), and (d) represent the first, second, third, and fourth subbands obtained from seizure-free signals. Figures (e), (f), (g), and (h) represent the first, second, third, and fourth subbands obtained from the epileptic seizures.

Caption: FIGURE 3: TQWT decomposition of seizure-free and seizure EEG signals with Q = 1,r = 3, j = 3.

Caption: FIGURE 4: Performance measures of the proposed approach with varied levels of decomposition.

Caption: FIGURE 5: Performance measures of the proposed approach with varied levels of decomposition.

Caption: FIGURE 6: Performance measures of the original and compact features set.

Caption: FIGURE 7: The classification rules obtained from the proposed system to classify the seizure-free (0) and the epileptic seizure (1) signals.

Caption: FIGURE 8: The total accuracy (%) of various methods employed to detect the disorder of the epilepsy.

Figure 8: The total accuracy (%) of various methods employed to detect the disorder of the epilepsy. PROPOSED SYSTEMS USED TO FOR THE COMPARSION ACCURACY IN PERCENTAGE (%) Bajaj et al. 2013 90 Faust et al. 2010 93.11 Acharya et al. 2009 95 Martis et al. 2013 95.7 Yuan et al. 2011 96.5 Naghsh-Nilchi et al. 2010 97.5 Pachori et al. 2014 97.75 Patidar et al. 2017 97.75 Acharya et al. 2012 98 Bajaj et al. 2012 98.13 Kumar et al. 2015 98.3 The proposed method 99 Note: Table made from bar graph.

Printer friendly Cite/link Email Feedback | |

Title Annotation: | Research Article |
---|---|

Author: | Sharaf, Ahmed I.; Soud, Mohamed Abu El-; Henawy, Ibrahim M. El- |

Publication: | International Journal of Biomedical Imaging |

Date: | Jan 1, 2018 |

Words: | 8100 |

Previous Article: | Estimation of the Craniectomy Surface Area by Using Postoperative Images. |

Next Article: | Corrigendum to "Polychromatic Iterative Statistical Material Image Reconstruction for Photon-Counting Computed Tomography". |

Topics: |