# Condition Monitoring of Sensors in a NPP Using Optimized PCA.

1. IntroductionAs a safety-critical system, in NPPs, safety is of prime importance. Meanwhile, there is also an increasing demand for NPPs to operate more cost-effectively [1]. Thus, advanced technologies for performance diagnosis and control are incorporated into the engineering designs, which aim to guarantee the safety and improve the economy of the whole NPP simultaneously. Meanwhile, with the wide application of digital I&C systems in NPPs, more sensors are applied to obtain the operating information of the plant. On the one hand, the application of more sensors in a NPP contributes to advanced diagnosis and control technologies where quantities of sensors are required to deliver data about the key indicators of system status and performance; on the other hand, it also increases the fault probability of sensors in NPPs [2]. If an abrupt or an incipient failure occurs on a sensor, nonpermitted characteristic property deviation of the sensor will be caused. As a result, inaccurate measurements are delivered to related systems which may further lead to the plant operation deviating from the optimal condition, resulting in process shutdown or even severe accidents in NPPs [3]. Thus, it is necessary to implement condition monitoring for sensors in NPPs.

Confirmed sensor measurements, in addition to conveying the operating information effectively to where it is required to ensure the safety and economy of the NPP, are also beneficial to the condition-based maintenance (CBM) strategy in NPPs. At present, a preventive maintenance strategy is mainly adopted in sensor calibrations during the regular refueling of a NPP. This not only presents a significant cost in time but also leads to component degradation due to repetitive manipulations compared with the CBM strategy [4, 5].

A traditional approach for sensor condition is based on hardware redundancy [6]. The major problem with hardware redundancy is the cost (including the sensor cost and maintenance cost). In this context, approaches based on analytical redundancy are proposed in the literature, including artificial neural networks (ANN) [7-9], independent component analysis (ICA) [10, 11], support vector machine (SVM) [12, 13], fuzzy logic [14-16], partial least-squares regression (PLSR) [17], and PCA [18-24]. A study conducted by Hines and Seibert concluded that the simplicity of analytical redundancy techniques and the tractability of their uncertainty calculations could favor them for acceptance by regulatory bodies [25]. Hence, PCAis adopted for sensor condition monitoring in this paper due to its simplicity and individual strong points.

In the literature, PCA has been used for sensor condition monitoring in many cases. Rosani and Hines applied PCA to monitor 5 temperature sensors in a research reactor [20]. Water-cooled chiller sensors were analyzed with the PCA technique by Hu [22]. Jamil et al. implemented fault diagnosis on the Pakistan Research Reactor-2 with PCA and Fisher discriminant analysis (FDA) [18]. Magan-Carrion et al. introduced a PCA-based method to carry out fault detection in WSNs [26]. Liu et al. and Delimargas et al. used the PCA method to solve the calibration sensitivity, respectively [27, 28].

However, the previous research is mainly focused on the design of the PCA model and implementation of the PCA method in various industries. There are quite a few problems in the common PCA method. Firstly, there is usually an implicit assumption that all the data are prepared in advance; nevertheless, data from a real NPP are usually contaminated by random noise or unknown factors in practice. Secondly, since thousands of sensors are applied in a NPP, it is impossible to put all the sensors into a single PCA model. How to separate sensors into various PCA models is not considered in previous research. Finally, false alarms are inevitable in practice due to the external and internal influences. How to reduce the false alarms to guarantee the reliability of the PCA model has got little attention.

The contribution of this paper is as follows: various optimizations techniques are proposed to deal with the foregoing problems in the common PCA method. Optimizations are involved in different modeling procedures of the common PCA method, including data preprocessing, modeling parameter selection, and fault detection and isolation.

The paper is organized as follows. Section 1 describes the necessity of sensor condition monitoring. Based on the previous research, an optimized PCA framework is proposed. Section 2 outlines the common PCA method. Section 3 details the PCA optimization framework. The effectiveness of the optimized PCA method is tested and evaluated with sensor measurements from a real NPP in Section 4. Conclusions and future work are given in the last section.

2. PCA Methodology

The basic concepts and formulas involved in the PCA method will be briefly explained in this section. For detailed mathematical derivation processes, refer to Li, He, or Jose [29-31].

2.1. Basic Theories of PCA. PCA transforms a set of correlated variables into a set of new uncorrelated variables and meanwhile retains most information of the original data. Then, the principal components (PCs) are derived from the uncorrelated variables to detect and isolate process abnormalities in a robust way [32].

The original data matrix X is (n samples, m variables) decomposed as the sum of an estimation matrix [??] and a residual matrix E:

X = [??] + E = [T.sub.k][P.sup.T.sub.k] + E = [t.sub.1][p.sup.T.sub.1] + [t.sub.2][p.sup.T.sub.2] + ... + [t.sub.k][p.sup.T.sub.k]. (1)

T and P are the scores and loading matrixes of X, respectively. Vectors [p.sub.i] are orthonormal, and vectors [t.sub.i] are also orthonormal. Meanwhile, [t.sub.i] is the linear combination of X which is derived as

[t.sub.i] = X[p.sub.i]. (2)

Vector [t.sub.i] represents how the samples are related to each other, while vector [p.sub.i] represents how variables are related to each other.

The next step is to select the PCs in a PCA model. There are various criteria to determine the number of PCs [33]. Eigenvalues corresponding to the eigenvectors describe how much information each PC contains. Cumulative percent variance (CPV) percentage represents the variation of selected PCs accounting for all the variation of X. Then, the CPV is adopted to determine the number of PCs. It is defined as

CPV = [[[summation].sup.i=k.sub.i=1][[lambda].sub.i]/[[summation].sup.i=m.sub.i=1][[lambda].sub.i]] x 100%. (3)

That is, PCA divides X into two parts in the foregoing steps: the model estimation matrix [??] and the residual matrix E.

2.2. Fault Detection of PCA. There are two commonly used statistics to carry out this task: Q statistics and Hotelling's [T.sup.2] statistics. They are defined to measure the variation in matrixes [??] and E, respectively. If a new testing vector exceeds the effective region in [??] or a significant residual is observed in E, a special event, either due to disturbance changes or due to changes in the relationship between variables, can be detected [2].

Q statistic quantifies the lack of fit between the testing vectors and the model. It indicates the distance that a testing vector falls from the PC model. The Hotelling [T.sup.2] statistic measures the variation within the PCA model. They are calculated as

[Q.sub.i] = [e.sub.i][e.sup.T.sub.i] = x(I - [PP.sup.T])[x.sup.T] [less than or equal to] [Q.sub.a],

[T.sup.2.sub.i] = [t.sub.i][[LAMBDA].sup.-1][t.sup.T.sub.i] = xP[[LAMBDA].sup.-1][P.sup.T][x.sup.T] [less than or equal to] [T.sup.2.sub.[alpha]]. (4)

[Q.sub.[alpha]] and [T.sup.2.sub.[alpha]] are confidence limits for Q and [T.sup.2] statistics, respectively. For the calculation of [Q.sub.[alpha]] and [T.sup.2.sub.[alpha]], refer to the doctoral thesis by Li [34].

3. Optimized Framework for Sensor Condition Monitoring Based on Common PCA

All the optimizations based on the common PCA method are summarized in Figure 1. Firstly, original data are preprocessed with statistical analysis and sliding window method. Then, the preprocessed data are applied to train the PCA model. Meanwhile, at the PCA modeling stage, three kinds of modeling parameter selection criteria are proposed compared with the common random selection criterion, including the variance of sensor measurements, the correlation of sensor measurements, and the type of sensors. Particularly, two different variance criteria are contained in the criterion of variance, which are standard deviation and volatility degree of the sensor measurements, respectively. Next, a false alarm reducing method is applied to reduce the false alarms of Q and [T.sup.2] statistics in the fault detection stage. Finally, the detected abnormal behavior is analyzed in principal and residual space simultaneously to locate the faulty sensor more accurately in the isolation stage. This way, more credible and reliable monitoring results can be obtained with the foregoing optimizations in a common method.

3.1. Data Preprocessing Stage. Since sensors in a NPP usually work at high temperature, high pressure, high radiation, high humidity, or high corrosion environment, thus singular points or noise-like fluctuations are inevitable in the original measurements [35]. If these data are directly used to develop the PCA model (nine coolant outlet temperature sensors are selected as an example), the monitoring results with 1000 testing samples are shown in Figure 2. It is evident from Figure 2 that the results are not quite satisfactory; both Q and [T.sup.2] statistics present quite a few alarms under normal operating conditions. Thus, data preprocessing is necessary for the data from a real environment.

The abnormal fluctuations in the original data are further classified into singular points and random fluctuations, and they are preprocessed with various methods in this paper.

To eliminate the singular points in the original data, a statistics-based analysis method is applied, which is characterized by its simple structure, small calculating amount, and fast speed [36]. All these advantages make it well suitable for the monitoring of sensors in a NPP, where a large number of sensors are installed. The theory of this statistics-based method is explained as follows.

Most random errors obey normal distribution under normal operating conditions; there is only a very small probability that the random error is greater than 3 standard deviations of the sensor measurements [37]. Whether [x.sub.i] is a singular point in x or not, it can be inferred by

[absolute value of [x.sub.i] - [bar.x]] > 3[sigma] (i = 1, 2, ..., n), (5)

where [bar.x] is the arithmetic average and [sigma] is the standard deviation estimation for the n equal precision measurements of sensor x. If [x.sub.i] satisfies (5), [x.sub.i] will be treated as a singular point and eliminated from the original data directly.

The measurements of three feedwater flow sensors are selected as an example to show the effectiveness of the singular points elimination method, and the results are given in Figure 3. It can be seen that singular points are all existent in the measurements of 1#, 2#, and 3# feedwater sensors based on the foregoing analysis.

After singular points are eliminated according to (5), random fluctuations in the measurements will be further reduced. Medium filtering, arithmetic average filtering, weighted recursive filtering, and wavelet analysis are the most used methods to reduce the random fluctuations [38]. Usually, the selection of the elimination method is mainly dependent on the characteristics of the measurements. Considering the type of sensors applied during modeling in this paper, the sliding window average method is used as the denoising method for the sensor measurements from a real NPP [39]. It is a time-domain denoising method which constantly takes out contiguous m measurements of sensor x and calculates the arithmetic average of the m measurement. m is just the length of the sliding window. Then, the average value in the sliding window is regarded as the estimated value at moment k. That is,

[x.sub.k,est] = [bar.x] = [k.summation over (i=k-m+1)][x.sub.i]. (6)

Random fluctuations are filtered based on (6). Then, the data present a smoother changing trend after singular points and random fluctuations are reduced from the original. The measurements in Figure 2 are used again to show the effectiveness of data preprocessing, and the results in this case are shown in Figure 4.

Compared with Figure 2, it is clear that the false alarms of Q and [T.sup.2] statistics are greatly reduced. Then, it can be concluded that data preprocessing is significantly effective in improving the accuracy of the PCA model, and it is really necessary and meaningful to preprocess the data from a real operating environment.

3.2. Modeling Parameter Selection Stage. After the original data are preprocessed, the next step is to develop the PCA model with the preprocessed measurements. Obviously, it is unrealistic and unreasonable to put all sensors in a NPP into a single PCA model; thus, a distributed framework is proposed in this paper, that is, multiple PCA models running in parallel to implement condition monitoring for all the monitored sensors in a NPP. Hence, how to best group various sensors into various PCA models to get optimal performance is very important [35]. In this context, the following criteria are proposed, which are compared with random modeling parameter selection criterion.

(1) Variance. Two different criteria are included in variance, which are standard deviation and volatility degree of the sensor measurements. They are described as follows.

(a) Standard Deviation. It refers to the standard deviation of the sensor measurements, which is typically used in statistical terminology. Considering that a similar standard deviation of the sensor measurements in a PCA model may be beneficial to the detection of small failures, thus it is defined in this paper:

S = [square root of [[summation].sup.n.sub.i=1][(x(i) - [bar.x(i)]).sup.2]/[n - 1]]. (7)

(b) Volatility Degree. It refers to the volatility degree of the sensor measurements, which is a bit different from "standard deviation" defined in statistical terminology. The volatility degree of sensor measurement is described as

V = [[summation].sup.n.sub.i=1][(x(i)/[bar.x] - [bar.x(i)]/[bar.x]).sup.2]/n. (8)

Compared with the criterion of standard variation, the criterion of volatility degree may be more reasonable. Since the sensor measurements cover different orders of magnitude, standard deviation may be incapable of describing the variation in the measurements more accurately. Two vectors s1 and s2 are taken as an example for explanation. Suppose that

s1 = [0.5 0.2 0.3 0.1 0.4],

s2 = [5 2 3 1 4]. (9)

Obviously, we can see that the changing trends, namely, the volatility degrees of s1 and s2, are equal. Then, the S and V values of s1 and s2 can be calculated as follows:

[mathematical expression not reproducible]. (10)

Based on (10), the foregoing inference is proved to be right; that is, the same volatility degree of s1 and s2 is obtained; however, the standard deviation of s1 and s2 is different. Thus, the volatility degree-based criterion is proposed as the supplement of the standard deviation-based criterion in this paper. This way, sensor measurements with similar changing trends (namely, with similar volatility degree) rather than with similar standard deviation can be grouped together to train a PCA model. Then, the PCA model should be more sensitive to glitches in the monitored sensors. And the fault detection sensitivity with these two different criteria will be evaluated in the simulation section.

(2) Correlation. It refers to the correlation coefficients between sensors x and y which can be calculated as (11). A higher R value usually means a more significant linear correlation between x and y. Since PCA is a linear analysis method, naturally it is advantageous to group the linear dependent sensors into a single set to develop the PCA model. Thus, this criterion is proposed.

[mathematical expression not reproducible]. (11)

Then, the sensor measurements with higher correlation coefficients are separated into the same PCA model. That is, sensors in each PCA model present higher linear correlation compared with a random grouping PCA model.

(3) Type. It refers to the types of sensors that are used to measure various parameters in a NPP. As it is known, various parameters are usually measured with various types of sensors, and various types of sensors are usually with different measurement precisions, work in different environments, and suffer from different external disturbances, and so on. Considering all these factors, a type-based modeling parameter selection criterion is proposed. Then, the same type of sensor can be grouped together to train a PCA model. As a result, the foregoing mentioned influence factors can be minimized.

All the proposed criteria are tested and evaluated in Section 4 to get an optimal modeling parameter selection criterion.

3.3. Fault Detection and Isolation Stage. Based on data preprocessing and modeling parameter selection, a false alarm reducing method is further applied to improve the accuracy and reliability of the PCA model in the fault detection stage. Meanwhile, the detected abnormal behavior is analyzed in principal and residual space simultaneously in order to locate the faulty sensor more accurately in the fault isolation stage.

The false alarm reducing method defines another confidence limit to further reduce the false alarms of [T.sup.2] and Q statistics. If [Q.sub.[alpha]] or [T.sup.2.sub.[alpha]] is called the first confidence limit, this new confidence limit is called the second confidence limit for [T.sup.2] and Q statistics.

Suppose that the false alarm probability for [T.sup.2] or Q statistics is [alpha], which is usually set between 0 and 0.05 according to the experience in process industries [40]. Selecting n as the length of a basic observation window, the allowable maximum m, namely, the second confidence limit, can be derived from the following formula:

F(m; [alpha], n) = [m.summation over (j=0)]P(m; [alpha], n) = [m.summation over (j=0)][C.sup.j.sub.n][[alpha].sup.j][(1 - [alpha]).sup.n-j] < [beta], (12)

where [beta] is also an experience value which is determined based on the model precision. Usually, it is set between 0.98 and 1 according to the experience in process industries [40]. If the number of false alarms for [T.sup.2] or Q statistics exceeds m in an observation window before [x.sub.i], then [x.sub.i] will be defined as a true faulty state.

After Q or [T.sup.2] statistics exceed the second confidence limit, an abnormality is detected. Then, an abnormal behavior is analyzed in principal and residual space simultaneously to locate the faulty sensor more accurately in the fault isolation stage. Since [T.sup.2] and Q statistics represent the total variation in principal and residual space, respectively, thus the contributions of sensors to [T.sup.2] and Q statistics are applied simultaneously to identify the faulty sensor [30].

Suppose that a testing vector x is expressed as x = [[x.sub.1], [x.sub.2], ..., [x.sub.m]] and m is the number of sensors in x. The contribution of sensor [x.sub.i] to the total variation in residual subspace (represented by Q statistic) is defined as

[mathematical expression not reproducible]. (13)

The contribution of sensor [x.sub.i] to the total variation in principal subspace (represented by [T.sup.2] statistic) can be calculated as the following steps.

(1) Calculate the contribution of [x.sub.i] to score vector [t.sub.j]:

[mathematical expression not reproducible], (14)

where [p.sub.j,i] is the ith element of vector [p.sub.j].

(2) Calculate the contribution of [x.sub.i] to [T.sup.2] statistic:

[mathematical expression not reproducible]. (15)

When a NPP is operating under normal conditions, [T.sup.2] and Q statistics should be within the confidence limits, and the contributions of each sensor to [T.sup.2] and Q statistics should be almost equal theoretically. If a fault occurs on the monitored sensors, [T.sup.2] and/or Q statistics will be beyond their confidence limits, and then [mathematical expression not reproducible] and [mathematical expression not reproducible] can be directly used to locate the faulty sensor. Furthermore, if the fault that occurs on the monitored sensors is just a small glitch, such as a small drift which may not be detected by [T.sup.2] and Q statistics, these two fault isolation indexes will also be beneficial both in the detection and in the isolation of this small fault. However, an evident increasing trend still can be seen in [mathematical expression not reproducible] and/or [mathematical expression not reproducible] for the drift sensor, although [T.sup.2] and Q statistics maybe incapable of detecting the small drifts on sensors.

Small drifts on sensors may not result in severe accidents, but if the drift sensor participates in important control processes in the NPP, this may lead to operation deviation from the optimal condition. The consequence of the deviation operation is potential decline of the plant economy. Even if small drifts appear on sensors which do not participate in important control processes and just serve monitoring purposes, these two fault isolation indexes can also contribute to the CBM strategy in a NPP. Since a higher index value usually indicates unknown degradation on the sensor, thus sensors can be calibrated, maintained, or repaired as required, and excessive calibration and maintenance manipulations for sensors can be avoided.

4. Simulation Tests and Results

In order to test the functionality of the optimized PCA method, sensor measurements are acquired from a real NPP under normal operating conditions with full power to carry out the simulations. Since a large number of sensors are included in the database of a NPP, thus the sensors are numbered separately in Arabic numerals in order to demonstrate the simulation results more conveniently. To verify the performance of PCA models with various modeling parameter selection criteria, five PCA models are given based on the proposed criteria, which are described in the following. Meanwhile, in order to verify the fault detection and isolation performance of the optimized PCA model, failures with different degrees are imposed sequentially to the measurements of coolant outlet temperature sensor (which is exactly marked 1# sensor in the database). The reason of introducing failures to this sensor is that 1# sensor is included in all the five PCA models mentioned above.

The five proposed PCA models are determined as follows.

(1) PCA Model with Modeling Parameter Selection Criterion of Type. Since 1# sensor is confirmed to be contained in all the five PCA models, thus sensors with the same type are selected to train the PCA model. Then, based on the modeling parameter selection criterion of type, the following sensors in the database are selected to train the PCA model, including [1 2 3 4 5 6 7 8 9]. And the Arabic numerals represent the positions of the selected sensors in the database.

(2) PCA Model with Modeling Parameter Selection Criterion of Standard Deviation. Similarly, 1# sensor is also included in this PCA model. Firstly, the standard deviation of all sensors in the database is calculated based on (7). Then, based on the modeling parameter selection criterion of standard deviation, sensors in the database with the most similar standard deviation to 1# sensor are selected out to train this PCA model. This way, the PCA model with modeling parameter selection criterion of standard deviation is determined. And the positions of the selected sensors in this PCA model are [1 21 13 43 130 132 146 24 45], which are ordered by the similarity of standard deviation to 1# sensor from large to small. Likewise, the Arabic numerals represent the positions of selected sensors in the database.

(3) PCA Model with Modeling Parameter Selection Criterion of Volatility Degree. In the same way, the volatility degree of sensors in the database is calculated firstly based on (8), and then sensors with the most similar volatility degree to 1# sensor are selected as the modeling parameters in this PCA model. Thus, the PCA model with modeling parameter criterion of volatility degree is determined. The selected sensors in this PCA model are with the following positions in the database: [1 47 55 61 80 130 149 102 112], which are ordered by the similarity of volatility degree to 1# sensor from large to small.

(4) PCA Model with Modeling Parameter Selection Criterion of Correlation Coefficients. In order to determine this PCA model, correlation coefficients between 1# sensor and all the other sensors in the database are calculated first based on (11). And then the first eight sensors with the largest correlation coefficients to 1# sensor are selected as the modeling parameters of this PCA model. The positions of the selected sensors in the database are [1 47 55 61 80 130 149 102 112], which are ordered by the correlation coefficients to 1# sensor from large to small. This way, the PCA model with modeling parameter selection criterion of correlation is determined.

(5) PCA Model with Modeling Parameter Selection Criterion of Random. For comparison, this PCA model is developed in this paper. The selected modeling parameters in the model are [1 47 55 61 80 130 149 102 112], which cover different types and different orders of magnitude on standard deviation, volatility degree, and correlation coefficients of sensors.

It can be seen that not only is the 1# sensor a common item in the foregoing five PCA models, but also nine sensors are included in each PCA model. In this context, failures can be imposed to the mutual 1# sensor measurements for every PCA model, and the model performances with different modeling parameter selection criteria can be evaluated with reasonable preconditions.

4.1. Simulations with Normal Measurements. 1000 original samples are used to train the five PCA models and another 1000 original samples are selected as the testing data to carry out the simulation tests. The results of [T.sup.2] and Q statistics in the five PCA models are shown in Figures 5 and 6, respectively. Red dotted lines in the figures are the confidence limits for [T.sup.2] and Q statistics. It can be seen that Q statistics present false alarms in all the five PCA models under normal operating conditions. For [T.sup.2] statistics, it is relatively better that false alarms only occur in PCA models with parameter selection criteria of random and standard deviation.

If the original samples are preprocessed with the methods proposed in this paper, then the preprocessed data are used to train the five PCA models. In this context, the simulation results of [T.sup.2] and Q statistics in the five PCA models are shown in Figures 7 and 8. Since singular points and random fluctuations in the original samples are eliminated by statistical and sliding window method, the false alarms of [T.sup.2] and Q statistics are reduced to some extent.

Thus, on the basis of the data preprocessing, the second confidence limit for [T.sup.2] and Q statistics is proposed to further reduce the false alarms of [T.sup.2] and Q statistics. With the application of the second confidence limit, the detailed false alarm probability of [T.sup.2] and Q statistics in the five PCA models is summarized in Table 1. Obviously, the false alarms of [T.sup.2] and Q statistics in all the five PCA models are reduced to lower levels with the application of the false alarm reducing method. As a result, the data preprocessing method to original data and false alarm reducing method to [T.sup.2] and Q statistics really contributes to false alarm reduction of [T.sup.2] and Q statistics under normal operating conditions. Then, the model performance is really improved in this way.

From Table 1, it can be seen that the PCA model with parameter selection criterion of correlation shows optimal performance on sensor fault detection compared with the other four PCA models. False alarms of [T.sup.2] and Q statistics are reduced to 0 and 0.2%, respectively, in this PCA model, which are lower than that in the other four PCA models.

Due to the influence of model precision and external environments, the contributions of sensors to [T.sup.2] and Q statistics in a PCA model are not equal under normal operating condition as the results in Figure 9. Thus, two samples are selected from the 1000 samples (namely, the 600th and 1000th samples) as a contrast to show the condition monitoring results. Then, contributions of sensors to [T.sup.2] and Q statistics in the five PCA models are calculated at the 600th and 1000th sample points, which are illustrated in Figures 9(a), 9(b), and 9(c). [T.sup.2] statistics in the PCA model with parameter selection criterion of random in Figure 9(a) are taken as an example for explanation. At the 600th sample point, the contribution of 1# sensor to [T.sup.2] statistics is about 14%; meanwhile, the contribution of 130# sensor to [T.sup.2] statistics is about 7%. It is clear that there is a large contribution difference between these two sensors, which should indicate unknown failures in the monitored sensors in theory. However, at 1000th sample point, the contribution of 1# sensor to [T.sup.2] statistics in this PCA model is still around 14%, and also that of the 130# sensor is still around 7%. Similar results also can be seen on the other sensors in this PCA model. That is, the contributions of all sensors in a PCA model to [T.sup.2] or Q statistics are not equal at a single sample point; however, the contributions of each sensor at different sample points almost keep unchanged. Then, it can be inferred that no failures occur in the monitored sensors; the contribution differences among various sensors may result from unknown uncertainty factors in the PCA model, not from the failures on sensors. In the other four PCA models, similar results also can be obtained.

From the contribution figures, we also can get such a fact that the PCA model with parameter selection criterion of correlation shows better performance on fault isolation under normal operating conditions. The contributions of sensors to [T.sup.2] statistics are almost equal, which best accords with the theoretical analysis. Meanwhile, the contributions of sensors to Q statistics in this PCA model also agree more with the theoretical analysis compared with the other four PCA models. On the other hand, from Figure 9, it also can be seen that the PCA model with random parameter selection criterion presents the worst performance on this point. Whether to [T.sup.2] or Q statistics, the contributions of sensors are quite different in this case.

4.2. Simulations with Abnormal Measurements. Meanwhile, in order to verify the fault detection and isolation ability of the proposed PCA model, two artificial drifts (ramps) are imposed to the coolant outlet temperature sensor (namely, 1# sensor in the database) at the 400th sample point. One drift simulates a common problem that affects process sensors and may result from aging. The simulated drift is a ramp that grows to 0.45[degrees]C for 1# sensor measurements. This small drift corresponds to a maximum 0.15% change of the measurements, which is imperceptible in the time profile. Another drift is relatively bigger, which represents a common issue that may result from mechanical failures. This simulated drift is also a ramp that grows to 3.5[degrees]C for 1# sensor measurements. And it is equivalent to a maximum 1.15% change which also can be seen in the time profile.

It can be seen that [T.sup.2] statistics in all five PCA models cannot detect the small drift that occurred on 1# sensor, which is shown in Figure 10. In Figure 11, increasing trends of Q statistics can be seen at the last period of the tests; however, the trends are not significant and with higher volatility, which are representative of uncertain results. Then, the contributions of sensors are further required to help detect the small failure on 1# sensor, which are illustrated in Figure 12. For explanation, the PCA model with random parameter selection in Figure 12(a) is taken as an example.

From Figure 12(a), the contribution of 1# sensor to Q statistics is about 22% at the 600th sample point, and it almost reaches 30% at the 1000th sample point. A big contribution increase is present on the 1# sensor, which is different from the situation under normal conditions (contributions keep unchanged between the 600th and 1000th points). In contrast, the contribution of 80# sensor to Q statistics is about 20% at the 600th sample point and reduced to 18% at the 1000th sample point. A small contribution decrease appears between the 600th and 1000th sample points, which is the same on the other sensors (47#, 55#, 61#, 130#, 149#, 102#, and 112#) in this PCA model; the contributions of these sensors almost remain unchanged or present minor decreasing trends with the drift developing on 1# sensor. However, no evident contribution differences of [T.sup.2] statistics appear on any sensor in this PCA model between the 600th and 1000th sample points. It can be explained in Figure 10, where [T.sup.2] statistics of 1# sensor almost have no obvious changes during the test either.

Based on the analysis of Q statistics and the contributions to Q statistics, it can be inferred that 1# sensor behaves abnormally. That is, it is entirely within the capacity of the PCA models to detect and isolate sensors with this level of drift. Meanwhile, from Figure 11, it also can be seen that the PCA model with correlation parameter selection is more sensitivity on fault detection compared with the other four PCA models, since the small drift on 1# sensor can be detected by this PCA model more quickly. The PCA models with modeling parameter selection criteria of standard deviation and volatility degree are in the second and third order, and the PCA model with random parameter selection criterion shows the worst performance of fault detection in this case.

From Figure 12, it can also be concluded that the PCA model with correlation parameter selection shows better performance on the small fault isolation. The contributions of 1# sensor to Q statistics at the 1000th sample point in the five PCA models are taken as an example to demonstrate the foregoing conclusion.

Since the failure imposed on 1# sensor is a ramp function, thus the failure will develop over time. Similarly, the contribution of 1# sensor to Q statistics will become large with the developing of the failure over time. It can be seen that the contributions of 1# sensor to Q statistics at the 1000th sample point have reached about 30%, 30%, 35%, 40%, and 60%, respectively, in the PCA models with parameter selection criterion of random, standard deviation, volatility degree, type, and correlation. Obviously, the contribution of 1# sensor to Q statistics in the PCA model with parameter selection criterion of correlation is significantly larger than that in the other four PCA models, which is very beneficial to the isolation of the drift on 1# sensor among the monitored sensors. Thus, compared with the other four PCA models, the PCA model with parameter selection criterion of correlation shows the best performance on sensor fault isolation with small drifts.

In contrast, the condition monitoring results with a larger drift on 1# sensor are described in Figures 13 and 14. The figures indicate that both [T.sup.2] and Q statistics in all the five PCA models can detect the failure during the test. That is, the PCA method has enough sensitivity to this kind of failures that occurred on the monitored sensors.

In this case, the contributions of sensors to [T.sup.2] and Q statistics in the five PCA models are shown in Figure 15. In each PCA model, the contribution of 1# sensor to [T.sup.2] or Q statistics at the 1000th sample point is significantly larger than that at the 600th sample point, which corresponds to theoretical analysis. Meanwhile, due to the larger drift on 1# sensor, the contributions of 1# sensor are also significantly greater than that in Figure 12. As a result, based on the contribution distribution of sensors, the failure on 1# sensor is located.

Meanwhile, from Figure 15, it also can be seen that the PCA model with random parameter selection criterion shows the worst performance compared with the other PCA models. Only in this PCA model is the contribution of 1# sensor to Q statistics below 50% either at the 600th or at the 1000th sample point. However, the contributions are all greatly larger than 50% in the other four PCA models whether at the 600th or at the 1000th testing point, which presents more effective fault detection and isolation abilities during the test. Thus, it can be concluded that the PCA models with parameter selection criteria of standard deviation, volatility degree, type, and correlation all show quite good performance on the fault isolation of sensors with larger failures.

Based on the foregoing simulations, the following conclusions can be obtained:

(1) The proposed data preprocessing and false alarm reducing methods are proved to be effective in the reduction of false alarms of [T.sup.2] and Q statistics in a PCA model, which is equivalent to the improvement of model performance.

(2) Simulations under normal and abnormal conditions show that the PCA model with modeling parameter selection criterion of correlation presents better performance both on the fault detection and on the fault isolation, compared with the other four PCA models.

5. Conclusions and Perspectives

An optimized PCA framework for sensor condition monitoring is proposed in this paper. The proposed optimizations are mainly involved in various modeling procedures in the common PCA method, including data preprocessing stage, modeling parameter selection stage, and fault detection and isolation stage. In the data preprocessing stage, singular points and random fluctuations in the original data are eliminated with various techniques. In the modeling parameter selection stage, various parameter selection criteria are proposed to get optimal model performance of the PCA method. In the last fault detection and isolation stage, a statistics-based method is further applied to reduce the false alarms of [T.sup.2] and Q statistics on the basis of data preprocessing. Meanwhile, the confirmed faulty state is discussed in the principal and residual space simultaneously to locate the faulty sensor more precisely.

Data from a real NPP are used to test the optimized PCA method in this paper. According to the simulation results under normal conditions, false alarms of [T.sup.2] and Q statistics really can be greatly reduced with the application of data preprocessing and false alarm reducing method. Based on the simulations with faulty data, the optimized PCA method proves to be effective in sensor fault detection and isolation, whether with small or major failures. Meanwhile, it can be concluded that the PCA model with parameter selection criterion of correlation shows better performance either under normal or under abnormal operating condition.

Although valuable improvements have been made in this paper, there is still much work to do in the future. How to further process the remaining false alarms and how to best reconstruct the faulty data will be analyzed on the basis of the done effort in this paper.

https://doi.org/10.1155/2018/7689305

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Acknowledgments

The authors acknowledge the financial support of the national project of "Research on Online Monitoring and Operation Support Techniques in a Nuclear Power Plant" to the present research.

References

[1] J. P. Ma and J. Jiang, "Applications of fault detection and diagnosis methods in nuclear power plants: a review," Progress in Nuclear Energy, vol. 53, no. 3, pp. 255-266, 2011.

[2] Li. Jiang, Sensor fault detection and isolation using system dynamic identification techniques [Ph.D. thesis], The University of Michigan, 2011.

[3] G. Betta and A. Pietrosanto, "Instrument fault detection and isolation: State of the art and new research trends," IEEE Transactions on Instrumentation and Measurement, vol. 49, no. 1, pp. 100-107, 2000.

[4] R. Dorr, F. Kratz, J. Ragot, F. Loisy, and J.-L. Germain, "Detection, isolation, and identification of sensor faults in nuclear power plants," IEEE Transactions on Control Systems Technology, vol. 5, no. 1, pp. 42-60, 1997.

[5] J. W. Hines and D. R. Garvey, "Development and application of fault detectability performance metrics for instrument calibration verification and anomaly detection," Journal of Pattern Recognition Research, vol. 1, no. 1, pp. 2-15, 2006.

[6] H. M. Hashemian, "On-line monitoring applications in nuclear power plants," Progress in Nuclear Energy, vol. 53, no. 2, pp. 167-181, 2011.

[7] A. Messai, A. Mellit, I. Abdellani, and A. Massi Pavan, "On-line fault detection of a fuel rod temperature measurement sensor in a nuclear reactor core using ANNs," Progress in Nuclear Energy, vol. 79, pp. 8-21, 2015.

[8] P. Deshpande, N. Warke, P. Khandare, and V. Deshpande, "Thermal power plant analysis using artificial neural network," in Proceedings of the 3rd Nirma University International Conference on Engineering, NUiCONE 2012, India, December 2012.

[9] P. F. Fantoni, "Experiences and applications of PEANO for online monitoring in power plants," Progress in Nuclear Energy, vol. 46, no. 3-4, pp. 206-225, 2005.

[10] A. Ajami and M. Daneshvar, "Data driven approach for fault detection and diagnosis of turbine in thermal power plant using Independent Component Analysis (ICA)," International Journal of Electrical Power & Energy Systems, vol. 43, no. 1, pp. 728-735, 2012.

[11] J. Ding, W. Hines, and B. Rasmussen, Independent component analysis for redundant sensor validation, Nuclear Engineering Department. The University of Tennessee, 2003.

[12] K. Salahshoor, M. Kordestani, and M. S. Khoshro, "Fault detection and diagnosis of an industrial steam turbine using fusion of SVM (support vector machine) and ANFIS (adaptive neuro-fuzzy inference system) classifiers," Energy, vol. 35, no. 12, pp. 5472-5482, 2010.

[13] K.-Y. Chen, L.-S. Chen, M.-C. Chen, and C.-L. Lee, "Using SVM based method for equipment fault detection in a thermal power plant," Computers in Industry, vol. 62, no. 1, pp. 42-50, 2011.

[14] H. Eliasi, H. Davilu, and M. B. Menhaj, "Adaptive fuzzy model based predictive control of nuclear steam generators," Nuclear Engineering and Design, vol. 237, no. 6, pp. 668-676, 2007.

[15] L. Su and Z. Zhao, "Performance diagnosis of power plant boiler based on fuzzy comprehensive evaluation," in Proceedings of the 2nd Annual Conference on Electrical and Control Engineering, ICECE 2011, pp. 3076-3079, China, September 2011.

[16] Y. K. Kang, H. Kim, G. Heo, and S. Y. Song, "Diagnosis of feed-water heater performance degradation using fuzzy inference system," Expert Systems with Applications, vol. 69, pp. 239-246, 2017.

[17] J. Chen, H. Li, D. Sheng, and W. Li, "A hybrid data-driven modeling method on sensor condition monitoring and fault diagnosis for power plants," International Journal of Electrical Power & Energy Systems, vol. 71, pp. 274-284, 2015.

[18] F. Jamil, M. Abid, I. Haq, A. Q. Khan, and M. Iqbal, "Fault diagnosis of Pakistan Research Reactor-2 with data-driven techniques," Annals of Nuclear Energy, vol. 90, pp. 433-440, 2016.

[19] M. Daneshvar and F. Rad B, "Data driven approach for fault detection and diagnosis of boiler system in coal fired power plant using principal component analysis," International Review of Automatic Control, vol. 3, no. 2, pp. 198-208, 2010.

[20] M. L. Rosani and H. J. W. Penha, "Using principal component analysis modeling to monitor temperature sensors in a nuclear research reactor," http://www.engr.utk.edu.

[21] K. Fu, Structure optimized PCA and its application [Ph.D. thesis], Zhejiang University, 2007.

[22] Y. Hu, Study on the PCA-based sensor fault detection efficiency of the water-cooled chiller [Ph.D. thesis], Huazhong University of Science Technology, 2013.

[23] S. Li and J. Wen, "A model-based fault detection and diagnostic methodology based on PCA method and wavelet transform," Energy and Buildings, vol. 68, pp. 63-71, 2014.

[24] Y. Hu, G. Li, H. Chen, H. Li, and J. Liu, "Sensitivity analysis for PCA-based chiller sensor fault detection," International Journal of Refrigeration, vol. 63, pp. 133-143, 2016.

[25] J. W. Hines and R. Seibert, "Technical review of on-line monitoring techniques for performance assessment: state-of-the-Art," Nuclear Regulatory Commission, NUREG/CR-6895, 2006.

[26] R. Magan-Carrion, J. Camacho, and P. Garcia-Teodoro, "Multivariate statistical approach for anomaly detection and lost data recovery in wireless sensor networks," International Journal of Distributed Sensor Networks, vol. 123, pp. 1-20, 2015.

[27] D. Liu, C.-H. Lung, N. Seddigh, and B. Nandy, "Entropy-based robust PCA for communication network anomaly detection," in Proceedings of the 2014 IEEE/CIC International Conference on Communications in China, ICCC 2014, pp. 171-175, China, October 2014.

[28] A. Delimargas, E. Skevakis, H. Halabian et al., "Evaluating a modified PCA approach on network anomaly detection," in Proceedings of the 2014 5th International Conference on Next Generation Networks and Services, NGNS 2014, pp. 124-131, Morocco, May 2014.

[29] R. Li, Research on statistical process monitoring based on PCA [Ph.D. thesis], Zhejiang University, 2007.

[30] X. He, Research of principal component analysis and its application in data reconstruction of fault sensor. Northeastern University [Master, thesis], 2010.

[31] C. Jose, "Alejandro Perez-Villegas, Pedro Garcia-Teodoro, Gabriel Macia-Fermandez. PCA-based multivariate statistical network monitoring for anomaly detection," Computer Security, pp. 59-118, 2016.

[32] F. Li, B. R. Upadhyaya, and S. R. P. Perillo, "Fault diagnosis of helical coil steam generator systems of an integral pressurized water reactor using optimal sensor selection," IEEE Transactions on Nuclear Science, vol. 59, no. 2, pp. 403-410, 2012.

[33] S. Valle, W. Li, and S. J. Qin, "Selection of the number of principal components: the variance of the reconstruction error criterion with a comparison to other methods," Industrial & Engineering Chemistry Research, vol. 38, no. 11, pp. 4389-4401, 1999.

[34] F. Li, Dynamic modeling, sensor placement design, and fault diagnosis of nuclear desalination systems [Ph.D. thesis], The University of Tennessee, 2011.

[35] W. Li, M. Peng, Y. Liu, N. Jiang, H. Wang, and Z. Duan, "Fault detection, identification and reconstruction of sensors in nuclear power plant with optimized PCA method," Annals of Nuclear Energy, vol. 113, pp. 105-117, 2018.

[36] W. Chunli, Z. Chunlei, and Z. pengtu, "Denoising algorithm based on wavelet adaptive threshold," Physics Procedia, vol. 24, pp. 678-685, 2012.

[37] E. Walpole R, Probability and Statistics for Engineers and Scientists, Prentice Hall, 9th edition, 2012.

[38] M. Sun, "Vibration signal smoothing method based on MAT-LAB," Measurement Science and Technology, vol. 30, no. 6, pp. 55-57, 2007.

[39] X. Chen, Research on data preprocess method for thermal parameters [Master, thesis], North China Electric Power University, 2013.

[40] T. Chen, E. Martin, and G. Montague, "Robust probabilistic PCA with missing data and contribution analysis for outlier detection," Computational Statistics and Data Analysis, vol. 53, no. 10, pp. 3706-3716, 2009.

Wei Li (ID), Minjun Peng (ID), Yongkuo Liu, Shouyu Cheng, Nan Jiang (ID), and Hang Wang

Fundamental Science on Nuclear Safety and Simulation Technology Laboratory, Harbin Engineering University, Harbin 150001, China

Correspondence should be addressed to Wei Li; success870323@126.com and Minjun Peng; heupmj@163.com

Received 15 August 2017; Accepted 13 December 2017; Published 8 January 2018

Academic Editor: Michael I. Ojovan

Caption: Figure 1: Optimization framework of PCA in this paper.

Caption: Figure 2: Q and [T.sup.2] statistics with original data.

Caption: Figure 3: Measurements with and without singular points.

Caption: Figure 4: Q and [T.sup.2] statistics with preprocessed data.

Caption: Figure 5: Q statistics with original data in 5 PCA models.

Caption: Figure 6: [T.sup.2] statistics with original data in 5 PCA models.

Caption: Figure 7: Q statistics with preprocessed data in 5 PCA models.

Caption: Figure 8: [T.sup.2] statistics with preprocessed data in 5 PCA models.

Caption: Figure 9: Contributions of sensors to Q and [T.sup.2] statistics in 5 PCA models under normal condition.

Caption: Figure 10: [T.sup.2] statistics with a small drift in 5 PCA models.

Caption: Figure 11: Q statistics with a small drift in 5 PCA models.

Caption: Figure 12: Contributions of variables to Q and [T.sup.2] statistics in 5 PCA models with a small drift on 1# sensor.

Caption: Figure 13: [T.sup.2] statistics with a larger drift in 5 PCA models.

Caption: Figure 14: Q statistics with a larger drift in 5 PCA models.

Caption: Figure 15: Contributions of sensors to Q and [T.sup.2] statistics in 5 PCA models with a larger fault on 1# sensor.

Table 1: False alarm probability of [T.sup.2] and Q statistics in 5 PCA models. Random Type Standard deviation [T.sup.2] 0.7% 0 2.6% [T.sup.2] (preprocess) 0.4% 0 0 [T.sup.2] (second confidence) 0 0 0 Q 4.0% 5.2% 5.1% Q (preprocess) 2.7% 2.6% 2.9% Q (second confidence) 0.4% 0.3% 0.4% Volatility Correlation degree [T.sup.2] 0 0 [T.sup.2] (preprocess) 0 0 [T.sup.2] (second confidence) 0 0 Q 4.2% 4.7% Q (preprocess) 0.7% 3.8% Q (second confidence) 0.3% 0.2%

Printer friendly Cite/link Email Feedback | |

Title Annotation: | Research Article |
---|---|

Author: | Li, Wei; Peng, Minjun; Liu, Yongkuo; Cheng, Shouyu; Jiang, Nan; Wang, Hang |

Publication: | Science and Technology of Nuclear Installations |

Date: | Jan 1, 2018 |

Words: | 8213 |

Previous Article: | ROSA/LSTF Tests and Posttest Analyses by RELAP5 Code for Accident Management Measures during PWR Station Blackout Transient with Loss of Primary... |

Next Article: | Study of Fast Transient Pressure Drop in VVER-1000 Nuclear Reactor Using Acoustic Phenomenon. |

Topics: |