The use of multilayer perceptron artificial neural networks for the classification of ethanol samples by commercialization region/Utilizacao de Redes Neurais Artificiais Perceptron de Multiplas Camadas para classificacao de amostras de etanol com a regiao de comercializacao.
Advances in research on the production of new fuels from renewable sources to reduce toxic gas emission levels into the atmosphere has underscored ethanol as the main biofuel in Brazil. Ethanol is environmentally important since it reduces carbon dioxide emissions. In fact, parts of the gas result from ethanol combustion absorbed by large plantations (Silva, Damasceno, Silva, Madruga, & Santana, 2008; Spacino et al., 2013).
Although hydrated ethanol is mainly produced from sugar cane as raw material, many other sources, such as corn and beet, may be used. Depending on the region or even the municipality, ethanol production may have physical and chemical parameters differentiated by conductivity, pH and others (Spacino et al., 2013). Various aspects on these characteristics may be studied since they emerge from different interests, comprising distillation results, municipality of origin, separation of different sugar cane crops, comparison of yields and production year for productivity (Silva et al., 2008; Spacino et al., 2013).
According to Bona, Silva, Borsato, and Bassoli (2012), the Artificial Neural Network (ANNs) is one of the study tools which has gained great importance and has been successful in targeting samples. There are several types of neural networks such as Multilayer Perceptron (MLP), radial basis networks, Self-Organizing Maps (SOM) among others (Haykin, 2001; Bishop, 2007).
ANNs reproduce the logical operations that the brain performs in various tasks. In recent decades, this tool has highlighted pattern recognition and classification (Borsato et al., 2009; Nobrega, Bona, & Yamashita, 2013; Link, Lemes, Marquetti, Scholz, & Bona, 2014). In addition, it may be applied in several areas, such as health (Read et al., 2010; Karelina et al., 2015), food (Debska & Guzowska-Swider, 2011) and engineering (Haykin, 2001; Vukovic & Miljkovic, 2015; Kosic, 2015).
Supervised training is carried out through backpropagation algorithm based on learning by error correction (Haykin, 2001; Bishop, 2007).
Basically, the learning of MLP by backpropagation occurs when the set of pairs named input and output performs two steps among the different layers of the network: a step forward called propagation and a step backward called backpropagation. Specifically, the actual response of the network is subtracted from the expected response to produce an error signal. The error signal is back-propagated through the network, adjusting synaptic weights to make the actual response of the network move closer to the expected response by minimizing the error (Bishop, 2007). When the response does not match with what was expected by the network, the procedure is repeated several times until the input-output set achieves accuracy (Haykin, 2001; Borsato et al., 2009).
Considering the commercial importance of ethanol and the effectiveness of ANN technique, the application of MLP-type ANN for the classification of ethanol samples commercialized in two regions of the state of Parana, Brazil, has been proposed.
Material and methods
Samples of hydrated ethanol
The two hundred and four samples of commercial hydrated ethanol (112 from the eastern region and 92 from the northern region of the state of Parana) used for the application of MLP underwent alcohol content, density, pH and electrical conductivity tests.
Alcohol content and Density
The determination of density and alcohol contents follows ASTM D4052-11 (American Society for Testing and Materials [ASTM], 2011), NBR 5992 standards (Associacao Brasileira de NormasTecnicas [ABNT], 2008).
The pH of ethanol fuel was determined with a MD-20 Digimed equipment, according to NBR 10891 (AssociacaoBrasileira de NormasTecnicas [ABNT], 2013).
Electrical conductivity was determined by Digimed DM-31 equipment, according to ASTM D1125-11 (American Society for Testing and Materials [ASTM], 2014).
The classification module of Statistica 9.0 was employed for MLP-type neural networks and for automatic segmentation. Networks were trained with 70% of the samples for the training group and 15% for test and validation. The choice of samples in each group was performed randomly and learning rate was maintained at 0.05. A categorical variable region was chosen, whilst alcohol content, density, electrical conductivity and pH were selected as continuous variables.
The number of hidden layer was 1; the number of neurons comprised was in the range from 1 to 10; maximum number of epochs was 200. Selected error functions were SOS (Sum of Squares), Cross Entropy; the activation of hidden layer were Identity, Logistic, Hyberbolic Tangent, Exponential; the functions for activation of the output layer were Identity, Logistic, Hyberbolic Tangent, Exponential, and Softmax. The employed backpropagation algorithm was BFGS, opted for 20 networks training, and the 5 best performing were selected by software.
Results and discussion
Figure 1 shows the rates for density (kg [m.sup.-3]), alcohol content (g [100.sup.-1]g), pH and electrical conductivity ([mu]S [m.sup.-1]) of the 204 samples analyzed, separated by commercialization regions. Horizontal lines indicate the boundaries of the compliance parameters. Results showed that all samples were within the parameters set for marketing.
Table 1 shows maximum and minimum, mean and standard deviation of each parameter used for training, test and validation of the network (MLP 4-8-2), with the best performance. Electrical conductivity and alcohol content were the compliance parameters respectively with the highest and lowest standard deviation.
[FIGURE 1 OMITTED]
Figure 2 demonstrates each compliance parameter with other input variables. When density rates were related to alcohol content, the behavior was close to linear, albeit with [R.sup.2] = 0.73. The relationship of parameters to each other showed a typical nonlinear behavior with few scattered samples. The nonlinear relationship among the parameter rates showed that ANNs may be applied to the case in question, since they are inherently nonlinear (Ritter, 1995).
The 204 samples tested were classified by a MLP-type neural network. The chosen neural networks comprised an input layer with one neuron for each variable, a hidden layer responsible for the separation of standards and an output layer with the decision taken by the hidden neurons.
[FIGURE 2 OMITTED]
In the training of samples, the automatic module Statistica 9.0 used activation functions identity, logistic, hyperbolic tangent and exponential type for neurons of the hidden layer and the output. Twenty networks were tested and the 5 best performances were highlighted for every start-up of the program.
Since they act characteristically as detectors, the hidden neurons had an important role in the operation of a network Perceptron learning by backpropagation. As the learning process advances, the hidden neurons gradually start discovering the peculiarities that characterize the training data (Haykin, 2001; Bishop, 2007).
The number of epochs and neurons in the hidden layer cannot be too high because when a neural network learns too many input-output examples, it may end up memorizing the training data. This phenomenon is known as adjustment or overtraining and makes the network lose its generalization capacity (Haykin, 2001; Bishop, 2007). Furthermore, according to Haykin (2001), the lower the learning rate parameter, the lower are the variations in synaptic weights of interaction to another network, and the softer is the trajectory of the weight space.
On the other hand, if the learning rate is very high, the modifications will result in major synaptic weights, that may render the network unstable. Therefore, in this case, a learning rate of 0.05 was applied, a maximum number of epochs equal to 200 and between 1 and 10 neurons were chosen in the hidden layer for training the MLP.
The samples and analyzed parameters presented to the network were subdivided randomly into three groups: the first consisted of the training set comprising 70%; the second group, called test, comprised 15% of samples; and the third, called validation, also comprised 15%. The second and third groups, which were not present during training, aimed at validating and verifying the generalization capacity of the trained network (Haykin, 2001; Borsato et al., 2009; Link et al., 2014).
Error was determinedat each epoch of training and the information was used to adjust the weights to reduce the error until stabilization. Figure 3 shows the number of epochs used for training the network with the highest performance, revealing that the network needed only 60 epochs to achieve stability.
[FIGURE 3 OMITTED]
As two main variability sources of the network were initialization and sampling, three initializations were used, totaling 60 networks. The 15 networks with the best performance, chosen by the computer program, presented a percentage of accuracy ranging between 88 and 95% for training, between 83 and 96% for test, and between 93 and 100% for validation, which were higher than those obtained by Anderson and Smith (2002) who applied ANNs in the differentiation of coffee samples by geographical origin.
In each initialization, a network was highlighted when the accuracy rate was compared with the other. Table 2 presents the three selected networks with their respective percentage of correct answers. Two of the selected networks had 8 neurons in the hidden layer. As each initialization used a random choice, different algorithms for training, error function, activation of hidden layer and output alter the hit rates. Table 2 presents the accuracy rates of the networks chosen for training (Tr), test (T) and validation (Val), as well as the characteristics of each algorithm (Al) used, including the error function (EF), the activation functions of the hidden layer (HA) and the output layer (OA).
According to Table 2, the best performance among the networks stood out with 8 neurons in the hidden layer, and the best of 2MLP 4-8-2 with a 95 hit rate for training, 96 for test and 96% for validation. The 1MLP 4-8-2 presented 94 for training, 90 for test and 100% for validation. The third network selected showed 7 neurons in the hidden layer, with hit rates of 92 for training, 90 for test and 96% for validation. Tukey's test applied to the averages rates for training, test and validation percentages of the 15 selected networks revealed a significant difference ([p.sub.max.] = 0.009) when compared to rates of the selected network (1MLP 4-8-2). The percentage of accuracy responses for the classification of samples, MLP network, according to the commercialization region, was significantly higher (Anderson & Smith, 2002).
In the case of 1MLP 4-8-2, only 11 out of 204 samples used including training, test and validation, were not classified correctly. Since there were only 9 and 15 errors respectively for 2MLP 4-8-2 and 3MLP 4-7-2, the above shows that the network 2MLP 4-8-2 had the highest number of correct responses. In fact, it was the network with the best performance in the classification of ethanol samples.
Among the algorithms used for training, the software automatic module selected the BFGS (Broyden, 1970; Fletcher, 1970; Goldfarb, 1970; Shanno, 1970), a quasi-Newton method of the second order, very efficient with very fast convergence, although a high memory is required to store the Hessian matrix (Statistica, 2009; Borsato, Pina, Spacino, Scholz, & Androcioli, 2011).
Figure 4 shows the representation of the network 2MLP 4-8-2, with 4 input parameters, 8 neurons in the hidden layer and 2 regions analyzed, whereas 3MLP 4-7-2 had 4 input parameters, 7 neurons in the hidden layer and 2 regions analyzed. In the hidden layer, neurons with more intense tones were more activated by the network in search of the target response.
From the selected networks, an order of importance could be stipulated for input parameters; in the case of network 1MLP 4-8-2, the alcohol content compliance parameter was characterized as the most important, followed by pH, density and electrical conductivity, in this order. For network 2MLP 4-8-2, the most important variable was alcohol content followed by the density, pH and electrical conductivity. Finally, in network 3MLP 4-7-2, the order of importance was density, alcohol content, electrical conductivity and pH, with the lowest percentage of correct responses (Table 2).
[FIGURE 4 OMITTED]
This order of importance was given by the network's sensitivity analysis, in which the program calculates the sum of squared residuals or error rate for the classification of each network for the model when one of the compliance parameters is taken. Further, relationships are established between the full model, which includes the parameter and when it is deleted. The parameters of relevance order may be established from these data, or rather, those with higher rates are the most important for the network classification. Table 3 shows the sensitivity analysis of the selected networks (Statistica, 2009).
MLP-type ANN proved to be a suitable tool for the classification of ethanol samples according to their commercialization region, whilst input variables of the sensitivity analysis reveal that the alcohol contents, followed by pH and density, were the input variables of the sensitivity analysis. Alcohol contents, followed by pH and density, were instrumental for the identification of samples with importance of compliance parameters for segmentation. The features used for training the network, the learning rate, epochs number, training algorithm, and number of hidden neurons were also effective for differentiating the samples.
American Society for Testing and Materials. (2011). ASTM D4052-11:Standard test method for density, relative density, and API gravity of liquids by digital density meter. West Conshohocken, PA: ASTM Internacional.
American Society for Testing and Materials. (2014). ASTM D1125-14: Standard test methods for electrical conductivity and resistivity of water. West Conshohocken, PA: ASTM International.
Anderson, K. A., & Smith, B. W. (2002). Chemical profiling to differentiate geographic growing origins of coffee. Journal of Agricultural and Food Chemistry, 50(7), 2068-2075.
Associacao Brasileira de Normas Tecnicas. (2008). NBR 5992: Alcool etilico e suas misturas com agua-Determinacao da massa especifica e do teor alcoolico-Metodo do densimetro de vidro. Rio de Janeiro, RJ: ABNT.
Associacao Brasileira de Normas Tecnicas. (2013). NBR 10891: Etanol hidratado combustivel-Determinacao do pH-Metodo potenciometrico. Rio de Janeiro, RJ: ABNT.
Bishop, C. M. (2007). Neural networks for pattern recognition. New York City, NK: Oxford University Press.
Bona, E., Silva, R. S. S. F., Borsato, D., & Bassoli, D. G. (2012). Self-organizing maps as a chemometric tool for aromatic pattern recognition of soluble coffee. Acta Scientiarum. Technology, 34(1), 111-119.
Borsato, D., Moreira, I., Nobrega, M. M., Moreira, M. B., Dias, G. H., Silva, R. S. S. F., & Bona, E. (2009). Aplicacao de redes neurais artificiais na identificacao de gasolinas adulteradas comercializadas na regiao de Londrina-Parana. Quimica Nova, 32(9), 2328-2332.
Borsato, D.,Pina, M. V. R., Spacino, K. R.,Scholz, M. B. S., & Androcioli, A. (2011). Application of artificial neural networks in the geographical identification of coffee samples. European Food Research & Technology, 233(3), 533-543.
Broyden, C. G. (1970). The convergence of a class of double rank minimization algorithms: 2. The new algorithm.Journal of AppliedMathematics,6(3), 222-231.
Debska, B., & Guzowska-Swider, B. (2011). Application of artificial neural network in food classification. Analytica Chimica Acta, 705(1-2), 283-291.
Fletcher, R. (1970). A new approach to variable metric algorithms. The ComputerJournal,13(3), 317-322.
Goldfarb, D. (1970). A family of variable-metric methods derived by variational means. Journal of Computational Mathematics, 24(109), 23-26.
Haykin, S. (2001). Redes neurais: principios e praticas. Porto Alegre, RS: Bookman.
Karelina, K., Liu, Y., Alzate-Correa, D., Wheaton, K. L., Hoyt, K. R., Arthur, J. S. C., & Obrietan, K. (2015). Mitogen and stress-activated kinases 1/2 regulate ischemia-induced hippocampal progenitor cell proliferation and neurogenesis. Neuroscience, 285(1), 292-302.
Kosic, D. (2015). Fast clustered radial basis function network as an adaptive predictive controller. Neural Networks, 63(1), 79-86.
Link, J. V., Lemes, A. L. G., Marquetti, I., Scholz, M. B. S., & Bona, E. (2014).Geographical and genotypic segmentation of Arabica coffee using self-organizing maps. Food Research International, 59(1), 1-7.
Nobrega, M. M., Bona, E., & Yamashita, F. (2013). An artificial neural network model for the prediction of mechanical and barrier properties of biodegradable films. Materials Science & Engineering C, Biomimetic Materials, Sensors and Systems, 33(7), 4331-4336.
Read, S. J., Monroe, B. M., Brownstein, A. L., Yang, Y., Chopra, G., & Miller, L. C. (2010). A neural network model of the structure and dynamics of human personality. Psychological Review, 117(1), 61-92.
Ritter, H. (1995). Self-organizing feature maps: Kohonen maps. In M. A. Arbib (Ed.), The handbook of brain theory and neural network (p. 846-851). Cambridge, UK: MIT Press .
Shanno, D. F. (1970). Conditioning of quasi-Newton methods for function minimization. Journal of Computational Mathematics, 24(111), 647-656.
Silva, J. A., Damasceno, B. P. G. L., Silva, F. L. H., Madruga, M. S., & Santana, D. P. (2008). Aplicacao da metodologia de planejamento fatorial e analise de superficies de resposta para otimizacao da fermentacao alcoolica. Quimica Nova, 31(5), 1073-1077.
Spacino, K. R., Silva, H. C., Angilelli, K. G., Silva, E. T., Moreira, I., Mesquita, M. V., & Borsato, D. (2013). Using self-organizing maps as a chemometric tool for alcohol classification by distillery. International Journal of Environment and Bioenergy, #(1), 1-11.
Statistica. (2009). Graphics software. Statistica for Windows, version 9. Tulsa, OK: Statistica.
Vukovic, N., & Miljkovic, Z. (2015). Robust sequential learning of feedforward neural networks in the presence of heavy-tailed noise. Neural Networks, 63(1), 31-47.
Received on April 29, 2015.
Accepted on October 19, 2015.
Erica Signori Romagnoli, Livia Ramazzoti Chanan Silva, Karina Gomes Angilelli, Bruna Aparecida Denobi Ferreira, Aline Regina Walkoff and Dionisio Borsato *
Departamento de Quimica, Universidade Estadual de Londrina, Rodovia Celso Garcia Cid, Pr-445 Km 380, Cx. Postal 10011, 86057-970, Londrina, Parana, Brazil. * Authorfor correspondence. E-mail: email@example.com
Table 1. Statistics of parameters used for training, test and validation of the network employed. Electrical Density Alcohol ph Conductivity (Kg [m.sup.-3]) Content (g100 ([micro]S [g.sup.-1]) [m.sup.-1]) Training Minimum 808.2000 93.90000 6.250000 57.0000 Maximum 811.0000 95.50000 8.000000 351.0000 Mean 809.4160 93.18194 7.200556 131.7236 Stand. D. 0.5153 0.20162 0.444275 131.7236 Test Minimum 806.6000 92.80000 6.590000 57.0000 Maximum 810.5000 93.50000 8.930000 170.000 Mean 809.2733 93.18000 7.173667 119.000 Stand. D. 0.6817 0.16274 0.488301 32.2707 Validation Minimum 807.9000 92.70000 6.640000 68.0000 Maximum 810.8000 93.70000 7.710000 275.0000 Mean 809.5100 93.13667 7.260667 136.0500 Stand. D. 0.5708 0.24280 0.449225 22.1685 General Minimum 806.6000 92.50000 6.250000 57.0000 Maximum 811.0000 93.90000 8.930000 351.0000 Mean 809.4088 93.17500 7.205441 130.6309 Stand. D. 0.5588 0.19805 0.440123 41.9046 Table 2. Accuracy percentages of selected MLPs and their characteristics. Network Tr (%) T. (%) Val (%) Al 1MLP 4-8-2 94 90 100 BFGS 2MLP 4-8-2 95 96 96 BFGS 3MLP 4-7-2 92 90 96 BFGS Network E.F. H.A. O.A. 1MLP 4-8-2 SOS Tanh Identity 2MLP 4-8-2 Entropy Logistic Softmax 3MLP 4-7-2 Entropy Exponential Softmax Table 3. Sensitivity analysis of selected MLP. Networks ph Alcohol Density Electrical Content Conductivity 1MLP 4-8-2 4.887 8.584 2.991 1.218 2MLP 4-8-2 3.352 4.694 3.494 1.469 3MLP 4-7-2 0.323 2.618 3.633 2.080
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||texto en ingles|
|Author:||Romagnoli, Erica Signori; Silva, Livia Ramazzoti Chanan; Angilelli, Karina Gomes; Ferreira, Bruna Ap|
|Publication:||Acta Scientiarum. Technology (UEM)|
|Date:||Apr 1, 2016|
|Previous Article:||Photodegradation of sugarcane vinasse: evaluation of the effect of vinasse pre-treatment and the crystalline phase of Ti[O.sub.2]/ Fotodegradacao da...|
|Next Article:||Dispersion of pollutants in watercourses intercepted by highway BR-050, in the Triangulo Mineiro region, Minas Gerais, Brazil/Dispersao de poluentes...|