Prediction of Lard in Palm Olein Oil Using Simple Linear Regression (SLR), Multiple Linear Regression (MLR), and Partial Least Squares Regression (PLSR) Based on Fourier-Transform Infrared (FTIR).
Adulteration of oils is an issue persisting in the market . In 2013, a company in Taiwan was found to market cheaper oils as premium class oils. This was followed by an incident of lard-based cooking oil being adulterated with gutter oil where more than 1,300 food products were affected [2, 3]. Consumer Voice  further reported that 47.09% of 1,015 edible oil samples tested from 14 states in India were not in compliance with the Food Safety and Standards Regulations.
Lard is considered one of the cheaper oils in the food industry. It can be blended effectively with other oils, with the intention to reduce the production cost. The presence of lard in cooking oil is important due to two perspectives: economic considerations and religious restrictions. Religions such as Islam and Judaism forbid the consumption of swine and any of its derivatives [1, 5] and hence should not be present in halal-labelled products. From the economic perspective, the credibility of Malaysia as a major producer and exporter of palm oil would be at risk should their products be found adulterated. A company in Malaysia was allegedly charged with intention to export palm oil adulterated with fatty acid to Sri Lanka .
Various methods have been developed to identify the adulteration of cooking oil; these include Gas Chromatography Mass Spectroscopy (GC-MS), High-Performance Liquid Chromatography Mass Spectrometry (HPLC-MS), Fourier-Transform Infrared (FTIR), Nuclear Magnetic Resonance (NMR), etc. The advantages and disadvantages of these analytical methods for adulterant analysis are summarized in Table 1.
Most of these techniques are costly and time-consuming. FTIR offers the advantages of rapid analysis with minimal sample preparation and is inexpensive. This technique, integrated with statistical approach particularly partial least squares (PLS), has demonstrated promising sensitivity for adulterant analysis [12, 13, 14]. FTIR coupled with PLS has been used for detection of adulterants in edible oils including avocado oil, sunflower oil, and palm oil with a detection limit as low as 2-3% [15, 16, 17]. In some cases, the detection level may be much higher, for example, the quantification of hazelnut in virgin olive oil is reported at 25% or higher. PLS regression has been commonly coupled with FTIR technique for prediction of adulterants; there is however limited study on the possibility of other regression strategies. Hence in this paper, we apply simple linear regression (SLR), multiple linear regression (MLR), and partial least squares regression (PLSR) for prediction of lard in palm olein oil using FTIR. This will provide fundamental knowledge on the performance of different regression models for adulterant analysis contributing toward quality control purposes.
2. Materials and Methods
2.1. Sample Preparation. Readily available palm olein cooking oil was purchased from the local market. Pure lard was extracted from adipose tissues of swine purchased from the local market. The adipose tissues were cut into small pieces and heated in an oven at 90[degrees]C for 2 hours. The liquid fat was ladled into a glass jar. It was left to cool to room temperature before storage. Prior to use, lard was preheated with a block heater (Stuart SBH200D) at 50[degrees]C for 1 hour, until the solidified lard turned into liquid.
The lard, pure palm olein oil, and the adulterated olein oil at 20% and 50% were analysed. The samples were agitated with a vortex mixer (VELP Scientifica Model ZX4) for 1 minute to ensure homogeneity [1, 17, 18].
2.2. FTIR Spectra Measurement. The samples were scanned with a Fourier-transform infrared spectrophotometer (Thermo Scientific Nicolet iS10) equipped with a diamond crystal attenuated total reflectance (ATR). The spectra were acquired at a resolution of 4 [cm.sup.-1] with 64 scans in the range of 4000-525 [cm.sup.-1]. The spectrum was ratioed against a fresh background spectrum recorded from the bare ATR plate.
Prior to collection of each background spectrum, the ATR plate was cleaned with pure ethanol. At each concentration level, a total of 20 replicates were scanned yielding 80 spectra. The spectra were saved in csv format for further analysis using Matlab R2013a.
2.3. Spectra Processing. The spectra were baseline corrected and subjected to peak detection according to the first derivative approach. The peaks detected were then matched across samples to produce a peak table with rows and columns representing samples and variables (in wavenumber), respectively. The algorithm is referred to  for brevity. The resultant peak table was analysed to deduce the marker bands differentiating pure and adulterated samples.
2.4. Variable Selection. Fisher Weights, a multiclass variable selection method, was employed to determine the variable(s) with discriminatory ability. The weight, [w.sub.m], for each variable, m, according to class (c = 1.. .C) was calculated based on the following equation. The variable with a higher magnitude of weight is elucidated with greater discriminatory ability . They are called the marker bands which are used for prediction of lard adulteration in palm olein oil using SLR, MRL, and PLSR.
[mathematical expression not reproducible], (1)
where [[bar.x].sub.mc] and [[bar.x].sub.m] are mean of the variable in class c and overall mean of the variable, respectively, [S.sub.m] is the pooled standard derivation, and [N.sub.c] is the number of members in class c.
2.5. Simple Linear Regression (SLR). The peak area of a marker band was calculated as the sum of signal from peak start to peak end. The vector of peak area, X, is assumed with linear relationship with the corresponding lard concentration, C. The regression is expressed as [??] = [b.sub.1] X + [b.sub.0] where b is the coefficient and [??] is the predicted concentration.
2.6. Multiple Linear Regression (MLR). The calibration model was built using the spectral data, X (a matrix), with its corresponding lard concentrations, C, in which C = X x B and B = [(X' x X).sup-1]. X' x C. The regression equation can then be written as C, predicted conc. = [b.sub.0] + [b.sub.1][x.sub.1] + [b.sub.2][x.sub.2] + ... + [b.sub.n][x.sub.n], considering only the linear terms .
2.7. Partial Least Squares Regression (PLSR). The PLS calibration model was developed using the spectral data, X, and its corresponding lard concentration, C, based on two principal components. The PLS algorithm assumes a linear relationship between X and C. They are decomposed into the models of X = T x P + E and C = T x q + f, where E and f are the noise, T is the scores matrix common for X and C, and P and q are the loadings matrices. The algorithm of PLS involves the projection of X onto the weight vector to get a scores vector, t. X is then projected into the scores to get loadings, p. After every PLS component, the X matrix is deflated by subtracting t x p from X. The algorithm of PLS according to NIPALS (non-linear iterative partial least squares) is explained in detail in .
2.8. Model Evaluation. The models were built using the training samples and validated with the test samples. A two-third of the 80 spectra were used as the training samples with equal number from each class whilst the remaining served as the test samples. The samples were split randomly for 100 iterations, and these 100 training/test sets were subjected to SLR, MLR, and PLSR according to the selected spectral regions for prediction of lard. For PLSR, the matrix of training samples was in addition standardized, and the corresponding concentration, C, was mean-centred; the test set was standardized using the mean and standard deviation ofthe training samples. The prediction performance was evaluated based on the percentage root-mean-squares error (%RMSE), in which
RMSE = [square root of ([[summation].sup.N.sub.n=1][([C.sub.n] -[[??].sub.n]).sup.2]/N, %RMSE = [[summation].sup.N.sub.n=1] C/N x 100. (2)
A lower %RMSE signifies better prediction. Typically, the training samples will inherit better prediction than the test samples. However, if a model predicts exceptionally well for the training samples but not for the test samples, it implies that the model is overfitted. Figure 1 illustrates the flow chart of the training/test set splitting for regression analysis. The process was programmed as a routine, and all analyses were performed in Matlab R2013a.
Analysis of Variance (ANOVA) with Tukey's test was performed to evaluate the %RMSE attained based on different spectral regions over 100 training/test splits to determine if there is a significant different at 95% confidence level.
3. Results and Discussion
The spectra pattern of pure and adulterated oil is shown in Figure 2; they are considerably similar with several major absorption peaks identified at the regions of 3000-2800 [cm.sup.-1], 1700-1600 [cm.sup.-1], and 1500-900 [cm.sup.-1]. These characteristic peaks are likewise reported by  with some discrepancies; the peak at 2954 [cm.sup.-1] is shifted to 2922 [cm.sup.-1] and that at 914 [cm.sup.-1] is inconsistently detected.
Based on Fisher Weights, five peaks at 3006 [cm.sup.-1], 2852 [cm.sup.-1], 1117 [cm.sup.-1], 1236 [cm.sup.-1], and 1159 [cm.sup.-1] were identified as variables with the most significant discriminatory ability, agreeing with . These peaks were reported to reduce in intensity with increasing concentration of lard; nevertheless, this observation is not entirely evidenced in the present study. The peak at 3006 [cm.sup.-1] was seen to increase corresponding to lard concentrations, opposing the findings of . For other marker bands, an inverse relationship is demonstrated between the peak intensity and concentration of lard as reported. Figure 3 illustrates the spectral regions of five variables with the most significant discriminating ability.
The peak at 3006 [cm.sup.-1] is attributed to the stretching of cis C=CH bond in unsaturated fatty acids, whereby the more abundant the bond is, the higher the peak intensity . As stated on the label of palm olein oil used in this study, the product contains 43% saturated fats, 43% monounsaturated, and 14% poly-unsaturated fats. In comparison to the composition of lard with 48% and 11% mono- and polyunsaturated fats, as reported by , the lard is anticipated with richer cis C=CH bonds. This offers an explanation to the positive correlation between the peak intensity and lard concentrations. The peak at 2852 [cm.sup.-1] is the characteristic of C-H stretching where the intensity is governed by the abundance of long-chain saturated fatty acids . Typically, lard contains higher amounts of stearic acid (18:0); nevertheless, its total saturated fatty acid (42%) is lower than palm olein oil (45.8%) supporting the reduced intensity at 2852 [cm.sup.-1] as the lard concentration increases. The peak at 1117 [cm.sup.-1] on the contrary is attributed to the out-of-plane CH bending; according to , a higher abundance of oleic acyl groups in oil (18 : 1) would evidence a reduction in the peak intensity. Lard typically contains 42% of oleic acid whilst palm olein oil comprises of 38% ; this suggests the inverse relationship between the peak intensity and lard concentrations. Other peaks at 1236 [cm.sup.-1] and 1159 [cm.sup.-1] are linked to the stretching of C-O group in esters. According to , the fingerprint region at 1500-1000 [cm.sup.-1] is the most suitable for discrimination of pure oil from the admixture of lard.
A two-third of the 80 spectra was randomly assigned as the training samples (n = 52) to develop the calibration model whilst the remaining 28 samples were used to test the model. Note that, for the training set, each level of concentration has an equal number of samples. A total of 100 training and test sets were used to ensure the model is consistent and reliable for prediction. These 100 training/test sets were subjected to SLR, MLR, and PLSR according to spectral regions of 3006 [cm.sup.-1], 2852 [cm.sup.-1], 1117 [cm.sup.-1], 1236 [cm.sup.-1], and 1159 [cm.sup.-1].
Table 2 summarizes the %RMSE of prediction according to spectral regions and training/test sets using various regression models. Evidently, the spectral regions with better predictive ability are those at 1130-1100 [cm.sup.-1] and 3020-2990 [cm.sup.-1], where the peak maximum is recorded at 1117 and 3006 [cm.sup.-1], respectively. This is demonstrated in PLSR and MLR with the former outperforms the latter whilst SLR exhibits exceptionally poor prediction across all regions--presumably has no predictive ability. The %RMSE based on the regions at 1159, 1236, and 2852 [cm.sup.-1] continue to increase in ascending order, according to PLSR, indicative of diminishing predictive ability. An extensive review on infrared spectroscopic technique for adulteration of food lipids  corroborated the aforementioned effective region at 3020-2990 [cm.sup.-1] and 1130-1100 [cm.sup.-1] for prediction of lard [1, 13, 27-31].
Among the three regression models, PLSR demonstrates more reliable and consistent prediction; this approach has been widely used for prediction of adulterants exhibiting superior accuracy over other strategies such as principal component regression, ordinary least squares and ridge regression [32, 33]. MLR is a linear approach that models the relationship between a dependent variable with more than one explanatory variable (independent). This approach will fall short when the number of independent variable is more than the number of sample, such as the spectral data, and if the variables are not independent. Besides, if the variables are characterized with profound noise, the prediction may be very susceptible to changes . SLR on the other hand is very sensitive to outliers and tends to be overfitted. Figure 4 illustrates the predicted concentration versus the expected concentration of test samples based on three different models (SLR, MLR, and PLSR) with specific reference to the spectral regions of 3006 and 1117 [cm.sup.-1].
In this paper, we compared three different regression models (SLR, MLR, and PLSR) for prediction of lard in palm olein oil. The marker bands for differentiation of lard and palm olein oil were identified at 3006 [cm.sup.-1], 2852[cm.sup.-1], 1117[cm.sup.-1], 1236[cm.sup.-1], and 1159[cm.sup.-1]. The regions with promising predictive ability were confirmed at 3006 and 1117 [cm.sup.-1] with PLSR demonstrating better accuracy.
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this article.
The authors would like to thank University Malaysia Sarawak for funding this study under the budget of Faculty of Resource Science and Technology.
 A. Rohman, Y. B. Che Man, P. Hashim, and A. Ismail, "FTIR spectroscopy combined with chemometrics for analysis of lard adulteration in some vegetable oils," CyTA-Journal of Food, vol. 9, no. 2, pp. 95-101, 2011.
 Focus Taiwan, Ting Hsin Former Chairman Jailed in Oil Scandal, The Central News Agency, Taiwan, 2017, http:// focustarwan.tw/news/asoc/201707280011.aspx.
 The New York Times, Taiwan's 'Gutter Oil' Scandal, The New York Times, NY, USA, 2014, https://www.nytimes.com/2014/ 09/19/opinion/taiwans-gutter-oil-scandal.html.
 Consumer Voice, Loose Edible Oils: Adulteration in Most Samples From 14 states, 2016, http://www.consumer-voice. org/loose_edible_cooking_oil.
 A. Rohman, T. Kuwat, S. Retno, Sismindari, E. Yuny, and W. Tridjoko, "Fourier Transform Infrared Spectroscopy applied for rapid analysis of lard in palm oil," International Food Research Journal, vol. 19, no. 3, pp. 1161-1165, 2012.
 The Island, Customs Wins Case Against Adulterated Palm Oil Imports, Upali Newspapers, Colombo, Sri Lanka, 2017, http:// www.island.lk/index.php?page_cat=articledetailsandpage=article-detailsandcode_title= 160763.
 J. Gromadzka and W. Wardencki, "Trends in edible vegetable oils analysis. Part B. Application of different analytical techniques," Polish Journal of Food and Nutritional Science, vol. 61, no. 2, pp. 89-99, 2011.
 B. C. Smith, Fundamentals of Fourier Transform Infrared spectroscopy, Taylor and Francis Group, Boca Raton, FL, USA, 2nd edition, 2011.
 D. Karthik, K. VijayaRekha, and K. Manjula, "Multivariate analysis for detecting adulteration in edible oil: a review," in Proceedings of International Conference on Advances in Engineering, Science and Management IEEE, New York, NY, USA, March 2012.
 P. Dais, "Nuclear magnetic resonance: methodologies and applications," in Handbook of Olive Oil: Analysis and Properties, R. Apraricio and J. Harwood, Eds., Springer, New York, NY, USA, 2013.
 E. D. Stein, M. C. Martinez, S. Stiles, P. E. Miller, and E. V. Zakharov, "Is DNA barcoding actually cheaper and faster than traditional morphological methods: results from a survey of freshwater bio assessment efforts in the United States?," PLoS ONE, vol. 9, no. 4, Article ID e95525, 2014.
 M. A. Ahmed, A. A. Abou-Arab, and K. Hayat, "Identification of lard in vegetable oil binary mixtures and commercial food
products by FTIR," Quality Assurance and Safety of Crops and Foods, vol. 9, no. 1, pp. 11-22, 2017.
 Y. B. Che Man, Z. A. Syahariza, M. E. S. Mirghani, S. Jinap, and J. Bakar, "Detection of lard adulteration in cake formulation by Fourier transform infrared (FTIR) spectroscopy," Food Chemistry, vol. 92, no. 2, pp. 365-371, 2005.
 A. Rohman, D. L. Setyaningrum, and S. Riyanto, "FTIR spectroscopy combined with partial least squares for analysis of red fruit oil in ternary mixture system," International Journal of Spectroscopy, vol. 2014, Article ID 785914, 5 pages, 2014.
 L. Cuibus, R. Maggio, V. Muresan, Z. Diaconeasa, O. L. Pop, and C. Sociaciu, "Preliminary discrimination of butter adulteration by ATR-FTIR spectroscopy," Bulletin UASVM Food Science and Technology, vol. 72, no. 1, pp. 70-76, 2015.
 B. F. Ozen and L. J. Mauer, "Detection of hazelnut oil adulteration using FT-IR spectroscopy," Journal of Agricultural and Food Chemistry, vol. 50, no. 14, pp. 3898-3901,2002.
 N. Quinones-Islas, O. G. Meza-Marquez, G. Osorio-Revilla, and T. Gallardo-Velazquez, "Detection of adulterants in avocado oil by Mid-FTIR spectroscopy and multivariate analysis," Food Research International, vol. 51, no. 1, pp. 148-154, 2013.
 D. Waskitho, E. Lukitaningsih, Sudaji, and A. Rohman, "Analysis of lard in lipstick formulation using FTIR spectroscopy and multivariate calibration: a comparison of three extraction methods," Journal of Oleo Science, vol. 65, no. 10, pp. 815-824, 2016.
 S. F. Sim and W. Ting, "An automated approach for analysis of Fourier Transform Infrared (FTIR) spectra of edible oils," Talanta, vol. 88, pp. 537-543, 2012.
 Z. Huo, S. Tang, Y. Park, and G. Tseng, "P-value evaluation, variability index and biomarker categorization for adaptively weighted Fisher's meta-analysis method in omics applications," Annals of Applied Statistics, 2017.
 R. G. Brereton, Chemometrics: Data Analysis for the Laboratory and Chemical Plant, John Wiley & Sons, Chichester, UK, 2003.
 Suparman, W. S. Rahayu, E. Sundhani, and S. D. Saputri, "The use of Fourier Transform Infrared Spectroscopy (FTIR) and gas chromatography mass spectroscopy (GCMS) for halal authentication in imported chocolate with various variants," Journal of Food and Pharmaceutical Sciences, vol. 2, pp. 6-11, 2015.
 A. Rohman and Y. B. Che Man, "Application of FTIR spectroscopy for monitoring the stabilities of selected vegetable oils during thermal oxidation," International Journal of Food Properties, vol. 16, no. 7, pp. 1594-1603, 2013.
 D. Ami, P. Mereghetti, A. Natalello et al., "FTIR spectral signatures of mouse antral oocytes: molecular markers of oocyte maturation and developmental competence," Biochimica et Biophysica Acta (BBA)-Molecular Cell Research, vol. 1813, no. 6, pp. 1220-1229, 2011.
 Y. B. Che Man, A. M. Marina, A. Rohman, H. A. Al-Kahtani, and O. Norazura, "A Fourier Transform Infrared Spectroscopy method for analysis of palm oil adulterated with lard in pre-dried French fries," International Journal of Food Properties, vol. 17, pp. 354-362, 2014.
 J. M. N. Marikkar, M. E. S. Mirghani, and I. Jaswir, "Application of chromatographic and infrared spectroscopic techniques for detection of adulteration in food lipids: a review," Journal of Food Chemistry and Nanotechnology, vol. 2, no. 1, pp. 32-41, 2016.
 Y. B. Che Man, Z. A. Syahariza, and A. Rohman, "Discriminant analysis of selected edible fats and oils and those in biscuit formulation using (FTIR) spectroscopy," Food Analytical Method, vol. 4, no. 3, pp. 404-409, 2011.
 A. Dominguez-vidal, J. P. Rosa, L. C. Cuadros-Rodriguez, and M. J. A. Canda, "Authentication of canned fish packing oils by means of Fourier transform infrared spectroscopy," Food Chemistry, vol. 190, pp. 122-127, 2016.
 Y. W. Lai, E. K. Kemsley, and R. H. Wilson, "Quantitative Analysis of potential adulterants of extra virgin olive oil using infrared spectroscopy," Food Chemistry, vol. 53, no. 1, pp. 95-98, 1995.
 A. Rohman and Y. B. Che Man, "Analysis of cod-liver oil adulteration using fourier transform infrared (FTIR) spectroscopy," Journal of American Oil Chemistry Society, vol. 86, pp. 1149-1153, 2009.
 M. Vasconcelos, L. Coelho, A. Barros, and J. M. M. Martins de Almeida, "Study of adulteration of extra virgin olive oil with peanut oil using FTIR spectroscopy and chemometrics," Cogent Food and Agriculture, vol. 1, pp. 1-13, 2015.
 S. Mahesh, D. S. Jayas, J. Paliwal, and N. D. G. White, "Comparison of partial least squares regress (PLSR) and Principal Components Regression (PCR) methods for protein and hardness predictions using the near infrared (NIR) hyperspectral images of bulk samples of Canadian wheat," Food and Bioprocess Technology, vol. 8, no. 1, pp. 31-40, 2015.
 Y. Ozgur and A. Goktas, "A comparison of partial least squares regression with other prediction method," Hacettepe Journal of Mathematics and Statistics, vol. 31, pp. 99-111, 2002.
 B. Wise, Properties of Partial Least Squares (PLS) Regression and Differences between Algorithms, Eigenvector Research, Inc, Manson, WA, USA, 2017, http://www.eigenvector.com/ Docs/Wise_pls_properties.pdf.
Siong Fong Sim (iD), Min Xuan Laura Chai, and Amelia Laccy Jeffrey Kimura
Faculty of Resource Science and Technology, Universiti Malaysia Sarawak, 94300 Kota Samarahan, Malaysia
Correspondence should be addressed to Siong Fong Sim; email@example.com
Received 20 July 2018; Accepted 18 October 2018; Published 8 November 2018
Academic Editor: Beatriz P. P. Oliveira
Caption: Figure 1: Flow chart of the training/test sample splitting for regression analysis.
Caption: Figure 2: The infrared spectra profile of pure and adulterated oil.
Caption: Figure 3: Spectral region of five variables with the most significant discriminatory ability.
Caption: Figure 4: Predicted concentration versus the expected concentration of test samples based on three different models (SLR, MLR, and PLSR) with specific reference to the spectral region of 3006 and 1117 [cm.sup.-1].
Table 1: The advantages and disadvantages of some common analytical methods for adulterant analysis in edible oils. Method Advantage Disadvantage (a) Easily identifies (a) Tedious sample HPLC-MS/ specific components in preparation for specific GC-MS sample; low detection compound groups (eg., limit extraction);(b) high instrument cost; long analysis time (b) Can analyse most (b) Cannot analyse FTIR samples; relatively molecules that do not inexpensive; fast and vibrate; difficult for simple analysis differentiation of compositions in a mixture NMR (a) Simultaneous analysis (d) Low sensitivity; spectroscopy of components; minimal or (b) high instrument cost no sample preparation Calorimetric (a) Sensitive to presence (a) Some analyses are method of other compounds; easy time-consuming; and inexpensive. destructive sampling (e) Requires high- DNA barcoding (e) Reliable quality DNA for identification as DNA is sequencing and comparison not affected by external with known data; long factors analysis time and high cost (a) , (b) , (c) , (d) , and (e) . Table 2: %RMSE of prediction according to spectral regions and training/test sets using various regression models. %RMSE Spectral region PLS SL R MLR (peak max) in wavenumber ([cm.sup.-1]) Training Test Training Test Training 1130-1100 (1117) 12.19 13.26 167.69 174.86 12.05 1190-1130 (1159) 18.28 20.09 225.32 248.25 839.59 1260-1210 (1236) 30.52 33.91 113.88 119.19 39.75 2870-2820 (2852) 43.28 51.05 110.33 119.28 117.96 3020-2990 (3006) 15.71 16.03 126.63 148.74 12.51 Spectral region (peak max) in wavenumber ([cm.sup.-1]) Test 1130-1100 (1117) 14.84 1190-1130 (1159) 863.52 1260-1210 (1236) 46.43 2870-2820 (2852) 128.69 3020-2990 (3006) 21.07
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Research Article|
|Author:||Sim, Siong Fong; Chai, Min Xuan Laura; Kimura, Amelia Laccy Jeffrey|
|Publication:||Journal of Chemistry|
|Date:||Jan 1, 2018|
|Previous Article:||Distribution and Identification of Sources of Heavy Metals in the Voghji River Basin Impacted by Mining Activities (Armenia).|
|Next Article:||Associating Polymer Networks Based on Cyclodextrin Inclusion Compounds for Heavy Oil Recovery.|