Developing the Australian mid-infrared spectroscopic database using data from the Australian Soil Resource Information System.
Consistent and reliable information on soil attributes is needed to understand, manage and conserve the soil. It is needed for environmental, hydrological and crop modelling, to study the effects of climate change on the soil and to improve our ability to produce enough food to support growing populations under a changing climate. Soil visible and infrared spectroscopy provides one source for such information and the mid infrared (mid-IR) is the range that we report on here. The Australian National Soil Archive is a collection of over 51 000 soil specimens from 9500 sites across Australia. The NatSoil database is an accompanying repository of associated data, which includes analytical data on soil physical and chemical attributes. The Archive's data can be accessed via the Australian Soil Resource Information System (ASRIS) (Johnston et al. 2003), where site reports containing the available soil morphological, chemical and physical data can be viewed or downloaded.
Viscarra Rossel et al. (2008) demonstrated that the bias and precision of spectroscopic calibrations could be related to the precision of the laboratory method. Because of the multi-decadal span of laboratory data in the NatSoil database and the different laboratories from around the country used for the analysis, many of which no longer exist, it is impossible to assemble information on the precision and accuracy of the analytical data contained in the database. It is also difficult to compile such data from the literature as the laboratory implementation of an otherwise 'universal' method is often country specific. In Australia, prior to the harmonisation of analytical methods in the Australian Collaborative Land Evaluation Program (ACLEP) and the publication of a national handbook of standard methods (Rayment and Higginson 1992), the procedure even for a straightforward test such as the determination of pH in a 1 :5 soil-water suspension, varied between institutions and laboratories as to whether the suspension should be settled or stirred. Methods in routine use can also be many decades old; consequently data published in the literature when the tests were developed are no longer available or lost in institutional reports. Further, the precision and accuracy of a test as employed in a routine analytical laboratory, with tests spanning many years and many soil samples is likely to be different from the data for a more limited suite of soils used in the original method development.
In Australia, data on the analytical method and laboratory performance are available through the Australian Soil and Plant Analysis Council's (ASPAC) inter-laboratory proficiency program (ILPP) for soil chemical tests. Laboratories participating in this program use standardised laboratory methods (Rayment and Higginson 1992). These methods and accompanying codes are the same as those used to identify the methods attached to the parameter values in the NatSoil database.
The aims of this work are to describe the initial development of the Australian mid infrared (mid-IR) spectroscopic database using historical soil data from the Australian Soil Resource Information System for prediction of physical and chemical properties of the soil, and to compare the precision and accuracy of the spectroscopic models with that of laboratory methods from data used in national proficiency testing.
Materials and methods
Selection of soil samples
The soil samples used to start the development of the mid-IR spectroscopic database and models were sourced from the Australian National Soil Archive. We based the selection of a diverse set of archived soil profiles to use in our analysis on the visible-near infrared (vis-NIR) spectra and their locations. Viscarra Rossel and Webster (2011, 2012) described the development of the vis-NIR spectroscopic database. The selection of soil samples was based on a conditioned Latin Hypercube sampling (Minasny and McBratney 2006) of the principal component analysis (PCA) scores of the vis-NIR spectra and the spatial coordinates of the samples. The approach is similar to that used by Viscarra Rossel el al. (2008). Thus, we selected 639 profiles made up of 4321 samples as a starting point in the development of the Australian mid-IR spectroscopic database. Their locations are shown in Fig. 1.
Soil samples without soil attribute information
We sourced soil samples collected for the National Geochemical Survey of Australia (NGSA) (de Caritat et al. 2008), which hold no information on soil attributes, and measured their mid-IR spectra. The samples were collected from across Australia after a first dissection of the continent into drainage catchments and then selection of sampling sites at low points in the catchments but well above the water table in the lowest positions. At each site, samples were collected and bulked to produce two specimens from within two depth layers, 0-10 cm and 60-80 cm (de Caritat et al. 2008). We analysed 1204 samples from the 0-10 cm layer and 1271 samples from the 60-80 cm layer. The location of the NGSA samples that we used are shown in Fig. 1.
mid-IR sample processing and spectroscopic measurements
The soil samples from the archive and the NGSA were subsampled and ground by hand in an agate mortar and pestle to pass through a 0.5 mm sieve following the recommendations of Le Guillou et al. (2015, this issue). Detail of the sample grinding protocol can be found in Appendix 1. Samples were loaded in quadruplicate onto modified 48-well sample plates (Fig. 2a) and their diffuse reflectance spectra recorded using the Bruker FT-IR Vertex70 spectrometer with a high throughput screening (HTS) attachment and robotic automation to load and unload sample plates (Bruker, Germany) (Fig. 2b). The instrument is fitted with a mercury cadmium telluride (MCT) detector that is liquid N2 cooled to improve signal to noise ratio.
The instrument has a spectral range of 7500-600 [cm.sup.-1]. We used a resolution of 4 [cm.sup.-1] and we recorded spectra using 64 measurements in 32 s, which are averaged to produce a spectrum for a particular soil sample. A background measurement using a gold standard was performed every 44 measurements to reduce errors due to instrument drift. For the analysis, we transformed the reflectance (R) spectra to apparent absorbance (A = log 1/R) and only used spectra in the range between 4000-600 [cm.sup.-1] (2500-16 700 nm).
Development of the mid-IR spectroscopic database
We used the workflow shown in Fig. 3 to commence the development of the mid-IR spectroscopic database from historical soil samples in the National Soil Archive and corresponding attribute data in the NatSoil database.
Data retrieval from the ASRIS NatSoil database
For each measured sample, we retrieved from the NatSoil database all available data on the soil chemical and physical attributes. Where multiple laboratory methods existed for an attribute, the method was used to initially determine whether data could be combined. For example, the determination of exchangeable cations has methods that use a 1 mol [L.sup.-1] ammonium chloride or acetate as extractant at pH 7. In this instance it is likely that data could be combined, whereas determinations of organic carbon that use an incomplete chemical oxidation in the Walkley-Black method (Walkley and Black 1934) should not be combined with determinations that use combustion in a furnace.
The QUANT 2 module in the Bruker OPUS software v7.2 (Bruker, Gemamy) was used to construct the partial least squares regressions (PLSR) for each soil attribute and each quadruplicate sample. The reason for using QUANT 2 was that we could then use the spectroscopic models to easily make online predictions of unknowns. QUANT 2 performs the PLSR on each of the replicates and provides for each unknown sample, the average prediction and its standard deviation.
For each attribute, we found the best pre-processing of the spectra using the 'optimise' function in QUANT 2. This tests the following spectral pre-processing algorithms: linear offset subtraction, min-max normalisation, multiplicative scatter correction (MSC), first and second derivative, individually and in binary combinations to find the best pre-processing for the spectra. Selection of the best preprocessing is based on the root mean square error of cross validation (RMSECV) and the number of PLSR factors in the model (Bruker, 2011).
For the spectroscopic modelling, outliers can occur in the soil attributes and/or in the spectra used for the modelling. Similarly, there can be outliers in the spectra of the unknown samples and in the predictions of the soil attributes. In the QUANT 2 software, outliers in the modelling are detected using F-ratios (F) of the soil attribute values or F of the spectra. Spectral outliers, in both the calibration and unknown samples, that fall outside of the spectral space of the calibration samples, were identified using the Mahalanobis distance (Maesschalck et al. 2000) and F is computed by (Bruker 2011; Haaland and Thomas 1988):
[F.sub.i] = (M - 1) [([[epsilon].sub.i]).sup.2]/[[summation].sub.j [not equal to] i] [([[epsilon].sub.j]).sup.2] (1)
where M is the number of calibrations samples, M - 1 are the degrees of freedom and e, are the spectral residuals for the ith sample in the calibration set compared to all other yth spectral residuals [[epsilon].sub.j]. These residuals are calculated for each ith sample by (Bruker 2011):
[epsilon] = [square root of [summation] [([x.sub.i] - [s.sub.i]).sup.2] (2)
where [x.sub.i] is the spectrum after pre-processing and [S.sub.i] the spectrum reconstructed from the PLSR vectors [p.sub.r] and the score coefficients [t.sub.i,r], tin such that:
[s.sub.i] = [summation] [t.sub.i,r] [p.sub.r] (3)
Spectra poorly represented by the PLSR vectors have a large F-ratio. Using the F-ratio and the M - 1 degrees of freedom a probability ([F.sub.prob]) can be calculated (Bruker, 2011):
[F.sub.Prob] (F, 1, M - 1) > 0.99. (4)
It indicates the probability that a spectrum in the calibration set is a spectral outlier. The limit for the automatic outlier detection is set in the software to 99%. If the [F.sub.prob] value lies above the limit, the corresponding spectrum is indicated in the report.
For unknown samples, the spectral F-ratios are calculated similarly but using the final calibration of all M calibration samples. The F-ratios can be used as guide in the calibration and prediction steps to flag possible outliers, which is especially important for routine analysis and for quality control applications where the individual spectral residuals may not be examined for each sample.
Model validation and assessment
Models were validated using both leave-one-out cross validation and using an independent test set. For this purpose, each soil attribute's dataset was split at random into a training set consisting of 70% of the observations and a test set, made up of the remaining 30%. The spectroscopic models were derived using the training set and the optimal number of PLSR factors determined using cross validation. The selected models were then generalised onto the test set and statistics were calculated to assess their predictability. The QUANT 2 software provides several statistics for the assessment including the coefficient of determination [R.sup.2], the mean error to assess the bias of the predictions, the standard error of prediction (SEP) to assess their imprecision, and the root mean square error (RMSE) to assess their overall inaccuracy. The ratio of the performance to deviation (RPD) is also given to enable different model comparisons. Models with RPD>2 predict well, models with 1.5> RPD <2 predict fairly well, models with RPD<1.5 predict poorly. Our categories of RPD are similar to those suggested by (Chang et al. 2001). Like Chang et al. (2001), we chose our categories for soil spectroscopy only from experience. RPD categories can be variable, and they should be evaluated in conjunction with the measures of bias and imprecision, or inaccuracy.
Accuracy of reference laboratory methods
An estimate of the precision and accuracy for each laboratory method was made using data from the 2007-08 ASP AC 1LPP reported in (Lyons et al. 2011). The mean robust coefficient of variation (CV) for each laboratory method was calculated by averaging the individual robust CVs for the six soil samples used in the 2007-08 round of the ILPP. We compared these robust CVs to the precision and bias obtained from the spectroscopic modelling.
Spectroscopic predictions for unknown samples
The estimates for the unknown NCSA samples were made using the 'evaluate' function in the QUANT 2 module. The software provides batch processing of the unknown quadruplicate spectra and for a particular sample and soil attribute, it provides the mean prediction and its standard deviation. As described above, the spectra of unknown samples that arc outside the range of the calibration set are marked as outliers, and if the predicted soil attribute values are outside of the range of the soil attributes used in the calibrations, they too are marked as outliers.
Selection of datasets, methods from the NatSoil database
The soil profiles in the spectroscopic database represent 13 Australian Soil classification (ASC) soil orders (Isbell 2002) and are from all states and territories except the Australian Capital Territory (ACT). The data we used from the NatSoil database originates from 11 different CSIRO and state agency soil survey laboratories. The descriptive statistics of the data are given in Table 1.
Model performance and laboratory method
The assessment statistics for the spectroscopic models and the calibration datasets used for the individual attributes listed in Table 2 and indicated by the NatSoil method codes. Method details can be found in Rayment and Higginson (1992) or Rayment and Lyons (2011). Models for all but the ion exchange properties used calibration data from a single laboratory method, or in the case of soil pH measured in water, included values where the method was not recorded (NR). In the case of ion exchange properties there are different methods and variations of them, where various exchangers and extractants are used, with and without the prior removal of soluble salts. For these attributes, we combined data for methods that use ammonium chloride or acetate at pH 7. These methods use the ammonium ion to replace the cations and then strip the ammonium with a 1 M salt solution and are functionally equivalent in the chemical processes of replacement and removal.
In contrast the oxidation processes used to measure organic carbon arc different. Estimation by the Walkley-Black method uses wet chemical oxidation with an empirically determined factor (Allison 1960) that is required if an estimate of the total organic carbon is desired. The reason being that the oxidation is known to be incomplete, for example char carbon present as carbon black would not be oxidised, while organic carbon measured using a high frequency induction furnace achieves complete oxidation by combustion. In this case the measurements are different and the datasets might need to be modelled separately. In the case of bicarbonate extractable phosphorus, the Colwell and Olsen P methods use the same extractant but different soil to extractant ratios and extraction times. The two methods represented different populations and it was impossible to model them in combination. Only the Colwell Bicarbonate P (Colwell 1963) was modelled as we did not have many samples in our set with Olsen P data.
Models for 16 of the 21 soil attributes tested had values of [R.sup.2] > 0.7 and five had [R.sup.2] > 0.5 (Table 2). Using RPD as a measure of model performance, 15 of the 21 soil attributes models predicted the independent test set well with RPD values >2, while six predicted fairly well with 1.5 > RPD < 2. The spectroscopic models for soil water measured at 15 bar and exchangeable acidity had [R.sup.2] < 0.5 and RPD < 1.5, so we judged them not useful for quantitative predictions of unknown samples (data not shown).
Particle size distribution was predicted very well (Table 2). The models for sand and clay contents, which can be more directly attributed to absorptions of minerals in the mid-IR compared to silt, produced [R.sup.2] > 0.85 and RPD values > 2.5, for the independent test set validation. The model for silt produced an [R.sup.2] of 0.81 and an RPD of 2.3. Soil water did not predict as well as we expected (Table 2). The model for water measured at 0.1 bar produced [R.sup.2] values of around 0.7 and RPD of 1.6 and 2.1 in the cross and test set validations (Table 2). The measurement at 15 bar (which approximates wilting point and is therefore agronomically useful) did not predict well and so it is not presented. There was little data to develop these models and some of the data that was there may have been derived using pedotransfer functions.
The soil organic carbon models, calibrated separately to the Walkley-Black and the furnace methods, predicted the test sets very well with [R.sup.2] > 0.85 and RPD > 2.5 (Table 2). As for the modelling done with the visible-near infrared (Viscarra Rossel and Webster 2012), by combining these data and modelling, we obtained slightly better validation statistics than with the individual models, [R.sup.2] of 0.9 and an RPD = 3 for the test set validations. These findings do not conform with the expectation that the two organic carbon methodologies would produce significantly different populations as is the case in the 1LPP (Lyons et al. 2011).
The model for pH also predicted well with [R.sup.2] of 0.8 and RPD of 2 (Table 2). The models to predict soil nutrient status, total nitrogen, phosphorus and potassium, performed reasonably well with [R.sup.2] > 0.8 and RPD > 2. The model to predict available P (measured with the Colwell Bicarb method) did not perform well, it produced an [R.sup.2] value of 0.57 and RPD of 1.6 (Table 2). The models for soil exchange properties, which are affected by chemistry that can be directly related to mid-IR absorptions (e.g. carbonate and clay minerals) predicted well, with calcium, magnesium, cation exchange capacity, sum of exchangeable cations and the sum of bases having [R.sup.2] > 0.8 and RPD > 2. The models for exchangeable sodium and potassium and base saturation predicted less well with [R.sup.2] values around 0.6 and RPD values of 1.6-1.8 (Table 2).
Values for the ILPP robust CV (Table 3) were available for 10 attributes: [pH.sub.water], organic C measured with the Walkely-Black method, organic C measured with the furnace method, total N, exchangeable Ca, Mg, K, Na, total P measured with an x-ray fluorescence (XRF) method, and bicarbonate P measured with the Colwell method.
For all attributes together, the ILPP robust CV correlated strongly with the cross-validation SEP (Pearson correlation coefficient, p = 0.78), which measures the precision of the spectroscopic models (Fig. 4a). Once measurements of total P with an XRF were removed, the ILPP robust CV also correlated well with the SEP of the test set predictions (p = 0.64) (Fig. 4c). This suggests spectroscopic models for total P measured with XRF, do not generalise well and are likely to be imprecise for predicting unknowns.
The cross-validation bias for the predictions of organic C using the Walkley-Black method was much larger than for other attributes (Fig. 4b), suggesting that these spectroscopic models might produce somewhat biased results. Without the data for organic C measured using the Walkley-Black method, the correlation between the ILPP robust CV and the (absolute relative) bias of the spectroscopic models' cross validations increased (p = 0.36, Fig. 4b). Interestingly however, the correlation with the (absolute relative) bias of the test set predictions was stronger (p = 0.45) and the absolute relative bias of the organic C data measured with the furnace method was larger than that measured using Walkley-Black (Fig. 4d).
Prediction of unknowns
We used those models that we deemed to be adequate for quantitative predictions ([R.sup.2]> 0.5 and RPD>1.5) to make predictions on the 2475 unknown samples from the NGSA (1204 form the 0-10 cm and 1271 from the 60-80 cm layer). In Table 2 we show data on the number of concentration ([O.sub.a]) and spectroscopic ([O.sub.s]) outliers produced in the predictions of the unknown samples. By showing these outliers we provide indication of how well the spectroscopic database represent the unknown spectra, and as such, the values indicate the predictability and usefulness of the spectroscopic soil attribute models. For predictions of the sum bases and the C:N ratio there were 5 and 8% of spectral outliers respectively. For all other attributes there were <5% spectral outliers. Predictions of bicarb P, exchangeable Na, organic C (Walkley-Black) and total N produced 20, 17, 13 and 11% of concentration outliers, respectively. All other predictions produced less than 10% concentration outliers (Table 2). Summary statistics of the predictions on the unknown samples are shown in Table 4.
The geographic coverage of the samples in the spectroscopic database and the unknowns are shown in Fig. 1. While the unknown samples are more widely distributed across Australia, the small number of spectral outliers for the unknowns indicates that the database is adequate for predicting the soil properties of unknown samples from many regions across the continent.
Discussion and conclusions
Data in the NATSOIL database has been sourced from many laboratories, eleven in our case, and cover several decades of sample collection and laboratory analysis. Many soil attributes were measured using various laboratory methods, with varying degrees of similarity in their conceptual chemical or physical basis. Much of the data from CS1RO laboratories were transferred from hard copy into a database prior to the publication of the current standard method codes by (Raymcnt and Higginson 1992). This data was subsequently exported using two relational databases before ending up in the current Microsoft Access[TM] NatSoil database.
During the matching of data to spectra, we found that the NATSOIL database contained data for which the incorrect method codes were allocated. One of us (the lead author here) was responsible for much of the original digitisation of paper records some 30 years ago, as well as managing one of the laboratories responsible for performing the analyses and so we could correct most of the misallocations. Soil analytical and chemical knowledge is useful for the development of spectroscopic databases using historical soil data. For some attributes, the inaccuracies that we found combined with the limited analytical data for many of the attributes, resulted in significantly less useful data left for the modelling (Table 2). For these reasons, we followed an iterative process for developing the spectral database (Fig. 3).
Accuracy and precision of analytical and spectroscopic methods
We showed that there is correlation between the imprecision of the conventional laboratory method, measured by the robust CV from an ILPP (Lyons et al. 2011), and the imprecision of the spectroscopic models constructed using spatially and temporally diverse historical soil attribute data measured in different laboratories. Methods that perform poorly in ILPPs are generally those that have more inaccurate spectroscopic predictions (Fig. 4). The analysis of these properties require the most rigorous control of analytical conditions and also the more careful and judicious spectroscopic modelling. For example, to analyse bicarbonate extractable P, the pH of the reagent has to be adjusted immediately before use, the temperature of the extraction controlled to [+ or -]1[degrees]C, the energy of the shaking standardised and the extraction time controlled to within a few tens of minutes. Any error introduced in the conventional laboratory analysis will propagate through to the spectroscopic modelling and if the correlations of the spectra to the particular soil attribute are mostly secondary or larger order, as is the case for P, then it is unlikely that the spectroscopic models will predict accurately. The relative difficulties in maintaining standard conditions between and within a laboratory are similar for both the performance of the ILPP method and the spectroscopic measurement and modelling alike.
Spectroscopic models used in routine operation
The assessment statistics for the spectroscopic modelling showed that we could derive useful models to predict the attributes of an independent set of soil samples using only their spectra (Table 2). Samples and data sourced from the Australian National Soil Archive and NATSOIL database could be used to build accurate and useful models for a number of soil attributes, including total N, organic C and total P. The model for bicarbonate extractable plant available P was the least accurate (/?2 = 0.57). Because there are often multiple methods and variations of methods recorded in the NATSOIL database for what is notionally the same soil attribute, selection of data and winnowing requires the application of soil chemistry knowledge to make better use of the data in the spectroscopic modelling. Both the precision and bias of the laboratory data, as well as the integrity of the database are important for the spectroscopic modelling. The former directly influences the accuracy of a model and the later can complicate the use of historical soil databases.
We thank the CSIRO, the Terrestrial Ecosystem Research Network (TERN), the Australian Government through the National Collaborative Research Infrastructure Strategy and the Super Science Initiative, for funding the project. We also thank Rebecca Edwards, Mark Glover and Gordon McLachlan for preparing the soil samples that we analysed. Peter Leppert, Marie Virueda, Gabrielle Navarrette, and Linda Karrises are thanked for helping to retrieve the soil samples from the National Soil Archive, and David Jacquier and Peter Wilson for their help with the NatSoil database.
Allison L (1960) Wet-combustion apparatus and procedure for organic and inorganic carbon in soil. Soil Science Society of America Proceedings 24, 36-40.
Bruker (2011) Opus spectroscopy software version 7, Quant user manual. Bruker Optik Ettlingen, Germany.
Chang CW, Laird DA, Mausbach MJ, Hurburgh CR (2001) Near-infrared reflectance spectroscopy-principal components regression analyses of soil properties. Soil Science Society of America Journal 65, 480-490. doi:10.2136/sssaj 2001.652480x
Colwell J (1963) The estimation of phosphorus fertilizer requirements for wheat in southern new south wales by soil analysis. Australian Journal of Experimental Agriculture and Animal Husbandry 3, 190 197.
de Caritat P, Lech ME, McPherson AA (2008) Geochemical mapping 'down under': selected results from pilot projects and strategy outline for the national geochemical survey of Australia. Geoscience Australia, www. ga.gov.au/metadata-gateway/metadata7record/71113/
Haaland DM, Thomas EV (1988) Partial least-squares methods for spectral analyses. 1. relation to other quantitative calibration methods and the extraction of qualitative information. Analytical Chemistry 60, 1193-1202.
Isbell R (2002) 'The Australian Soil Classification.' revised edn. (CSIRO Publishing: Melbourne)
IUPAC (2006a) 'Compendium of chemical terminology.' 2nd edn. (the "Gold Book"). (Compiled by AD McNaught and A Wilkinson) (Blackwell Scientific Publications, Oxford (1997). XML on-line corrected version: http://goldbook.iupac.org (2006-) created by M Nic, J Jirat, B Kosata; updates compiled by A. Jenkins. doi:10.1351/goldbook.
IUPAC (2006b) 'Compendium of chemical terminology.' 2nd edn. (the "Gold Book"). (Compiled by AD McNaught and A Wilkinson) (Blackwell Scientific Publications: Oxford (1997). XML on-line corrected version: http://goldbook.iupac.org (2006-) created by M Nic, J Jirat, B Kosata; updates compiled by A. Jenkins. doi:10.1351/goldbook.
Johnston RM, Barry SJ, Bleys E, Bui EN, Moran CJ, Simon DAP, Carlile P, McKenzie NJ, Henderson BL, Chapman C, Imhoff M, Maschmedt D, Howe D, Grose C, Schoknecht N, Powell B, Grundy M (2003) Asris: the database. Soil Research 41, 1021-1036. doi:10.1071/SR02033
Le Guillou F, Wetterlind J, Viscarra Rossel RA, Hicks W, Gmndy M, Tuomi S (2015) How does grinding affect the mid-infrared spectra of soil and their multivariate calibrations to texture and organic carbon? Soil Research 53, 913-921.
Lyons D, Rayment G, Hill R, Daly B, Marsh J, Ingram C (2011) Aspac soil proficiency testing program report 2007-08. Tech. Report, ASPAC, Melbourne, Victoria, www.aspac-australasia.com/index.php/documents/ upload-documents/doc down!oad/232-annual-review-soi 1-07-08
Maesschalck RD, Jouan-Rimbaud D, Massart D (2000) The mahalanobis distance. Chemometrics and Intelligent Laboratory Systems 50, 118.
Minasny B, McBratney AB (2006) A conditioned latin hypercube method for sampling in the presence of ancillary information. Computers & Geosciences 32, 1378-1388.
Rayment G, Higginson F (1992) Australian Laboratory Handbook of Soil and Water Chemical Methods. Australian Soil and Land Survey Handbooks Series. Inkata Press, Melbourne, Australia.
Rayment G, Lyons D (2011) 'Soil chemical methods-Australasia.' (CSIRO Publishing: Melbourne)
Viscarra Rossel R, Webster R (2011) Discrimination of australian soil horizons and classes from their visible-near infrared spectra. European Journal of Soil Science 62, 637-647.
Viscarra Rossel RA, Jeon YS, Odeh IOA, McBratney AB (2008) Using a legacy soil sample to develop a mid-IR spectral library. Soil Research 46, 1-16. doi: 10.1071/SR07099
Viscarra Rossel RA, Webster R (2012) Predicting soil properties from the australian soil visible-near infrared spectroscopic database. European Journal of Soil Science 63, 848-860.
Walkley A, Black 1 (1934) An examination of the degtjareff method for determining organic carbon in soils: Effect of variations in digestion conditions and of inorganic soil constituents. Soil Science 63, 251-263.
Appendix 1. Soil grinding protocol for diffuse reflectance infrared Fourier transform (DRIFT) spectroscopy
This soil sample preparation (grinding) protocol is designed to produce soil samples with a consistent particle size of 500 pm (0.5 mm) or finer for measurements with a diffuse reflectance infrared Fourier transform (DRIFT) mid-IR spectrometer. The effects of soil sample grinding on mid-IR spectra are reported by Le Guillou et al. (2015, this issue).
During transport and storage, soil samples can settle and their particles can become sorted in the container. To obtain a subsample for grinding that is representative of the bulk composition of the soil in the container, the soil in the container must be remixed (IUPAC 2006a). Mixing should be done by gently rotating the container in opposite directions along both of its axes as shown in Figure Al. There must be enough headspace to allow the soil to move and mix in the container, or the procedure will be ineffective. If this is not the case, consider transferring the entire contents of the container to a larger one before remixing, or use another technique to obtain the subsample, such as a riffle sample divider or coning and quartering (IUPAC 20066).
Soil subsample grinding
1. Take approximately 4g (or approximately half a teaspoonful) of soil from the sample container.
2. Place in the agate mortar and grind the subsample for 30 s in a circular motion by applying even pressure using only the agate pestle.
3. Pass the ground sample through a stainless steel sieve for the desired particle size. Based on recommendations by Le Guillou et al. (2015, this issue), we use the 500 pm sieve.
4. Pour the sample that does not go through the sieve back into the agate mortar and grind as in step 2 above. Do not force soil through the sieve because doing so will damage the sieve.
5. Repeat steps 2-4 until the entire subsample has been ground and passes through the sieve. Note that the entire subsample should be ground and passed through the sieve so as not to selectively exclude soil components. Take particular care with sandy samples as sand grains can fly out of the mortar.
6. Transfer the sieved soil subsample to your storage container using a small funnel to avoid sample loss.
7. Between samples, clean the mortar, pestle, teaspoon, funnel, and any other equipment used with a brush and/or a tissue with 50% ethanol-water to prevent cross contamination. Dry the equipment well before reuse.
W. Hicks (A), R. A. Viscarra Rossel (A,B), and S. Tuomi (A)
(A) CSIRO Land & Water, PO Box 1666, Canberra, ACT 2601, Australia.
(B) Corresponding author. Email: firstname.lastname@example.org
Table 1. Statistical summary of the soil samples used in the experiments Soil attribute Units N Mean Total sand % 2411 53.5 Silt % 2949 12.5 Clay % 2957 32.9 0.1 Bar water % 151 0.3 [pH.sub.Water] 3880 6.8 Organic C (Walkely-Black) % 498 0.8 Total C (Furnace) % 935 1.3 Total N % 695 0.1 C:N 272 19.8 Total P (XRF) % 530 0.0 Bicarb P mg [kg.sup.-1] 472 14.6 Exch. Ca [cmol.sub.c] [kg.sup.-1] 1759 10.1 Excli. Mg [cmol.sub.c] [kg.sup.-1] 1759 6.91 Exch. Na [cmol.sub.c] [kg.sup.-1] 1730 2.27 Exch. K [cmol.sub.c] [kg.sup.-1] 1047 0.574 CEC [cmol.sub.c] [kg.sup.-1] 658 13.3 ECEC [cmol.sub.c] [kg.sup.-1] 181 11 Sum bases [cmol.sub.c] [kg.sup.-1] 181 6.59 Base saturation % 237 0.674 Total Fe % 317 4.9 Total K % 779 0.668 St. Soil attribute Dev. Min. Med. Max. Skew. Total sand 25.9 2.0 55.0 100.0 -0.1 Silt 9.8 0.0 10.8 85.0 1.3 Clay 20.5 0.0 32.0 93.0 0.3 0.1 Bar water 0.1 0.1 0.3 0.5 -0.4 [pH.sub.Water] 1.5 2.9 6.5 10.3 0.3 Organic C (Walkely-Black) 1.4 0.0 0.4 20.9 7.0 Total C (Furnace) 3.0 0.0 0.5 37.1 7.4 Total N 0.2 0.0 0.1 1.8 3.8 C:N 10.1 0.0 18.5 57.0 1.0 Total P (XRF) 0.1 0.0 0.0 0.5 2.8 Bicarb P 20.7 1.0 7.4 227.7 3.8 Exch. Ca 12.2 0 5.7 167 3.48 Excli. Mg 7.28 0.01 4.3 49 1.53 Exch. Na 3.61 0.01 0.485 37 2.44 Exch. K 0.569 0.01 0.4 3.63 1.55 CEC 16.4 0.1 6 96.7 1.95 ECEC 13 0.81 4.72 70.23 1.98 Sum bases 8.8 0.19 2.19 35.1 1.75 Base saturation 0.31 0.01 0.74 1.18 -0.465 Total Fe 5.85 0.08 2.74 64 3.96 Total K 0.86 0.004 0.32 5.28 2.12 Table 2. Assessment statistics for the mid-IR models used to predict soil attributes Units for the soil attributes are given in Table 1. A method code of 'NR ' indicates the exact method was unavailable and therefore not recorded in the database. M is the number of samples used in the modelling, [O.sub.N-M] are the number of outliers and NF the number of PLSR factors used. The assessment statistics reported are for both the training (T) and testing (V) of the models. They are the coefficient of determination [R.sup.2], the root meansquared error (RMSE), the mean error (ME), the standard deviation of the error (SDE) and the ratio of performance to deviation (RPD). For the number of unknown prediction samples V, we show the soil attribute concentration outliers ([O.sub.a]) and spectral outliers ([O.sub.s]) Soil attribute Method code M [O.sub.N-M] NF [R.sup.2] Sand 2376 35 10 0.90 Silt P10 2848 101 10 0.78 Clay 2898 59 10 0.86 0.1 Bar water P3B6VL 149 2 6 0.71 [pH.sub.Water] 4A1, N 3768 112 10 0.81 Organic C 6A1 494 4 9 0.85 Total organic C 6B2, 3 897 38 8 0.86 Total N 7A1 679 16 7 0.83 C:N 8 267 5 6 0.68 Total P 9A1 404 126 3 0.79 Bicarb P 9B 444 28 7 0.57 Exch. Ca 1747 12 10 0.86 Exch. Mg 1031 728 10 0.85 Exch. Na 1651 79 8 0.65 Exch. K 967 80 10 0.59 CEC 15ABCD, NR 638 20 8 0.83 ECEC 172 9 9 0.82 Sum Bases 177 4 10 0.92 Base Saturation 232 5 9 0.59 Total Fe 12A 301 16 6 0.81 Total K 17A 768 11 9 0.78 T (70%) Soil attribute RMSE SDE ME RPD [R.sup.2] RMSE Sand 7.77 7.77 -0.08 2.95 0.90 7.84 Silt 4.33 4.33 0.07 2.11 0.81 4.37 Clay 7.59 7.59 -0.02 2.66 0.87 7.04 0.1 Bar water 0.05 0.05 0.01 1.59 0.77 0.05 [pH.sub.Water] 0.62 0.62 0.06 2.32 0.80 0.61 Organic C 0.41 0.34 -0.23 2.33 0.88 0.40 Total organic C 0.44 0.44 0.01 2.66 0.85 0.53 Total N 0.06 0.06 0.00 2.43 0.86 0.07 C:N 5.40 5.40 -0.02 1.76 0.62 6.36 Total P 0.01 0.01 0.00 2.18 0.80 0.01 Bicarb P 8.03 7.99 -0.85 1.52 0.57 8.52 Exch. Ca 3.98 3.98 -0.02 2.67 0.86 3.98 Exch. Mg 2.37 2.36 -0.17 2.60 0.86 2.58 Exch. Na 1.63 1.63 -0.05 1.70 0.66 1.82 Exch. K 0.29 0.28 -0.02 1.56 0.63 0.29 CEC 6.12 6.10 -0.49 2.42 0.85 6.18 ECEC 4.11 4.11 -0.11 2.16 0.84 4.03 Sum Bases 2.08 2.02 0.49 3.49 0.93 2.06 Base Saturation 0.20 0.20 -0.01 1.53 0.61 0.19 Total Fe 1.83 1.83 -0.05 2.31 0.81 1.98 Total K 0.37 0.37 -0.02 2.15 0.83 0.39 V (30%) P (n = 2475) Soil attribute SDE ME RPD [O.sub.a] [O.sub.s] Sand 7.84 0.07 3.20 13 8 Silt 4.37 0.17 2.30 92 15 Clay 6.99 -0.82 2.80 31 12 0.1 Bar water 0.05 0.00 2.10 144 33 [pH.sub.Water] 0.59 0.17 2.20 0 11 Organic C 0.30 -0.26 2.80 157 2 Total organic C 0.44 0.29 2.60 312 0 Total N 0.05 -0.05 2.70 260 26 C:N 5.76 2.70 1.70 102 209 Total P 0.01 0.00 2.30 2 11 Bicarb P 6.12 5.93 1.60 496 62 Exch. Ca 3.98 -0.02 2.70 228 38 Exch. Mg 2.33 1.11 2.70 76 28 Exch. Na 0.81 1.63 1.80 414 36 Exch. K 0.29 0.06 1.70 23 26 CEC 5.62 2.56 2.60 25 21 ECEC 4.03 0.09 2.50 91 67 Sum Bases 2.06 -0.03 3.80 153 124 Base Saturation 0.18 0.06 1.60 139 37 Total Fe 1.49 1.30 2.40 0 0 Total K 0.28 0.27 2.40 31 25 Table 3. Robust coefficient of variation (CV) values for ASPAC 2007- 2008 soil proficiency testing program after removal of outliers (from Lyons et al. 2011) The first numeral in the method code indicates the parameter, the letter the method and the end numeral the variation of the method, for example whether the instrumental measurement was manual or automated, or whether soluble salts were removed. If not specified all variants were included in the model. Units for the soil attributes are given in Table I Attribute Method code Mean robust CV pH water 4A1 0.02 Organic C (Walkely-Black) 6A1 0.10 Total organic C (Furnace) 6B2 0.07 Total N 7A1 0.17 Ca 0.11 Mg 15A1 0.08 K 0.18 Na 0.34 Total P (XRF) 9A1 0.12 Bicarb P (Colwell) 9B 0.11 Table 4. Summary statistics of the spectroscopic prediction on the n unknown soilsamples, without outliers Units for the soil attributes are given in Table 1 Soil attribute n Mean St. Dev. Minimum Maximum Sand 2454 64.90 22.67 6.00 101.00 Silt 2368 9.00 7.48 0.00 42.30 Clay 2432 22.50 16.03 0.30 77.20 0.1 Bar water 2298 0.23 0.10 0.07 0.55 PH Water 2464 7.30 1.00 4.30 10.20 Organic C 2316 0.65 0.91 0.02 10.60 Total organic C 2163 0.67 0.93 0.01 15.10 Total N 2189 0.06 0.08 0.00 0.84 C:N 2164 18.50 4.20 0.10 42.80 Total P 2462 0.00 0.01 0.00 0.10 Bicarb P 1917 6.60 9.20 0.10 52.80 Exch. Ca 2209 7.20 6.30 0.00 37.20 Exch. Mg 2371 4.30 3.70 0.02 25.50 Exch. K 2426 0.60 0.30 0.01 2.10 Exch. Na 2025 1.40 1.60 0.01 8.20 CEC 2429 17.20 10.70 0.14 62.10 EC EC 2317 20.10 10.80 0.88 63.40 Sum bases 2198 15.70 10.00 0.21 59.20 Base saturation 2299 0.90 0.20 0.05 2.10 Total Fe 2475 4.69 2.02 0.07 11.60 Total K. 2419 0.72 0.54 0.01 3.42
|Printer friendly Cite/link Email Feedback|
|Author:||Hicks, W.; Rossel, R.A. Viscarra; Tuomi, S.|
|Date:||Nov 1, 2015|
|Previous Article:||How does grinding affect the mid-infrared spectra of soil and their multivariate calibrations to texture and organic carbon?|
|Next Article:||Eighty-metre resolution 3D soil-attribute maps for Tasmania, Australia.|