Printer Friendly

Digital soil class mapping using legacy soil profile data: a comparison of a genetic algorithm and classification tree approach.

Introduction

There is growing demand for soil information at varying spatial extent, from global to regional and national scales. Effective soil management, environmental modelling, and monitoring all require knowledge of the distribution of soils within the landscape. In Australia, a move towards catchment-based management, under the auspices of various Catchment Management Authorities (CMAs), has led to increased demand for catchment-scale information (NCMA 2005). To meet this demand, some CMAs have initiated projects to integrate disparate soil data into soil spatial information systems, or, more broadly, land resource information systems useful for natural resource management. Where existing soil data are absent, new soil survey projects may be required to fill the gaps. However, traditional soil surveys are deemed to be expensive and time-consuming (Bui 2007) as well as overly qualitative (McBratney et al. 2000). As such, digital soil mapping techniques, as a conduit to creating soil spatial information systems, have been developed to utilise increased computing power, existing soil databases, ever-increasing rich ancillary data (McBratney et al. 2003; Behrens and Scholten 2007) and exiting soil maps (Bui and Moran 2003). Existing soil databases (termed here as legacy data), which are predominantly soil profile descriptions and soil classification, cover much of the agriculturally intensive regions of Australia and provide an underutilised resource for digital mapping of soil classes.

Soil class mapping aims to delineate the patterns of soil types within a landscape or area (McBratney et al. 2003). In most Australian and overseas soil survey programs, the default has been to derive a local classification based on the observations within the survey area, with correlation across the whole program, and in addition to apply the national classification for broader communication purposes--in Australia, the Australian Soil Classification (Isbell 1996). The classification of soil profiles or soil cores, based on field observations, provides useful information for land management and environmental monitoring at various scales (Hengl et al. 2007). Soil class mapping generally involves a set of known soil profile descriptions and other observations classified according to the chosen classification scheme and a set of observable variation in ancillary information or environmental variables in the field. This traditional approach is often adopted for several reasons: (1) financial constraints, i.e. field observations are generally less expensive than laboratory analysis; (2) given spatial variability in soil, it is better to obtain soil information from more sites within a survey area rather than more detailed information from fewer sites; (3) legacy data, while based on different classification schemes, provide a valuable data source; (4) a soil classification scheme may be implicit of many soil properties and thus a soil class can be used to estimate or infer other soil properties or behaviour (Minasny and McBratney 2007).

Recent advances in technology have led to new vistas in digital soil mapping based on quantitative models derived from correlation of soil classes with easy-to-measure ancillary variables (McKenzie et al. 2000; McBratney et al. 2003). The scorpan-sspf model (McBratney et al. 2003) provides such a framework, where the ancillary information represents various soil-forming factors. Quantitative models require a predictive function (soil spatial prediction function, sspf), which have to date been drawn from data mining applications. Predictive functions that have been widely used for developing such predictive models include logistic regression (e.g. Bailey et al. 2003; Giasson et al. 2006), classification tree (CT) analysis (e.g. Lagacherie and Holmes 1997; McBratney et al. 2000; Moran and Bui 2002; Bui and Moran 2003), neural networks (e.g. Behrens et al. 2005; Behrens and Scholten 2007), support vector machines and learning vector quantisation (e.g. Behrens and Scholten 2007), and discriminant analysis. Most predictive functions and predictive models do not address the problematic 'unknown site selection probability' (Bui and Moran 2003; McBratney et al. 2003) apparent in legacy data. Thus, there is the need to explore alternative models such as those based on genetic algorithms. An example is the genetic algorithm for rule-set production (GARP) (Stockwell and Noble 1992). GARP was developed by Stockwell and Noble (1992), Stockwell (1999), and Stockwell and Peters (1999) to model the habitat distribution of plant and animal species using locations of known species' presence and environmental variables. The algorithm has stochastic elements which produce a population of rules before iterative use of the best rules in a given generation to develop the next generation until some convergence criterion is met and a solution given (Stockwell 1999; Stockwell and Peters 1999).

GARP is noteworthy in that it was developed specifically to utilise legacy data to analyse collections of locations (indicating the presence of a plant or animal species), such as those collated in museums, addressing the 'unknown site selection probability' by randomly redistributing the observations into training and testing datasets for each generation. While GARP has been widely used in ecology to map the distribution of plant species (e.g. Peterson et al. 2003; Raimundo et al. 2007), bird species (e.g. Peterson and Cohoon 1999; Chen and Peterson 2002), and rodent species (e.g. Anderson et al. 2002a, 2002b), its potential for mapping of the soil classes has not been explored.

Advances in digital soil mapping through spatial prediction of soil classes, and indeed of soil attributes, has been due to increasing availability of ancillary data at fine resolution. A notable source of such ancillary data is gamma radiometric information, which is recognised as a potential source of soil surrogates for regolith or parent material, especially clay mineralogy (Cook et al. 1996; Wilford et al. 1997; Wilford and Minty 2007). In Australia, gamma radiometric data is almost freely available, sourced from private and public agencies. It is therefore important to assess whether this remotely sensed information, other than the traditional, readily available ancillary data (such as Landsat and other space-borne remote seining data, DEM, etc.), can improve the accuracy of digital soil class mapping.

The major aims of this paper were: (1) to develop a scheme for implementing GARP in digital soil class mapping; (2) to compare the performance of GARP with a commonly used method, CT; and (3) to evaluate the usefulness of radiometrics as a predictor in creating digital soil class maps from legacy soil data. In testing these aims we created a 200-m resolution digital soil class map of the Namoi catchment in north-west New South Wales, Australia, based on the abridged Australian Soil Classification Suborders (Isbell 1996) using an existing database of soil profiles collated from various sources. We developed a GARP approach to produce the digital soil class map based on the scorpan-sspfe model (McBratney et al. 2003), which takes advantage of the near omnipresence of ancillary information. We then compared the performance of GARP with the best of CT using several classification accuracy measures and map uncertainty. We also examined the effect of including/excluding radiometric data in the prediction model

Materials and methods

Study area

The study area is the Namoi Catchment, an area of ~42 000 [km.sup.2] in north-western NSW (Fig. 1), (NCMB 2003). The catchment forms part of the Murray-Darling Basin. The most recently produced soil class map for the catchment was developed by Bui and Moran (2003) as part of the Murray Darling Basin Information System (MDBC 1999), consisting of mapping units classified in accordance to Northcote (1979). There is no catchment-scale digital soil class map based on the current and widely used Australian Soil Classification (Isbell 1996).

The geology of the eastern section of the catchment is dominated by Tertiary volcanics, with basalt rocks forming the south, south-western, and north-eastern boundaries (Donaldson and Heath 1997). The central section of the catchment is dominated by sedimentary geology, including shales, sandstones, and conglomerate rocks, with the alluvial plains consisting of Quaternary sediments (Zhang et al. 1999). A large alluvial plain stretching from Narrabri west to Walgett is dominated by Quaternary sediments (Zhang et al. 1999).

The soils on the alluvial plains, especially where sediment is predominately sourced from the basaltic ranges, are generally moderately fertile, deep cracking clays (Donaldson and Heath 1997; Young et al. 2002) or Vertosol (Isbell 1996). In some places, these Vertosols are found in association with duplex soils termed as Chromosols, Sodosols, and Kurosols, and Dermosols and Ferrosols (Isbell 1996; Donaldson and Heath 1997). In the eastern part, which is characterised by rough or steep terrain, the soil associations are predominantly made up of duplex soils as well as Kandosols, Tenosols, and Dermosols (Isbell 1996; Donaldson and Heath 1997) Soils formed on Pilliga sandstone, which are located in the southern central section of the catchment, are coarse-textured Kandosols, Tenosols, and some Sodosols (Donaldson and Heath 1997).

Data collation

Legacy soil data

Spatially referenced soil profile observations were obtained from the former NSW Department of Natural Resources (DNR) and the Cotton CRC Soil Database, developed at the University of Sydney (Odeh et al. 2004). The soil data consisting of soil profile information at all locations, totalling 3875 soil profiles (Fig. 2), were classified to the Suborder level of the Australian Soil Classification (Isbell 1996) and then used for this study. Where few observations were present at the Suborder level, selected similar Suborders within Orders were combined to form abridged Suborder classes; in total, there are 26 Suborders or abridged Suborders (see Table 1, Fig. 3).

[FIGURE 1 OMITTED]

[FIGURE 2 OMITTED]

Ancillary data

Data on several environmental covariates were used in this study (Table 2). Elevation and terrain attributes, including slope, aspect, plan curvature, profile curvature, curvature, topographic wetness index (TWI), multi-resolution valley bottom flatness index (MRVBF, Gallant and Dowling 2003), and Hammond Landform class (Hammond 1964; Dikau et al. 1991), were used as relief factors. Airborne magnetic survey data, gamma radiometric data [potassium (%), thorium (ppm), uranium (ppm)] as surrogates for parent material, NDVI derived from MODIS imagery captured in May 2003, and land-use representing vegetation were the other environmental covariates used. The gamma radiometric data are only available for approximately three-quarters of the catchment.

Modelling approach

Many digital soil class mapping techniques are designed to relate soil classes or soil types with the ancillary variables or covariates. McBratney et al. (2003) provide a prototype model for digital soil class mapping which they termed as scorpan-sspf. It is expressed as:

[S.sub.c] = f(s,c,o,r,p,a,n) (1)

where [S.sub.c] is a soil class, s is the soil information such as obtained from a prior soil map or expert knowledge, c refers to climate, o is organisms such as human activity, r is relief, p is parent material, a is age, and n refers to neighbourhood or spatial position. Each of the scorpan factors on the fight-hand side of Eqn 1 may be represented by a set of interval or ratio data, such as elevation and terrain attributes for r and magnetic survey data for p. The scorpan model is also flexible enough to incorporate categorical or nominal variables as the covariates. The spatial prediction function sspf, f, could take the form of a supervised classification algorithm or logistic model. In using the supervised classification method, the known soil observations are used as the training data to fit the supervised classification algorithm and then predict the soil classes onto a fine grid of locations where there is ancillary information. Table 2 shows the covariates used in this study, and the scorpan factor each represents.

The GARP modelling system

This overview of GARP is based on the work of Stockwell (1999) and Stockwell and Peters (1999). Genetic algorithms are based on the concept of evolution by natural selection as solutions are evolved in a stochastic, iterative manner. The GARP modelling procedure consists of the following steps:

1. Start at initial time (t = 0);

2. Initialise a population of individuals (rules) P(t);

3. Evaluate the fitness of each individual by evaluating how well a rule predicts the distribution using a random subset of observations (training dataset) and save the best individuals in a rule archive;

4. Test against fitness criterion and terminate this rule archive if the criterion is met; otherwise

5. Increase time counter;

6. Create a new set of individuals using the rule archive and random generators;

7. Apply heuristic operators to population;

8. Go to 3.

CARP uses 4 classes of rules: (1) envelope, a conjunction of ranges of all ancillary information where a soil class may be present, e.g. if elevation is 600-800 m, slope <0.2%, MRVBF >5, then a Vertosol may be present; (2) subset envelope, envelope rules where 1 or more variables are excluded from the rule; (3) atomic, a conjunction of single values for continuous or categorical variables, e.g. landform class = open moderate hills, MRVBF = 2, then Kandosol; and (4) logit, adaptations of a logistic regression model to a rules. Rule fitness is evaluated using the following set of criteria (Eqn 2):

Coverage = n/[n.sub.o] Prior probability = p[Y.sub.S]/n Posterior probability = pX[Y.sub.S]/n (2)

Significance = (pX[Y.sub.S] - [n.sub.o]p[Y.sub.S]/n)/[square root of ([n.sub.o]p[Y.sub.S](1 - p[Y.sub.S]/n)/n]

where n is the number of data points in the random sample training set, [n.sub.o] is the number of points to which the rule applies, p[Y.sub.S] is the number of data with the same conclusion as the rule, and pX[Y.sub.S] is the number of data the rule predicts correctly. Variation is introduced using heuristic operators, crossover, and mutation. Termination occurs when the maximum number of generations is reached or the generational improvement falls below a predetermined threshold.

As CARP was originally developed for mapping plant and animal species distributions, it has the capacity to map only individual species (Stockwell 1999; Stockwell and Peters 1999). Each of 25 abridged Suborders was modelled individually. Previous studies (e.g. Stockwell et al. 2006) have used GARP to map biodiversity and species richness, but not mutually exclusive classes of plant or animal communities. Thus, a novel method to combine these individual Suborder models to obtain a prediction of a soil class at a point is required, and is discussed below.

In this study, CARP modelling was performed using DesktopGarp (Pereira 2005, available at: www.nhm.ku.edu/desktopgarp). The modelling involved 200 runs for each of the 25 abridged Suborders. As the gamma radiometric data cover only three-quarters of the catchment, 2 sets of scorpan-sspfe (Eqn 1) analyses were carried out: (1) analysis on the whole catchment without the gamma radiometric ancillary data; (2) analysis that covers three-quarters of the catchment with the gamma radiometric data. For each of the abridged Suborders, the testing and training datasets were randomly sampled at the ratio of 50:50 of the observations. In each ran, a combination of all rule types was used, including all ancillary of the factors as predictors. The maximum number of iterations was set to 1000, with a minimum convergence of 0.01. While the CARP outputs reported by Stockwell (1999) and Stockwell and Peters (1999) take the form of probability that a species is present at each point within an area of interest, DesktopGarp output is in binary (Pereira 2005), as 1 indicating presence and 0 absence of a given Suborder at each of the grid points.

Combining the results of CARP model runs into single digital soil class maps

As the CARP model incorporates a stochastic element, each run can be considered equiprobable. Therefore, by summing several runs for a given soil class, the proportion of times CARP predicted a soil class as present at each location on the base grid is obtained. This proportion can be considered as analogous to the probability of the soil class occurring at that point. By assuming a priori probability of obtaining any soil class at any point is equal, each grid location is allocated to the soil class with the highest proportion of the model runs predicting presence of a given class at that location. Where this solution is non-unique, the average accuracy for the soil class in question, calculated from the results, is used so that the point is allocated the most accurately modelled class out of those with the equal highest proportion of model runs that were predicted as present.

For the 2 map extents, i.e. whole catchment and radiometric coverage extent, 200 binary maps for each of the 25 soil classes were summed and converted into proportions of the 200 runs. A series of conditional statements was then used to combine the 25 individual soil class proportion maps as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (3)

[P.sup.i.sub.SC] is the proportion of 200 runs of soil class i allocated as present, n is the total number of soil classes, m is the number of soil classes allocated at maximum proportion at a given point, [SoilClass.sub.k] is one of the abridged Suborders, [SC.sub.acc] is the average GARP model accuracy for a given soil class, [Tr.sub.i] is the training accuracy for a given model run, and [Te.sub.i] is the testing accuracy for a given run as obtained from the GARP results.

This method of combination provides several measures of prediction accuracy for the GARP mapping:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (4)

Other measures of prediction accuracy, including the training accuracy, testing accuracy, and overall accuracy, were obtained from individual GARP model runs.

Classification tree analysis (CT)

In order to compare the performance of GARP, we also performed CT, a widely used predictive function in the scorpan-sspfe model (e.g. Lagacherie and Holmes 1997; McBratney et al. 2000; Moran and Bui 2002; Bui and Moran 2003). We used the R programming language (R Development Core Team 2006) for CT, in which the algorithm Tree package (Tree package, Ripley 2006) builds a tree, or a series of IF(...), THEN(...) statements by binary recursive partitioning of the training data; at each node it splits the data such that the reduction in deviance (Eqn 5) is maximised until some minimum reduction or subset number is reached (Lagacherie and Holmes 1997; Ripley 2006). Deviance, at each node i, is defined as:

[D.sub.i] = -2 [summation over (k)] [n.sub.ik] log [p.sub.ik] (5)

where [p.sub.ik] is the probability of obtaining class k with each observation from the training dataset partitioned to a terminal node resulting in a random sample [n.sub.ik] (Bui et al. 1999; Ripley 2006). The classes are allocated based on a given set of conditions to the most likely class at the terminal node using classification rate:

c([T.sub.k]) = [pr.sub.k] (6)

where c([T.sub.k]) is the classification rate at terminal node [T.sub.k] and [pr.sub.k] is the proportion of individuals allocated to the dominant class at the terminal node [T.sub.k]. Deviance Pruning, as performed by Bui et al. (1999), was used to simplify the tree model. The pruned trees were used to predict onto the base grids for the full extent and subset area with radiometric coverage. The most likely soil class was predicted at each location on the base grid. A measure of prediction uncertainty was derived using the appropriate terminal node classification rate at each point on the base grid (Eqn 6).

Data preparation for digital soil class mapping

As stated above, the gamma radiometric data do not cover the whole catchment. Therefore, 2 sets of catchment-scale digital soil class mapping were carried out: (1) using all covariates for mapping the whole catchment without radiometric data.; and (2) using all covariates, including radiometric data, for mapping a subset area of the catchment with radiometric data coverage. In the latter case, a subset of soil profile observations within the area of radiometric coverage was selected, comprising 2883 soil class observations.

Classification accuracy assessment and map uncertainty

For all digital soil class maps, 'displayed' classification accuracy was produced from 5 by 5 cell neighbourhood majority filtered maps; the final maps were sampled at all locations of the soil observations. We then used a modified jack-knife (Good 1999) as applied by Odeh et al. (2003), randomly resampling the known soil observations 50 times for the purpose of validation. The sampling rate was kept at one-sixth of the available data, i.e. 480/2883 for the radiometric coverage and 646/3875 for the full catchment extent. The mean classification accuracy from the jack-knifing technique was then calculated as the number of profiles correctly predicted as a proportion of the total number of observations in the sample (Eqn 7):

Acc = N/[N.sub.0] (7)

where N is the number of soil observations correctly predicted and No is the total number of soil class observations in the sample. Classification accuracy at Australian Soil Classification Order level was also calculated for each jack-knife sample, to recognise the hierarchical nature of the Australian Soil Classification, and investigate the misclassifications. Cohen's Kappa coefficient (Cohen 1960) was calculated both the abridged Suborder and Order levels using the jack-knife samples.

Measuring crisp classification accuracy for 25 classes over an area of 42 000[km.sup.2] can be an extremely harsh measure of map and model accuracy. Moreover, as the Australian Soil Classification is a hierarchical classification scheme, misclassifications at Suborder level within the same Order can be seen as of less importance than misclassifications to different Orders. Hence, we explored the Order level classification accuracies and Kappa coefficients using the same jack-knife samples as described above. We also computed the confusion matrices for the Order level classification.

A 2-way factorial ANOVA was performed on the jack-knifed classification accuracies (Eqn 7) and Kappa coefficients. The 2 ANOVA factors were the predictive models (CT or GARP), and inclusion of radiometric data (yes or no). We thus tested the hypothesis that the predictive model or the inclusion of radiometric data could affect the results significantly. Another measure of accuracy used is the Euclidean distance from each prediction grid location to the closest soil profile observation which has the same abridged Suborder (e.g. Bui and Moran 2003).

Results and discussion

Comparing digital soil class maps produced using GARP and CT

Whole catchment extent

The abridged soil Suborder classes as predicted by GARP and CT across the whole catchment are shown as smoothed (filtered by 5 by 5 cell majority) digital maps in Fig. 4. Both maps indicate Vertosol Suborders as the dominant soil classes on the lower Namoi floodplains between Narrabri and Walgett, and on the alluvial plains north-west and south-east of Gunnedah. A major difference between the maps is that CT predicted patches of Dermosol Suborders that are interspersed with Vertosol Suborders in the lower Namoi floodplains (Fig. 4a). Moving to the upper part of the catchment, the dominant Suborders predicted by the CT are the abridged Tenosols (Fig. 4a), which are the poorly developed soil types, in contrast to the well-developed Vertosols predominant on the lower Namoi flood plains. The eastern section of the catchment is characterised by rough and steep terrains, which tend to produce shallow, stony Rudosols and Tenosol. The GARP model is rather a poor predictor (as will be confirmed later) for large patches of Red Vertosols in much of the upper Namoi catchment (Fig. 4b), an unlikely soil on terrain underlain by Sandstone. This poor performance by GARP is also reflected by the dominance of Kurosol Suborders in the north-eastern section of the catchment, and less so in the south-east.

[FIGURE 4 OMITTED]

The pedosequence moving north-west through Pilliga onto the lower Namoi floodplains provides another distinct difference between the results from the 2 prediction models. While the sequence predicted by the CT is in the order Tenosol, Chromosol, Sodosol, and Vertosol Suborders, the GARP model predicted a sequence of Vertosol, Sodosol, and Vertosol Suborders (Fig. 4b). Generally, as Fig. 4 shows, there is lower diversity of abridged Suborders in the digital soil class map produced by GARP than that produced by CT.

[FIGURE 5 OMITTED]

Radiometric coverage

The results of the predictions by the 2 models (GARP and CT) incorporating radiometric data (hereafter referred to as radiometric predictions) are shown as digital soil class maps in Fig. 5. These maps were also smoothed by 5 by 5 cell majority filters. There is a clear similarity between the map produced for the whole catchment using CT (Fig. 4a) and the subset area of the catchment covered by radiometric prediction (Fig. 5a). The dominance of Grey Vertosols in the lower Namoi floodplains is retained by both prediction methods, as is the case with the whole catchment prediction. There is a greater diversity of soil types predicted in this lower flood plain area; CT predicts other Vertosol and Dermosol Suborders, while GARP predicts Dermosol, Rudosol, and Calcarosol Suborders. in the steeper or rougher terrain of the eastern section, the radiometric prediction using the CT shows even greater diversity of soil classes than its equivalence for the whole catchment, while GARP predicts mainly 2 abridged orders: Kandosol and Tenosol. The pedosequence produced by the CT for the whole catchment without radiometfic ancillary data is largely retained in the radiometric prediction maps.

The inclusion of radiometric data as predictor variables has increased the diversity of Suborders predicted by all of the methods (Fig. 5). This is more clearly demonstrated by the GARP model than the CT, as shown by the whole-catchment digital soil class maps (Fig. 4b) and those of the radiometric subset area (Fig. 5b). The improved performance of the GARP model by the inclusion of radiometric ancillary variables in the model is well illustrated in Fig. 5b.

Maps of predictive uncertainty

The GARP prediction accuracy based on maxP (Eqn 4) shows similar results regardless of whether or not radiometric data were used (Fig. 6). MaxP values are lowest in the western extremities of the catchment, generally increasing towards the east. This is different from the predictive accuracy criterion used for CT (Eqn 6, Fig. 7). A notable discrepancy of the whole catchment modelling is that mdsP (Eqn 4, Fig. 8) are reduced when radiometric data are included in the model. This is particularly so in the lower Namoi floodplain. This measure of uncertainty (Fig. 8, Eqn 4) is generally higher (the model prediction is more accurate) in the lower Namoi floodplain region. This is similar to the uncertainty results using CT (Fig. 7).

The number of points where more than one soil class is predicted under the GARP model (Mpred, Eqn 4) is higher when radiometric data are included in the model (Fig. 9). In both soil class maps, multiple allocations of classes generally occur in the rougher terrains to the east, away from the floodplains. The highest number of soil classes allocated at the maximum GARP proportion at a single point is 15 for the radiometric prediction, and 13 for the whole catchment model.

Comparison of classification accuracy between methods

Accuracy assessment by modified jack-knife resampling

The results of the ANOVA performed on the classification accuracies and Kappa coefficients are respectively, presented in Tables 3, 4. The inclusion of radiometric data in the predictive model significantly improved the displayed classification accuracy (abridged Suborder increase=0.11, P<0.001; Order level increase=0.09, P<0.001) and the Kappa coefficient (abridged Suborder increase = 0.09, P<0.001; Order level increase=0.09, P<0.001). The CT maps have higher classification accuracies and larger Kappa coefficients than the GARP maps (Suborder increase in accuracy = 0.18, P < 0.001 ; Kappa coefficients = 0.17, P<0.001; Order level increase in accuracy=0.16, P<0.001; Kappa coefficients=0.17, P<0.001). The mean Kappa coefficients (Table 4) suggest a fair agreement for the CT maps and only slight agreement for the GARP maps (Landis and Koch 1977).

The increase in classification accuracy at the Order level classification and not Suborder level classification is generally 0.12-0.14 (Table 3) or 15-20% of the misclassifications being within the Order errors. The high misclassification rates at the Suborder level reflect both the difficulty in predicting 25 crisp classes, and the resultant harsh nature of such a measure.

The Kappa coefficients for CT are similar to results presented by Bui and Moran (2003). The classification accuracies are generally lower, except for the CT prediction at Order level incorporating radiometric data.

Confusion matrices

The confusion matrices for the Order level CT maps (Tables 5, 6) show similar results for both radiometric prediction and whole catchment. A large proportion of outside-Order classifications were misclassified to Vertosol. The full catchment map resulting from CT also shows a high level of misclassification to Tenosol Order (Table 6). A similar trend for the GARP radiometric prediction also shows misclassification to Kandosol to be higher than for CT (Table 7). In the whole catchment GARP map, Vertosol Chromosol, Kandosol, and Rudosol Orders all have high numbers of false positives (Table 8).

Distance from grid location to nearest soil observation with same class

The distance to the nearest soil class observation for all grid points on the 200-m resolution grid maps is a measure of neighbourhood accuracy. This measure is based on the premise that the nearer a grid location is to a soil observation location with the same soil Suborder class as the predicted at the grid location, the more accurate the prediction. The results are illustrated in Fig. 10b--e. All of the maps produced by CT exhibit similar patterns of distance to nearest similar soil profile observations (Fig. 10b, d). The area in the north-east section of the catchment, characterised by large distance to nearest soil profile with the same Suborder classification (Fig. 10), is in part due to scarcity of soil profile observations.

The distances to the nearest soil observations with the same soil classes are generally larger for the GARP model without incorporating radiometric data than those computed for other models (Fig. 10c). Both GARP maps have wide sections with large distances to nearest soil observations at the western extremity of the catchment (Fig. 10c, e), again indicating problems at edges and narrow sections when using the rule-sets consisting of conjunctions of ranges. By comparison, this is not the case for the CT maps in all cases.

[FIGURE 6OMITTED]

General discussion and conclusions

Inclusion of gamma radiometric data in prediction model

The inclusion of radiometric data as a factor in the scorpan-sspf model significantly improved the classification accuracy and ability to differentiate between soil classes using all cases of the GARP and CT models (Tables 3-8, Figs 4, 5). The confusion matrices (Tables 5-8) show that the inclusion of radiometric data minimises the misclassification to the light-textured Orders such as Tenosols and Rudosols. These results are to be expected, as without radiometric counts, data on magnetic anomaly provide the only covariate for geology, regolith, or soil parent material. While magnetic anomaly data may be useful for geological mapping, more accurate representation of geology is enhanced, especially in terms of clay mineralogy, by the inclusion of radiometric information (Jaques et al. 1997). However, vegetation and soil moisture can cause attenuation of the gamma radiation (Wilford et al. 1997; Pickup and Marks 2000; Wilford and Minty 2007). Therefore, in the heavily vegetated areas of the catchment, such as forests and native vegetation areas north-west of Gunnedah, gamma radiometric imagery may reflect more of the variation in vegetation cover than the variation in mineralogical composition or parent material. Nevertheless, this representation of vegetation variation was probably as useful in all of the models.

[FIGURE 7 OMITTED]

[FIGURE 8 OMITTED]

GARP performance

To our knowledge, this is the first application of GARP to spatial prediction of mutually exclusive soil classes. Therefore, comparisons cannot be made against previous reported instances of using GARP. However, the range of mean accuracies for individual soil classes determined in this study (0.5-0.8) are similar to results reported by Anderson et al. (2002a) and Stockwell and Peterson (2002) in studies predicting the distribution of plant and animal species. The least represented soils in the catchment, Organosols and Anthroposols (Fig. 3), have the lowest accuracies. These accuracy values do not reflect the final map accuracy, instead are used in the development of the final map.

The choice of ancillary variables greatly affects the ability of GARP to differentiate between soil classes. The inclusion of radiometric data improved the performance of GARP sufficiently to produce a digital soil class map of similar quality to the map produced by the CT model for the whole catchment (Table 3).

[FIGURE 9 OMITTED]

The large areas of Red Vertosol and Kurosol Suborders predicted by GARP in the whole catchment, especially in the eastern or upper catchment area (Fig. 4b), appear to be artefacts of the individual soil class map combination method. The artefact areas of Red Vertosol and Kurosols correspond to the large swathes of multiple allocations present in Fig. 9a, with the mean model accuracy of the Red Vertosol and Kurosol Suborders among the highest for the whole catchment GARP models.

There is evidence of some spatial trend in the maximum proportions of predicted classes, as the maximum increases from the west to east of the catchment for both GARP maps (Fig. 6). This result is confirmed by the distances to closest similar soil observations (Fig. 10d, e). Figure 8 is also suggestive of some edge effect in the CARP predictions close to the catchment boundary and in the narrow section of the catchment, west of Walgett, due to the use of conjunctions of ancillary variable value ranges in the rule-sets.

The coherency of the resulting digital soil class maps

All of the models predicted Grey Vertosols as the predominant soil types in the alluvial plains west of Narrabri (Figs 4, 5), which corresponds with the preponderance of deep cracking clays in the floodplains as reported by Donaldson and Heath (1997), Young et al. (2002), and Bui and Moran (2003). The predicted soil class maps by CT and CARP, both without radiometric data as covariates, exhibit more coherent and contiguous polygons of soil classes, although these are less structured in the rougher and complex terrains of the upper part of the catchment. The large area of Red Vertosol predicted in the Pilliga area by the full catchment CARP (Fig. 4b) is highly improbable as the underlying geology is Pilliga sandstone. Colluvial and alluvial movement of coarse-grained sediments from the coarse-textured or poorly structured soil classes, developed on the Pilliga sandstones (Figs 4, 5), explains the pedosequence of soil classes as predicted by some models (Young et al. 2002).

Can CARP be further adapted for soil class mapping?

The heavy reliance on covariate choice, shown by the poor performance of CARP for the whole catchment, highlights several challenges in adapting this algorithm for soil class mapping. These problems include: (i) individual soil class combination method; (ii) inherent limitations of the DesktopGARP; (iii) issues with displaying the resulting digital soil class mapping.

Increasing the number of runs for the individual soil classes may better differentiate between classes at each grid location. Moreover, a best subset procedure, such as outlined by Anderson et al. (2003), could be implemented. However, the problem with these 2 approaches is the computing power and time required to run such large numbers of model simulations for each soil class. A probabilistic output for each CARP run may help alleviate confusion in allocating soil classes to individual pixels.

The combination technique relies on the overall average accuracy of individual soil class model simulations to resolve the problems of multiple allocations at each pixel. Several other measures of accuracy are produced by the DesktopGarp, including measures of commission and omission (Pereira 2005), which could be used instead, or in conjunction with the mean individual soil class model accuracy. Excluding areas where a given soil class may not occur would probably improve the recombination, which could be done by masking certain areas from prediction (Pereira 2005).

GARP has been noted to consistently over-predict the size of animal species habitats (Barker et aL 2006). It has also been used to model species invasion. Similar to our GARP approach for predicting soil classes, it has been used to model not only the existing habitats but also potential habitats of plant and animal species (e.g. Oberhauser and Peterson 2003; Peterson et al. 2003, 2004, 2006; Roura-Pascual et al. 2004; Wang and Wang 2006). While the latter approach may be useful in mapping soil information such as temporal change of salt-affected areas, it confuses the allocation of a single soil class to a pixel when mapping multiple soil classes. Soil classification, as in the Australian Soil Classification, requires the allocation of a soil into one of several classes--a multinomial classification and not binary as it is the case with classical GARP applications. Implementing the GARP modelling to allow the concurrent modelling of multi-category responses would likely resolve the problems associated with combining multiple maps for individual, binary response soil classes as outlined above. This would be similar to the extension of binary logit models to multinomial logit models or from binary to multicategory classification trees. The multinomial implementation could then be run a set number of times and the modal prediction at each grid location used to determine the final digital soil class map.

[FIGURE 10 OMITTED]

Some general comments on digital soil class mapping

While this paper has considered the important task of mapping soil classes, mapping soil functional attributes is also an area of current research (e.g. Carre et al. 2007). Although this multinomial implementation of GARP has not performed as well we would have liked, the use of GARP to map binary soil functional attributes, such as areas with salinity or sodicity above a certain threshold, is worth exploring. This approach would be unencumbered by the problems associated with multinomial classification.

Not all misclassifications are equal. In this paper we have attempted to accommodate this fact by measuring the classification accuracy and Kappa coefficients at the Order level in the hierarchical Australian Soil Classification. This, however, does not account for similarities between classes at the Order level. Minasny and McBratney (2007) incorporated a taxonomic distance into CT for creation of a digital soil class map using Australian Soil Classification Orders, while Hengl et al. (2007) produced digital maps of overlapping soil classes using supervised classification, multinomial logistic regression, multiple indicator kriging, and classification of taxonomic distances, but did not combine individual soil class membership maps into a single digital soil class map. The ability to quantify and determine the taxonomic distances between soil classes requires understanding of the classification scheme along with a database of soil profiles--with associated soil property information--for each class such that a modal profile can be developed (Minasny and McBratney 2007).

Another important issue raised by this paper is how to quantify and best display uncertainty and variation in digital soil class maps. This is a growing area of research (Bishop et al. 2001, 2006; Hengl et al. 2007). Recognising classification schemes as inherently fuzzy through the use of membership functions (e.g. Hengl et al. 2007; Minasny and McBratney 2007) can be considered similar to GARP proportion of each soil class, and the probability of occurrence at each terminal node in the CT. The challenge is to display this membership information in a coherent digital soil class map without creating a complex, confusing legend.

To produce good quality, digital soil class maps, more data (in terms of more field sampling and fieldwork) may be needed to fill the current gaps in data. However, combining soil data from different sources and merging them into a common geospatial database for digital soil mapping is the first step and, although challenging, can only assist in the development of more accurate digital soil class maps. Although the GARP approach has been specifically designed to utilise legacy data, and thus should be more suitable for such disparate and multi-sourced soil profile description and soil class data as used in this study, CT out-performed GARP as implemented here. We have shown that digital soil maps, produced using legacy soil profile data, can provide accurate digital soil class maps with quantifiable certainty.

Acknowledgements

The authors acknowledge the support of the Cotton Catchment Communities Cooperative Research Centre for their financial support through their Summer Scholarship program for the first author. We also thank Dr Budiman Minasny of the University of Sydney for his suggestion on some of the techniques used in this paper.

Manuscript received 1 October 2008, accepted 21 May 2009

References

Anderson RP, Gomez-Laverde M, Peterson AT (2002a) Geographical distributions of spiny pocket mice in South America: insights from predictive models. Global Ecology and Biogeography 11, 131-141. doi: 10.1046/j.1466-822X.2002.00275.x

Anderson RP, Lew D, Peterson AT (2003) Evaluating predictive models of species' distributions: criteria for selecting optimal models. Ecological Modelling 162, 211 232. doi: 10.1016/S0304-3800(02)00349-6

Anderson RP, Peterson AT, Gomez-Laverde M (2002b) Using niche-based GIS modeling to test geographic predictions of competitive exclusion and competitive release in South American pocket mice. Oikos 98, 3 16. doi: 10.1034/j.1600-0706.2002.t01-1-980116.x

Bailey N, Clements T, Lee JT, Thompson S (2003) Modelling soil series data to facilitate targeted habitat restoration: a polytomous logistic regression approach. Journal of Environmental Management 67, 395-107. doi: 10.1016/S0301-4797(02)00227-X

Barker S, Benitez S, Baldy J, Cisneros-Heredia D, Colorado G, et al. (2006) Modeling the South American Range of the Cerulean Warbler. In 'ESRI International User Conference'. (ESRI: Redlands, CA)

Behrens T, Forster H, Scholten T, Steinrucken U, Spies ED, Goldschmitt M (2005) Digital soil mapping using artificial neural networks. Journal of Plant Nutrin'on and Soil Science-Zeitschrft Fur Pflanzenernahrung Und Bodenkunde 168, 21-33. doi: 10.1002/jpln.200421414

Behrens T, Scholten T (2007) A comparison of data mining techniques in predictive soil mapping. In 'Digital soil mapping: an introductory perspective'. (Eds P Lagacherie, AB McBratney, M Voltz) pp. 353-364. (Elsevier: Amsterdam)

Bishop TFA, McBratney AB, Whelan BM (2001) Measuring the quality of digital soil maps using information criteria. Geoderma 103, 95-111. doi: 10.1016/S0016-7061(01)00071-4

Bishop TFA, Minasny B, McBratney AB (2006) Uncertainty analysis for soil-terrain models. International Journal of Geographical Information Science 20, 117-134. doi: 10.1080/13658810500287073

Bui E (2007) A review of digital soil mapping in Australia. In 'Digital soil mapping: an introductory perspective'. (Eds P Lagacherie, AB McBratney, M Voltz) pp. 25-39. (Elsevier: Amsterdam)

Bui EN, Loughhead A, Comer R (1999) Extracting soil-landscape rules from previous soil surveys. Australian Journal of Soil Research 37, 495-508. doi: 10.1071/S98047

Bui EN, Moran CJ (2003) A strategy to fill gaps in soil survey over large spatial extents: an example from the Murray-Darling basin of Australia. Geoderma 111, 21-14. doi: 10.1016/S0016-7061(02)00238-0

Carre F, McBratney AB, Mayr T, Montanarella L (2007) Digital soil assessments: Beyond DSM. Geoderma 142, 69-79. doi: 10.1016/j.geoderma.2007.08.015

Chen G J, Peterson AT (2002) Prioritization of areas in China for the conservation of endangered birds using modelled geographic distributions. Bird Conservation International 12, 197-209. doi: 10.1017/S0959270902002125

Cohen J (1960) A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20, 37-46. doi: 10.1177/0013164460 02000104

Cook SE, Comer RJ, Groves PR, Grealish GJ (1996) Use of airborne gamma radiometric data for soil mapping. Australian Journal of Soil Research 34, 183 194. doi: 10.1071/SR9960183

Dikau R, Brabb E, Mark R (1991) 'Landform classification of New Mexico by Computer.' (U.S. Geological Survey: Denver, CO)

Donaldson S, Heath T (1997) Namoi river catchment report on land degradation and proposals for integrated management for its treatment and prevention. NSW Department of Land and Water Conservation.

Gallant JC, Dowling TI (2003) A multiresolution index of valley bottom flatness for mapping depositional areas. Water Resources Research 39(12), 1347. doi: 10.1029/2002WR001426

Giasson E, Clarke RT, Inda AV, Merten GH, Tomquist CG (2006) Digital soil mapping using multiple logistic regression on terrain parameters in Southern Brazil. Scientia Agricola 63, 262-268. doi: 10.1590/S010390162006000300008

Good PI (1999) 'Resampling methods: a practical guide to data analysis.' (Birkhauser: Boston)

Hammond EH (1964) Analysis of properties in land form geography an application to broad-scale land form mapping. Annals of the Association of American Geographers 54, I 1 19. doi: 10.1111/j.1467-8306.1964. tb00470.x

Hengl T, Toomanian N, Reuter HI, Malakouti MJ (2007) Methods to interpolate soil categorical variables from profile observations: lessons from lran. Geoderma 140, 417 427. doi: 10.1016/j.geoderma.2007.04.022

Isbell RF (1996) 'The Australian Soil Classification.' (CSIRO Publishing: Collingwood, Vic.)

Jaques AL, Wellman P, Whitaker A, Wybom D (1997) High-resolution geophysics in modem geological mapping. AGSOJournal of Australian Geology & Geophysics 17, 159 173.

Lagacherie P, Holmes S (1997) Addressing geographical data errors in a classification tree for soil unit prediction. International Journal of Geographical Information Science 11, 183-198. doi: 10.1080/136588197242455

Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33, 159-174. doi: 10.2307/2529310

McBratney AB, Odeh 1OA, Bishop TFA, Dunbar MS, Shatar TM (2000) An overview of pedometric techniques for use in soil survey. Geoderma 97, 293 327. doi: 10.1016/S0016-7061(00)00043-4

McBratney AB, Santos MLM, Minasny B (2003) On digital soil mapping. Geoderma 117, 3-52. doi: 10.1016/S0016-7061(03)00223-4

McKenzie NJ, Gessler PE, Ryan P J, O'Connell D (2000) The roles of terrain analysis in soil mapping. In 'Terrain analysis: Principles and applications'. (Eds JP Wilson, J Gallant) (John Wiley & Sons, Inc.: New York)

MDBC (1999) Inventory of GIS datasets held by the Murray-Darling Basin Commission. Murray-Darling Basin Commission.

Minasny B, McBratney AB (2007) Incorporating taxonomic distance into spatial prediction and digital mapping of soil classes. Geoderma 142, 285 293. doi: 10.1016/j.geoderma.2009.08.001

Moran C J, Bui EN (2002) Spatial data mining for enhanced soil map modelling. International Journal of Geographical Information Science 16, 533 549. doi: 10.1080/13658810210138715

NCMA (2005) Namoi Catchment Management Authority: Three year investment strategy 2004-2007. Namoi Catchment Management Authority.

NCMB (2003) Namoi Catchment: A Blueprint for the Future. NSW Department of Land and Water Conservation, Sydney.

Northcote KH (1979) 'A factual key for the recognition of Australian soil.' (Rellim Technical Publications: Glenside, S. Aust.)

Oberhauser K, Peterson AT (2003) Modeling current and future potential wintering distributions of eastern North American monarch butterflies.

Proceedings of the National Academy of Sciences of the United States of America 100, 14063 14068. doi: 10.1073/pnas.23315 84100

Odeh IOA, Cattle S, Triantafilis J, McBratney AB (2004) The Australian Cotton Soil Database and Geographic Information System. In 'Quality cotton a living industry. Proceedings of the 12 Australian Cotton Conference'. Gold Coast Convention and Exhibition Centre. pp. 493-507. (ACGRA: Orange, NSW)

Odeh IOA, Todd A J, Triantafilis J (2003) Spatial prediction of soil particle-size fractions as compositional data. Soil Science 168, 501-515. doi: 10.1097/00010694-200307000-00005

Pereira RS (2005) DesktopGarp 1.1.6. University of Kansas Biodiversity Research Center.

Peterson AT, Cohoon KP (1999) Sensitivity of distributional prediction algorithms to geographic data completeness. Ecological Modelling 117, 159-164. doi: 10.1016/S0304-3800(99)00023-X

Peterson AT, Papes M, Kluza DA (2003) Predicting the potential invasive distributions of four alien plant species in North America. Weed Science 51,863-868. doi: 10.1614/P2002-081

Peterson AT, Papes M, Reynolds MG, Perry ND, Hanson B, Regnery RL, Hutson CL, Muizniek B, Damon IK, Carroll DS (2006) Native-range ecology and invasive potential of Cricetomys in North America. Journal of Mammalogy 87, 427-432. doi: 10.1644/05-MAMM-A133R3.1

Peterson AT, Scachetti-Pereira R, Hargrove WW (2004) Potential geographic distribution of Anoplophora glabripennis (Coleoptera: Cerambycidae) in North America. American Midland Naturalist 151, 170-178. doi: 10.1674/0003-0031 (2004) 151[0170:PGDOAG]2.0.CO;2

Pickup G, Marks A (2000) Identifying large-scale erosion and deposition processes from airborne gamma radiometrics and digital elevation models in a weathered landscape. Earth Surface Processes" and Landforms 25, 535-557. doi: 10.1002/(SICI) 1096-9837(200005)25:5 <535::AID-ESP91>3.0.CO;2-N

R Development Core Team (2006) 'R: A language and environment for statistical computing.' (R Foundation for Statistical Computing: Vienna)

Raimundo RLG, Fonseca RL, Scachetti-Pereira R, Townsend Peterson A (2007) Native and exotic distributions of siamweed modeled using the genetic algorithm for rule-set production. Weed Science 55, 41-48. doi: 10.1614/WS-06-083.1

Ripley B (2006) 'Tree: Classification and regression trees.' R Package Version 1.0-24.

Roura-Pascual N, Suarez AV, Gomez C, Pons P, Touyama Y, Wild AL, Peterson AT (2004) Geographical potential of Argentine ants (Linepithema humile Mayr) in the face of global climate change. Proceedings" of the Royal Society of London. Series B. Biological Sciences 271, 2527-2534. doi: 10.1098/rspb.2004.2898

Stockwell D (1999) Genetic algorithms II: Species distribution modelling. In 'Machine learning methods for ecological applications'. (Ed. A Fielding) (Kluwer Academic Publishers: Boston)

Stockwell D, Beach JH, Stewart A, Vorontsov G, Vieglais D, Pereira RS (2006) The use of the GARP genetic algorithm and internet grid computing in the Lifemapper world atlas of species biodiversity. Ecological Modelling 195, 139-145. doi: 10.1016/j.ecolmodel.2 005.11.016

Stockwell D, Peters D (1999) The GARP modelling system: problems and solutions to automated spatial prediction. International Journal of Geographical Information Science 13, 143-158. doi: 10.1080/136 588199241391

Stockwell DRB, Noble IR (1992) Induction of sets of rules from animal distribution data: a robust and informative method of data analysis. Mathematics and Computers" in Simulation 33, 385-390. doi: 10.1016/ 0378-4754(92)90126-2

Stockwell DRB, Peterson AT (2002) Effects of sample size on accuracy of species distribution models. Ecological Modelling 148, 1-13. doi: 10.1016/S0304-3800(01)00388-X

Wang R, Wang YZ (2006) Invasion dynamics and potential spread of the invasive alien plant species Ageratina adenophora (Asteraceae) in China. Diversity & Distributions 12, 397-408. doi: 10.1111/j.13669516.2006.00250.x

Wilford J, Minty B (2007) The use of airborne gamma-ray imagery for mapping soils and understanding landscape processes. In 'Digital soil mapping: an introductory perspective'. (Eds P Lagacherie, AB McBratney, M Voltz) pp. 207-218. (Elsevier: Amsterdam)

Wilford JR, Bierwith PN, Craig MA (1997) Application of airborne gamma-ray spectrometry in soil/regolith mapping and applied geomorphology. AGSO Journal of Australian Geology & Geophysics 17, 201-216.

Young RW, Young ARM, Price DM, Wray RAL (2002) Geomorphology of the Namoi alluvial plain, northwestern New South Wales. Australian Journal of Earth Sciences 49, 509-523. doi: 10.1046/j.1440-0952. 2002.00934.x

Zhang L, Beavis SG, Gray SD (1999) Development of a spatial database for large-scale catchment management: geology, soils and landuse in the Namo Basin, Australia. Environment International 25, 853-860. doi: 10. 1016/S0160-4120(99)00057-4

M. A. Nelson (A,B) and I. O. A. Odeh (A)

(A) Faculty of Agriculture, Food and Natural Resources, The University of Sydney, NSW, Australia. (B) Corresponding author. Email: michael.n@usyd.edu.au
Table 1. Abridged Australian Soil Classification (ASC) Suborder
definitions and codes

See Isbell (1996); XX, all Suborders within the ASC Order; ZZ, all other
Suborders within ASC Order

ASC Code Abridged Description
 Suborder code

AN XX ANTH Anthroposols
OR XX ORG Organosols
HY XX HYD Hydrosols
PO XX POD Podosols
CA XX CALC Calcarosols
VEAA R Ve Red Vertosols
VE AB Br Ve Brown Vcrtosols
VE AD Gr Ve Grey Vertosols
VE AE Bl Ve Black Vcrtosols
VE ZZ Ot Ve Other Vertosols
KU XX Ot Ku Kurosols
SO AB Br So Brown Sodosols
SO ZZ Ot So Other Sodosol
CH AB R Ch Brown Chromosols
CH ZZ Ot Ch Other Chromosols
EE XX Fe Ecrrosols
DE AB, AE BIBr De Black/Brown Dermosols
DE ZZ Ot De Other Dermosols
KA AA, AB BrR Ka Brown/Red Kandosols
KA ZZ Ot Ka Other Kandosols
RU CY Le Ru Leptic Rudosols
RU ER St Ru Stratic Rudosols
RU ZZ Ot Ru Other Rudosols
TE CY, AW, GZ BLO Te Bleached/Leptic Orthic Tenosols
TE ZZ Ot Te Other Tenosols

Table 2. Ancillary information used in digital soil class mapping

Ancillary information Scorpan factor

DEM r
Curvature r
Profile Curvature r
Plan Curvature r
Aspect r
Slope r
MRVBF (A) r
TWI (B) r
Hammond Landform Class (C) r
Magnetic Anomaly p
Gamma Radiometrics (U, Th, K) p
Land Use o
NDVI o

(A) Multi-resolution Valley Bottom Flatness Index (Gallant and bowling
2003).

(B) Topographic Wetness Index (McKenzie et al. 2000).

(C) See Dikau et al. (1991) and Hammond (1964).

Table 3. Results of ANOVA comparing the displayed classification
accuracy for digital soil class map produced by CARP and CT for both
radiometric prediction and full catchment extents

Separate analyses were performed for abridged Suborders and ASC Order
data. Values followed by the same letters are not significantly
different at [alpha] = 0.05

Predictive Displayed classification accuracy at:
function
 Abridged Suborder level ASC Order level

Radiometric: Included Excluded Included Excluded

GARP 0.23a O.10b 0.37a 0.25b

CT 0.41c 0.28d 0.53c 0.41d

Table 4. Results of ANOVA comparing the Kappa coefficients for
digital soil class map produced by CARP and CT for both radiometric
prediction and full catchment extents

Separate analyses were performed for abridged Suborders and ASC
Order data. Values followed by the same letters arc not significantly
different at [alpha] = 0.05

Predictive Displayed classification accuracy at:
function
 Abridged Suborder level ASC Order level

Radiometric: Included Excluded Included Excluded

GARP 0.15a 0.07b 0.19a 0.07b

CT 0.36c 0.22d 0.37c 0.25d

Table 5. Confusion matrix for Order Icvcl classifications:
CT, radiometric prediction

Predicted Observed classification
classification AN CA CH DE FE HY KA

AN 0 0 1 1 0 0 0
CA 0 0 0 0 0 0 0
CH 0 0 136 25 1 0 19
DE 0 1 42 83 2 0 7
FE 0 0 2 0 4 0 2
HY 0 0 0 0 0 0 0
KA 0 1 9 10 0 0 52
KU 0 0 7 3 0 0 0
OR 1 2 4 5 0 0 5
RU 0 0 17 10 2 0 7
SO 0 0 23 20 3 0 6
TE 0 0 20 15 0 0 12
VE 1 5 126 87 2 6 26

Predicted Observed classification
classification KU OR RU SO TE VE

AN 0 0 0 0 0 0
CA 0 0 0 0 0 0
CH 3 4 6 21 20 108
DE 4 3 10 19 16 58
FE 0 0 0 0 0 1
HY 0 0 0 0 0 2
KA 0 4 13 17 4 18
KU 24 0 1 5 1 9
OR 2 40 3 11 4 31
RU 1 10 66 13 18 22
SO 4 5 10 96 14 53
TE 9 7 13 23 81 54
VE 7 26 27 52 38 958

Table 6. Confusion matrix for Order level classifications:
CT, full catchment

Predicted Observed classification
classification AN CA CH DG FE HY KA

AN 0 0 0 0 0 0 0
CA 0 0 0 0 0 0 0
CH 3 0 133 39 0 0 19
DE 0 0 40 99 4 2 12
PE 0 0 3 2 8 1 0
HY 0 0 0 0 0 0 0
KA 0 0 13 11 2 0 38
KU 1 0 11 0 0 0 4
OR 0 0 7 6 2 2 5
RU 0 0 13 9 2 0 8
SO 0 0 27 11 4 0 12
TE 2 9 188 159 6 0 68
VE 1 8 164 101 2 4 29

Predicted Observed classification
classification KU OR RU SO TE VE

AN 0 0 0 0 0 0
CA 0 0 0 0 0 3
CH 13 12 21 37 26 130
DE 4 4 8 8 15 76
PE 1 0 2 4 2 8
HY 0 0 0 0 0 0
KA 1 0 4 11 12 23
KU 22 2 0 5 4 12
OR 2 24 0 5 2 20
RU 4 4 35 17 16 28
SO 5 4 18 85 18 65
TE 13 36 79 94 132 219
VE 12 33 32 57 30 1018

Table 7. Confusion matrix fur Order level classifications:
GARP, radiometric prediction

Predicted Observed classification
classification AN CA CH DE FE HY KA

AN 0 0 0 0 0 0 0
CA 0 0 0 0 0 0 0
CH 0 4 59 30 1 4 10
DE 1 3 76 76 5 0 4
FE 0 0 0 0 0 0 0
HY 0 0 0 0 0 0 0
KA 0 0 97 51 0 0 53
KU 0 0 5 0 0 0 12
OR 1 0 16 8 6 0 7
RU 0 0 20 18 0 0 23
SO 0 0 0 0 0 0 0
TE 0 0 19 17 0 0 15
VE 0 2 95 59 2 2 12

Predicted Observed classification
classification KU OR RU SO TE VE

AN 0 0 0 0 0 0
CA 0 0 0 0 0 1
CH 7 10 16 24 12 199
DE 5 17 16 32 23 96
FE 0 0 0 0 0 1
HY 0 0 0 0 0 0
KA 5 6 24 44 48 136
KU 12 8 4 4 10 15
OR 7 29 11 21 16 69
RU 6 13 41 56 34 54
SO 0 0 0 0 0 0
TE 8 5 14 25 38 49
VE 4 11 23 51 15 694

Table 8. Confusion matrix for Order level classifications:
CARP, full catchment

Predicted Observed classification
classification AN CA CH DE FE HY KA

AN 0 0 0 0 0 0 0
CA 0 4 18 10 2 0 4
CH 2 4 262 129 7 2 69
DE 0 0 7 37 2 1 4
W 1 0 10 4 4 2 0
HY 0 0 0 3 0 0 0
KA 0 1 90 91 10 0 63
KU 0 0 25 16 0 2 2
OR 2 2 14 10 0 2 8
RU 0 2 67 37 2 0 21
SO 0 0 16 16 0 0 6
TG 0 0 5 6 0 0 2
VE 2 4 85 78 3 0 16

Predicted Observed classification
classification KU OR RU SO TE VE

AN 0 0 0 0 0 3
CA 0 2 2 18 5 34
CH 20 32 49 102 68 649
DE 7 1 2 8 5 6
W 0 1 5 6 4 9
HY 0 0 0 0 0 2
KA 4 19 19 29 46 82
KU 21 9 6 9 11 30
OR 3 24 2 13 13 57
RU 12 13 59 50 43 120
SO 0 2 2 17 8 15
TG 0 2 0 2 3 23
VE 10 14 53 69 51 572
COPYRIGHT 2009 CSIRO Publishing
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2009 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Nelson, M.A.; Odeh, I.O.A.
Publication:Australian Journal of Soil Research
Article Type:Report
Geographic Code:8AUST
Date:Sep 1, 2009
Words:9802
Previous Article:Sequential indicator simulation and indicator kriging estimation of 3-dimensional soil textures.
Next Article:Field level digital soil mapping of cation exchange capacity using electromagnetic induction and a hierarchical spatial regression model.
Topics:

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters