Printer Friendly

Hierarchical clustering of multiobjective optimization results to inform land-use decision making.


Planning problems often have many potential solutions and multiple competing objectives. These types of problems are well addressed by multiobjective optimization methods. Multiobjective optimization is applied to problems in a variety of fields where multiple conflicting objectives must be considered. The result is a nondominated set of potential solutions. A solution is nondominated if no feasible solution exists that is better on all objectives.

Balling (2004) used a multiobjective optimization algorithm to consider city and regional land-use and transportation planning. Like this paper and that of Roberts (2003), the goal of using multiobjective optimization was to improve on traditional planning methods. In most planning decisions, the alternative plans are formulated based on the experience and preferences of planners then presented to the public and the decision makers. This small set of plans cannot adequately capture the complexity of the planning problem and is inherently subjective (Balling 2004).

To evaluate their approach, Balling (2004) presented the results of their analysis to local city planners, state planners, and environmental planners, as well as local politicians. All persons consulted approved of this approach and encouraged continued work although a final plan was not chosen from the 100 resulting plans. Motivating this work, Balling (2004) believes that one reason that a plan was not chosen from the optimization results is the difficulty of considering such a large number of plans. Even with a large number of plans for consideration, planners uncovered key aspects of the problem that were used in the selection of a final plan. With current computing power, it is possible to consider multiobjective problems with very large nondominated solution sets. For example, the land-use problem in this paper has 6,561 nondominated solutions. According to Balling (2004), the number of plans to be considered must be objectively reduced to a set of plans representing "distinct conceptual ideas." In other words, decision makers need a sample of the nondominated solutions that is sufficiently representative of the possibilities and trade-offs but small enough for tractable consideration.

The multiobjective optimization literature acknowledges the need for a method of reducing or organizing the nondominated set (Benson and Sayin 1997). Several researchers (Rosenman and Gero 1985, Morse 1980, Taboada et al. 2007) have dealt with this issue using cluster analysis or filtering. This paper differs from their work for it aims not only to make the nondominated set tractable but to do so without removing any elements of the nondominated set before presenting the solutions to the decision makers. A set of potential land-use plans is organized into nested groups of similarly performing plans without filtering out any plans.

Cluster analysis can be applied to the results of a multiobjective optimization algorithm to organize or partition solutions based on their objective function values. In this paper, clustering is used to take a large set of land-use plans and organize them based on proportions of urban, natural, and agricultural land-use as well as landscape-ecology measures. The goal of clustering is to create an "efficient representation that characterizes the population being sampled" (Jain and Dubes 1988, p. 55). Such a representation allows a decision maker to further understand the decision by making available the attainable limits for each objective, key decisions and their consequences, and the most relevant variables; this presentation is an improvement on a list of potential solutions and their associated objective function values.

This paper details a hierarchical cluster analysis approach to organize Pareto optimization results into a hierarchical representation. The methodology is applied to a greenlands system design problem in an urban fringe area in southern Ontario, Canada. In this study, 171 unique potential land-use configurations were generated using the Nondominated Sorting Genetic Algorithm II (NSGA-II) (Deb et al. 2002) described in the following section with eight landscape ecology-based criteria. A hierarchical algorithm is desirable for it presents a nested organization of the land-use configurations that can be used for decision making as demonstrated in the example decision.


A multiobjective framework is used for optimization problems that cannot be satisfactorily converted into a single objective problem. Within this framework, trade-offs are explicitly considered. A multiobjective optimization problem is composed of a set of decision variables, a set of objective functions of those variables to be maximized or minimized, and a set of constraints on the values of those variables.

Without loss of generality, assume that all objective functions are to be maximized. Mathematically, a multiobjective problem can be written as shown in problem 1.

Maximize f(x) = ([f.sub.1](x), [f.sub.2](x), ... [f.sub.m](x)) Subject to x = ([x.sub.1], [x.sub.2], ..., [x.sub.n]) [epsilon] [member of] (1)

where X is the set of feasible solutions, often defined by a set of constraints.

It is unlikely that a single solution x in X maximizes all the objective functions simultaneously because the objective functions typically conflict. The efficient set, E, is the set of feasible solutions x in X for which no other feasible solution is as good as x with respect to all objective functions and strictly better than x in at least one objective function. For I = {1, ..., m}, the efficient set is formally defined as in equation 2.

E = { X [member of] X: [f.sub.i] (x) [greater than or equal to] [f.sub.i] (y) [for all] y [member of] X, i [member of] I, and [f.sub.i](x) > [f.sub.i](y) for some i [member of] I (2)

The solutions in E are said to be Pareto optimal or globally nondominated. The Pareto front is the mapping of the efficient set to the space defined bythe objective functions, i.e., {f (x): X [member of] E} . A nondominated set is a set that is efficient with respect to its own elements, i.e., satisfying equation 2 with E = X. No solution in a nondominated set dominates or is dominated by any other solution in the set. In Figure 1, solutions A and B are nondominated for no solutions are better on both objectives [f.sub.1](x) and [f.sub.2](x). Solution C is dominated by solution A but not by solution B, because C is better on objective [f.sub.2](x) but B is better on objective [f.sub.1](x). The efficient or nondominated set consists of solutions A and B.


Three approaches can be taken to find a solution to multiobjective problems (Benson and Sayin 1997). The first approach entails reformulating the problem as a single objective problem. To do so, additional information is required from the decision makers, such as the relative importance or weights of the objectives, goal levels for the objectives, value functions, etc. The second approach requires that the decision makers interact with the optimization procedure typically by specifying preferences between pairs of presented solutions. The third approach, Pareto optimization, finds a representative set of nondominated solutions approximating the Pareto front. Pareto optimization methods, such as evolutionary multiobjective optimization algorithms, allow decision makers to investigate the potential solutions without a priori judgments regarding the relative importance of objective functions. Post-Pareto analysis is necessary to select a single solution for implementation (Benson and Sayin 1997).


All three approaches to solving multiobjective optimization problems have shortcomings. The solution returned by the single objective approach can highly depend on the weights and, in nonconvex problems, the responses to changes in weights or goals may be unpredictable. With conflicting and noncommensurate criteria as well, it can be difficult to make value judgments such as choosing weights or goals for the criteria. Given decision maker input, the first approach returns a single solution. Interactive approaches consider only a small set of nondominated solutions because of the effort required (Benson and Sayin 1997). Pareto optimization approaches return a potentially large number of solutions for consideration. Selecting a single solution from a large nondominated set is likely to be difficult for any decision maker. Benson and Sayin (1997) proposed that an ideal solution procedure for multiobjective optimization is to provide the decision makers with a globally representative subset of the nondominated set that is sufficiently small so it is tractable. This work aims to approach this ideal procedure by accepting the computational effort required to generate a large nondominated set and subsequently organizing it based on its structure in such a way that decision makers can tractably consider interesting subsets without a priori removal of any solutions from consideration.


Any Pareto optimization method could be employed in our methodology. NSGA-II is used for it is known to perform well with nonconvex, disconnected, and nonuniform Pareto fronts (Deb et al. 2002). The use of this heuristic algorithm allows for efficient searching through many potential solutions while considering several objectives that may be nonconvex and discontinuous.

NSGA-II is an evolutionary multiobjective algorithm, specifically a genetic algorithm. Evolutionary multiobjective algorithms apply biologically inspired evolutionary processes as heuristics to generate nondominated sets of solutions. It should be noted that the solutions returned by evolutionary multiobjective algorithms may not be Pareto optimal, that is, globally nondominated, but the algorithms are designed to evolve solutions that approach the Pareto front and spread out to capture the diversity existing on the Pareto front to obtain a good approximation of the Pareto front.

Genetic algorithms operate on a population of solutions and employ selection, crossover, and mutation operators, among others, to generate successive improved populations based on a fitness function. At each generation, a set of potential parents is generated, subsets of the parents are combined to create offspring, and the fittest offspring are included in the next generation (Falkenauer 1998).

NSGA-II differs from single objective genetic algorithms in two respects: It aims to maintain diversity in the population instead of converging to a single solution and it uses nondomination to assess the fitness of individuals. NSGA-II also is elitist; the members of the next generation can be drawn from either the offspring or the parents of the current generation, thus potentially preserving good solutions from the previous generations that may have been destroyed by crossover or mutation.

The procedure for applying NSGA-II and an example follow:

1. Randomly generate the first generation.

2. Partition the parents and offspring of the current generation into k fronts [F.sub.1], [F.sub.2], .... [F.sub.k], so that the members of each front are dominated by all members of better fronts and by no members of worse fronts.

3. Calculate the crowding distance for each potential solution: For each front, sort the members of the front according to each objective function from the lowest value to the highest value. Compute the difference between the solution and the closest lesser solution and the closest greater solution, ignoring the other objective functions. Repeat for each objective function. The crowding distance for that solution is the average difference over all the objective functions. Note that the extreme solutions for each objective function are always included.

4. Select the next generation based on nondomination and diversity:

* Add the best fronts [F.sub.1], [F.sub.2], ..., [F.sub.j] to the next generation until it reaches the target population size.

* If part of a front, [F.sub.j+1]. must be added to the generation to reach the target population size, sort the members from the least crowded to most crowded. Then add as many members as are necessary, starting with the least crowded.

5. Generate offspring:

* Apply binary tournament selection by randomly choosing two solutions and including the higher-ranked solution with a fixed probability, typically between 0.5 and 1 (Goldberg and Deb 1991).

* Apply single-point crossover and the sitewise mutation to generate offspring.

6. Repeat steps 2 through 5 for the desired number of generations.


An example of the Pareto ranking and crowding distance calculations for a two-objective function maximization problem is shown in Figure 2. In this example, the population size is six so there are 12 solutions (parents and offspring) considered. Six of these solutions are included in the next generation. First, the Pareto fronts are identified: The first front contains the nondominated solutions, the second front contains the solutions dominated only by the first front, and the third front contains the solutions dominated by the first and second fronts. The next generation is formed by taking the three solutions in the first front and the three least-crowded solutions in the second front. These solutions then would be used to generate offspring and the process would be repeated.


Post-Pareto analysis aids decision makers in choosing a single solution from the potentially large set of Pareto optimization results. Several researchers have applied clustering methods in different ways to nondominated sets to aid decision makers. Most of these methods use the similarity of elements in the nondominated set based on their objective function values and remove elements that are too similar to other elements. We construct a tree data structure for the nondominated set, allowing decision makers to consider subsets of the nondominated solutions prior to removal of any solutions from further consideration.

Mattson et al. (2004) detailed a "smart Pareto filter" to obtain a representative subset of a nondominated set by defining regions of "practically insignificant trade-offs" around solutions. Each solution is considered successively and all solutions with "practically insignificant trade-offs" are removed for they are not sufficiently distinguishable from the solution under consideration. To create a representative set, more elements of the nondominated set are retained in areas with steeper trade-offs, commonly known as "knees." The dimensions of the regions of "practically insignificant trade-offs" for each objective function must be specified a priori (Mattson et al. 2004).

Morse (1980) detailed one of the first applications of cluster analysis to a nondominated set. The multiobjective programs considered were linear programs. A solution was removed from the nondominated set if it was indistinguishable from another solution based on decision maker-defined thresholds. Morse (1980) evaluated seven hierarchical clustering methods. Ward's method, the group average method, and the centroid method performed very well. Ward's method was preferred because the clusters at the same level of the hierarchy were similar in size and shape, although it performed only slightly better than the centroid and group average methods (Rosenman and Gero 1985). Ward's method creates clusters by minimizing the variance within each of the two clusters at each level of the hierarchy.

Rosenman and Gero (1985) applied complete linkage hierarchical clustering to "reduce the size of the Pareto optimal set whilst retaining its shape" (p. 189). This method allowed control of the diameter of the resulting clusters. They noted that solutions whose vectors of objective function values are similar may have decision variable vectors that are similar or very different but this idea was not further explored. The objective functions were considered successively to avoid the implicit aggregation in applying proximity measures. First, elements of the nondominated set were clustered using a single criterion. If a solution within a cluster dominated another solution in the cluster on all criteria except the clustering criterion, then the dominated solution was eliminated from consideration. The process was repeated for each criterion until the nondominated set was sufficiently small.

Taboada et al. (2007) used partitional (k-means) clustering for combinatorial multiobjective problems. Either the most interesting cluster, i.e., the "knee" cluster, was considered in detail by discarding the solutions in other clusters, or one solution from each of the k clusters was considered to form a representative subset of the nondominated set.

This paper differs from the previously mentioned work by considering hierarchical clustering and not reducing the size of the nondominated set under consideration. The hierarchical tree structure for the solutions allows the decision makers to tractably consider the solutions using a sequence of decisions to reduce the set of solutions under consideration.


Cluster analysis involves using algorithms and techniques to examine the internal organization in a data set in an objective way; it can be used to describe the data concisely and to uncover patterns and relationships that may not be readily apparent (Dubes 1993). The aim is to group objects that are similar in some way. A cluster analysis requires several steps: data scaling, proximity calculation, selection of a clustering algorithm, application of the clustering algorithm, and validation and consideration of results.

The input data is represented as a matrix, X, containing the p criteria, e.g., the objective function values, of the n elements to be clustered, e.g., the Pareto optimization results. Without data normalization or scaling, the relative values of the objective functions may act as implicit weightings. This weighting is undesirable because Pareto optimization is used to generate an unbiased set of optimal trade-off solutions without considering the relative importance of the objective functions. Milligan and Cooper (1988) and Gnanadesikan (1995) found range scaling better for recovering known cluster structures than raw data or normalization to unit variance and zero mean. Range scaling was used in our proposed hierarchical clustering methodology.

Clustering methods can be separated into two categories: partitional methods that provide a single partition of the solutions and hierarchical methods that provide a series of nested partitions. With a hierarchical clustering method, the number of clusters need not be known a priori (Ward 1963).

The tree structure of a hierarchical clustering algorithm can be useful for guiding decision processes when many alternatives must be considered. The tree of the cluster hierarchy often is represented in a dendrogram, where the top element in the tree, the root, is a cluster containing all the elements and the bottom elements, i.e., leaves, represent individual elements. The dendrogram displays the merging (or dividing) of clusters from the leaves to the root (or the root to the leaves) and the distance or dissimilarity between the merged (or split) clusters. The dendrogram is an objective structure that can be used by decision makers to discuss and consider the clustered elements.

The group average linkage method with Euclidean distance was used for this analysis. The Euclidean distance is easily interpretable and invariant to rotations and translations (Dubes and Jain 1976). The group average linkage computes the distance between clusters as the mean distance between all pairs with one element in the first cluster and one element in the second cluster including duplicate elements. Hierarchical clustering linkage methods, like all clustering methods, make assumptions about the sizes and shapes of clusters (Jain et al. 1999). Different linkage methods tend to identify clusters with different characteristics. The group average linkage allows clusters to vary in size and shape.

Validation, beginning with establishing a clustering tendency in the input data, is important because clustering methods will find clusters even in random data (Dubes and Jain 1979). The Cluster Validation and Analysis section that follows includes the process of validating the clustering results.


The proposed hierarchical clustering methodology is summarized in this section. The steps of this analysis are as follows:

1. Define decision variables, feasible set, and objective functions.

2. Choose and apply a Pareto optimization algorithm, e.g., NSGA-II.

3. Cluster analysis:

* Clustering tendency: By visual inspection or data projections verify that a hierarchical cluster structure is a reasonable model for the data.

* Data scaling: Remove implicit variable weightings due to relative scales using range scaling.

* Proximity: Select and apply an appropriate similarity measure for the data, here, Euclidean distance.

* Choice of algorithm(s): Consider the assumptions and characteristics of clustering algorithms and select the most suitable algorithm for the application, here, group average linkage.

* Application of algorithm: Apply the selected algorithm and obtain a dendrogram.

* Validation: Examine the results based on application subject matter knowledge, assess the fit to the input data and stability of the cluster structure, and compare the results of multiple algorithms, if used.

4. Represent and use the clusters and structure: If the clustering is reasonable and valid, examine the divisions in the hierarchy for trade-offs and other information to aid decision making.

The greenlands design problem is described in the following section, followed by the results of the cluster analysis and an example decision.


The greenlands design problem detailed by Roberts (2003) is applied to an urban fringe area near Toronto, Ontario, Canada. In this region, single-family residential housing and aggregate extraction (hereafter referred to collectively as urban), agriculture, and natural areas coexist. The analysis aims to inform land-use decision making concerning the effects of land-use, in particular potential habitat loss and fragmentation represented by reduction in the area and connectedness of natural land. The model takes into account the existing landscape features and land- use (see Figure 3). Currently abandoned fields could potentially be used to allow for urban growth, reseeded or allowed to regenerate as natural areas, or restored as agricultural land; these fields are the candidate sites for land-use change. The configuration of the landscape features is important for the function of the landscape for natural systems and human use.

The problem is formulated as a configuration optimization problem. Configuration optimization is a class of combinatorial optimization that manipulates geometric and topological properties of a system to optimize the system performance (Roberts 2003). Categorical variables represent the land-use classes. Twenty-eight total land-use classes in the source data were aggregated into four classes denoted as {1, 2, 3, 4}, representing "unchanged," "natural," "agriculture," and "urban." There are eight candidate sites available for land-use change as shown in white on the study area plot in Figure 3.


Eight objective functions were formulated based on landscape-ecology metrics (Roberts 2003). A nonmathematical description of each objective function follows. Many of the objective function evaluations depend on graph data structures representing the topology of the study area. Each objective function is formulated for maximization.

GA1a Area of Natural Features: More natural area is better. This objective is implemented as the ratio of the area of the candidate sites coded "natural" and the total area of the candidate sites.

GA1 Area Weighted Mean Shape: Compact natural areas are more desirable than elongated natural areas. This principle is modeled by maximizing the mean area to the perimeter ratio of the n largest sets of connected natural polygons. In this study, n is five, although n only reaches four in this data set.

GA2 Natural Feature Connectivity: Connected natural sites are preferable to the same natural sites scattered across the landscape. This objective maximizes the mean number of connected natural sites in the n largest connected sets of natural sites.

GA3 Stepping-stones of Natural Features on Shortest Paths: Paths of natural sites through the landscape allow for flora and fauna mobility. The number of natural sites along n(n-1)/2 "stepping-stone" shortest paths between the n largest natural areas is maximized.

GA4 Patches of Natural Features within Urban Areas: Patches of natural area within urban areas are desirable. This objective maximized the number of links between urban sites and natural sites within urban areas based on spatial autocorrelation join counts.

GA5 Agricultural Area: The area of the candidate areas assigned to agriculture is maximized. This objective is implemented as the ratio of the area of the candidate sites coded "agricultural" and the total area of the candidate sites.

GA6 Clustered Development: More compact urban areas are more desirable. Similar to objective GA4, this objective maximizes the number of urban to urban adjacencies and is implemented based on join counts.

GA7 Urban Area: Similar to objectives GA1a and GA5, this objective competes for land-use. It is implemented as the ratio of the area of the candidate sites coded "urban" and the total area of the candidate sites.

Because the decision variables are categorical, the true Pareto front is a discrete set of solutions. Given this discreteness and the nonlinearity of some of the objective functions, the density of the solutions may not be homogeneous across the front. Because of this variation in solution density, a hierarchical clustering structure may exist.


This section describes the application of the hierarchical clustering methodology presented previously to the greenlands design problem using NSGA-II. To assess the quality of the NSGA-II results, a full enumeration of the Pareto front was obtained. This full enumeration was possible because of the small sample area considered.

The NSGA-II results contained 171 unique solutions from the four to the eight (48 or 65,536) different possible land-use configurations from eight candidate sites and four potential land-use categories. Figure 4(a) shows the NSGA-II results and Figure 4(b) shows the objective function values for the 6,561 solutions on the fully enumerated Pareto front. The fully enumerated Pareto front was obtained by removing the dominated solutions from the potential solutions. Ideally, the NSGA-II results would have the same range and distribution as the values in the true Pareto front. As seen in Figures 4(a) and 4(b), the ranges and distributions of the approximation and the true values are very similar. The differences between the ranges and distributions of these values are discussed below. Note that objective GA4, patches of natural features within urban areas, takes only a single value in these results because of the small problem size and the existing land-use and is excluded from further analysis.

NSGA-II is a heuristic method to approximate the full Pareto front. It trades off time and the number of solutions, allowing the consideration of larger problems, with precision. The NSGA-II approximation of the Pareto front contains approximately 2 percent as many solutions as the true Pareto front (171 versus 6,561).

There are several discrepancies between the approximation of the Pareto front shown in Figure 4(a) and the true Pareto front shown in Figure 4(b). The true Pareto front must contain the solutions that would result from optimizing each of the objective functions separately while ignoring the other objectives. In this problem, the percentage land-use area objective functions, GA1a, GA5, and GA7, would each attain a value of one. These solutions were not included in the NSGA-II results. However, it is unlikely that these extreme solutions are politically acceptable, as per the example decision that follows There also were proportionally fewer solutions in the upper portions of the ranges for the natural and agricultural land-use objective functions, GA1a and GA5, respectively. The mean value for the urban land-use objective function, GA7, was higher in the NSGA-II results than in the enumeration of the true Pareto front. NSGA-II explored the span of potential values for the land-use objective functions, but there is a bias toward urban land-use.


NSGA-II can generate a good approximation of the Pareto front with less time and computational effort. It provides a small sample covering the large set of potential solutions. This ability allows considering larger problems than would otherwise be feasible.


The first step of the cluster analysis is establishing the existence of a clustering tendency. In Figure 5, each objective function is plotted against each other objective function. Evidence of clusters is clear in several of the plots. For example, considering GA1a plotted against GA5, three large clusters are apparent: one cluster with low values of GA1a and GA5, one cluster with high values of GA1a and low values of GA5, and one cluster with low values of GA1 a and high value of GA1a. These same three maj or clusters also can be seen in the plots of GA1a against GA7 and of GA5 against GA7. The attribution of land to the differing land-uses is an important characteristic of this decision and the presence of these major clusters should be detected by any successful clustering algorithm. Several subclusters can be seen within each of the major clusters, confirming the expected hierarchical cluster structure. For example, in the cluster where both GA1a and GA5 take low values, there are five well-separated dense regions. At this point in the methodology, an indication of a clustering tendency is established. Without an indication of a clustering tendency, the results of applying a clustering algorithm may not reflect any pattern in the data. The true structure may not correspond directly to the obvious clusters in these plots because of relationships between multiple objective functions that are not easily visualized. For example, all three of the land-use percentage objective functions must be considered to understand the trade-off for the percentage of land allocated to each land-use. The proposed clustering methodology considers these simultaneous interactions between multiple objective functions.

Next, the scale of the data was addressed. All but one of the objective functions range from 0 to 1. Only objective function GA6, measuring clustered development using spatial autocorrelation, could take values beyond this range. GA6 was rescaled to lie in the range 0 to 1 by linearly mapping the values in the NSGA-II results to the range 0 to 1.

The Euclidean distance was used as the measure of similarity between vectors of objective functions. For this application, the weighted group average method was found to be the most suitable clustering method. This method was applied to the NSGA-II results, giving the dendrogram shown in Figure 6. Beginning at the root, each split of the dendrogram into two subclusters can be qualified in terms of the differences between the subclusters, for example, C(2,1,1) for a cluster derived by choosing the second cluster at the first branching, the first cluster at the second branching, and the first cluster at the third branching.

The cophenetic correlation coefficient compares the distance between solutions and their relative position in the dendrogram. In this case, its value was 0.9247, indicating a good fit of the data to the dendrogram. Stability testing further validated the resulting clustering structure. Three types of stability tests were applied:

Adding random error terms of up to 25 percent, Omitting up to 25 percent of the solutions, and Randomly splitting the data into two sets equal-sized before clustering.

The first and second branchings shown in the example decision were robust to these perturbations. Although in some of the more extreme perturbations, the order of these branchings was reversed with a split between natural and agricultural land occurring before the split between natural and urban land.

Three features are important to the success of this cluster analysis: The cluster structure must be a valid representation of the data, obvious clusters must be detected, and, where no obvious clustered exist, it must segment the clusters by reflecting the structure of the data.

This methodology succeeded in identifying the obvious clusters. Cluster C(1) contained the solutions with high values of objective function GA7, urban land-use area, which only take low values of GA1a, natural land-use area, and low to moderate values of GA5, agricultural land-use area. Branching cluster C(2) results in a trade-off between GA1a and GA5, the natural and agricultural land-use area objective functions., C(2,1) had low values of GA1a and high values of GA7 while C(2,2) had high values of GA1a and low values of GA7. The three major clusters were identified in the first two branchings as C(1), C(2,1), and C(2,2). In cluster C(2,2), there was no obvious branching into two subclusters. The clustering algorithm branched the cluster into two subclusters so that the solutions in cluster C(2,2,1) were preferable to those in cluster C(2,2,2) on objective function GA6, clustered development on which they all attained the maximal value. As well, no solution in cluster C(2,2,1) took the minimal value for objective function GA5, agricultural land-use area. The solutions in cluster C(2,2,2) attained similar or better values of objective function GA1a, natural land-use area, than the solutions in cluster C(2,2,1), and similar or worse values of objective function GA5, agricultural land-use area. None of the solutions in cluster C(2,2,2) took a value of zero for objective function GA7, urban land-use area, and no solution in cluster C(2,2,1) included any new urban land-use area. These subclusters were clearly different and reflected trade-offs between the objective functions.



Internal, external, and relative validity were considered. The internal validity of the weighted group average linkage method results was satisfactory. The cophenetic correlation coefficient of 0.9247 was sufficiently large to indicate a good fit of the dendrogram to the data. Error perturbation, data deletion, and data split stability tests indicated that the three major clusters that were detected are a valid structure. The order of the first two branchings giving these three clusters was less robust as reflected in the similar heights in the original dendrogram. The clusters caused by branchings resulting lower in the dendrogram were less robust than the clusters from higher branchings. This reduced robustness reflected that those clusters are less differentiated and in many cases they were not the result of an obvious cluster structure.

The external validity was established by considering the clusters in relation to knowledge about the landscape configuration problem. The cluster analysis results reflected the real-world aspects of the landscape configuration problem, indicating good external validity. The clusters of landscape designs were differentiated in terms of the land-use codings for the candidate sites. An example of this assessment can be found in the example decision that follows.

The relative validity was assessed by comparing the hierarchical weighted average linkage clustering results to a similar method, hierarchical complete linkage clustering. The weighted group average linkage method results tended to agree with the complete linkage method, indicating good relative validity. The discrepancies in the results from these methods can be attributed to known assumptions of the complete linkage method, in particular similarly sized clusters that did not agree with the visual inspection of the cluster structure.

These assessments confirmed that there were three major clusters in the data and that the general dendrogram structure was valid although the clusters lower in the dendrogram were less robust. The remainder of this section details an example decision process based on the results of the cluster analysis.


This section provides an example of the use of the hierarchical clustering structure in the land-use decision problem. Te local human population requires land to work, live, and grow food, i.e., urban and agricultural land. Land at the fringes of the currently developed areas is desirable for these purposes, but this land currently serves natural functions. For example, water recharge areas and animal habitats may exist within the natural area. In this paper and previous work (Roberts 2003), the natural functions that require specific land parcels, such as water recharge, were dealt with by preprocessing, and the natural functions that may not be specific to particular land parcels, such as some animal habitats, are dealt with using the multiobjective optimization model for the landscape configuration problem. Known specific desired habitat areas also could be preprocessed as natural in this methodology. In this scenario, there are several current candidate sites whose land-use can be changed and the most significant concern of the decision makers is the loss of natural land related to the first four objective functions and GA6, which measures the clustering of urban development. Te clustering of urban development is also desirable for human use of the land, for example, services such as public transit and waste collection can be implemented more efficiently in compact urban areas.

The example decision begins by considering the branching into clusters C(1) and C(2), and then proceeds to the preferred cluster and considers that branching. This process is repeated until there is a small set of landscape designs amenable for further, more detailed consideration.



Observations: Figure 7 shows the objective function values of the solutions in the two clusters resulting from the first branching at the root of the dendrogram. The trade-off in land area for the different land-uses is evident: Cluster C(1) contains the solutions with high values of objective function GA7, urban land-use area, which only coincide with solutions with low values of GA1a, natural land-use area, and low to moderate values of GA5, agricultural land-use area. Te solutions in C(1) achieve a wide spread of values for objective function GA6, ranging from approximately 0 to 0.8. Cluster C(2) contains the solutions with low values of GA7, urban land-use area. Cluster C(2) does not restrict the values of objective functions GA1a, natural land-use area, and GA5, agricultural land-use area. Similar to cluster C(1), the solutions in cluster C(2) take a wide range of values for GA6, clustered development, for more configurations are available with more sites coded urban, but in cluster C(2) the values for GA6 range from approximately 0.2 to 1. Note that in this particular study area, none of the candidate sites are adjacent to existing urban areas (see Figure 3) so any new urban area will be less compact.

Decision: The algorithm extracted the expected conflict between natural and urban land-use. C(1) is dominated by urban development while C(2) contains solutions with various levels of agricultural and natural land. The consequences of favoring urban or nonurban land can be seen in the differences between the two clusters on the other objective functions. In this particular problem, favoring natural or urban land-use has little impact on the natural landscape-ecology objectives (GA1, GA2, and GA3). Because the more natural plans on cluster C(2) may result in highly clustered urban areas, in this example we further consider cluster C(2). The cluster or clusters for further consideration can be chosen with the knowledge of the relationships between the objectives for this particular land-use problem. It should be noted that for other problems, the natural-urban division may not occur at the first branching; For example, the first branching could divide clusters on GA1 if candidate sites could connect the existing natural areas.. In this problem it implies that this conflict is the dominant effect in this decision.



Observations: The result is a trade-off between GA1a and GA5, the natural and agricultural land-use area objective functions, respectively (see Figure 8). C(2,1) has low values of GA1a and higher values of GA5, while C(2,2) has high values of GA1a and lower values of GA5.

Decision: Again this example problem demonstrates expected conflicts: in this case, agricultural versus natural land-use. One might thus choose cluster C(2,2) for further consideration for the natural land-use is the highest priority in this decision scenario. In contrast to selecting weights for the objectives or singularly emphasizing new natural area, there are a number of plans to consider for which the trade-offs between objectives have been made explicit. The effects of emphasizing natural area on urban clustering and the natural landscape functions were available for consideration before the decision makers needed to narrow the set of plans under consideration.


Observations: Figure 9 shows the clusters resulting from branching cluster C(2,2). The solutions in cluster C(2,2,1) are preferable on objective function GA6, clustered development, on which they all attain the maximal value. Also, no solution in cluster C(2,2,1) takes the minimal value for objective function GA5, agricultural land-use area. The solutions in cluster C(2,2,2) attain equivalent or better values of objective function GA1a, natural land-use area, and equivalent or worse values of objective function GA5, agricultural land-use area. Some of the solutions in cluster C(2,2,2) do not take a value of zero for objective function GA7, urban land-use area, but no solution in cluster C(2,2,1) includes any new urban land-use area.


Decision: Further consider cluster C(2,2,1) in this scenario with the understanding that none of the new land is allocated for urban use. Consider also choosing C(2,2,2) for further consideration for it has a small quantity of new urban land while noting that the amount of new agricultural land may be reduced and that allowing any new urban land will degrade the clustering of the urban development.

At this point, the solutions may be deemed sufficiently similar in performance that other factors that were unmodeled by this methodology, for instance land costs and availability, aesthetics of viewsheds, etc., must be considered. The insight from considering the trade-offs in the objective functions can be included in the decision, but the details of the landscape designs should be considered. Cluster C(2,2,1) contains only two landscape configurations and cluster C(2,2,2) contains ten landscape configurations. Figures 10 and 11 show the plots for the solutions in clusters C(2,2,1) and C(2,2,2).

In agreement with the emphasis on natural land-use, the largest candidate site, site four, is natural in all of these plans. Within cluster C(2,2,1), sites three and six also are always natural and site five is always agricultural. Within cluster C(2,2,2), site one is unchanged or agricultural and at least one of the small sites is urban. While the solutions in both of these clusters are very similar, the superior performance of cluster C(2,2,1) on the clustered development objective function corresponds to the lack of new urban land. Within the clusters, the land-use of the larger sites is consistent and the plans are mostly differentiated on the land-use of the smaller sites. For objective function GA1, the area weighted shape of natural area, none of the solutions in clusters C(2,2,1) or C(2,2,2) take the lowest values attained for this objective function; in all of these solutions having site four as natural land improves the shape of the largest natural area. Within cluster C(2,2,1) and for solutions A, C, F, H, and I in cluster C(2,2,2), the smaller natural area above the center of the study area has an improved area weighted shape because of the natural land-use of site six. In plan F in cluster C(2,2,2), the natural area weighted shape for the largest natural area is improved by having site five as natural. Within clusters C(2,2,1) and C(2,2,2), the natural area stepping-stone shortest paths measured by objective function GA3 always outperforms the worst attainable value. Like the natural area weighted shape, this improvement is because of the additional natural areas.


In the first few branchings of the dendrogram, the clusters correspond to those noted in the visual inspection for clustering tendency. The branchings lower in the dendrogram do not correspond to visually obvious clusters for no obvious clusters exist. These branchings segment the obvious clusters into subclusters that are differentiated but not significantly separated. The use of a hierarchical linkage clustering algorithm allows the method to deal with these branchings where there may be no cluster structure and return usable results.


There are three primary aims for the proposed methodology. First, it should create a tractable presentation of the NSGA-II results for the landscape configuration problem; visually obvious clusters should be detected and a useful structure should be provided even where no obvious structure exists. he validity of the resulting structure also is important. his aim was discussed in the previous section. Second, it should be adaptable to other problems suited to a multiobjective optimization framework using Pareto front enumeration or approximation methods. This second requirement includes being extendable to include other model aspects such as constraints, preferences, and weights. Third, it should be simple enough to be understood by decision makers, including the general public, and to be potentially included in future decision support systems without extensive training. If the system is a black box, it will not be accepted and used in the expected decision making contexts.


Using the binary branching structure in the dendrogram, the solutions can be considered based on their objective function values. Potentially interesting subsets of solutions for further consideration can be found by reducing the set under consideration by descending in the tree from the root until a sufficiently small set of solutions with sufficiently similar objective function values remains. Using the dendrogram resulting from the weighted group average linkage method, the set under consideration can be made arbitrarily small. Because the tree is not balanced, the decrease in the number of solutions under consideration resulting from each branching is not predictable and many branchings may need to be taken to obtain a sufficiently small set. If desired, a different linkage method, such as the complete linkage method, could be employed to return a more balanced dendrogram that is less indicative of the trade-off surface structure. In the landscape configuration problem considered here, three branchings typically are sufficient to reduce the solutions under consideration to a small compact set. At that point, unmodeled aspects of the decision, such as the suitability of individual candidate sites for more specific uses, should be considered.

The proposed methodology provides a tractable representation of the multiobjective optimization results. While the effects of the relative ranges of the objective functions may complicate the use of this methodology or implicitly convey additional importance to a particular objective function, currently there is no available method for considering multiple objective functions simultaneously without some consideration of the relative importance of the objective functions. Here the emphasis is on those objective functions that clearly differentiate the clusters occurring higher in the dendrogram.


The characteristics of the input data are important to the use and success of this methodology. he proposed methodology is most easily applied where the input data is a Pareto front derived from an approximation algorithm or enumeration and a clustering tendency is seen in two-dimensional visualizations. If the decision variables in the multiobjective optimization problem are continuous, the input to the clustering algorithm must be a discrete approximation of the Pareto front.

If the problem has only two or three objective functions, and in particular if those functions are well behaved, i.e., convex and continuous, there is little benefit to using the proposed methodology. If a simple two-dimensional or three-dimensional visualization of the Pareto front or a good approximation thereof is available, the proposed methodology cannot lead to additional insight. The proposed methodology is particularly useful where there are more than three objective functions but not so many objective functions that it becomes difficult to select one of the clusters at a branching.

If the decision variables are continuous or discrete with a constant density and a good approximation of the Pareto front is easily obtained, then the approach taken by Mattson et al. (2004) to find "interesting" regions of the Pareto front may be more suitable. Applying the methodology proposed in this paper still may yield insight for these problems, particularly if it is difficult to obtain a good approximation of the Pareto front. If no clustering tendency exists, then any structure resulting from the application of a clustering algorithm will be an artifact of that clustering algorithm. Nonetheless, a clustering algorithm could be used to construct a tractable representation using a dendrogram.



Once a hierarchical clustering structure is obtained, the dendrogram can be used in decision support systems. he dendrogram may be enhanced by simultaneous display with other visualizations to aid in enabling insight into the problem of interest. Seo and Shneiderman (2002) present uses for dendrograms in exploring high-dimensional hierarchical cluster structures in the context of genomic microarray analysis. One visualization using a dendrogram is to display the dendrogram and use columns of color blocks below each leaf to display information relevant to that leaf. In the landscape configuration example, this visualization could be used to display the land-use codings for each solution from NSGA-II. Figure 12 shows the solutions in cluster C(2,2,2), employing this visualization using colors yellow for agricultural, green for natural, gray for urban, and white for unchanged. he dendrogram provides an order for the solutions that allows the differences and similarities in the land-use codings to be seen relative to similarities in the objective functions represented by the dendrogram. Using this enhanced dendrogram would give insight into key sites contributing to objective functions and allow the user to verify that aspects of the problem are properly modeled. For example, in the landscape configuration problem, the connectivity of the core natural areas is important. A small number of sites may determine this feature of the landscape design. Using a dendrogram enhanced with a color block view of the candidate site land-use codings would allow users to see whether particular sites tend to be similar within clusters and different between clusters. For deployment in a decision support system, further work would be required to design appropriate visualizations.

A possible different use of the dendrogram for visualization in the landscape configuration problem is to use it as an input interface to allow users to display the full maps of the study area. Choosing a cluster would overlay the land-use codings in the NSGA-II solution, allowing the user to see the solutions of interest as a whole landscape design.

This methodology is expected to be used in an iterative decision process, where the problem is reformulated based on the output of earlier iterations. Objective functions and constraints on the decision variables can be added, removed, or changed and the analysis repeated. This iterative process ensures that the model accurately represents the problem and explores the problem to obtain additional insight. An iterative process can be used to allocate limited resources to investigating potential solutions. At the first iteration, the proposed methodology is applied to a small sample of the feasible solutions. The proposed methodology can be applied to each of these interesting regions in turn by constraining the decision variables or placing limits on the objective function values. Computational effort need not be expended on exploring the entire solution space in detail for the shape of the trade-off surface can be considered. While this methodology could be implemented using off-the-shelf tools such as SPSS, R, and S, the effort to do so could not be easily justified unless the methodology were key to the decision process. Commercial development could allow for considering larger problems as well as improving usability through a modern graphical interface, including real-time interactive visualizations of solutions. If real-time interaction is not necessary, this methodology can be scaled up to include 20,000 or more partitions (Roberts 2003). The future work detailed in the following section discusses issues to be addressed prior to easily generalizing the methodology for other objectives and problems.


Pareto optimization methods allow using multiobjective optimization models without a priori decision maker preferences. The decision makers can consider the possibilities and trade-offs between objectives before selecting a solution for implementation. These methods suffer from the shortcoming of requiring the decision makers to consider many possible solutions resulting from the optimization procedure. his paper developed and evaluated a cluster analysis methodology to address this issue. A land-use planning problem was used as motivation and to evaluate the proposed methodology.

Previous work in multiobjective optimization in land-use planning called for a method to objectively determine a set of plans representing "distinct conceptual ideas" (Balling 2004). Previous methods involved eliminating some of the Pareto optimal solutions before presenting them to the decision makers. The proposed methodology allows the entire nondominated set to be presented to the decision makers by providing a tractable structure for the results. This methodology will continue to be applicable as computational power increases and Pareto optimization algorithms improve, allowing the consideration of larger nondominated sets.

This approach is applicable to multiobjective problems with discrete decision variables or hierarchically clustered nondominated sets. Multiobjective configuration optimization problems and the more general class of combinatorial multiobjective optimization problems have discrete Pareto fronts. It also may be applicable to problems containing highly discontinuous Pareto fronts. If a hierarchical structure is not suspected in the data or if the structure is not to be used in the decision process but a clustering tendency exists in the data, then the methodology presented by Taboada et al. (2007) involving k-means clustering to create a partitioning of the solutions into a predefined number of clusters may be more suitable.

The proposed methodology is particularly useful if similarly performing solutions based on the objective function values may be distinguishable to the decision makers based on the importance of the decision variable values or unmodeled aspects of the problem. Previous approaches to this issue would have eliminated similarly performing solutions from consideration.

Future work will revisit the issues in cluster analysis, including scaling, proximity measures, selection of algorithms, and validation, as well as improved visualizations. The limitations of the small study area and the structure resulting from the particular landscape ecology metrics used also require further consideration. This work could be extended to consider the proximity of the solutions based on their decision variable values, e.g., in the land-use application the similarity of the landscape configurations. It may be desirable in some applications to highlight clusters containing similarly performing solutions with very different decision variable values; these solutions could denote unmodeled aspects of the problem or possible freedom in the decision. Shape space measures (Small 1996) may be a suitable approach to comparing the morphology of the landscape configurations.


Support for this work was provided by the Ontario Graduate Scholarship and the Department of Systems Design Engineering at the University of Waterloo. The second author acknowledges GEOIDE grant #HSS-DSS-17 for partial support of this work.


Balling, R. 2004. Applications of multi-objective evolutionary algorithms: advances in natural computation. Volume 1. City and regional planning via a MOEA: lessons learned. Singapore: World Scientific, 227-45.

Benson, H. P., and S. Sayin. 1997. Towards finding global representations of the efficient set in multiple objective mathematical programming. Naval Research Logistics 44: 47-67.

Cormack, R. M. 1971. A review of classification. Journal of the Royal Statistical Society A 134(3): 321-67.

Deb, K., A. Pratap, S. Agarak, and T. Meyarivan. 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation 6(2): 182-97.

Dubes, R., and A. K. Jain. 1976. Clustering techniques: the user's dilemma. Pattern Recognition 8: 247-60.

Dubes, R., and A. K. Jain. 1979. Validity studies in clustering methodologies. Pattern Recognition 11: 235-54

Dubes, R. C. 1993. Handbook of pattern recognition and computer vision, cluster analysis and related issues. Salem, MA: World Scientific Publishing Company, 3-32.

Falkenauer, E. 1998. Genetic algorithms and grouping problems. Chichester, West Sussex, England: John Wiley and Sons.

Gnanadesikan, R. 1995. Weighting and selection of variables for cluster analysis. Journal of Classification 12: 113-36.

Goldberg, D. E., and K. Deb. 1991. Foundations of genetic algorithms. Chapter A, Comparative analysis of selection schemes used in genetic algorithms. San Mateo, CA: Morgan Kaufmann, 69-93.

Jain, A. K., M. N. Murty, and P. J. Flynn. 1999. Data clustering: A review. ACM Computing Surveys 31(3).

Jain, Anil K., and Richard C. Dubes. 1988. Algorithms for clustering data. Englewood Cliffs, NJ: Prentice Hall.

Mattson, C. A., A. A. Mullur, and A. Messac. 2004. Smart pareto filter: obtaining a minimal representation of multiobjective design space. Engineering Optimization: 36(6): 721-40.

Milligan, G. W., and M. C. Cooper. 1988. A study of standardization of variables in cluster analysis. Journal of Classification 5: 181-204.

Morse, J. N. 1980. Reducing the size of the nondominated set: pruning by clustering. Computers and Operations Research 7: 55-66.

Roberts, S. A. 2003. Configuration optimization in socio-ecological systems. Ph.D. thesis, Department of Systems Design Engineering, University of Waterloo, Waterloo, Ontario, Canada, ?did=7650422061 &sid= 3&Fmt=2&clientId=278 50&RQT=309&VName=PQD &cfc=1.

Rosenman, M. A., and J. S. Gero. 1985. Reducing the pareto optimal set in multicriteria optimization (with applications to pareto optimal dynamic programming). Engineering Optimization 8: 189-206.

Seo, J. ,and B. Shneiderman. 2002. Interactively exploring hierarchical clustering results. Computer 35(7): 80-86.

Small, C. G. 1996. The statistical theory of shape. New York: Springer.

Taboada, H., F. Baheranwala, D. Coit, and N. Wattanapongsakorn. 2007. Practical solutions for multi-objective optimization: an application to system reliability design problems. Reliability Engineering and System Safety 92(3): 314-22.

Ward, J. H., Jr. 1963. Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58(301): 236-44.

C. M. Moulton received her Master's degree in Applied Science in Systems Design Engineering at the University of Waterloo. Her research interests include geomatics, data mining, decision support, and multiobjective optimization.

Steven Roberts is an associate professor in the Department of Geography and Environmental Studies at Wilfrid Laurier University where he teaches geomatics and spatial analysis courses at the graduate and undergraduate level. His Ph.D. is in Systems Design Engineering from the University of Waterloo and he has undergraduate degrees in Urban and Regional Planning (Waterloo) and Mathematics (McMaster). His current research interests include spatial data models and data structures, combinatorial optimization, genetic algorithms and programming (Evolutionary Multicriterion Optimization), applied graph theory and category theory, landscape ecology, and parallel computing.

P. H. Calamai is a professor in the Department of Systems Design Engineering at the University of Waterloo. He received his Ph.D. in Systems Design Engineering from the University of Waterloo. His research interests include facility location and resource allocation as well as multidisciplinary design optimization and decision support systems.

Corresponding Address:

Wilfrid Laurier University

Department of Geography and Environmental Studies

Room 3C11, Arts Building

Waterloo ON Canada N2L 3C5

Phone: (519) 884-1970, Ex. 2470
COPYRIGHT 2009 Urban and Regional Information Systems Association (URISA)
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2009 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Moulton, C.M.; Roberts, S.A.; Calamai, P.H.
Publication:URISA Journal
Article Type:Report
Geographic Code:1CANA
Date:Jul 1, 2009
Previous Article:NSDI building blocks: regional GIS in the United States.
Next Article:The potential of integrating e-participation in planning support systems.

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters