Functional measurement in consumer evaluation of market products.
Within the framework of multiple-criteria decision making, product perception has been often characterized as a multi-attribute choice process in which consumer evaluation can be affected by several factors like price, variety, habit, quality, design, brand name, and even country of origin (Anderson & Cunningham, 1972; Olson, 1977; Hastak & Hong, 1991; Hilgenkamp & Shanteau, 2010). Multi-attribute models are generally characterized as following three steps: (1) evaluation of the attributes, (2) integration of the obtained subjective dimensions, and finally (3) transformation of the results of the evaluation process into a ranking order, a set of pairwise preferences or a rating over some real interval (see, e.g., Lynch, 1985; Oral & Kettani, 1989). Then, the principal difficulty lies in detecting and identifying those factors that strongly influence the process. Several multi-attribute utility models and theories have been developed with the aim of defining which product attributes move the consumer towards a precise purchase choice, such as Conjoint analysis (Green & Rao, 1971; Green & Srinivasan, 1978), Analytic Hierarchy Process (Saaty, 1988), Simple multi-attribute rating techniques (see, e.g., Keeney & Raiffa, 1976), and functional measurement (Anderson, 1981, 1982; Lynch, 1985). In particular, two of these methodologies are based on important theoretical paradigms: the conjoint analysis can be regarded as an empirical application of the theory of conjoint measurement (Luce & Tukey, 1964), while the functional measurement is a methodology of the information integration theory (Anderson, 1981, 1982). Other examples of multiple-criteria decision making techniques, that however are not aimed to find the optimal choice and are not based on an axiomatic foundation, are ELECTRE (Roy, 1990) or PROMETHEE (Brans & Mareschal, 2005).
In the present work, the focus is on the application of functional measurement as a way of estimating consumer preferences. Functional measurement focuses on testing the cognitive rules, which underlie the integration process, by means of specific families of multi-attribute models that generally show an additive, multiplicative or weighted structure. An important subset of these models, and a widely used cognitive integration rule (Anderson, 1981, 1982), is represented by an averaging process with a dual representation in which every level of each attribute is entitled to its own weight and scale value parameters: Ratio scales would be involved in the measurement of weights and equal-interval scales would be used to measure values (Zalinski & Anderson, 1989). Although such a dual representation implies some difficulties in uniqueness, bias, convergence, reliability, and goodness of fit of the parameters (Zalinski & Anderson, 1991), the method of sub-designs (Norman, 1976; Anderson, 1982) allows for complete identifiability by adjoining selected sub-designs to the full factorial design. A full three-way design (A x B x C) is then supplemented with three two-way sub-designs (A x B, A x C, and B x C), and with three one-way sub-designs.
An analysis on footwear and shoe market was carried out, in collaboration with an Italian manufacturer (Brand C in the application that follows). Potential buyers of a target product were presented with different profiles created by manipulating, at the same time, design, brand, and price in a 3 x 3 x 3 factorial design. Their preferences were recorded and analyzed. Functional measurement and cluster analysis techniques were then employed to identify patterns in cognitive behavior. Parameters were estimated using the open source software R (R Development Core Team, 2013) and the library R-Average (Vidotto & Vicentini, 2007; Vidotto, Massidda & Noventa, 2010) that performs a model selection based on both Akaike (1974) and Schwarz (1978) criteria. The procedure estimates and tests several combinations of weights using information criteria (IC) ranging from the Equal-weight Averaging Model (EAM) to the Differential-weight Averaging Model (DAM). Simple-weight Averaging Model (SAM) is used as a special case of EAM where even the weights of the different design factors are the same.
The specific target of the research were female consumers between 35 and 40 years. The sample under investigation consists of 26 women (mean age 38 [+ or -] 2.2; (96% with high-school license or higher; 61% employed, 15.4% self-employed).
By means of a computerized procedure, participants were presented with different combinations of 3 Italian shoe brands (A, B, and C), 3 prices (99.99 [euro], 112.00, and 115.00), and 3 shoe designs (A, B, and C, each one associated with the same labeled brand). The three different shoe designs were deliberately chosen to be highly similar since the target product was specifically designed for the specific market segment under investigation.
With respect to the brands considered in our study, Brand A is a very famous brand, known by the 100% of the sample and used at least once by the 70% of the sample, Brand B is a famous brand, known by the 96% of the sample and used at least once by the 50% of the sample, and Brand C is a quite famous brand, known by the 77% of the sample and used at least once by the 23% of the sample. The two other dimensions manipulated in our study (i.e., Price and Design) were chosen on the basis of a preliminary questionnaire that was presented to a large sample of 292 participants (112 males, 180 females). This preliminary study aimed at investigating preferences in footwear and shoe market. Design and Price emerged as two of the most important qualities for footwear, together with Comfort and Lifespan. Interestingly, the factor Brand was one of the less important according to the participants.
The participants were tested individually in a laboratory. Each of the participants was presented with 54 different footwear profiles (27 stimulus presentations for the full-factorial design plus 27 stimulus presentations for the two-ways sub-designs), and was asked if she would have bought the displayed shoe on a 21 point scale from "Absolutely not" (0) to "Absolutely yes" (20). Each profile was presented twice for a total of 108 evaluations. Experiment time duration was about 15-20 minutes. An example of full profile is depicted in Figure 1.
Resulting data where analyzed by means of the library R-Average. Pre-trials suggested averaging rule as the most suited and coherent model for
the type of data at hand. Averaging rule was indeed assessed as the most suited rule for the present case situation by considering an extension of the methods suggested by Anderson (1981, see pages 58-65) and applied to market research (see, e.g., Troutman & Shanteau, 1976; Massidda, Polezzi & Vidotto, 2011). Responses of the participants to the different two-ways designs were compared to the responses given in the corresponding full design (with the levels of the third factor kept constant). While an additive rule would imply a series of simple shifts of the two-ways graphs, an averaging rule would imply more complex patterns (steeper shapes, intersections) and, quite often, the presence of interaction effects.
As a first step, 4 participants were removed since they showed response set bias, that is, they gave constant responses to almost all stimuli. The remaining 22 participants were analyzed at the individual level.
Hence, for every participant, both scale values and weight parameters were estimated using the library R-Average. Four participants showed a SAM, 8 an EAM, and 10 followed a DAM. Adjusted [R.sup.2] for most of the participants ranged between 0.73 and 0.98 (M = 0.85, SD = 0.07). The only exception was represented by participants 3, 7, 33, and 34, who showed an adjusted [R.sup.2] of 0.48, 0.39, 0.29, and 0.52, respectively. It should be stressed however that these participants showed patterns of either midpoint response bias or extreme response bias.
Finally, a visual inspection of each participant allowed us to control that the best model selected by the R-Average procedure was also the most plausible between all the models that had been estimated by the procedure and not only the one with the best goodness-of-fit.
The weight parameters were estimated by means of their corresponding t-parameters, that is the logarithm of their inverse. Such an additive representation is useful from both a computational and an interpretative point of view (Vidotto, 2013). Negative t-parameters correspond, indeed, to reduction in the importance of an attribute, while positive values correspond to enhancement in the importance. In the following, the two terms weight parameters and t-parameters are used interchangeably since they just refer to a monotonic change of the scale.
The estimated scale values and t-parameters were then analyzed by means of a cluster analysis (see, e.g., Hofmans & Mullet, 2013) by using the R package cluster (Kaufman & Rousseeuw, 1990; Maechler, Rousseeuw, Struyf, & Hubert, 2005). The cluster analysis was performed using agglomerative hierarchical nesting (Agnes) that, moving from a step 0 in which every single object is considered a cluster by itself, sequentially merges elements forming bigger clusters of minimally dissimilar elements. In particular, the method complete (complete-linkage clusters) was used, that is based on farthest neighboring, (i.e., the distance between different clusters is equal to the largest distance from any member of one cluster to any member of the other cluster). Two cluster analyses were run, one on the scale values and the other on the weight parameters. The dendrograms resulting from the two analysis are displayed in Figures 2 and 4, respectively.
Figure 2 shows the results of the cluster analysis that was run on the scale value parameters. As it can be noticed in the figure, the scale value parameters do not show a strong clustering structure: They merge at very different heights and exhibit a medium agglomerative coefficient (0.66). By cutting the dendrogram at different heights, different number of clusters might be obtained. Cutting between 40 and 50 would give two clusters, between 30 and 40 would give three clusters, and just below 30 would yield seven clusters, one of which is just made of a single subject. From a general point of view there are no general rules to cut at a given height, but there are indices and criteria to assess the validity of a clustering structure, like Dunn's index, Davies-Bouldin index, Hubert's Gamma (see, e.g., Halkidi, Batistakis, & Vazirgiannis, 2001). In the present work, the agglomerative coefficient (Kaufman & Rousseeuw, 1990) was used, as a measure of the strength of the found clustering structure. Moreover, a qualitative criterion was applied to select a height that fits the ideas of Actionability and Parsimony for the qualifications of market segmentations (Tonks, 2009), so that an organization can reach the segments of interest without wasting resources. A visual inspection suggested that three clusters could have been a sufficient number of segments under a parsimony perspective, hence it was cut at an height between 30 and 40. Figure 3 depicts the most salient differences between the participants who fall into the three clusters. On the whole, the participants of the cluster on the extreme right (Cluster 3, N = 5) perceived the products as too expensive, as they rated the factor Price around 2. At the same time, they showed a moderate importance of the factor. These participants were more focused on the design of the shoes. On the whole, the participants of the central cluster (Cluster 2, N = 10) appreciated the design of the shoes more than the participants of the other two clusters, as they rated it above 15. Nevertheless, the factor Design was less important in composing the judgments compared to the other clusters. Finally, the participants of the cluster on the extreme left (Cluster 1, N = 7) showed a more balanced response in general and they preferred, in order of importance, Design, Price and then Brand. It is worth noticing that about half of the sample (Clusters 1 and 3) gave low ratings to the design of the products, but perceived the factor as relevant.
The results of the cluster analysis that was run on the weight parameters are more interesting. As it can be noticed in Figure 4, the t-parameters show a quite good agglomerative coefficient (0.81), and there seems to be a homogeneous subdivision into two clusters (although it might be questioned whether to include or not participants 8, 14 and 19, they were considered in our analysis in order to avoid clusters made of singletons or pairs). The most interesting difference amongst the participants falling into the two clusters is that, on average, the participants of the right cluster (Cluster 1, N = 12) expressed higher rates and behaved following a differential averaging model (Figure 5), whereas those of the left cluster (Cluster 2, N = 10) in general expressed lower rates and behaved following an equal averaging model (Figure 6) in which Design rules over the attributes Price and Brand.
There are several observations that deserve to be done. The first one concerns the factor Brand that was not particularly considered during the profile evaluation in both clusters. This appears to be consistent with the preliminary results obtained with the questionnaire, in which the brand was one of the less influential factors during the evaluation of a target product. Although the scale parameters for the different brands generally highlighted preferences for the two most famous (Brands A and B), their t-parameters were generally negative, thus implying weight parameters lower than 1 and a reduced impact on the general response. The second interesting observation is that the participants who were careful to the factor Price generally expressed a better rating for the lower price and considered the two higher prices to be equivalent. This suggests that a slight rise in the added value might be considered acceptable by the consumers. The third observation concerns, instead, the general importance of the factor Design that, for some buyers, appeared to overwhelm any other consideration in the evaluation phase. This might be a consequence of the fact that, on average, the three presented designs were not considered particularly appealing by many consumers: Indeed, the participants of Cluster 2, expressed a particularly severe judgment on the Design, and showed an equal averaging model, whereas the participants of Cluster 2 expressed a more moderate judgment, and showed on average a differential average method and more integration of information. This appears to be confirmed by a direct observation of the responses given by the participants to the presented profiles. As it can be seen in Figure 7, the participants in Cluster 1 did not alter their preferences when Design was added as, on average, they kept integrating information with a differential model. Differently, the participants in Cluster 2 showed lower preferences as soon as the Design entered the equation. This result might be expected in an equal average model in which Design had the strongest weight (notice that, in Figure 7, the marginal responses have been averaged over the factor Brand).
A final note concerns the composition of Cluster 1 and 2 in the cluster analysis that was run on the weight parameters (see Figures 4 to 6). Almost all the participants in Cluster 2 followed an EAM integration rule at the individual level. Cluster 1, instead, includes both participants showing a DAM model and participants showing a SAM model (i.e., participants 3, 9, 33 and 35 that are clustered together). Although this might seem a coarse approximation, it might be understandable from an analytical point of view, since simple averaging models (corresponding to -parameters equal to 0) are on average more close to Differential models (whose -parameters are sometimes above and sometimes below 0) than to Equal averaging models (whose parameters are consistently above or below 0). At the same time, it might be understandable from a parsimony perspective, since the necessity of partitioning the market into a feasible and actionable number of segments should follow an economically-based rationale and not simply a mathematical rationale.
CONCLUSIONS AND DISCUSSION
The perception of product quality is the result of a prior integration of several factors. In the present work, functional measurement methodology was applied to analyze and decompose the subjective evaluation of the attributes of a target product. Factors like Brand, Price and Design of a shoe were combined into different profiles and evaluated by a sample of selected consumers. The final goal was to evaluate whether a particular product, specifically designed for a given market niche, was actually appreciated by potential buyers. Individual data were analyzed using the R-Average library to estimate scale values and weight parameters. The resulting parameters were analyzed by means of agglomerative hierarchical clustering to explore the feasibility of such an approach.
The most interesting result was achieved by considering a clustering of individuals based on similarity of weight parameters. Two main clusters of consumers emerged. The consumers in the first cluster were less severe in their ratings and integrated information on the basis of a differential averaging model. The consumers in the second cluster were more severe in their judgment and their cognitive algebra was based on an equal averaging model in which the design of the product had the strongest weight. Hence, the negative perception of the product design that was observed in the consumers of the second cluster affected their general perception of the product. Furthermore, although higher preferences were generally expressed in favor of the two more famous brands, as soon as the design of the shoes was displayed, the brand was no longer particularly important in evaluating the product. Finally, the factor Price was also an important factor, with a slight preference for the lowest price. The two higher prices were perceived as equivalent, as it could have been expected given the negligible difference between them.
Several considerations about the current study are in order. First, an approach to clustering has been used that is just one out of several different possibilities. Clustering in Functional Measurement methodology can be either performed on raw data or at different stages of the information integration process (Hofmans & Mullet, 2013). In the present work, with the aim of exploring some possibilities, cluster analysis was tentatively applied to scale values and weight parameters independently. A more refined approach might be, for instance, clustering first on the weights, since they express the importance of an attribute, and then further clustering on the scale values parameters. Such an approach was not pursued in the present work to avoid an excessive segmentation of a relatively small sample of subjects.
Second, market segmentation is a pivotal strategic marketing concept and very sophisticated clustering and validation procedures are available for a long time (see, e.g., Punj & Steward, 1983; Halkidi et al., 2001; Liu et al., 2012). Functional Measurement methodology allows, however, to perform cluster analysis during different phases of the process, such as valuation, integration and response, thus generating new possibilities for interpreting consumers' judgments. In particular, a perspective of market segmentation based on such a rationale might define segments not only on the basis of the preferences attached by consumers to quality goods, but also on the very definition of the cognitive rules that underlie their judgment strategies. Such an approach might be of some help, on the one hand, in targeting more efficiently consumers interests and tastes and, on the other hand, in analyzing their behavior.
The use of different cognitive rules during the information integration process, such as EAM or DAM, might become of some help in market strategy and products design. Customized commercials and products specifics for a given niche might be created to impact more on certain consumers than on others. As a naive example, if a given segment of interest follows an EAM rule in which design is the most important factor, commercials might be plain, simple and strongly based on such a visual (or tactile) information. On the converse, if the interest is in pursuing a segment of consumers that integrates information in a more complex way, commercials might be more structured, focusing on the relations of attributes that the consumer itself enhances in his judgment. This might be helpful not only in the approach of classic segmentation and marketing, but also in the recent developments of sensory marketing (see, e.g., Krishna 2012). Finally, it also interesting to notice that, as previously investigated by Troutman and Shanteau (1976), averaging appears to be a widely followed rule in integrating information about products specifics.
To conclude, in this work, functional measurement has been applied to consumer evaluation of footwear and shoe market. However, it is stressed here that the presented analysis procedure has a strong potential for the analysis of many different markets. The hints for industries and companies that might derive from functional measurement are several, and concern both the design of products that meet needs and expectations of the consumers, and the planning of effective marketing strategies.
Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716-723.
Anderson, N. H. (1981). Foundations of Information Integration Theory. New York: Academic Press.
Anderson, N. H. (1982). Methods of Information Integration Theory. New York: Academic Press.
Anderson, W. T., & Cunningham, W. H. (1972). Gauging foreign product promotion. Journal of Advertising Research, 12, 29-34.
Brans, J. P., & Mareschal, B. (2005). PROMETHEE methods. In J. Figueira, S. Greco, & M. Ehrgott (Eds.), Multiple criteria decision analysis: State of the art surveys (pp-163-195). Berlin: Springer.
Dalgic, T., & Leeuw, M. (1994). Niche marketing revisited: Concepts, applications, and some european cases. European Journal of Marketing, 28(4), 39-55.
Green, P. E., & Rao, V. R. (1971). Conjoint measurement for quantifying judgmental data. Journal of Marketing Research, 8, 355-363. Green, P. E., & Srinivasan, V. (1978). Conjoint analysis in consumer research: Issues and outlook. Journal of Consumer Research, 5, 103-123.
Halkidi, M., Batistakis, Y., & Vazirgiannis, M. (2001). On cluster validation techniques. Journal of Intelligent Information Systems, 17(2/3), 107-145. Hastak, M., & Hong, S. T. (1991). Country-of-origin effects on product quality judgments: An information integration perspective. Psychology and Marketing, 8(2), 129-143.
Hilgenkamp, H., & Shanteau J. (2010). Functional measurement Analysis of Brand Equity: Does Brand name affect Perceptions of Quality? Psicologica, 31, 561-575.
Hofmans, J., & Mullet, E. (2013). Towards unveiling individual differences in different stages of information processing: A clustering-based approach. Quality & Quantity, 47, 455-464.
Kaufman, L., & Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis. Wiley, New York.
Keeney, R. L. & Raiffa, H. (1976). Decisions with multiple objectives: Performances and value trade-offs. New York: Wiley.
Krishna, A. (2012). An integrative review of sensory marketing: Engaging the senses to affect perception, judgment and behavior. Journal of Consumer Psychology, 22, 332-351.
Liu, Y., Kiang, M., & Brusco, M. (2012). A unified framework for market segmentation and its applications. Expert Systems with Applications, 39, 10292-10302.
Luce, R. D., & Tukey, J. W. (1964). Simultaneous conjoint measurement: A new type of Fundamental measurement. Journal of Mathematical Psychology, 7(1), 1-27.
Lynch, J. G., Jr. (1985). Uniqueness issues in the decompositional modeling of multiattribute overall evaluations: An information integration perspective. Journal of Marketing Research, 22(1), 1-19.
Maechler, M., Rousseeuw, P., Struyf, A., & Hubert, M. (2005). Cluster: Cluster analysis basics and extensions. R package version 1.14.2, URL http://cran.rproject.org/web/packages/ cluster/citation.html.
Massidda, D., Polezzi, D., & Vidotto, G. (2011). A Functional Measurement approach to cope the non-linearity of judgments in marketing research. In Proceedings of the 10th European Conference on Research Methodology for Business and Management Studies, pp. 348-354. Normandy Business School, Caen, France.
Norman, K. L. (1976). A solution for weights and scale values in functional measurement. Psychological Review, 83(1), 80-84.
Olson, J. C. (1977). Price as an informational cue: effects on product evaluation. In A. G. Woodside, J. N. Sheth, & P. D. Bennet (Eds.), Consumer and industrial buying behaviour (pp. 267-286). Amsterdam: North Holland.
Oral, M., & Kettani, O. (1989). Modelling the process of multiattribute choice. The Journal of the Operational Research Society, 40(3), 281-291.
Parrish, E. D., Cassill, N. L., & Oxenham, W. (2006). Niche Market Strategy for a mature marketplace. Marketing Intelligence & Planning, 24(7), 694-707.
Punj, G., & Steward, D. W. (1983). Cluster analysis in marketing research: Review and suggestions for applications. Journal of Marketing Research, 20(2), 134-138.
R Development Core Team. (2013). R: A language and environment for statistical computing. Vienna, Austria.
Roy, B. (1990). The outranking approach and the foundations of ELECTRE methods. In C. A. Banae (Ed.), Readings in Multiple Criteria Decision Aid (pp. 49-73). New York: Springer-Verlag.
Saaty, T. L. (1998). Multicriteria decision making - the analytic hierarchy process. Planning, priority setting, resource allocation. Pittsburgh: RWS Publishing.
Schwarz, G. (1978). Estimating the Dimension of a Model. The Annals of Statistics, 6(2), 461-464.
Tonks, D. G. (2009). Validity and the design of market segments. Journal of Marketing Management, 25(3-4), 341-356.
Troutman, M., & Shanteau, J. (1976). Do consumers evaluate products by adding or averaging attribute information? Journal of Consumer Research, 3(2), 101-106.
Vidotto, G., & Vicentini, M. (2007). A general method for parameter estimation of averaging models. Teorie e modelli, 12(1-2), 211-221.
Vidotto, G., Massidda, D., & Noventa, S. (2010). Averaging models: Parameters estimation with the R-Average procedure. Psicologica, 31(3), 461-475.
Vidotto, G. (2013). Note on differential weight averaging models in functional measurement. Quality and Quantity, 47(2), 811-816
Zalinski, J., & Anderson, N. H. (1989). Measurement of importance in multi-attribute models. In J. B. Sidowski (Ed.), Conditioning, cognition and methodology. Contemporary issues in experimental psychology (pp. 177-215). Lanham, MD: University Press of America.
Zalinski, J., & Anderson, N. H. (1991). Parameter estimation for averaging theory. In N. H. Anderson (Ed.), Contributions to Information Integration Theory (Vol. 1: Cognition, pp. 353-394). Hillsdale, NJ: Lawrence Erlbaum Associates.
(Manuscript received: 8 October 2013; accepted: 21 February 2014)
Noventa S. (1), Anselmi P. (2), Tagliabue M. (2) and Vidotto G. (2)
(1) University of Verona, Italy; (2) University of Padova, Italy
(1) We would like to thank the anonymous reviewers of Psicologica for their insight into their work and their helpful suggestions to improve the manuscript. Email: email@example.com
|Printer friendly Cite/link Email Feedback|
|Author:||S., Noventa; P., Anselmi; M., Tagliabue; G., Vidotto|
|Date:||Dec 1, 2014|
|Previous Article:||A policy-based weighted averaging model to predict green vehicle market shares.|
|Next Article:||Legitimacy of executive compensation plans: a preliminary study of French laypersons' acceptability.|