Overview of Perceptual Mapping
William D. Neal
The Sawtooth Software Conference on:
Conjoint Analysis, Perceptual Mapping, and Computer Interviewing
April 6-8, 1988
Perceptual mapping is one of the few marketing research techniques that provides direct input into the strategic marketing planning process. It allows senior marketing planners to take a broad view of the strengths and weaknesses of their product or service offerings relative to the strengths and weaknesses of their competition. It allows the marketing planner to view the customer and the competitor simultaneously in the same realm.
Perceptual mapping and preference mapping techniques have been a basic tool of the applied marketing research profession for over twenty years now. It is one of the few advanced multivariate techniques that has not suffered very much from alternating waves of popularity and disfavor. Although I personally observed a minor waning of the use of the techniques in the early 1980's, it is now as popular as ever.
And although these techniques have been used extensively over a large number of applied research studies, and for a very wide variety of product and service categories, and have been subjected to extensive validations, there still remain some very basic issues as to the procedure's applicability and usefulness.
In addition, there remain many outstanding issues concerning the proper procedures and algorithms that should be used for perceptual mapping.
So, I see that my main task at this conference is to raise the issues, as I see them. I am taking a rather naive approach. That is, I will approach these issues from the research manager's point of view, and not the statistician's. These issues represent the kinds of questions that my clients ask me and my staff. Obviously, I have some answers, and some biases, but I will try to minimize those, and concentrate on the issues.
I know that many of these issues will be addressed at this conference, both in formal presentations and in informal discussions. I am taking this route in the hopes that this introduction will encourage greater investigation, increase validation activities, encourage additional, practitioner? oriented publishing activities, and provide fuel for additional conferences of this type.
WHAT'S IN A NAME?
So, let's start with the first issue. Just what is perceptual mapping? Or, is it preference mapping? Or, is it structural segmentation? Or what? Here is a list of some of the names that I have seen this procedure called:
|- Perceptual Mapping
||- MDS Mapping
|- Preference Mapping
||- Market Mapping
|- Structural Segmentation
||- Product Mapping
|- Brand Mapping
||- Goal Mapping
|- Behavioral Mapping
||- Image Mapping
|- Strategic Product Positioning
||- Semantic Mapping
Well, if the only difference between these various names is the selection of a particular attribute set, then I suggest that we rename the technique to just plain old Multivariate Mapping. If one wishes to distinguish algorithms, then the proper descriptive prefix can be used, such as discriminant analysis?based multivariate mapping. Or, if one wishes to distinguish the types of attributes used, then an appropriate suffix like multivariate mapping of consumer product preferences would be more appropriate. Either, or both are far more descriptive and certainly reduces confusion.
If there are true differences between these various names and the and the idea of generic multivariate mapping, then we are obligated to make those distinctions and perpetuate that nomenclature throughout the profession. As it stands now, the name perceptual mapping is confusing to both marketing managers and many research professionals. Currently, most marketing managers assume that there is a fundamental difference between perceptual mapping and, say, preference mapping. Is there really?
ISSUES AND PROBLEMS WITH CURRENT ALGORITHMS IN GENERAL USE
Following are the three major classes of algorithms that are generally in use for perceptual mapping in the applied marketing research arena. Included is a brief discussion of their strengths and weaknesses, and some outstanding questions, from a users viewpoint.
- Discriminant analysis is still the most popular algorithm in use today for applied multivariate mapping. The procedure is widely available. The algorithm is robust in that the assumptions concerning the continuity of the data, and the data distributions can be relaxed to a considerable extent.
The inputs to discriminant analysis consist of individual respondent ratings of products across attributes. The basic assumptions are that the rating scales are continuous and normally distributed. However, in using the technique for mapping purposes, these assumptions can be relaxed to the point that products simply rank?ordered on attributes will provide sufficient information for mapping purposes.
Discriminant analysis is much like regression analysis in that it uses a least?squares approach in an attempt to fit linear models to the data. However, the dependent variable is nominal. That is, for mapping purposes, the dependent variable is the product being rated. Thus, each product rated by each respondent is an input record, so if a respondent rated five products, that generates five input records.
Discriminant analysis then calculates the coefficients to a set of standardized linear equations, called discriminant equations, that explain the differences between the product ratings. Or, said a different way, explains the variance between product ratings.
The formation of the linear equations follows an order, such that the first equation explains the most variance, the second explains the most variance remaining after accounting for the variance explained by the first, and so on until you reach a limit of one less than the number of products being rated, or one less than the number of variables, whichever is less.
These linear equations are further constrained so that each one is uncorrelated to the other. That is, they are orthogonal.
These two properties, the successive optimization of the variance explained, and the orthogonality of the equations, forms the basis for mapping, because one is assured that the first linear equation, which defines the X axis of a map, explains the most variation between products, and the second linear equation, or Y axis, explains the most variance between products, after accounting for the variance explained by the X axis (given the limitations of the least?squares procedure). And the X and Y axes are orthogonal.
In most cases, the first two equations define the majority of the variance between product ratings, and are the only significant dimensions. Later, we will discuss significant dimensions beyond two.
Assuming for the moment that there are only two significant dimensions, the calculated coefficients of each variable in each equation define the X and Y coordinates of the attribute on the map.
The X and Y coordinates of each product are calculated by substituting the mean attribute ratings of each product into the two discriminant equations, and calculating the results.
The linear discriminant equations allow the researcher to easily plot additional products, or concepts into the derived space. These equations also allow the researcher to explore the distributions of specific customer groups in the derived space.
Most widely available discriminant analysis algorithms provide a variety of useful statistics to the researcher, such as eigen values to show you the variance explained by each equation, tests of significance for each equation, multivariate F statistics to show the significance of the group differences, and correlations between the discriminant functions and each attribute variable.
The procedure also has a few drawbacks.
Obviously it requires individual ratings of individual products (or services, or firms) on each of a selected set of attributes. Consequently, there is a perpetual problem with what to do with missing data points. Although I have read a dozen papers on handling missing data in discriminant analysis, there seems to be no consensus short of case?wise deletion. Yet, the realities of today's marketing research industry often makes this an unacceptable solution. Is mean substitution an appropriate solution? How does mean substitution effect the calculation of the discriminant functions?
The procedure is dependent on the selection of the appropriate attribute set. The omission of important discriminating attributes may lead to false conclusions concerning the dimensionality of consumer ratings of product differences.
Also, the procedure highlights those variables that discriminate between products, and will not display on the map attributes that may be extremely important, even dominating product choice, but do not differentiate between products. Alternatively, situations often develop where a particular variable discriminates between products, but is not important in product choice.
Often, the selected set of attribute variables is highly correlated, consequently, there is no control over the number of attribute variables, or which attribute variables, enter the discriminant solution and define the relevant space. To overcome this situation, multiple passes, forcing in variables in which there is a high interest, are often required. This can be costly.
The inclusion or exclusion of one of the products or firms being rated often changes the dimensionality of the space, especially when the set of firms or products under consideration is small or radically different from other products. It is often difficult to convey this situation to research managers and senior marketing management. A radically changing product space detracts from the confidence that senior marketing managers have in the procedure. Is there some way of overcoming this, short of adding more products simply to stabilize the space? That solution is often not viable in researching industrial products.
- R-Type Factor Analysis is seldom used as a mapping procedure in today's applied marketing research field, although in the 1970's it was the preferred mapping procedure among many applied researchers. And, there are a few empirical studies that show it is superior to discriminant analysis. Although you have the same problems with what to do about missing data and selecting the relevant set of variables as you have with discriminant analysis, this procedure overcomes two of the problems with discriminant analysis. All variables are shown on the map, and the inclusion or exclusion of products has no effect on the extracted dimensions.
The inputs to factor analysis are very similar to those for discriminant analysis, product ratings across attributes. However, an additional ingredient is required. You must also collect an importance rating from each respondent for each attribute. These importance ratings are the basis for developing the mapping space. The basic assumptions concerning the distribution and continuity of the rating scales should not be relaxed.
At this point the two procedures part ways. Unlike discriminant analysis, where the variance between product ratings is addressed, factor analysis attempts to explain the correlation between importance ratings of the variables. That is, the first factor equation is that linear equation that explains the maximum amount of correlation between the variables, and the second extracted equation explains the most of the remaining correlation, and so on, until 100% of the correlation is explained with a number of factors equal to one less than the number of variables. The extracted factors are linear equations which have a coefficient for each variable. These coefficients are commonly referred to as factor loadings.
The output of factor analysis does meet the basic criteria for developing a map. The first two dimensions explain the maximum amount of variance (i.e. correlation) between the importance ratings of the variables (not the ratings of the products), and they are orthogonal. Thus, to define a variable location on the map is a simple case of using that variable's loading on the first factor as the X coordinate, and its loading on the second factor as the Y coordinate.
Factor analysis is an interdependence procedure, thus the various differences in product ratings is ignored until after the factor equations are derived. Product locations in the derived space are calculated by averaging the first two factor scores of that product's ratings to define the X and Y coordinates. Or alternatively, plugging the average product scores on each attribute into the two factor scores and calculating the X and Y coordinates.
The extraction of factors is highly sensitive to the number of correlated attributes. The addition or deletion of an attribute may dramatically alter the dimensionality of the derived space. In addition, extraction of factors is dependent on the intercorrelations between variables, and does not necessarily optimize the separation between products, like discriminant analysis. Furthermore, a single variable that may be considered extremely important and dominating the selection of products, like safety, may not show up as a dimension on a map, simply because it is not correlated to any of the other measures.
Myers and Tauber (Market Structure Analysis, AMA, 1977) recommended overcoming this problem through the use of a "weighted covariance approach", where the input to the factoring program is a matrix of product covariances, weighted by regression scores derived from regressing the importance ratings against product choice. But this has proved to be a bulky and difficult procedure to implement, and there has been little empirical validation.
- Non-metric scaling procedures are still used quite often for multivariate mapping. However, I am only going to concentrate on one of those, and briefly describe the others.
- Correspondence Analysis or Dual Scaling techniques are gaining in popularity, mainly because there has been a considerable amount written on the technique over the last few years, it is an extremely robust technique, it has simple data collection requirements, and the algorithms are becoming widely available.
Correspondence analysis is often used as a post?hoc mapping procedure for studies that did not originally contemplate multivariate mapping, because of its ability to use summary distributions of nominal data. The procedure puts no significant demands on the distribution of the data. In addition, the procedure does not require the standard attributes?by?products data format required by other procedures. A matrix of products?by?attributes works just as well, and will produce an identical map.
In addition, the point?point maps produced from correspondence analysis are directly generated by most of the programs and they are much easier for general marketing managers and creative promotional personal to understand.
Inputs to correspondence analysis can be as simple as a summary table of respondent checks as to whether a product has a certain characteristic or not. Almost any data collection procedure imaginable can be transformed, and used to provide inputs to correspondence analysis. Respondents can be asked to name a single brand most associated with an attribute, or occasion, or store. Even open ended questions can be used by asking respondents to name the qualities most associated with a brand, or store, or personality. There are no restrictions as to how many, or how few items a respondent associates with a product, or whatever.
The data input to the program is a matrix of counts of how many times a product, service, firm, or whatever, is associated with an attribute, usage occasion, need, or whatever.
Consequently, the data collection process is highly simplified. This has considerable appeal in light of the industry's intense interest in "respondent abuse" and declining response rates.
Correspondence analysis has a unique ability to integrate a large amount of data from divergent perspectives on a single map. For example, brands, product attributes, needs fulfillment, and usage occasions can all be shown on the same map.
The two main drawbacks of the technique are that it uses only summarized distributions of nominal data for most of the algorithms that are currently available. Thus, a considerable amount of the variance associated with a database of individual responses is sacrificed. And metric data distributions must be "nominalized" to be used in the procedure.
The exception is Benzacri's SPAD program that few researchers have access to. SPAD allows you to input either the individual observations, or ratings, or the summarized data. Interestingly, you will often get differing amounts of explained variance, and/or different product and attribute locations on the map, depending whether you use the individual observations or the summarized data. Frankly, I'm not sure why this happens.
If there are a number of metric distributions that must be converted to nominal variables, the selection of the appropriate break?points is critical, and has a considerable effect on the amount of explained variance and the extracted dimensions of the correspondence map. We need a solution to this situation, and guidelines on proper procedures for nominalizing metric data.
- KYST, PROFIT, INDSCAL, TORSCA, PREFMAP, PROXIMITY, ALSCAL, SSA?1 thru SSA?4, MRSCAL, MINISSA, MINITRI, PARAFAC, and MDSCALE, (to name a few) all fall into a class of mapping procedures called non?metric multidimensional scaling procedures. However, in actuality, some of these algorithms are more metric in nature than non?metric. Although conceptually different from correspondence analysis, for the most part they have been replaced with correspondence analysis because the data collection procedure is as easy for one as the other.
These methods release the researcher from having to specify the appropriate attribute set, and instead rely on how consumers judge the products in question to be similar, or dissimilar. The data collection process is often an unstructured sorting task, their respondents are asked to sort products into piles that are similar, or simply rank order products based on their similarity.
Orthogonal scales are then derived to explain the consumers' perceived differences between the complete set of products. The derivations are based on minimizing stress in the fewest dimensions possible, while preserving respondents' order of similarity. The nature of the dimensions are determined by inspecting the manner in which each product is aligned with each dimension.
Explanatory variables can be depicted on the map by asking consumers to correlate the similarity of a given attribute, or usage occasion, to the products.
The procedures for the most part are quite sensitive to the number of products in the data set. The addition or deletion of one product, will often change the dimensionality of the space.
In addition, several of these algorithms require complicated, and often conceptually difficult, data transformations to work correctly and they are quite sensitive to the types of transformations undertaken. (see "Multidimensional Scaling", by Kruskal and Wish, Sage University Press, 1978.)
CURRENT ISSUES IN PERCEPTUAL MAPPING
- Defining and limiting the relevant space
How is the relevant space limited? There are three types of limitations that must be placed on the relevant multivariate space that will be analyzed and mapped. They are:
- Limits on the relevant set of variables that will be used to define the perceptual space. In my opinion, this is the most critical area for setting limitations, except for those using the scaling methods based on overall product similarities. The major question to the applied researcher is what variables are to be used to orient the perceptual positioning of the various competitors. There is a nearly unlimited set of variables available.
The selection of the relevant variable set determines the type of map that will be produced. That is, will the map be based on such things as purchase behavior, organizational images, product usage behaviors, product attribute characteristics, brand images, consumer goals, consumer needs, convenience issues, or some combination of these.
This is a critical decision, and requires the agreement of senior marketing management to concur with the appropriate attribute set. Determination of the relevant set requires the professional marketing researcher to critically examine previous research in the category, conduct qualitative research, and creatively select those variables that will provide senior marketing managers with the insight necessary to form marketing strategy.
The problem is that we all have seen empirical evidence that the relevant set of attributes changes dramatically from product category to product category, and even among sub?categories. Yet, there is no substantial body of knowledge to tell us what is the relevant set of variables that should be used in any one category. We are left to re?inventing the wheel every time we approach a new product category with multivariate mapping. This severely detracts from the general adaptation of multivariate mapping procedures at the strategic marketing planning level.
- Limits on the relevant set of products, services, or firms that will be mapped into the multivariate space is also a major issue. Although I don't believe that this is as critical an issue as the selection of the relevant variable set, it is still a serious one. A balance is required.
In this era of market fragmentation, and the rapid emergence of new product categories, and sub?categories, brought on by an acceleration of differentiated products flooding the market place, the selection of the relevant competitive set of products or services is ever?changing.
If the relevant set of products, services or firms is too broad, we may fail to uncover those truly discriminating variables that may reveal an opportunity for a competitive advantage. That is, some non?competitive products may so skew the spatial dimensions of the map that differences between the true set of competitors may be hidden or overlooked.
On the other hand, the selection of too narrow of a competitive set may destine the marketing planner to focus on the wrong competitors and wrong dimensions. As an example, department stores for years focused on competing department stores as the relevant set, ignoring the single merchandise line specialty stores and the deep discounters until the department stores' bottom lines started gushing red ink.
Given the rapid nature of change in the competitive set for most product and service lines, we could not rely on a body of literature to solve this problem. What is needed is a set of generally accepted procedures for determining the relevant competitive set at any point in time.
Permit me to continue the discussion of issues in multivariate mapping in a more abbreviated manner. I will limit my remarks from here on to discriminant analysis?based multivariate mapping, since that is what most of us are using.
- Are there particular product categories or merchandise lines or firm?types where discriminant analysis?based mapping works better? If so, then what are the characteristics of those product categories or industries.
- Is "high?involvement" in the respondent rating process a necessary prerequisite for multivariate mapping? What level of familiarity is necessary and sufficient to include a set of ratings into the definition of the relevant multivariate space?
- Extracting the dimensions.
- What are some good rules of thumb for determining how many dimensions to use? How much variance needs to be explained to be comfortable? How should we handle dimensions with low variance explained, but test as significant?
- How do you display more than two dimensions? What procedures and graphics algorithms are available? What graphics procedures best convey the information in the multivariate space to managers and creative professionals?
- If you are forced to use a two?dimensional map, but have three or more significant dimensions, how do you adequately show those attributes that are heavily loaded on the third dimension? Or, do you eliminate those from the display. If you do eliminate them, what criteria should you use?
- What actions should you take when the first extracted dimension explains much more variance then the second dimension. Is it appropriate to display those two dimensions as equal axes in the map?
- Plotting the variables in the derived space raises some interesting questions.
- Should variable coordinate weighting be used to show differences in the amount of variance explained by each axis?
- If so, what should be used as the appropriate weights ? percent of variance explained by each axis, eigenvalues, or something else?
- Plotting the firms/products in the perceptual space
- How should we show which products or firms are significantly different from others on the map?
- Does anyone attempt to draw confidence limits around the mapped points anymore?
- What about "ideal" points?
- Should "ideal" points be used at all?
- If so, what is the best way of doing that?
- Use importance ratings and treat these as another product rating? In other words, do we permit importance ratings to assist in the definition of the relevant space?
- Or, should we calculate standardized mean importance ratings and plug those values into the previously extracted dimensional linear equations to calculate the coordinates of the ideal point?
- Should we use respondents' highest rating of any firm/product on each attribute and use that as a set of ratings for the "ideal" product?
- What about using respondents' preferred firm/product and simply duplicate that rating as the set of "ideal" ratings under the assumption that the respondent will purchase or use that product closest to their ideal ?
- Is it appropriate to map a "generalized" space, then segment the sample on importance ratings or product preferences, then impose the mean ratings of those segments as multiple "ideal" points on the map?
- What other methodologies are there for generating "ideal points"?
- What do you do when any one of these procedures dramatically skews the map?
- Is longitudinal mapping a valid concept? What are the critical issues in overlaying maps? What are the best methods for doing this?
- Line up "index" points from successive time periods so as to minimize the variance between them? Should the index points be the vector of importance ratings, or some other measure?
- Select a very stable vector that consistently discriminates between at least two of the products or firms, and minimize the variance between their positions over successive time periods?
- Use both of these methods in combination?
- Re-generate the dimensions with each attribute from each time period representing a separate attribute, and each product from each time period representing a separate product?
- Always use the original space, and simply plug in the standardized means for each product from successive time periods into the linear dimensional equations and calculate the new coordinates?
- What other procedures are being used?
- How can you incorporate volumetric data into multivariate mapping? In other words, how can you show the marketing manager where the greatest demand exists on the map? Or, where the opportunities are.
- Are scatter plots of grouped respondent locations the only thing available?
- Or, can we develop a surface?plot over the mapped space that will depict such things as dollars spent, or number of items bought, or even number of times visited? What methods are being used now? What could be done with the new graphics packages combined with multivariate "smoothing" routines to super?impose surface plots over the derived space?
Needless to say, there are still many outstanding issues and further development opportunities with multivariate mapping procedures. I'm sure that there are others besides these. I would like to challenge this audience to address these issues, share them with your peers, publicize solutions to them, freely subject them to validations, and give us more specificity in executing this most powerful and useful marketing research procedure.