Usage Notes

FNLStat : Multidimensional Scaling : Usage Notes

Usage Notes

The routines described in this chapter all involve multidimensional scaling. Routine MSIDV performs computations for the individual differences metric scaling models. The utility routines are useful for associated computations as well as for programming other methods of multidimensional scaling.

The following is a brief introduction to multidimensional scaling meant to acquaint the user with the purposes of the routines described in this chapter. Also of interest is the table at the end of this section giving the notation used. A more complete description of procedures in multidimensional scaling may be found in the references, as well as in the algorithm sections for the routines.

Multidimensional Scaling Data Types

A “dissimilarity” is a subject’s measure of the “distance” between two objects. For example, a subject’s estimate of the distance between two cities is a dissimilarity measure that may, or may not, be the actual distance between the cities (depending upon the subjects familiarity with the two cities). Dissimilarities usually have less relationship to distance. For example, the subject may estimate, on a given scale, the difference between two smells, two tastes, two colors, two shapes, etc. As a concrete example, the subject is asked to compare two wines and indicate whether they have very similar tastes (scale value 0), or very different tastes (scale value 10), or are somewhere in between. In this case, no objective measure of “distance” is available, yet the dissimilarity may be measured. In all cases, however, the larger the difference between the objects, the larger the dissimilarity measure.

If instead the measure increases as the objects become more similar, then a “similarity” measure rather than a “dissimilarity” measure is obtained. Most routines in this chapter require dissimilarities as input so that similarities must be converted to dissimilarities before most routines in this chapter can be used. Routine MSSTN provides two common methods for performing these conversions.

In general, dissimilarities between all objects in a set are measured (yielding a matrix of dissimilarities), and the multidimensional scaling problem is to locate the objects in a Euclidean (or other) space of known dimension given the matrix of dissimilarities. The estimates of object locations should yield predicted distances between the objects that “closely approximate” the observed dissimilarities. In many multidimensional scaling methods, “closely approximates” means that a predefined measure of the discrepancy (the “stress”) is minimized. The simplest stress measure is the sum of the squared differences between the observed dissimilarities and the distances predicted by the estimated object locations. This stress measure, as well as all other stress measures used in this chapter, is discussed more fully in the manual document for routine MSTRS.

Note that the predicted distances between objects may not be Euclidean distance. Indeed, in one of the more popular multidimensional scaling models, the individual differences model, weighted Euclidean distance is used. Let λ1k and λ2k, k = 1, …, d, be the location estimates of two objects (stimuli) in a d dimensional space. Then, the weighted Euclidean distance used in the individual difference model is given by

Many other distance models are possible. The models used in this chapter are discussed in the manual document for routine MSDST.

A dissimilarity is a subject’s estimate of the difference (“distance”) between two objects. From the observed dissimilarities, a predicted distance between the objects is obtained by estimating the location of the objects in a Euclidean space of given dimension. In metric scaling, the dissimilarity may be a ratio measure (in which case a dissimilarity of zero means that the objects are in the same location) or an interval measure (in which case “distance” plus a constant is observed). When an interval measure is observed, the interval constant, c, must also be estimated in order to relate the dissimilarity to the predicted distance. For ratio measures, c is not required. A couple of methods for estimating c are used by the routines in this chapter. These methods are explained in the routines that use them.

In nonmetric scaling, the dissimilarity is an ordinal (rank) or categorical measure. In this case, the stress function need only assure that the predicted distances satisfy, as closely as possible, the ordinal or categorical relationships observed in the data. Thus, the stress should be zero if the predicted distances maintain the observed rankings in the dissimilarities in ordinal data. The meaning of a stress in categorical data is more obtuse and is discussed further below.

In ordinal data, the stress function is computed as follows: First, the dissimilarities are transformed so that they correspond as closely as possible to the predicted distances, but such that the observed ordinal relationships are maintained. The transformed dissimilarities are called “disparities”, and the stress function is computed from the disparities and the predicted distances. (In ratio and interval data, disparities may be taken as the dissimilarities.) Thus, if the predicted distances preserve the observed ordinal relationships, a stress of zero will be computed. If the predicted distances do not preserve these relationships, then new estimates for the distances based upon the disparities can be computed. These can be followed by new estimates of the disparities. When the new estimates do not lead to a lower stress, convergence of the algorithm is assumed.

In categorical data, all that is observed is a category for the “distance” between the objects, and there are no known relationships between the categories. In categorical data, the disparities are such that the categories are preserved. A score minimizing the stress is found for each category. As with ordinal data, new distances are computed from this score, followed by new scores for the categories, etc., with convergence occurring when the stress cannot be lowered further. In categorical data, a stress of zero should be relatively uncommon.

The individual differences model assumes that the squared distance between stimuli i and j for subject l,

is given as

where d is the number of dimensions (always assumed to be known), λik is the location of the i‑th stimulus in the k‑th dimension, and wlk is the weight given by subject l to the k‑th dimension. Let

denote the average of the squared distances in the i‑th row of the dissimilarity matrix for the l‑th subject, let

be similarly defined for the j‑th column, and let

denote the average of all squared distances for the l‑th subject. Then, the product moment (double centering) transformation is given by

The advantage of the product‑moment transformations is that the “product‑moment” (double centered) matrices Pl = (pijl) can be expressed as

Pl = Λ[diag(Wl)]ΛT

where Λ = (λik) is the configuration matrix, and where diag(Wl) is a diagonal matrix with the subject weights for subject l, wlk, along the diagonal. If one assumes that the dissimilarities are measured without error, then the dissimilarities can be used in place of the distances, and the above relationship allows one to compute both diag(Wl) and Λ directly from the product‑moment matrices so obtained. If error is present but small, then very good estimates of Λ and diag(Wl) can still be obtained (see De Leeuw and Pruzansky 1978). Routine MSDBL computes the product‑moment matrices while MSINI computes the above estimates for X and diag(Wl).

Data Structures

The data input to a multidimensional scaling routine is, conceptually, one or more dissimilarity (or similarity) matrices where a dissimilarity matrix contains the dissimilarity measure between the
i‑th and j‑th stimuli (objects) in position (i, j) of the matrix. In multidimensional scaling, the dissimilarity matrix need not be symmetric (asymmetric distances can also be modelled, see routine MSDST) but if it is, only elements above the diagonal need to be observed. Moreover, in the multidimensional “unfolding” models, the distances between all pairs of objects are not observed. Rather, all (or at least many) of the dissimilarities between one set of objects and a second set are measured. When these types of input are combined with the fact that missing values are also allowed in many multidimensional scaling routines, it is easy to see that data structures required in multidimensional scaling can be quite complicated. Three types of structures are allowed for the routines described in this chapter. These are discussed below.

Let X denote a matrix containing the input dissimilarities. The columns of X correspond to the different subjects, and a subjects dissimilarity matrix is contained within the column. Thus, X is a matrix containing a set of dissimilarity matrices, one dissimilarity matrix within each column. For any one problem, the form (structure) of all dissimilarity matrices input in X must be consistent over all subjects. The form can vary from problem to problem, however. In the following, X contains only one column and the index for subject is ignored to simplify the notation. The three storage forms used by the routines described in this chapter are

1. Square symmetric: For this form, each column of X contains the upper triangular part of the dissimilarity matrix, excluding the diagonal elements (which should be zero anyway). Specifically, X(1) contains the (1, 2) element of the dissimilarity matrix, X(2) contains the (1, 3) element, X(3) contains the (2, 3) element, etc. Let q denote the number of stimuli in the matrix. All q(q ‑ 1)/2 off‑diagonal elements are stored.

2. Square asymmetric: X contains all elements of each square matrix, including the diagonal elements, which are not used. The dissimilarities are stored in X as if X were dimensioned q × q. The diagonal elements are ignored.

3. Rectangular: This corresponds to the “unfolding models” in which not all of the dissimilarities in each matrix are observed. In this storage mode, the row stimuli do not correspond to the column stimuli. Because of the form of the data, no diagonal elements are present, and the data are stored in X as if X were dimensioned r × s where r is the number of row stimuli and s is the number of column stimuli.

Missing values are also allowed. They are indicated in X in either of two ways: 1) The standard IMSL missing value indicator NaN (not a number) may be used to indicate missing values, or 2) negative elements of X are taken to be missing dissimilarities.

Table 14.1 gives some notation commonly used in this chapter. In general, an element of a matrix is denoted by the lowercase matrix name with subscripts. The notation is generally consistent, but there are some variations when variation seems appropriate.

Table 2 – Commonly Used Notation
Symbol	Fortran	Meaning
	DIST	Distance between objects i and j for subject l.
	DISP	Disparity for objects i and j for subject l.
X	X	The input array of dissimilarities.
D	NDIM	The number of dimensions in the solution.
W	W	The matrix of subject weights.
diag(Wl)		The diagonal matrix of subject weights for subject l.
π	WS	The matrix of stimulus weights.
Λ	CFL	The configuration matrix.
αh	A	The intercept for strata h.
βh	B	The slope for strata h.
νh	WT	The stratum weight for stratum h.
Nh	NCOM	The number nonmissing dissimilarities in stratum h.
Pl	P	The product‑moment matrix for subject l.
ɸ	STRSS	The stress criterion (over all strata).
ɸl	STRS	The stress within stratum l.
P	POWER	The power to use in the stress criterion.
Q	NSTIM	The total number of stimuli.
η	NSUB	The number of matrices input.
Γ		Normalized eigenvectors.
	IFORM	Option giving the form of the dissimilarity input.
	ICNVT	Option giving the method for converting to dissimilarities.
	MODEL	Vector giving the parameters in the distance model.
	ISTRS	Option giving the stress formula to use.
	ITRANS	Option giving the transformation to use.
	IDISP	The method to be used in estimating disparities.
	EPS	Convergence tolerance.