MSIDV
Performs individual‑differences multidimensional scaling for metric data using alternating least squares.
Required Arguments
NSTIM — Number of stimuli in each similarity/dissimilarity matrix. (Input)
X — NSUB similarity or dissimilarity matrices in symmetric storage mode. (Input)
Each matrix must occupy consecutive memory positions, and must be stored as a column in X. X must be dimensioned as
DIMENSION X(NC2,NSUB)
where NC2 = NSTIM * (NSTIM ‑ 1)/2. Each matrix is stored without the diagonal elements by column as upper triangular matrices. For example, a 3 by 3 matrix would be stored with the (1, 2), (1, 3), (2, 3) elements as the first three elements of the first column of X.
NDIM — Number of dimensions desired in the solution. (Input)
DIST — Vector of length NSUB * NC2, where NC2 = NSTIM * (NSTIM ‑ 1)/2, containing the predicted distances. (Output)
DIST contains the distances as predicted by the estimated parameters in the model. DIST has the same storage mode as X and may be treated as a series of NSUB matrices in symmetric storage mode but without the diagonal elements.
CFL — Matrix of size NSTIM by NDIM containing the configuration of points obtained from the multidimensional scaling. (Output)
A — Vector of length NSUB containing the intercepts for each subject. (Output)
B — Vector of length NSUB containing the slopes for each subject. (Output)
WT — Vector of length NSUB containing the criterion function weights for each subject. (Output)
STRS — Vector of length NSUB containing the value of the weighted optimized criterion within each subject. (Output)
STRSS — Value of the weighted optimized criterion function (summed over subjects). (Output)
RESID — NSUB * NC2 vector containing the observation residuals. (Output)
Here, NC2 = NSTIM (NSTIM ‑ 1)/2.
Optional Arguments
NSUB — Number of matrices to be used in the analysis. (Input)
Default: NSUB = size (X,2).
ICNVT — Option for converting from similarity to dissimilarity data. (Input)
If ICNVT = 0, the input data contains dissimilarities and no conversion is performed. If ICNVT = 1, the data are converted from similarity to dissimilarity data by subtracting each similarity from the largest similarity for the subject. If ICNVT = 2, the data are converted to dissimilarities by reciprocating each similarity.
Default: ICNVT = 0.
MODEL — Model option parameter. (Input)
MODEL = 0 means the Euclidean model is used, otherwise, the individual differences model is used.
Default: MODEL = 0.
ISTRS — Option giving the stress formula to be used. (Input)
Default: ISTRS = 0.
Stress formulas differ in the weighting given to each subject. The valid values of ISTRS are:
ISTRS | Weighting |
---|
0 | Inverse of within‑subject variance of observed dissimilarities about the predicted distances |
1 | Inverse of within‑subject sum of squared dissimilarities |
2 | Inverse of within‑subject variance of dissimilarities about the subject mean |
See the
Description section for further discussion of the stress formula weights.
ITRANS — Option giving the transformation to be used on the observed and predicted dissimilarities when computing the criterion function. (Input)
Default: ITRANS = 0.
ITRANS | Transformation |
---|
0 | Squared distances |
1 | Distances (that is, no transformation is performed) |
2 | Log of the distances |
See the
Description section for further discussion of stress formula transformations.
IPRINT — Printing option. (Input)
Default: IPRINT = 0.
IPRINT | Action |
---|
0 | No printing is performed. |
1 | Printing is performed, but the output is abbreviated. |
2 | All printing is performed. |
LDCFL — Leading dimension of CFL exactly as specified in the dimension statement in the calling program. (Input)
Default: LDCFL = size (CFL,1).
W — NSUB by NDIM matrix containing the subject weights. (Output when MODEL is not zero, not referenced otherwise)
W is not used and may be a 1x1 array if MODEL = 0.
LDW — Leading dimension of W exactly as specified in the dimension statement in the calling program. (Input)
Default: LDW = size (W,1).
FORTRAN 90 Interface
Generic: CALL MSIDV (NSTIM, X, NDIM, DIST, CFL, A, B, WT, STRS, STRSS,
RESID [, …])
Specific: The specific interface names are S_MSIDV and D_MSIDV.
FORTRAN 77 Interface
Single: CALL MSIDV (NSTIM, NSUB, X, ICNVT, MODEL, ISTRS, ITRANS, NDIM, IPRINT, DIST, CFL, LDCFL, W, LDW, A, B, WT, STRS, STRSS, RESID)
Double: The double precision name is DMSIDV.
Description
Routine MSIDV performs multidimensional scaling analysis according to an alternating optimization algorithm. Input to MSIDV consists of symmetric dissimilarity matrices measuring distances between the row and column objects. Optionally, similarities can be input, and these can be converted to dissimilarities by use of the ICNVT option. In MSIDV, the row and column objects (stimuli) must be identical. Dissimilarities in multidimensional scaling are used to position the objects within a d = NDIM dimensional space, where d is specified by the user. Optionally, in the individual differences scaling model (MODEL ≠ 0), the weight assigned to each dimension for each subject may be changed.
The Input Data
The data input in X must be in a special symmetric storage form. For this storage mode, the input array X contains only the upper triangular part of each dissimilarity matrix and does not contain the diagonal elements (which should all be zero anyway). Storage of symmetric data in X is as follows: X(1) corresponds to the (1, 2) element in the first matrix (which is a measure of the distance between objects 1 and 2), X(2) corresponds to the (1, 3) element, X(3) corresponds to the (2, 3) element, etc., until all t = q(q ‑ 1)/2 off‑diagonal elements in the first matrix are stored, where q = NSTIM. The t + 1 element in X contains the (1, 2) element in the second matrix, and so on.
Missing values are indicated in either of two ways: 1) The standard missing value indicator NaN (not a number), specified via routine
AMACH(6) (
Reference Material) may be used to indicate missing values, or 2) Negative elements of
X may be used to indicate missing observations. In either case, missing values are estimated as the mean dissimilarity for the subject and used as such when computing initial estimates, and they are omitted from the criterion function when optimal estimates are computed.
Routine MSIDV assumes a metric scaling model. When no transformation is specified (ITRANS = 1), then each datum (after transforming to dissimilarities) is a measure of distance plus a constant, αm. In this case, the constant (which is always called the “intercept”) is assumed to vary with subject and must first be added to the observed dissimilarities in order to obtain a metric. When a transformation is specified (ITRANS ≠ 1), the meaning of αm changes (with respect to metrics). Thus, when ITRANS = 1, the data is assumed to be interval (see the chapter introduction) while when ITRANS ≠ 1 ratio data is assumed. A scaling factor, the “slope”, is also always estimated for each subject.
The Criterion Function
When ISTRS = 1 or 2, the criterion function in MSIDV is given as
where δijm denotes the predicted distance between objects i and j on subject m,
denotes the corresponding dissimilarity (the observed distance), νm is the subject weight, f is one of the transformations f(x) = x2, f(x) = x, or f(x) = ln(x) specified by parameter ITRANS, αm is the intercept added to the transformed observation within each subject, and βm is the slope for the subject. For ISTRS = 0, the criterion function is given as
where nm is the number of nonmissing observations on the m‑th subject. Assuming fixed weights, the first derivatives of the criterion for ISTRS = 0 are identical to the first derivatives of the criterion when ISTRS = 1 or 2, but with weights
ISTRS can, thus, be thought of as changing the weighting to be used in the criterion function.
The transformation f(x) specified by parameter ITRANS is used to obtain constant within‑subject variance of the subject dissimilarities. If the variance of the log of the observed dissimilarities (about the predicted dissimilarities) is constant within subject, then the log transformation should be used. In this case, the variance of a dissimilarity should be proportional to its magnitude. Alternatively, the within‑subject variance may be constant when distances (or squared distances) are used.
The Distance Models
The distance models for δijm available in MSIDV are given by:
The Euclidian Model
The individual-differences model:
where Λ denotes the configuration (CFL) so that λik is the location of the i‑th stimulus in the k‑th dimension, where d is the number of dimensions, and where wmk is the weight assigned by the m‑th subject to the k‑th dimension (W).
The Subject Weights
Weights that are inversely proportional to the estimated variance of the dissimilarities (about their predicted values) within each subject may be preferred because such weights lead to normal distribution theory maximum likelihood estimates (when it is assumed that the dissimilarities are independently normally distributed with constant residual variance). The estimated (conditional) variance used as the inverse of the weight νm for the m‑th subject in MSIDV (when ISTRS = 0) is computed as
where the sum is over the observations for the subject, and where nm is the number of observed nonmissing dissimilarities for the subject. These weights are used in the first derivatives of the criterion function.
When ISTRS = 1, the within‑subject average sum of squared dissimilarities are used for the weights. They are computed as
Finally, when ISTRS = 2, the within‑subject variance of the dissimilarities is used for the weights. These are computed as follows
where
denotes the average of the transformed dissimilarities in the stratum.
The Optimization Procedure
Initial estimates of all parameters are obtained through methods discussed in routine
MSINI. After obtaining initial estimates, a modified Gauss‑Newton algorithm is used to obtain estimates for the parameters that optimize the criterion function. The parameters are optimized sequentially as follows:
1. Optimize the configuration estimates, Λ = CFL.
2. If required, estimate the optimal subject weights, wmk = W(m, k), one subject at a time.
3. Optimize the parameters αm = A(m) and βm = B(m), one subject at a time.
4. If convergence has not been reached, continue at Step 1.
An iteration is defined to be all of the Steps 1, 2, and 3. Convergence is assumed when the maximum absolute change in any parameter during an iteration is less than 10−4 or if there is no change in the criterion function during an iteration.
The Lp Gauss-Newton Description
A modified Gauss‑Newton algorithm is used in the estimation of all parameters. This algorithm, which is discussed in detail by Merle and Spath (1974), uses iteratively reweighted least squares on a Taylor series linearization of the parameters in δijm. During each iteration, the subject weights, which may depend upon the parameters in the model, are assumed to be fixed.
Standardization
All models available are overparameterized so the resulting parameter estimates are not uniquely defined. For example, in the Euclidean model, the columns of X can be translated or “rotated” (multiplied by an orthonormal matrix), and the resulting stress will not be changed. To eliminate lack of uniqueness due to translation, model estimates for the configuration are centered in all models. No attempt at eliminating the rotation problem is made, but note that rotation invariance is not a problem in many of the models given. With more general models than the Euclidean model, other kinds of overparameterization occur. Further restrictions on the parameters to eliminate this overparameterization are given below by the model transformation type specified by ITRANS. In the following, wlk ∈ W, where W is the matrix of subject weights. The restrictions to be applied by model transformation type are
1. For all models:
where q = NSTIM. i.e., center the columns of X.
(b) If W is in the model, scale the columns of W so that
2. For f(x) = x and f(x) = x2:
(a) Set bh = 1 if the data are matrix conditional and W is in the model or if the data are unconditional. (Matrix conditional with one matrix is considered to be unconditional data.)
(b) If W is not in the model, scale all elements in X so that
where η = NSUB is the number of matrices observed.
3. For f(x) = ln(x), substitute ah for bh (but set ah to 0 instead of 1) in all restrictions in Item 2.
Comments
1. Workspace may be explicitly provided, if desired, by use of M2IDV/DM2IDV. The reference is:
CALL M2IDV (NSTIM, NSUB, X, ICNVT, MODEL, ISTRS, ITRANS, NDIM, IPRINT, DIST, CFL, LDCFL, W, LDW, A, B, WT, STRS, STRSS, RESID, WK1, WK2, WK3, WK4, WK5, WK6, WK7, IWK8, WK10, WK11, WK12, WK13, ID, WKDER, DWKHES, DWKGRA, WKDDP, NCOM, DISP)
The additional arguments are as follows:
WK1 — Work vector of length equal to max(NSUB, NDIM * NSTIM, ND + 1)
WK2 — Work vector of length equal to NDIM * NDIM
WK3 — Work vector of length equal to NSTIM * NSTIM
WK4 — Work vector of length equal to NSTIM * NSTIM
WK5 — Work vector of length equal to NDSS * NDSS
WK6 — Work vector of length equal to 3 * NDSS
WK7 — Work vector of length equal to 5 * NDSS
IWK8 — Integer work vector of length equal to NDSS
WK10 — Work vector of length equal to NDIM * NDIM
WK11 — Work vector of length equal to NSUB * NSUB
WK12 — Work vector of length equal to NDIM * NDIM * max(NSUB, NSTIM)
WK13 — Work vector of length equal to NSTIM * NDIM
ID — Integer work vector of length equal to 4 * NDIM + 2
WKDER — Work vector of length equal to NPAR
DWKHES — Double precision work vector of length equal to NDIM * NDIM * NSTIM * NSTIM
DWKGRA — Double precision work vector of length equal to NPAR
WKDDP — Work vector of length equal to NC2
NCOM — Work vector of length equal to NSUB
DISP — Work vector of length equal to NSUB * NC2
where ND = NDIM * (NDIM + 1)/2, NC2 = NSTIM * (NSTIM ‑ 1)/2, NDSS = max(NDIM, NSTIM, NSUB), and where NPAR = NDIM * NSTIM + 2 * NSUB when MODEL = 0; otherwise NPAR = NDIM * NSTIM + (NDIM + 2) * NSUB.
2. Informational errors
Type | Code | Description |
---|
3 | 1 | At some point during the iterations there were too many step halvings. This is usually not serious. |
4 | 1 | The program cannot continue because a Hessian matrix is ill defined. A different model may be required. This error should only occur when there is serious numerical imprecision. |
4 | 2 | A dissimilarity matrix has every element missing. |
Examples
Example 1
The following example concerns some intercity distance rankings. The data are described by Young and Lewyckyj (1979, page 83). The driving mileages between various cities in the United States are ranked, yielding a symmetric ordinal dissimilarity matrix. These rankings are used as input to MSIDV. A Euclidean model is fit. The resulting two‑dimensional scaling yields results closely resembling the locations of the major cites in the U.S. Note that MSIDV assumes continuous, not ranked, data.
The original rankings are given as:
USE PGOPT_INT
USE MSIDV_INT
IMPLICIT NONE
INTEGER IPRINT, ISTRS, LDCFL, LDW, LNX, NDIM, NSTIM, NSUB, IPAGE
PARAMETER (IPRINT=2, ISTRS=1, LDCFL=10, LNX=45, NDIM=2, &
NSTIM=10, NSUB=1)
!
REAL A(1), B(1), CFL(LDCFL,NDIM), DIST(45), RESID(LNX), &
STRS(1), STRSS, WT(1), X(45,1)
!
DATA X/4, 22, 13, 8, 15, 12, 34, 31, 11, 24, 6, 21, 29, 18, 39, &
10, 9, 27, 25, 42, 20, 35, 32, 16, 28, 2, 44, 43, 36, 30, &
19, 33, 17, 45, 40, 7, 3, 5, 26, 23, 37, 14, 1, 41, 38/
! Call PGOPT to set page length for
! the plotting
IPAGE =50
CALL PGOPT (-2, IPAGE)
!
CALL MSIDV (NSTIM, X, NDIM, DIST, CFL, A, B, WT, STRS, &
STRSS, RESID, ISTRS=ISTRS, IPRINT=IPRINT)
!
END
Output
Initial parameter estimates.
CFL
1 2
1 -0.762 0.124
2 -0.451 -0.349
3 0.496 0.073
4 -0.151 0.651
5 1.237 0.392
6 -1.114 0.588
7 -1.077 -0.566
8 1.461 0.034
9 1.321 -0.614
10 -0.961 -0.333
Iteration history
Iter Source Stress Stress change Maximum Change
1 INIT STRSS 0.3755E-02
1 CONFIG STRSS 0.3399E-02 0.3559E-03 0.8062E-03
1 LINES STRSS 0.3142E-02 0.2564E-03 0.8062E-03
2 CONFIG STRSS 0.3068E-02 0.7382E-04 0.1022E-04
2 LINES STRSS 0.3047E-02 0.2156E-04 0.1022E-04
Plot(s) of the configuration matrix (CFL)
:::::::+::::::::::::X::+:::::::::::::::+:::::::::::::::+
. I .
0.600 X I +
. I .
. I .
. I .
0.450 + I +
. I X .
. I .
. I .
0.300 + I +
. I .
. I .
D . I .
i 0.150 + I +
m . X I .
e . I X .
n . I X .
s 0.000 +------------------------------------------------------+
i . I .
o . I .
n . I .
-0.150 + I +
2 . I .
. I .
. I .
-0.300 + I +
. X X I .
. I .
. I .
-0.450 + I +
. I .
. I .
. X I .
-0.600 + I +
:::::::+:::::::::::::::+:::::::::::::::+::::::::::X::::+
-0.80 0.00 0.80 1.60
Dimension 1
Final parameter estimates.
NCOM
45
CFL
1 2
1 -0.738 0.095
2 -0.447 -0.337
3 0.497 0.077
4 -0.153 0.661
5 1.237 0.399
6 -1.132 0.609
7 -1.074 -0.571
8 1.445 0.035
9 1.325 -0.624
10 -0.960 -0.343
A
-0.04255
B
0.4019
WT
0.01248
STRS
0.003047
STRESS = 3.04681E-03
Residuals
Subject Row Stimulus Column Stimulus Residual
1 2 1 -0.0436
1 3 1 0.1230
1 3 2 -0.1422
1 4 1 -0.1318
1 4 2 -.0697
1 4 3 -0.0581
1 5 1 0.0950
1 5 2 0.0631
1 5 3 -0.0456
1 5 4 0.0639
1 6 1 -0.0742
1 6 2 0.1268
1 6 3 0.0681
1 6 4 0.1212
1 6 5 -0.0495
1 7 1 -0.0376
1 7 2 -0.0216
1 7 3 -0.0736
1 7 4 -0.0119
1 7 5 0.0464
1 7 6 0.0558
1 8 1 -0.1177
1 8 2 0.0169
1 8 3 0.0480
1 8 4 -0.0173
1 8 5 -0.0223
1 8 6 0.0178
1 8 7 -0.0047
1 9 1 -0.0185
1 9 2 0.0373
1 9 3 0.0872
1 9 4 0.0618
1 9 5 0.0335
1 9 6 -0.0913
1 9 7 0.0202
1 9 8 -0.0671
1 10 1 -0.0415
1 10 2 -0.0276
1 10 3 0.0869
1 10 4 0.1342
1 10 5 -0.1565
1 10 6 -0.0522
1 10 7 0.0179
1 10 8 0.0701
1 10 9 -0.0191
Residual Plot
0.160 :::::::+::::::::::::::+::::::::::::::+::::::::::::::+::::
. .
. .
. X .
. X .
0.120 + X X +
. .
. .
. X .
. X X .
0.080 + +
. X X .
. X X X .
. X .
. X X .
0.040 + X +
R . X .
e . X .
s X X X .
i . .
d 0.000 +-------------------------------------------------------+
u . X X .
a . X X X .
l .X X .
s . .
-0.040 +XX X +
. X X .
. X X .
. X .
. X X X .
-0.080 + +
. X
. .
. .
. .
-0.120 + X +
. X .
. .
. X .
. .
-0.160 :::::::+::::::::::::::+::::::::::::::+::X:::::::::::+::::
1 3 5 7
Predicted Distances
Example 2
The second example involves three subjects’ assessment of the dissimilarity between rectangles that vary in height and width. An analysis is performed in k = 2 dimensions using the individual‑differences scaling model. The estimated subject weights, wmk, indicate how each subject weight the dimensions. The raw data are given as follows:
USE MSIDV_INT
IMPLICIT NONE
INTEGER ICNVT, IPRINT, ISTRS, ITRANS, LDCFL, LDW, LNX, MODEL, &
NDIM, NSTIM, NSUB
PARAMETER (IPRINT=1, LDCFL=9, LDW=3, LNX=108, MODEL=1, NDIM=2, &
NSTIM=9, NSUB=3)
!
REAL A(NSUB), B(NSUB), CFL(LDCFL,NDIM), DIST(LNX), &
RESID(LNX), STRS(NSUB), STRSS, W(LDW,NDIM), WT(NSUB), &
X(36,NSUB)
!
DATA X/1.00, 1.41, 1.00, 2.24, 2.00, 1.00, 2.00, 2.24, 1.41, &
1.00, 2.24, 2.83, 2.24, 2.00, 1.00, 1.41, 2.24, 2.00, 2.24, &
1.41, 1.00, 1.00, 2.00, 2.24, 2.83, 2.24, 2.00, 1.00, 1.00, &
1.41, 1.00, 1.41, 1.00, 1.41, 1.00, 1.41, 1.50, 1.68, 0.75, &
2.12, 1.50, 0.75, 1.50, 2.12, 1.68, 1.50, 2.12, 3.35, 3.09, &
3.00, 1.50, 1.68, 3.09, 3.00, 3.09, 1.68, 0.75, 1.50, 3.00, &
3.09, 3.35, 2.12, 1.50, 0.75, 0.75, 1.68, 1.50, 1.68, 0.75, &
1.68, 1.50, 1.68, 0.50, 2.06, 2.00, 4.03, 4.00, 2.00, 4.00, &
4.03, 2.06, 0.50, 4.03, 4.12, 2.24, 1.00, 0.50, 2.06, 2.24, &
1.00, 2.24, 2.06, 2.00, 0.50, 1.00, 2.24, 4.12, 4.03, 4.00, &
2.00, 2.00, 2.06, 0.50, 2.06, 2.00, 2.06, 0.50, 2.06/
!
CALL MSIDV (NSTIM, X, NDIM, DIST, CFL, A, B, WT, STRS, &
STRSS, RESID, MODEL=MODEL, IPRINT=IPRINT, W=W)
!
END
Output
Iteration history
Iter Source Stress Stress change Maximum Change
1 INIT STRSS -0.3590E+03
1 CONFIG STRSS -0.3590E+03 0.0000E+00 0.5708E-03
1 SUB WT STRSS -0.3590E+03 0.0000E+00 0.1581E-02
1 LINES STRSS -0.3590E+03 0.0000E+00 0.2727E-02
2 CONFIG STRSS -0.3590E+03 0.0000E+00 0.1442E-06
2 SUB WT STRSS -0.3590E+03 0.0000E+00 0.7165E-04
2 LINES STRSS -0.3590E+03 0.0000E+00 0.7165E-04
Final parameter estimates.
NCOM
1 2 3
36 36 36
CFL
1 2
1 1.225 0.000
2 1.225 -1.225
3 0.000 -1.225
4 -1.225 -1.225
5 -1.225 0.000
6 -1.225 1.225
7 0.000 1.225
8 1.225 1.225
9 0.000 0.000
W
1 2
1 1.000 1.000
2 0.342 1.372
3 1.411 0.089
A
1 2 3
-0.002773 0.001941 0.000055
B
1 2 3
0.2229 0.2587 0.2963
WT
1 2 3
1000.0 1000.0 1000.0
STRS
1 2 3
-119.7 -119.7 -119.7
STRESS = -359.018