Title: | A General Framework for Multivariate Analysis with Optimal Scaling |
---|---|
Description: | Contains various functions for optimal scaling. One function performs optimal scaling by maximizing an aspect (i.e. a target function such as the sum of eigenvalues, sum of squared correlations, squared multiple correlations, etc.) of the corresponding correlation matrix. Another function performs implements the LINEALS approach for optimal scaling by minimization of an aspect based on pairwise correlations and correlation ratios. The resulting correlation matrix and category scores can be used for further multivariate methods such as structural equation models. |
Authors: | Patrick Mair [cre, aut], Jan De Leeuw [aut] |
Maintainer: | Patrick Mair <[email protected]> |
License: | GPL-2 |
Version: | 1.0-6 |
Built: | 2024-12-11 19:24:14 UTC |
Source: | https://github.com/r-forge/psychor |
This function performs optimal scaling by maximizing a certain aspect of the correlation matrix.
corAspect(data, aspect = "aspectSum", level = "nominal", itmax = 100, eps = 1e-06, ...)
corAspect(data, aspect = "aspectSum", level = "nominal", itmax = 100, eps = 1e-06, ...)
data |
Data frame or matrix |
aspect |
Function on the correlation matrix (see details) |
level |
Vector with scale level of the variables ("nominal" or "ordinal"). If all variables have the same scale level, only one value can be provided |
itmax |
Maximum number of iterations |
eps |
Convergence criterion |
... |
Additional parameters for aspect |
We provide various pre-specified aspects:
"aspectAbs"
takes the sum of the absolute values of the correlations to the power pow
. The optional argument pow = 1
.
"aspectSum"
the sum of the correlations to the power of pow
. Again, as default pow = 1
.
"aspectDeterminant"
computes the determinant of the correlation matrix; no additional arguments needed.
"aspectEigen"
the sum of the first p eigenvalues (principal component analysis). By default the argument p = 1
.
"aspectSMC"
the squared multiple correlations (multiple regression) with respect to a target variable. By default targvar = 1
which implies that the first variable of the dataset is taken as response.
"aspectSumSMC"
uses the sum of all squared multiple correlations (path analysis).
Alternatively, the user can write his own aspect, e.g. the function myAspect(r, ...)
with r as the correlation matrix. This function must return a list with the function value as first list element and the first derivative with respect to r as the second. Then aspect = myAspect
and additional arguments go into ...
in maxAspect()
.
loss |
Final value of the loss function |
catscores |
Resulting category scores (after optimal scaling) |
cormat |
Correlation matrix based on the scores |
eigencor |
Eigenvalues of the correlation matrix |
indmat |
Indicator matrix (dummy coded) |
scoremat |
Transformed data matrix (i.e with category scores resulting from optimal scaling) |
burtmat |
Burt matrix |
niter |
Number of iterations |
Jan de Leeuw, Patrick Mair
Mair, P., & De Leeuw, J. (2010). Scaling variables by optimizing correlational and non-correlational aspects in R. Journal of Statistical Software, 32(9), 1-23. doi:10.18637/jss.v032.i09
de Leeuw, J. (1988). Multivariate analysis with optimal scaling. In S. Das Gupta and J.K. Ghosh, Proceedings of the International Conference on Advances in Multivariate Statistical Analysis, pp. 127-160. Calcutta: Indian Statistical Institute.
## maximizes the first eigenvalue data(galo) res.eig1 <- corAspect(galo[,1:4], aspect = "aspectEigen") res.eig1 summary(res.eig1) ## maximizes the first 2 eigenvalues res.eig2 <- corAspect(galo[,1:4], aspect = "aspectEigen", p = 2) res.eig2 ## maximizes the absolute value of cubic correlations res.abs3 <- corAspect(galo[,1:4], aspect = "aspectAbs", pow = 3) res.abs3 ## maximizes the sum of squared correlations res.cor2 <- corAspect(galo[,1:4], aspect = "aspectSum", pow = 2) res.cor2 ## maximizes the determinant res.det <- corAspect(galo[,1:4], aspect = "aspectDeterminant") res.det ## maximizes SMC, IQ as target variable res.smc <- corAspect(galo[,1:4], aspect = "aspectSMC", targvar = 2) res.smc ## maximizes the sum of SMC res.sumsmc <- corAspect(galo[,1:4], aspect = "aspectSumSMC") res.sumsmc
## maximizes the first eigenvalue data(galo) res.eig1 <- corAspect(galo[,1:4], aspect = "aspectEigen") res.eig1 summary(res.eig1) ## maximizes the first 2 eigenvalues res.eig2 <- corAspect(galo[,1:4], aspect = "aspectEigen", p = 2) res.eig2 ## maximizes the absolute value of cubic correlations res.abs3 <- corAspect(galo[,1:4], aspect = "aspectAbs", pow = 3) res.abs3 ## maximizes the sum of squared correlations res.cor2 <- corAspect(galo[,1:4], aspect = "aspectSum", pow = 2) res.cor2 ## maximizes the determinant res.det <- corAspect(galo[,1:4], aspect = "aspectDeterminant") res.det ## maximizes SMC, IQ as target variable res.smc <- corAspect(galo[,1:4], aspect = "aspectSMC", targvar = 2) res.smc ## maximizes the sum of SMC res.sumsmc <- corAspect(galo[,1:4], aspect = "aspectSumSMC") res.sumsmc
At 4 points in time the objects (n = 1204 adolescents) were asked to rate cigarette, marijuana, and alcohol consumption on a 5-point scale.
galo
galo
Data frame with marijuana (POT), cigarette (CIG), and alcohol (ALC) consumption.
Category labels:
1 ... never consumed
2 ... previous but no use over the last 6 months
3 ... current use of less than 4 times a month
4 ... current use of between 4 and 29 times a month
5 ... current use of 30 or more times a month
Duncan, S. C., Duncan, T. E., and Hops, H. (1998). Progressions of alcohol, cigarette, and marijuana use in adolescence. Journal of Bahavioral Medicine, 21, 375-388.
data(duncan) duncan
data(duncan) duncan
The objects (individuals) are 1290 school children in the sixth grade of elementary school in the city of Groningen (Netherlands) in 1959.
galo
galo
Data frame with the five variables Gender, IQ, Advice, SES (fathers occupation) and School. IQ (original range 60 to 144) has been categorized into 9 ordered categories and the schools are enumerated from 1 to 37.
SES:
LoWC = Lower white collar; MidWC = Middle white collar; Prof = Professional, Managers; Shop = Shopkeepers; Skil = Schooled labor; Unsk = Unskilled labor.
Advice:
Agr = Agricultural; Ext = Extended primary education; Gen = General; Grls = Secondary school for girls; Man = Manual, including housekeeping; None = No further education; Uni = Pre-University.
Peschar, J.L. (1975). School, Milieu, Beroep. Groningen: Tjeek Willink.
data(galo) galo
data(galo) galo
This function performs optimal scaling in order to achieve linearizing transformations for each bivariate regression.
lineals(data, level = "nominal", itmax = 100, eps = 1e-06)
lineals(data, level = "nominal", itmax = 100, eps = 1e-06)
data |
Data frame or matrix |
level |
Vector with scale level of the variables ("nominal" or "ordinal"). If all variables have the same scale level, only one value can be provided |
itmax |
Maximum number of iterations |
eps |
Convergence criterion |
This function can be used as a preprocessing tool for categorical and ordinal data for subsequent factor analytical techniques such as structural equation models (SEM) using the resulting correlation matrix based on the transformed data. The estimates of the corresponding structural parameters are consistent if all bivariate regressions can be linearized.
loss |
Final value of the loss function |
catscores |
Resulting category scores (after optimal scaling) |
cormat |
Correlation matrix based on the scores |
cor.rat |
Matrix with correlation ratios |
indmat |
Indicator matrix (dummy coded) |
scoremat |
Transformed data matrix (i.e with category scores resulting from optimal scaling) |
burtmat |
Burt matrix |
niter |
Number of iterations |
Jan de Leeuw, Patrick Mair
Mair, P., & De Leeuw, J. (2008). Scaling variables by optimizing correlational and non-correlational aspects in R. Journal of Statistical Software, 32(9), 1-23. doi:10.18637/jss.v032.i09
de Leeuw, J. (1988). Multivariate analysis with linearizable regressions. Psychometrika, 53, 437-454.
data(galo) res.lin <- lineals(galo) summary(res.lin)
data(galo) res.lin <- lineals(galo) summary(res.lin)
This method provides regression plots and transformation plots for objects of class "aspect"
, i.e. solutions of corAspect
and lineals
## S3 method for class 'aspect' plot(x, plot.type, plot.var = c(1,2), xlab, ylab, main, type, ...)
## S3 method for class 'aspect' plot(x, plot.type, plot.var = c(1,2), xlab, ylab, main, type, ...)
x |
Object of class |
plot.type |
Type of plot to be produced (details see below): |
plot.var |
For |
xlab |
Label x-axis. |
ylab |
Label y-axis. |
main |
Plot title. |
type |
Whether points, lines or both should be plotted. |
... |
Additional graphical parameters. |
The regression plot ("regplot"
) provides two plots. First, the unscaled solution is plotted. A frequency grid for the categories of the first variable (var1; x-axis) and the categories of the second variable (var2; y-axis) is produced. The regression line is based on the category weighted means of the relative frequencies: the blue line on the var1 means on the x-axis and the var2 categories on the y-axis, the red line is based on the var1 categories on the x-axis and the var2 means on the y-axis. In a second device the scaled solution is plotted. The frequency grid is determined by the var1 scores (x-axis) and the var2 scores(y-axis). Now, instead of the var1/var2 categories, the var1 scores (blue line y-axis) and the row scores (red line x-axis) are used.
The transformation plot ("transplot"
) plots the raw categories against the computed scores.
##Regression plots using galo data data(galo) res <- lineals(galo[,1:4]) #plot(res, plot.type = "regplot", plot.var = c("advice","SES")) #plot(res, plot.type = "transplot")
##Regression plots using galo data data(galo) res <- lineals(galo[,1:4]) #plot(res, plot.type = "regplot", plot.var = c("advice","SES")) #plot(res, plot.type = "transplot")
The dataset is about the use of public Internet terminals. For this package we extracted a subset of 8 items.
wurzer
wurzer
A data frame (n = 215) with the following items:
Do you know at least one place where you can finnd such a terminal? (yes/no)
Have you already used such a terminal? (yes/no)
How often do you use the Internet on each of the following locations: home, work, cafe, terminal, cellphone? (5-point scales; see below)
Which of the following descriptions fits you best? (I'm here on vacation/I am from here/I'm here on business travel)
The 5-point items we have the following categories: daily (1), almost daily (2), several times a week (3), several times a month (3), once a month (4), less frequently (5).
Wurzer, M. (2006). An Application of Configural Frequency Analysis: Evaluation of the Usage of Internet Terminals. Master's thesis, University of Vienna, Austria.
data(wurzer) wurzer
data(wurzer) wurzer