Title: | Cluster Optimized Proximity Scaling |
---|---|
Description: | Multidimensional scaling (MDS) methods that aim at pronouncing the clustered appearance of the configuration (Rusch, Mair & Hornik, 2021, <doi:10.1080/10618600.2020.1869027>). They achieve this by transforming proximities/distances with explicit power functions and penalizing the fitting criterion with a clusteredness index, the OPTICS Cordillera (Rusch, Hornik & Mair, 2018, <doi:10.1080/10618600.2017.1349664>). There are two variants: One for finding the configuration directly (COPS-C) for any Minkowski distance with given explicit power transformations and implicit ratio, interval and nonmetric optimal scaling transformations (Borg & Groenen, 2005, ISBN:978-0-387-28981-6), and one for using the augmented fitting criterion to find optimal hyperparameters for the explicit transformations (P-COPS). The package contains various functions, wrappers, methods and classes for fitting, plotting and displaying a large number of different MDS models (most of the functionality in smacofx) in the COPS framework. The package further contains a function for pattern search optimization, the ``Adaptive Luus-Jaakola Algorithm'' (Rusch, Mair & Hornik, 2021,<doi:10.1080/10618600.2020.1869027>) and a functions to calculate the phi-distances for count data or histograms. |
Authors: | Thomas Rusch [aut, cre] , Patrick Mair [aut] , Kurt Hornik [ctb] |
Maintainer: | Thomas Rusch <[email protected]> |
License: | GPL-2 | GPL-3 |
Version: | 1.14-1 |
Built: | 2025-01-07 13:10:10 UTC |
Source: | https://github.com/r-forge/stops |
Matrix of Jaccard distances between 70 countries (Hungary and Greece were combined to be the same observation) based on their binary time series of having had a banking crises in a year from 1800 to 2010 or not. See data(bankingCrises) in package Ecdat for more info. The last column is Reinhart & Rogoffs classification as a low (3), middle- (2) or high-income country (1).
A 69 x 70 matrix.
data(bankingCrises) in library(Ecdat)
This uses an approximation to power stress that makes use of smacofx as workhorse. Free parameters are kappa, lambda and nu
cop_apstress( dis, theta = c(1, 1, 1), type = "ratio", ndim = 2, weightmat = 1 - diag(nrow(dis)), init = NULL, itmaxi = 1000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, normed = TRUE, scale = "sd" )
cop_apstress( dis, theta = c(1, 1, 1), type = "ratio", ndim = 2, weightmat = 1 - diag(nrow(dis)), init = NULL, itmaxi = 1000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, normed = TRUE, scale = "sd" )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
the theta vector of parameters to optimize over. Must be of length three, with the first the kappa argument, the second the lambda argument and the third the nu argument. One cannot supply upsilon and tau as of yet. Defaults to 1 1 1. |
type |
MDS type. |
ndim |
number of dimensions of the target space |
weightmat |
(optional) a binary matrix of nonnegative weights. |
init |
(optional) initial configuration |
itmaxi |
number of iterations. default is 1000. |
... |
additional arguments to be passed to the fitting procedure |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the cordillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
normed |
should the cordillera be normed; defaults to TRUE |
scale |
should the configuration be scale adjusted |
A list with the components
stress: the stress-1 of the configuration
stress.m: default normalized stress (sqrt(stress-1))
copstress: the weighted loss value
OC: the OPTICS cordillera value
parameters: the theta parameters used for fitting (kappa, lambda, nu)
fit: the returned object of the fitting procedure (typically of class smacofB or smacofP)
cordillera: the cordillera object
The free parameter that pcops optimizes over is lambda for power transformations of the observed proximities.
cop_cmdscale( dis, theta = 1, type = "ratio", weightmat = NULL, ndim = 2, init = NULL, itmaxi = 1000, add, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
cop_cmdscale( dis, theta = 1, type = "ratio", weightmat = NULL, ndim = 2, init = NULL, itmaxi = 1000, add, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
the theta vector of powers; this must be a scalar of the lambda transformation for the observed proximities. |
type |
MDS type. Ignored here. |
weightmat |
(optional) a matrix of nonnegative weights |
ndim |
number of dimensions of the target space |
init |
(optional) initial configuration |
itmaxi |
number of iterations. No effect here. |
add |
should the dissimilarities be made Euclidean? Defaults to TRUE. |
... |
additional arguments to be passed to the fitting procedure smacofx::cmdscale. Note we always use eig=TRUE and that can't be changed (we need the GOF). Also default if nothing is supplied is to use add=TRUE which in my opinion one always should to avoid negative eigenvalues. |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the corrdillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
scale |
should the configuration be scale adjusted |
normed |
should the cordillera be normed; defaults to TRUE |
A list with the components
stress: the badness-of-fit value (this isn't stress here but 1-(sum_ndim(max(eigenvalues,0))/sum_n(max(eigenvalues,0)), 1-GOF[2])
stress.m: default normalized stress (manually calculated)
copstress: the weighted loss value
OC: the Optics cordillera value
parameters: the parameters used for fitting (lambda)
fit: the returned object of the fitting procedure, which is cmdscalex object with some extra slots for the parameters and stresses
cordillera: the cordillera object
The free parameter is lambda for power transformations the observed proximities. The fitted distances power is internally fixed to 1 and the power for the weights=delta is -2. Allows for a weight matrix because of smacof.
cop_elastic( dis, theta = 1, type = "ratio", ndim = 2, weightmat = 1 - diag(nrow(dis)), init = NULL, itmaxi = 1000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, normed = TRUE, scale = "sd" )
cop_elastic( dis, theta = 1, type = "ratio", ndim = 2, weightmat = 1 - diag(nrow(dis)), init = NULL, itmaxi = 1000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, normed = TRUE, scale = "sd" )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
the theta vector of powers; this must be a scalar of the lambda transformation for the observed proximities. Defaults to 1. |
type |
MDS type. |
ndim |
number of dimensions of the target space |
weightmat |
(optional) a matrix of nonnegative weights (NOT the elscal weights) |
init |
(optional) initial configuration |
itmaxi |
number of iterations. default is 1000. |
... |
additional arguments to be passed to the fitting procedure |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the corrdillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
normed |
should the cordillera be normed; defaults to TRUE |
scale |
should the configuration be scale adjusted |
A list with the components
stress: the stress-1
stress.m: default normalized stress
copstress: the weighted loss value
OC: the Optics cordillera value
parameters: the parameters used for fitting (lambda)
fit: the returned object of the fitting procedure (plus a slot for the orginal data $deltaorig)
cordillera: the cordillera object
PCOPS version of elastic scaling with powers
cop_powerelastic( dis, theta = c(1, 1), type = "ratio", weightmat = 1 - diag(nrow(dis)), init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
cop_powerelastic( dis, theta = c(1, 1), type = "ratio", weightmat = 1 - diag(nrow(dis)), init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
the theta vector of powers; a vector of length two where the first element is kappa (for the fitted distances), the second lambda (for the observed proximities). If a scalar for the free parameters is given it is recycled. Defaults to 1 1. |
type |
MDS type. Defaults to "ratio". |
weightmat |
(optional) a matrix of nonnegative weights |
init |
(optional) initial configuration |
ndim |
number of dimensions of the target space |
itmaxi |
number of iterations. default is 10000. |
... |
additional arguments to be passed to the fitting procedure |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the cordillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
scale |
should the configuration be scale adjusted |
normed |
should the cordillera be normed; defaults to TRUE |
A list with the components
stress: the stress-1 value (sqrt(stress.m))
stress.m: default normalized stress
copstress: the weighted loss value
OC: the OPTICScordillera value
parameters: the parameters used for fitting (kappa, lambda)
fit: the returned object of the fitting procedure
cordillera: the cordillera object
This is power stress with free kappa and lambda but nu is internally fixed to 1, so no weight transformation.
cop_powermds( dis, theta = c(1, 1), type = "ratio", weightmat = 1 - diag(nrow(dis)), init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
cop_powermds( dis, theta = c(1, 1), type = "ratio", weightmat = 1 - diag(nrow(dis)), init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
the theta vector of powers; a vector of length 2 where the first element is kappa (for the fitted distances), the second lambda (for the observed proximities). If a scalar is given it is recycled. Defaults to 1,1. |
type |
MDS type. Defaults to ratio. |
weightmat |
(optional) a matrix of nonnegative weights |
init |
(optional) initial configuration |
ndim |
number of dimensions of the target space |
itmaxi |
number of iterations. default is 10000. |
... |
additional arguments to be passed to the fitting procedure |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the cordillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
scale |
should the configuration be scale adjusted |
normed |
should the cordillera be normed; defaults to TRUE |
A list with the components
stress: the stress-1 value
stress.m: default normalized stress
copstress: the weighted loss value
OC: the Optics cordillera value
parameters: the parameters used for fitting (kappa, lambda)
fit: the returned object of the fitting procedure
cordillera: the cordillera object
This is power stress with free kappa and lambda but nu is fixed to -1 and the weights are delta.
cop_powersammon( dis, theta = c(1, 1), type = "ratio", weightmat = 1 - diag(nrow(dis)), init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
cop_powersammon( dis, theta = c(1, 1), type = "ratio", weightmat = 1 - diag(nrow(dis)), init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
the theta vector of powers; a vector of length two where the first element is kappa (for the fitted distances), the second lambda (for the observed proximities). If a scalar is given it is recycled for the free parameters. Defaults to 1 1. |
type |
MDS type. defaults to ratio. |
weightmat |
(optional) a matrix of nonnegative weights |
init |
(optional) initial configuration |
ndim |
number of dimensions of the target space |
itmaxi |
number of iterations (of powerstress). default is 10000. |
... |
additional arguments to be passed to the fitting procedure |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the cordillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
scale |
should the configuration be scale adjusted |
normed |
should the cordillera be normed; defaults to TRUE |
A list with the components
stress: the stress1 value (sqrt(stress.m))
stress.m: default normalized stress
copstress: the weighted loss value
OC: the Optics cordillera value
parameters: the explicit parameters used for fitting (kappa, lambda)
fit: the returned object of the fitting procedure
cordillera: the cordillera object
Power stress with free kappa and lambda and rho (the theta argument) and ratio and interval optimal scaling.
cop_powerstress( dis, theta = c(1, 1, 1), type = "ratio", weightmat = 1 - diag(nrow(dis)), init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
cop_powerstress( dis, theta = c(1, 1, 1), type = "ratio", weightmat = 1 - diag(nrow(dis)), init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
the theta vector of powers; the first is kappa (for the fitted distances), the second lambda (for the observed proximities), the third nu (for the weights). If a scalar is given it is recycled. Defaults to 1 1 1. |
type |
MDS type. Default is ratio. |
weightmat |
(optional) a matrix of nonnegative weights |
init |
(optional) initial configuration |
ndim |
number of dimensions of the target space |
itmaxi |
number of iterations. default is 10000. |
... |
additional arguments to be passed to the fitting procedure |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the cordillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
scale |
should the configuration be scale adjusted |
normed |
should the cordillera be normed; defaults to TRUE |
A list with the components
stress: the stress
stress.m: default normalized stress
copstress: the weighted loss value
OC: the Optics cordillera value
parameters: the parameters used for fitting (kappa, lambda, nu)
fit: the returned object of the fitting procedure plus a slot for the original data $deltaorig
cordillera: the cordillera object
This is a power stress where kappa and lambda are free to vary but restricted to be equal, so the same exponent will be used for distances and dissimilarities. nu (for the weights) is also free.
cop_rpowerstress( dis, theta = c(1, 1, 1), type = "ratio", weightmat = NULL, init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
cop_rpowerstress( dis, theta = c(1, 1, 1), type = "ratio", weightmat = NULL, init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
the theta vector of powers; the first two arguments are for kappa and lambda and should be equal (for the fitted distances and observed proximities), the third nu (for the weights). Internally the kappa and lambda are equated based on theta[1]. If a scalar is given it is recycled (so all elements of theta are equal); if a vector of length 2 is given, it gets expanded to c(theta[1],theta[1],theta[2]). Defaults to 1 1 1. |
type |
MDS type. Defaults to "ratio". |
weightmat |
(optional) a matrix of nonnegative weights |
init |
(optional) initial configuration |
ndim |
number of dimensions of the target space |
itmaxi |
number of iterations. default is 10000. |
... |
additional arguments to be passed to the fitting procedure |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the cordillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
scale |
should the configuration be scale adjusted |
normed |
should the cordillera be normed; defaults to TRUE |
A list with the components
stress: the stress1 value (sqrt(stress.m))
stress.m: default normalized stress
copstress: the weighted loss value
OC: the Optics cordillera value
parameters: the explicit parameters used for fitting (kappa=lambda, nu)
fit: the returned object of the fitting procedure
cordillera: the cordillera object
Free parameter is kappa=2r for the fitted distances.
cop_rstress( dis, theta = 1, type = "ratio", weightmat = 1 - diag(nrow(dis)), init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
cop_rstress( dis, theta = 1, type = "ratio", weightmat = 1 - diag(nrow(dis)), init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
the theta vector of powers; this must be a scalar of the kappa=2*r transformation for the fitted distances proximities. Defaults to 1. Note that what is returned is r, not kappa. |
type |
MDS type. Defaults to "ratio". |
weightmat |
(optional) a matrix of nonnegative weights |
init |
(optional) initial configuration |
ndim |
number of dimensions of the target space |
itmaxi |
number of iterations. default is 10000. |
... |
additional arguments to be passed to the fitting procedure |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the cordillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
scale |
should the configuration be scale adjusted |
normed |
should the cordillera be normed; defaults to TRUE |
A list with the components
stress: the stress-1 value
stress.m: default normalized stress
copstress: the weighted loss value
OC: the Optics cordillera value
parameters: the parameters used for fitting (r)
fit: the returned object of the fitting procedure
cordillera: the cordillera object
Uses smacofx::sammon wrapper for MASS::sammon. The free parameter is lambda for power transformations of the observed proximities.
cop_sammon( dis, theta = 1, type = "ratio", ndim = 2, init = NULL, weightmat = NULL, itmaxi = 1000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
cop_sammon( dis, theta = 1, type = "ratio", ndim = 2, init = NULL, weightmat = NULL, itmaxi = 1000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
the theta vector of powers; this must be a scalar of the lambda transformation for the observed proximities. Defaults to 1. |
type |
MDS type. Only "ratio" here. |
ndim |
number of dimensions of the target space |
init |
(optional) initial configuration |
weightmat |
(optional) a matrix of nonnegative weights. |
itmaxi |
number of iterations. Default is 1000. |
... |
additional arguments to be passed to the fitting procedure |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the corrdillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
scale |
if TRUE the configuration is scale adjusted |
normed |
should the cordillera be normed; defaults to TRUE |
A list with the components
stress: the stress-1
stress.m: default normalized stress (stress-1^2)
copstress: the weighted loss value
OC: the Optics cordillera value
parameters: the parameters used for fitting (lambda)
fit: the returned object of the fitting procedure smacofx::sammon
cordillera: the cordillera object
Uses smacofSym, so it can deal with a weightmatrix and different types. The free parameter is lambda for power transformations of the observed proximities.
cop_sammon2( dis, theta = 1, type = "ratio", ndim = 2, weightmat = 1 - diag(nrow(dis)), init = NULL, itmaxi = 1000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, normed = TRUE, scale = "sd" )
cop_sammon2( dis, theta = 1, type = "ratio", ndim = 2, weightmat = 1 - diag(nrow(dis)), init = NULL, itmaxi = 1000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, normed = TRUE, scale = "sd" )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
theta the theta vector of powers; this must be a scalar of the lambda transformation for the observed proximities. Defaults to 1. |
type |
MDS type. Default is ratio. |
ndim |
number of dimensions of the target space |
weightmat |
(optional) a matrix of nonnegative weights (NOT the sammon weights) |
init |
(optional) initial configuration |
itmaxi |
number of iterations. default is 1000. |
... |
additional arguments to be passed to the fitting procedure |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the corrdillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
normed |
should the cordillera be normed; defaults to TRUE |
scale |
should the configuration be scale adjusted |
A list with the components
stress: the stress-1 value
stress.m: default normalized stress
copstress: the weighted loss value
OC: the Optics cordillera value
parameters: the parameters used for fitting (lambda)
fit: the returned object of the fitting procedure
cordillera: the cordillera object
The free parameter is lambda for power transformations the observed proximities. The fitted distances power is internally fixed to 1 and the power for the weights is 1.
cop_smacofSphere( dis, theta = 1, type = "ratio", ndim = 2, weightmat = NULL, init = NULL, itmaxi = 5000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, normed = TRUE, scale = "sd", stresstype = "default" )
cop_smacofSphere( dis, theta = 1, type = "ratio", ndim = 2, weightmat = NULL, init = NULL, itmaxi = 5000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, normed = TRUE, scale = "sd", stresstype = "default" )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
the theta vector of powers; this must be a scalar of the lambda transformation for the observed proximities. Defaults to 1. |
type |
MDS type |
ndim |
number of dimensions of the target space |
weightmat |
(optional) a matrix of nonnegative weights |
init |
(optional) initial configuration |
itmaxi |
number of iterations. default is 1000. |
... |
additional arguments to be passed to the fitting procedure |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the corrdillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
normed |
should the cordillera be normed; defaults to TRUE |
scale |
should the configuration be scale adjusted |
stresstype |
which stress to report. Only takes smacofs default stress currrently. |
A list with the components
stress: the stress-1 value
stress.m: default normalized stress
copstress: the weighted loss value
OC: the Optics cordillera value
parameters: the parameters used for fitting (lambda)
fit: the returned object of the fitting procedure
cordillera: the cordillera object
The free parameter is lambda for power transformations the observed proximities. The fitted distances power is internally fixed to 1 and the power for the weights is 1.
cop_smacofSym( dis, theta = 1, type = "ratio", ndim = 2, weightmat = 1 - diag(nrow(dis)), init = NULL, itmaxi = 1000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, normed = TRUE, scale = "sd", stresstype = "default" )
cop_smacofSym( dis, theta = 1, type = "ratio", ndim = 2, weightmat = 1 - diag(nrow(dis)), init = NULL, itmaxi = 1000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, normed = TRUE, scale = "sd", stresstype = "default" )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
the theta vector; should be a scalar for the lambda (proximity) transformation. Defaults to 1. |
type |
MDS type. |
ndim |
number of dimensions of the target space |
weightmat |
(optional) a matrix of nonnegative weights |
init |
(optional) initial configuration |
itmaxi |
number of iterations. default is 1000 |
... |
additional arguments to be passed to the fitting procedure |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the cordillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
normed |
should the cordillera be normed; defaults to TRUE |
scale |
should the configuration be scale adjusted |
stresstype |
which stress to report. Only takes smacofs default stress currrently. |
A list with the components
stress: the stress
stress.m: default normalized stress
copstress: the weighted loss value
OC: the Optics cordillera value
parameters: the parameters used for fitting (lambda)
fit: the returned object of the fitting procedure
cordillera: the cordillera object
Free parameter is lambda for the observed proximities. Fitted distances are transformed with power 2, weights have exponent of 1. Note that the lambda here works as a multiplicator of 2 (as sstress has f(delta^2)).
cop_sstress( dis, theta = 1, type = "ratio", weightmat = 1 - diag(nrow(dis)), init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
cop_sstress( dis, theta = 1, type = "ratio", weightmat = 1 - diag(nrow(dis)), init = NULL, ndim = 2, itmaxi = 10000, ..., stressweight = 1, cordweight = 0.5, q = 1, minpts = ndim + 1, epsilon = 10, rang = NULL, verbose = 0, scale = "sd", normed = TRUE )
dis |
numeric matrix or dist object of a matrix of proximities |
theta |
the theta vector of powers; this must be a scalar of the lambda transformation for the observed proximities. Defaults to 1. Note that the lambda here works as a multiplicator of 2 (as sstress has f(delta^2)). |
type |
MDS type, defaults to "ratio". |
weightmat |
(optional) a matrix of nonnegative weights |
init |
(optional) initial configuration |
ndim |
number of dimensions of the target space |
itmaxi |
number of iterations. default is 10000. |
... |
additional arguments to be passed to the fitting procedure |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the cordillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
scale |
should the configuration be scale adjusted |
normed |
should the cordillera be normed; defaults to TRUE |
A list with the components
stress: the stress-1 value
stress.m: default normalized stress
copstress: the weighted loss value
OC: the Optics cordillera value
parameters: the parameters used for fitting (lambda)
fit: the returned object of the fitting procedure
cordillera: the cordillera object
About the function cops: The high level function allows for minimizing copstress for a clustered MDS configuration. Allows to choose COPS-C (finding a configuration from copstress with cordillera penalty) and profile COPS (finding hyperparameters for MDS models with power transformations). It is a wrapper for copstressMin and pcops.
cops( dis, variant = c("1", "2", "Variant1", "Variant2", "v1", "v2", "COPS-C", "P-COPS", "configuration-c", "profile", "copstress-c", "p-copstress", "COPS-P", "copstress-p", "cops-c", "p-cops", "copsc", "pcops"), ... )
cops( dis, variant = c("1", "2", "Variant1", "Variant2", "v1", "v2", "COPS-C", "P-COPS", "configuration-c", "profile", "copstress-c", "p-copstress", "COPS-P", "copstress-p", "cops-c", "p-cops", "copsc", "pcops"), ... )
dis |
a dissimilarity matrix or a dist object |
variant |
a character string specifying which variant of COPS to fit. Allowed is any of the following "1","2","Variant1","Variant2","v1","v2","COPS-C","P-COPS","configuration-c","profile","copstress-c","p-copstress". Defaults to "COPS-C". |
... |
arguments to be passed to |
For COPS-C Variant 1 see copstressMin
, for P-COPS Variant 2 see pcops
dis<-as.matrix(smacof::kinshipdelta) #COPS-C with equal weight to stress and cordillera res1<-cops(dis,variant="COPS-C",stressweight=0.75,cordweight=0.25, minpts=2,itmax=1000) #use higher itmax in real res1 summary(res1) plot(res1) plot(res1,"reachplot") #s-stress type copstress (i.e. kappa=2, lambda=2) res3<-cops(dis,variant="COPS-C",kappa=2,lambda=2,stressweight=0.5,cordweight=0.5) res3 summary(res3) plot(res3) # power-stress type profile copstress # search for optimal kappa and lambda between # kappa=0.5,lambda=0.5 and kappa=2,lambda=5 # nu is fixed on -1 ws<-1/dis diag(ws)<-1 res5<-cops(dis,variant="P-COPS",loss="powerstress", theta=c(1.4,3,-1), lower=c(1,0.5,-1),upper=c(3,5,-1), weightmat=ws, stressweight=0.9,cordweight=0.1) res5 summary(res5) plot(res5)
dis<-as.matrix(smacof::kinshipdelta) #COPS-C with equal weight to stress and cordillera res1<-cops(dis,variant="COPS-C",stressweight=0.75,cordweight=0.25, minpts=2,itmax=1000) #use higher itmax in real res1 summary(res1) plot(res1) plot(res1,"reachplot") #s-stress type copstress (i.e. kappa=2, lambda=2) res3<-cops(dis,variant="COPS-C",kappa=2,lambda=2,stressweight=0.5,cordweight=0.5) res3 summary(res3) plot(res3) # power-stress type profile copstress # search for optimal kappa and lambda between # kappa=0.5,lambda=0.5 and kappa=2,lambda=5 # nu is fixed on -1 ws<-1/dis diag(ws)<-1 res5<-cops(dis,variant="P-COPS",loss="powerstress", theta=c(1.4,3,-1), lower=c(1,0.5,-1),upper=c(3,5,-1), weightmat=ws, stressweight=0.9,cordweight=0.1) res5 summary(res5) plot(res5)
Calculates copstress for given MDS object
copstress( obj, stressweight = 1, cordweight = 5, q = 1, minpts = 2, epsilon = 10, rang = NULL, verbose = 0, normed = TRUE, scale = c("std", "sd", "proc", "none"), init, ... )
copstress( obj, stressweight = 1, cordweight = 5, q = 1, minpts = 2, epsilon = 10, rang = NULL, verbose = 0, normed = TRUE, scale = c("std", "sd", "proc", "none"), init, ... )
obj |
MDS object (supported are sammon, cmdscale, smacof, rstress, powermds) |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; defaults to 0.5 |
q |
the norm of the cordillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to 2 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the distances (min distance minus max distance). If NULL (default) the cordillera will be normed to each configuration's maximum distance, so an absolute value of goodness-of-clusteredness. |
verbose |
numeric value hat prints information on the fitting process; >2 is very verbose (copstress level), >3 is extremely (up to MDS optimization level) |
normed |
should the cordillera be normed; defaults to TRUE |
scale |
should the configuration be scale adjusted. |
init |
a reference configuration when doing procrustes adjustment |
... |
additional arguments to be passed to the cordillera function |
A list with the components
copstress: the weighted loss value
OC: the Optics cordillera value
parameters: the parameters used for fitting (kappa, lambda)
cordillera: the cordillera object
Minimizing Copstress to obtain a clustered ratio, interval, spline or ordinal PS configuration with given explicit power transformations theta. The function allows mix-and-match of explicit (via theta) and implicit (via type) transformations by setting the kappa, lambda, nu (or theta) and type arguments. This function also supports fitting any Minkowski distance in the configuration.
copstressMin( delta, kappa = 1, lambda = 1, nu = 1, theta = c(kappa, lambda, nu), type = c("ratio", "interval", "ordinal"), ties = "primary", weightmat = 1 - diag(nrow(delta)), ndim = 2, init = NULL, stressweight = 0.975, cordweight = 0.025, q = 1, minpts = ndim + 1, epsilon = max(10, max(delta)), dmax = NULL, rang, optimmethod = c("NelderMead", "Newuoa", "BFGS", "SANN", "hjk", "solnl", "solnp", "subplex", "snomadr", "hjk-Newuoa", "hjk-BFGS", "BFGS-hjk", "Newuoa-hjk", "cmaes", "direct", "direct-Newuoa", "direct-BFGS", "genoud", "gensa"), verbose = 0, scale = c("sd", "rmsq", "proc", "none"), normed = TRUE, accuracy = 1e-07, itmax = 10000, stresstype = c("stress-1", "stress"), principal = FALSE, minkp = 2, spline.degree = 2, spline.intKnots = 2, ... ) copsc( delta, kappa = 1, lambda = 1, nu = 1, theta = c(kappa, lambda, nu), type = c("ratio", "interval", "ordinal"), ties = "primary", weightmat = 1 - diag(nrow(delta)), ndim = 2, init = NULL, stressweight = 0.975, cordweight = 0.025, q = 1, minpts = ndim + 1, epsilon = max(10, max(delta)), dmax = NULL, rang, optimmethod = c("NelderMead", "Newuoa", "BFGS", "SANN", "hjk", "solnl", "solnp", "subplex", "snomadr", "hjk-Newuoa", "hjk-BFGS", "BFGS-hjk", "Newuoa-hjk", "cmaes", "direct", "direct-Newuoa", "direct-BFGS", "genoud", "gensa"), verbose = 0, scale = c("sd", "rmsq", "proc", "none"), normed = TRUE, accuracy = 1e-07, itmax = 10000, stresstype = c("stress-1", "stress"), principal = FALSE, minkp = 2, spline.degree = 2, spline.intKnots = 2, ... ) copStressMin( delta, kappa = 1, lambda = 1, nu = 1, theta = c(kappa, lambda, nu), type = c("ratio", "interval", "ordinal"), ties = "primary", weightmat = 1 - diag(nrow(delta)), ndim = 2, init = NULL, stressweight = 0.975, cordweight = 0.025, q = 1, minpts = ndim + 1, epsilon = max(10, max(delta)), dmax = NULL, rang, optimmethod = c("NelderMead", "Newuoa", "BFGS", "SANN", "hjk", "solnl", "solnp", "subplex", "snomadr", "hjk-Newuoa", "hjk-BFGS", "BFGS-hjk", "Newuoa-hjk", "cmaes", "direct", "direct-Newuoa", "direct-BFGS", "genoud", "gensa"), verbose = 0, scale = c("sd", "rmsq", "proc", "none"), normed = TRUE, accuracy = 1e-07, itmax = 10000, stresstype = c("stress-1", "stress"), principal = FALSE, minkp = 2, spline.degree = 2, spline.intKnots = 2, ... )
copstressMin( delta, kappa = 1, lambda = 1, nu = 1, theta = c(kappa, lambda, nu), type = c("ratio", "interval", "ordinal"), ties = "primary", weightmat = 1 - diag(nrow(delta)), ndim = 2, init = NULL, stressweight = 0.975, cordweight = 0.025, q = 1, minpts = ndim + 1, epsilon = max(10, max(delta)), dmax = NULL, rang, optimmethod = c("NelderMead", "Newuoa", "BFGS", "SANN", "hjk", "solnl", "solnp", "subplex", "snomadr", "hjk-Newuoa", "hjk-BFGS", "BFGS-hjk", "Newuoa-hjk", "cmaes", "direct", "direct-Newuoa", "direct-BFGS", "genoud", "gensa"), verbose = 0, scale = c("sd", "rmsq", "proc", "none"), normed = TRUE, accuracy = 1e-07, itmax = 10000, stresstype = c("stress-1", "stress"), principal = FALSE, minkp = 2, spline.degree = 2, spline.intKnots = 2, ... ) copsc( delta, kappa = 1, lambda = 1, nu = 1, theta = c(kappa, lambda, nu), type = c("ratio", "interval", "ordinal"), ties = "primary", weightmat = 1 - diag(nrow(delta)), ndim = 2, init = NULL, stressweight = 0.975, cordweight = 0.025, q = 1, minpts = ndim + 1, epsilon = max(10, max(delta)), dmax = NULL, rang, optimmethod = c("NelderMead", "Newuoa", "BFGS", "SANN", "hjk", "solnl", "solnp", "subplex", "snomadr", "hjk-Newuoa", "hjk-BFGS", "BFGS-hjk", "Newuoa-hjk", "cmaes", "direct", "direct-Newuoa", "direct-BFGS", "genoud", "gensa"), verbose = 0, scale = c("sd", "rmsq", "proc", "none"), normed = TRUE, accuracy = 1e-07, itmax = 10000, stresstype = c("stress-1", "stress"), principal = FALSE, minkp = 2, spline.degree = 2, spline.intKnots = 2, ... ) copStressMin( delta, kappa = 1, lambda = 1, nu = 1, theta = c(kappa, lambda, nu), type = c("ratio", "interval", "ordinal"), ties = "primary", weightmat = 1 - diag(nrow(delta)), ndim = 2, init = NULL, stressweight = 0.975, cordweight = 0.025, q = 1, minpts = ndim + 1, epsilon = max(10, max(delta)), dmax = NULL, rang, optimmethod = c("NelderMead", "Newuoa", "BFGS", "SANN", "hjk", "solnl", "solnp", "subplex", "snomadr", "hjk-Newuoa", "hjk-BFGS", "BFGS-hjk", "Newuoa-hjk", "cmaes", "direct", "direct-Newuoa", "direct-BFGS", "genoud", "gensa"), verbose = 0, scale = c("sd", "rmsq", "proc", "none"), normed = TRUE, accuracy = 1e-07, itmax = 10000, stresstype = c("stress-1", "stress"), principal = FALSE, minkp = 2, spline.degree = 2, spline.intKnots = 2, ... )
delta |
numeric matrix or dist object of a matrix of proximities |
kappa |
power transformation for fitted distances |
lambda |
power transformation for proximities (only used if type="ratio" or "interval") |
nu |
power transformation for weights |
theta |
the theta vector of powers; the first is kappa (for the fitted distances if it exists), the second lambda (for the observed proximities if it exist and type="ratio" or "interval"), the third is nu (for the weights if it exists). If less than three elements are is given as argument, it will be recycled. Defaults to 1 1 1. Will override any kappa, lambda, nu parameters if they are given and do not match. |
type |
what type of MDS to fit. Currently one of "ratio", "interval" or "ordinal". Default is "ratio". |
ties |
the handling of ties for ordinal (nonmetric) MDS. Possible are "primary" (default), "secondary" or "tertiary". |
weightmat |
(optional) a matrix of nonnegative weights; defaults to 1 for all off diagonals |
ndim |
number of dimensions of the target space |
init |
(optional) initial configuration |
stressweight |
weight to be used for the fit measure; defaults to 0.975 |
cordweight |
weight to be used for the cordillera; defaults to 0.025 |
q |
the norm of the cordillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS, see |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked, see |
dmax |
The winsorization limit of reachability distances in the OPTICS Cordillera. If supplied, it should be either a numeric value that matches 'max(rang)' or 'NULL'; if 'NULL' it is found as 1.5 times (for kappa >1) or 1 times (for kappa <=1) the maximum reachbility value of the power torgerson model with the same lambda. If 'dmax' and 'rang' are supplied and 'dmax' is not 'max(rang)', a warning is given and 'rang' takes precedence. |
rang |
range of the reachabilities to be considered. If missing it is found from the initial configuration by taking 0 as the lower boundary and dmax (see above) as upper boundary. See also |
optimmethod |
What optimizer to use? Choose one string of 'Newuoa' ( |
verbose |
numeric value hat prints information on the fitting process; >2 is very verbose |
scale |
Scale the configuration (in MDS stress is invariant up to a scaling factor). One of "none" (so no extra scaling of the configuration but normalized to sum delta^2=1), "sd" (configuration divided by the highest standard deviation of any the columns), "proc" (procrustes adjustment to the initial fit) and "rmsq" (configuration divided by the maximum root mean square of the columns). Default is "sd" which often gives a nicer spread on the axes. Note that the scaled configuration is returned as $conf and the unscaled as $usconf, so manual calculation of the OC should be done with $conf. |
normed |
should the Cordillera be normed; defaults to TRUE. |
accuracy |
numerical accuracy, defaults to 1e-7. |
itmax |
maximum number of iterations. Defaults to 10000. For the two-step algorithms if itmax is exceeded by the first solver, the second algorithm is run for at least 0.1*itmax (so overall itmax may be exceeded by a factor of 1.1). |
stresstype |
which stress to use in the copstress. Defaults to stress-1. If anything else is set, explicitly normed stress which is (stress-1)^2 is used. Using stress-1 puts more weight on MDS fit. |
principal |
If ‘TRUE’, principal axis transformation is applied to the final configuration. |
minkp |
Power of the Minkowski distance. Defaults to 2 (Euclidean distance). |
spline.degree |
Degree of the spline for ‘mspline’ MDS type |
spline.intKnots |
Number of interior knots of the spline for ‘mspline’ MDS type |
... |
additional arguments to be passed to the optimization procedure |
This is an extremely flexible approach to least squares proximity scaling: It supports ratio power stress; ratio, interval and ordinal r stress and ratio, interval and ordinal MDS with or without a COPS penalty. Famous special cases of these models that can be fitted are multiscale MDS if kappa->0 and delta=log(delta), Alscal MDS (sstress) with lambda=kappa=2, sammon type mapping with weightmat=delta and nu=-1, elastic scaling with weightmat=delta and nu=-2. Due to mix-and-match this function also allows to fit models that have not yet been published, such as for example an "elastic scaling ordinal s-stress with cops penalty".
If one wants to fit these models without the cops penalty, we recommend to use powerStressMin
(for ratio and interval MDS with any power transformation for weights, dissimilarities and distances) or rStressMin
(for ratio, interval and ordinal MDS with power transformations for distances and weights) as these use majorization.
Some optimizers (including the default hjk-Newuoa) will print a warning if itmax is (too) small or if there was no convergence. Consider increasing itmax then.
For some solvers there sometimes may be an error [NA/NaN/Inf in foreign function call (arg 3)] stemming from smacof::transform(). This happens when the algorithm places two object at exactly the same place so their fitted distance is 0. This is good from an OPTICS Cordillera point of view (as it is more clustered) which is why some solvers like to pick that up, but it can lead to an issue in the optimal scaling in smacof. This can usually be mitigated when specifying the model by either using less cordweight, less itmax, less accuracy or combining the two offending objects into one (so include them as a combined row in the distance matrix).
We might eventually switch to newuoa in nloptr.
A copsc object (inheriting from smacofP). A list with the components
delta: the original untransformed dissimilarities
tdelta: the explicitly transformed dissimilarities
dhat: the explicitly transformed dissimilarities (dhats), optimally scaled and normalized (which are approximated by the fit)
confdist: Configuration distances, the transformed fitted distances
conf: the configuration (normed) and scaled as specified in scale.
usconf: the unscaled configuration (normed to sum delta^2=1). Scaling applied to usconf gives conf.
parameters, par, pars : the theta vector of powers tranformations (kappa, lambda, nu)
niter: number of iterations of the optimizer.
stress: the square root of explicitly normalized stress (calculated for confo).
spp: stress per point
ndim: number of dimensions
model: Fitted model name
call: the call
nobj: the number of objects
type, loss, losstype: stresstype
stress.m: The stress used for copstress. If stresstype="stress-1" this is like $stress else it is stress^2
copstress: the copstress loss value
resmat: the matrix of residuals
weightmat: the matrix of untransformed weights
tweightmat: the transformed weighting matrix (here weightmat^nu)
OC: the (normed) OPTICS Cordillera object (calculated for scaled conf)
OCv: the (normed) OPTICS Cordillera value alone (calculated for scaled conf)
optim: the object returned from the optimization procedure
stressweight, cordweight: the weights of the stress and OC respectively (v_1 and v_2)
optimmethod: The solver used
type: the type of MDS fitted
minkowski: the power of the minkowski distance fitted in the configuration.
dis<-as.matrix(smacof::kinshipdelta) set.seed(1) ## Copstress with equal weight to stress and cordillera and L2 configuration distance res1<-copstressMin(dis,stressweight=0.75,cordweight=0.25, itmax=100) #use higher itmax about 10000 res1 summary(res1) plot(res1) #super clustered ## Copstress with equal weight to stress and cordillera and L1 configuration distance res2<-copstressMin(dis,stressweight=0.75,cordweight=0.25, itmax=100, minkp=1) #use higher itmax about 10000 res2 plot(res2) ##Alias name res1<-copsc(dis,stressweight=0.75, cordweight=0.25,itmax=100) ## Elastic scaling mspline s-stress in L1 with cops penalty res3<-copsc(dis,type="mspline",kappa=2,nu=-2,weightmat=dis, stressweight=0.5, cordweight=0.5,itmax=100)
dis<-as.matrix(smacof::kinshipdelta) set.seed(1) ## Copstress with equal weight to stress and cordillera and L2 configuration distance res1<-copstressMin(dis,stressweight=0.75,cordweight=0.25, itmax=100) #use higher itmax about 10000 res1 summary(res1) plot(res1) #super clustered ## Copstress with equal weight to stress and cordillera and L1 configuration distance res2<-copstressMin(dis,stressweight=0.75,cordweight=0.25, itmax=100, minkp=1) #use higher itmax about 10000 res2 plot(res2) ##Alias name res1<-copsc(dis,stressweight=0.75, cordweight=0.25,itmax=100) ## Elastic scaling mspline s-stress in L1 with cops penalty res3<-copsc(dis,type="mspline",kappa=2,nu=-2,weightmat=dis, stressweight=0.5, cordweight=0.5,itmax=100)
Double centering of a matrix
doubleCenter(x)
doubleCenter(x)
x |
numeric matrix |
the double centered matrix
Explicit Normalization Normalizes distances
enorm(x, w = 1)
enorm(x, w = 1)
x |
numeric matrix |
w |
weight |
a constant
Adaptive means that the search space reduction factors in the number of iterations; makes convergence faster at about 100 iterations
ljoptim( x, fun, ..., red = ifelse(adaptive, 0.99, 0.95), lower, upper, acc = 1e-06, accd = 1e-04, itmax = 1000, verbose = 0, adaptive = TRUE )
ljoptim( x, fun, ..., red = ifelse(adaptive, 0.99, 0.95), lower, upper, acc = 1e-06, accd = 1e-04, itmax = 1000, verbose = 0, adaptive = TRUE )
x |
optional starting values |
fun |
function to minimize |
... |
additional arguments to be passed to the function to be optimized |
red |
value of the reduction of the search region |
lower |
The lower contraints of the search region |
upper |
The upper contraints of the search region |
acc |
if the numerical accuracy of two successive target function values is below this, stop the optimization; defaults to 1e-6 |
accd |
if the width of the search space is below this, stop the optimization; defaults to 1e-4 |
itmax |
maximum number of iterations |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose |
adaptive |
should the adaptive version be used? defaults to TRUE. |
A list with the components (see also optim
)
par The position of the optimimum in the search space (parameters that minimize the function; argmin fun)
value The value of the objective function at the optimum (min fun)
counts The number of iterations performed at convergence with entries fnction for the number of iterations and gradient which is always NA at the moment
convergence 0 successful completion by the accd or acc criterion, 1 indicate iteration limit was reached, 99 is a problem
message is NULL (only for compatibility or future use)
fbana <- function(x) { x1 <- x[1] x2 <- x[2] 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 } res1<-ljoptim(c(-1.2,1),fbana,lower=-5,upper=5,accd=1e-16,acc=1e-16) res1 set.seed(210485) fwild <- function (x) 10*sin(0.3*x)*sin(1.3*x^2) + 0.00001*x^4 + 0.2*x+80 plot(fwild, -50, 50, n = 1000, main = "ljoptim() minimising 'wild function'") res2<-ljoptim(50, fwild,lower=-50,upper=50,adaptive=FALSE,accd=1e-16,acc=1e-16) points(res2$par,res2$value,col="red",pch=19) res2
fbana <- function(x) { x1 <- x[1] x2 <- x[2] 100 * (x2 - x1 * x1)^2 + (1 - x1)^2 } res1<-ljoptim(c(-1.2,1),fbana,lower=-5,upper=5,accd=1e-16,acc=1e-16) res1 set.seed(210485) fwild <- function (x) 10*sin(0.3*x)*sin(1.3*x^2) + 0.00001*x^4 + 0.2*x+80 plot(fwild, -50, 50, n = 1000, main = "ljoptim() minimising 'wild function'") res2<-ljoptim(50, fwild,lower=-50,upper=50,adaptive=FALSE,accd=1e-16,acc=1e-16) points(res2$par,res2$value,col="red",pch=19) res2
A symmetric distance matrix of 32 MATCH-ADT modules based on their usage by clinicians. The raw data were counts of how often each module has been used with each of 449 youths, resulting in a count profile of each module. Based on that the phi-distance between the modules has been calculated.
A 32 x 32 distance matrix.
This is an object that inherits from class distance (see package analogue) and matrix.
Take matrix to a power
mkPower(x, r)
mkPower(x, r)
x |
matrix |
r |
numeric (power) |
a matrix
Metaparameter selection for MDS models baseed on the Profile COPS approach (COPS Variant 2). It uses copstress for hyperparameter selection of explicit transformations (currently power transformations). It is a special case of a STOPS model and predated it; stops
has more functionality and can be seen as the successor. pcops uses explicitly normalized stress for copstress (not stress-1).
pcops( dis, loss = c("stress", "smacofSym", "smacofSphere", "strain", "sammon", "rstress", "powermds", "sstress", "elastic", "powersammon", "powerelastic", "powerstress", "sammon2", "powerstrain", "apstress", "rpowerstress"), type = "ratio", weightmat = NULL, ndim = 2, init = NULL, theta = c(1, 1, 1), stressweight = 1, cordweight, q = 2, minpts = ndim + 1, epsilon = 100, rang, optimmethod = c("ALJ", "pso", "SANN", "direct", "directL", "stogo", "MADS", "hjk"), lower = 0.5, upper = 5, verbose = 0, scale = c("proc", "sd", "none", "std"), normed = TRUE, s = 4, acc = 1e-05, itmaxo = 200, itmaxi = 5000, ... )
pcops( dis, loss = c("stress", "smacofSym", "smacofSphere", "strain", "sammon", "rstress", "powermds", "sstress", "elastic", "powersammon", "powerelastic", "powerstress", "sammon2", "powerstrain", "apstress", "rpowerstress"), type = "ratio", weightmat = NULL, ndim = 2, init = NULL, theta = c(1, 1, 1), stressweight = 1, cordweight, q = 2, minpts = ndim + 1, epsilon = 100, rang, optimmethod = c("ALJ", "pso", "SANN", "direct", "directL", "stogo", "MADS", "hjk"), lower = 0.5, upper = 5, verbose = 0, scale = c("proc", "sd", "none", "std"), normed = TRUE, s = 4, acc = 1e-05, itmaxo = 200, itmaxi = 5000, ... )
dis |
numeric matrix or dist object of a matrix of proximities |
loss |
which loss function to be used for fitting, defaults to strain. See Details. |
type |
MDS type which may be one of "ratio", interval", "ordinal". Defaults to "ratio". Note not all loss arguments support all types; if not there will be an error and infor which types are supported. In that case choose another type. |
weightmat |
(optional) a matrix of nonnegative weights; defaults to 1 for all off diagonals |
ndim |
number of dimensions of the target space |
init |
(optional) initial configuration. If not supplied, the Torgerson scaling result of the dissimilarity matrix dis^theta[2]/enorm(dis^theta[2],weightmat) is used. |
theta |
the theta vector of free parameters; see details for the number of free parameters for each loss function. Defaults to 1 for all free parameters. Make sure to supply a theta of the correct length as the mechanisms in place to automatically choose theta/upper/lower are dependent on the optimizer and ad hoc: If this is a vector with more elements than necessary, it is either cut (so for a vector of length 3 and a function with 2 free parameters, the first two elements of the vector are used) or there will be an error. If a scalar is given as argument and the number of free parameters is larger than 1, the scalar will be recycled and this may also make the optimizers equate all free parameters. |
stressweight |
weight to be used for the fit measure; defaults to 1 |
cordweight |
weight to be used for the cordillera; if missing gets estimated from the initial configuration so that copstress = 0 for theta=1 |
q |
the norm of the cordillera; defaults to 1 |
minpts |
the minimum points to make up a cluster in OPTICS; defaults to ndim+1 |
epsilon |
the epsilon parameter of OPTICS, the neighbourhood that is checked; defaults to 10 |
rang |
range of the minimum reachabilities to be considered. If missing it is found from the initial configuration by taking 1.5 times the maximal minimum reachability of the model with theta=1. If NULL it will be normed to each configuration's minimum and maximum distance, so an absolute value of goodness-of-clusteredness. Note that the latter is not necessarily desirable when comparing configurations for their relative clusteredness. See also |
optimmethod |
What general purpose optimizer to use? Defaults to our adaptive LJ version (ALJ). Also allows particle swarm optimization with s particles ("pso", |
lower |
A vector of the lower box contraints of the search region. Its length must match the length of theta. |
upper |
A vector of the upper box contraints of the search region. Its length must match the length of theta. |
verbose |
numeric value hat prints information on the fitting process; >2 is extremely verbose. Note that for models with some parameters fixed, the iteration progress of the optimizer shows different values also for the fixed parameters because due to the modular setup we always optimize over a three parameter vector. These values are inconsequential however as internally they will be fixed. |
scale |
should the configuration be scaled and/or centered for calculating the cordillera? "std" standardizes each column of the configurations to mean=0 and sd=1 (typically not a good idea), "sd" scales the configuration by the maximum standard devation of any column (default), "proc" adjusts the fitted configuration to the init configuration (or the Togerson scaling solution if init=NULL). This parameter only has an effect for calculating the cordillera, the fitted and returned configuration is NOT scaled. |
normed |
should the cordillera be normed; defaults to TRUE |
s |
number of particles if pso is used |
acc |
termination threshold difference of two successive outer minimization steps. |
itmaxo |
iterations of the outer step (optimization over the hyperparmeters; if solver allows it). Defaults to 200. |
itmaxi |
iterations of the inner step (optimization of the MDS). Defaults to 5000. |
... |
additional arguments to be passed to the optimization procedure |
Currently allows for the following models:
Power transformations applied to observed proximities only (theta, upper, lower should be numeric scalar): Strain loss/Torgerson scaling (strain
, workhorse: smacofx::cmdscale), Stress for symmetric matrices (smacofSym
, stress
,smacofSphere
for scaling onto a sphere; workhorse: smacof::smacofSym), Sammon mapping (sammon
, workhorse is smacofx::sammon or sammon2
, workhorse: smacof::smacofSym), elastic scaling (elastic
, workhorse smacof::smacofSym), Alscal or S-Stress sstress
(workhorse: smacofx::powerStressMin)
Power transformations of fitted distances only (theta, upper, lower should be numeric scalar): r-stress rstress
(workhorse: smacofx:rStressMin)
Power transformations applied to fitted distances and observed proximities (theta, upper, lower should be numeric of length 2): Power MDS (powermds
, workhorse: smacofx::powerStressMin), Sammon Mapping/elastic scaling with powers (powersammon
, powerelastic
, workhorse: smacofx::powerStressMin)
Power transformations applied to fitted distances, observed proximities and weights (theta, upper, lower should be numeric of length 3): power stress (POST-MDS, powerstress
, workhorse: smacofx::powerStressMin), restricted power stress with equal transformations for distances and proximities (rpowerstress
); workhorse: smacofx::powerStressMin), approximated power stress (apstress
; workhorse: smacof::smacofSym)
A list with the components
copstress: the weighted loss value
OC: the OPTICS cordillera for the scaled configuration (as defined by scale)
optim: the object returned from the optimization procedure
stress: the stress (square root of stress.m)
stress.m: default normalized stress
parameters: the parameters used for fitting (kappa, lambda)
fit: the returned object of the fitting procedure
cordillera: the cordillera object
dis<-as.matrix(smacof::kinshipdelta) set.seed(210485) #configuration is scaled with highest column sd for calculating cordilera res1<-pcops(dis,loss="strain",lower=0.1,upper=5,minpts=2) res1 summary(res1) plot(res1)
dis<-as.matrix(smacof::kinshipdelta) set.seed(210485) #configuration is scaled with highest column sd for calculating cordilera res1<-pcops(dis,loss="strain",lower=0.1,upper=5,minpts=2) res1 summary(res1) plot(res1)
Squared p-distances
pdist(x, p)
pdist(x, p)
x |
numeric matrix |
p |
p>0 the Minkoswki distance |
squared Minkowski distance matrix
The pairwise phi-distance of two vectors x and y is sqrt(sum(((x[i]-y[i])^2)/((x[i]+y[i])*(sum(x)+sum(y))))). The function calculates this for all pairs of rows of a matrix or data frame X.
phidistance(X)
phidistance(X)
X |
an n times p numeric matrix or data frame |
a symmetric n times n matrix of pairwise phi distance (between rows of X) with 0 in the main diagonal. Is an object of class distance and matrix.
S3 plot method for p-cops objects
## S3 method for class 'pcops' plot(x, plot.type, main, asp = 1, ...)
## S3 method for class 'pcops' plot(x, plot.type, main, asp = 1, ...)
x |
an object of class cops |
plot.type |
String indicating which type of plot to be produced: can be one of "confplot", "reachplot", "resplot", "transplot", "Shepard", "stressplot", "bubbleplot", "histogram" *see plot.smacofP, plot.smacofB and plot.copsc). Note that not all plots might be available for all losses. |
main |
the main title of the plot |
asp |
aspect ratio of x/y axis; defaults to 1; setting to 1 will lead to an accurate represenation of the fitted distances. |
... |
Further plot arguments passed: see 'plot.smacofP', 'plot.smacofB' and 'plot' for detailed information. |
See plot.smacofP
#@importFrom smacofx plot
dis<-as.matrix(smacof::kinshipdelta) resl<-pcops(dis,loss="stress",lower=0.1,upper=5,minpts=2) plot(resl,plot.type="confplot") plot(resl,plot.type="reachplot") plot(resl,plot.type="Shepard") plot(resl,plot.type="transplot") plot(resl,plot.type="stressplot") plot(resl,plot.type="bubbleplot") plot(resl,plot.type="histogram")
dis<-as.matrix(smacof::kinshipdelta) resl<-pcops(dis,loss="stress",lower=0.1,upper=5,minpts=2) plot(resl,plot.type="confplot") plot(resl,plot.type="reachplot") plot(resl,plot.type="Shepard") plot(resl,plot.type="transplot") plot(resl,plot.type="stressplot") plot(resl,plot.type="bubbleplot") plot(resl,plot.type="histogram")
procruster: a procrustes function
procruster(x)
procruster(x)
x |
numeric matrix |
a matrix
Secular Equation
secularEq(a, b)
secularEq(a, b)
a |
matrix |
b |
matrix |
Squared distances
sqdist(x)
sqdist(x)
x |
numeric matrix |
squared distance matrix