Title: | Optimal Subset Selection for Transformation Models |
---|---|
Description: | Greedy optimal subset selection for transformation models (Hothorn et al., 2018, <doi:10.1111/sjos.12291> ) based on the abess algorithm (Zhu et al., 2020, <doi:10.1073/pnas.2014241117> ). Applicable to models from packages 'tram' and 'cotram'. |
Authors: | Lucas Kook [aut, cre], Sandra Siegfried [ctb], Torsten Hothorn [ctb] |
Maintainer: | Lucas Kook <[email protected]> |
License: | GPL-3 |
Version: | 0.0-6 |
Built: | 2025-01-20 13:32:23 UTC |
Source: | https://github.com/r-forge/ctm |
Optimal subset selection for multivariate transformation models
abess_mmlt( mltargs, supp, k_max = supp, thresh = NULL, init = TRUE, m_max = 10, m0 = NULL, ... )
abess_mmlt( mltargs, supp, k_max = supp, thresh = NULL, init = TRUE, m_max = 10, m0 = NULL, ... )
mltargs |
Arguments passed to |
supp |
support size of the coefficient vector |
k_max |
maximum support size to consider during the splicing algorithm.
Defaults to |
thresh |
threshold when to stop splicing. Defaults to
0.01 * |
init |
initialize active set. Defaults to |
m_max |
maximum number of iterating the splicing algorithm. |
m0 |
Transformation model for initialization |
... |
Currently ignored |
List containing the fitted model via mmlt
, active set
A
and inactive set I
.
Optimal subset selection for transformation models
abess_tram( formula, data, modFUN, supp, mandatory = NULL, k_max = supp, thresh = NULL, init = TRUE, m_max = 10, m0 = NULL, ... )
abess_tram( formula, data, modFUN, supp, mandatory = NULL, k_max = supp, thresh = NULL, init = TRUE, m_max = 10, m0 = NULL, ... )
formula |
object of class |
data |
data frame containing the variables in the model. |
modFUN |
function for fitting a transformation model, e.g., |
supp |
support size of the coefficient vector |
mandatory |
formula of mandatory covariates, which will always be included
and estimated in the model. Note that this also changes the intialization
of the active set. The active set is then computed with regards to the
model residuals of |
k_max |
maximum support size to consider during the splicing algorithm.
Defaults to |
thresh |
threshold when to stop splicing. Defaults to
0.01 * |
init |
initialize active set. Defaults to |
m_max |
maximum number of iterating the splicing algorithm. |
m0 |
Transformation model for initialization |
... |
additional arguments supplied to |
List containing the fitted model via modFUN
, active set
A
and inactive set I
.
set.seed(24101968) library(tramvs) N <- 1e2 P <- 5 nz <- 3 beta <- rep(c(1, 0), c(nz, P - nz)) X <- matrix(rnorm(N * P), nrow = N, ncol = P) Y <- 1 + X %*% beta + rnorm(N) dat <- data.frame(y = Y, x = X) abess_tram(y ~ ., dat, modFUN = Lm, supp = 3)
set.seed(24101968) library(tramvs) N <- 1e2 P <- 5 nz <- 3 beta <- rep(c(1, 0), c(nz, P - nz)) X <- matrix(rnorm(N * P), nrow = N, ncol = P) Y <- 1 + X %*% beta + rnorm(N) dat <- data.frame(y = Y, x = X) abess_tram(y ~ ., dat, modFUN = Lm, supp = 3)
AIC "tramvs"
## S3 method for class 'tramvs' AIC(object, ...)
## S3 method for class 'tramvs' AIC(object, ...)
object |
object of class |
... |
additional arguments to |
Numeric vector containing AIC of best model
Optimal subset selection in a BoxCox-type transformation model
BoxCoxVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
BoxCoxVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
formula |
object of class |
data |
data frame containing the variables in the model. |
supp_max |
maximum support which to call |
k_max |
maximum support size to consider during the splicing algorithm.
Defaults to |
thresh |
threshold when to stop splicing. Defaults to
0.01 * |
init |
initialize active set. Defaults to |
m_max |
maximum number of iterating the splicing algorithm. |
parallel |
toggle for parallel computing via
|
future_args |
arguments passed to |
... |
Additional arguments supplied to |
See tramvs
Coef "abess_tram"
## S3 method for class 'abess_tram' coef(object, ...)
## S3 method for class 'abess_tram' coef(object, ...)
object |
object of class |
... |
additional arguments to |
Named numeric vector containing coefficient estimates
see coef.tram
Coef "mmltvs"
## S3 method for class 'mmltvs' coef(object, best_only = FALSE, ...)
## S3 method for class 'mmltvs' coef(object, best_only = FALSE, ...)
object |
Object of class |
best_only |
Wether to return the coefficients of the best model only (default: FALSE) |
... |
additional arguments to |
Vector (best_only = TRUE
) or matrix (best_only = FALSE
)
of coefficients
Coef "tramvs"
## S3 method for class 'tramvs' coef(object, best_only = FALSE, ...)
## S3 method for class 'tramvs' coef(object, best_only = FALSE, ...)
object |
Object of class |
best_only |
Wether to return the coefficients of the best model only (default: FALSE) |
... |
additional arguments to |
Vector (best_only = TRUE
) or matrix (best_only = FALSE
)
of coefficients
Optimal subset selection in a Colr-type transformation model
ColrVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
ColrVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
formula |
object of class |
data |
data frame containing the variables in the model. |
supp_max |
maximum support which to call |
k_max |
maximum support size to consider during the splicing algorithm.
Defaults to |
thresh |
threshold when to stop splicing. Defaults to
0.01 * |
init |
initialize active set. Defaults to |
m_max |
maximum number of iterating the splicing algorithm. |
parallel |
toggle for parallel computing via
|
future_args |
arguments passed to |
... |
Additional arguments supplied to |
See tramvs
Compute correlation for initializing the active set
cor_init(m0, mb)
cor_init(m0, mb)
m0 |
|
mb |
|
Vector of correlations for initializing the active set, depends on
type of model (see e.g. cor_init.default
)
Default method for computing correlation
## Default S3 method: cor_init(m0, mb)
## Default S3 method: cor_init(m0, mb)
m0 |
|
mb |
|
Vector of correlation for initializing the active set
Method for computing correlations in mmlts
## S3 method for class 'mmlt' cor_init(m0, mb)
## S3 method for class 'mmlt' cor_init(m0, mb)
m0 |
|
mb |
|
Vector of correlation for initializing the active set
Shit-scale tram method for computing correlation
## S3 method for class 'stram' cor_init(m0, mb)
## S3 method for class 'stram' cor_init(m0, mb)
m0 |
|
mb |
|
Vector of correlations for initializing the active set, includes both shift and scale residuals
Optimal subset selection in a cotram model
cotramVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
cotramVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
formula |
object of class |
data |
data frame containing the variables in the model. |
supp_max |
maximum support which to call |
k_max |
maximum support size to consider during the splicing algorithm.
Defaults to |
thresh |
threshold when to stop splicing. Defaults to
0.01 * |
init |
initialize active set. Defaults to |
m_max |
maximum number of iterating the splicing algorithm. |
parallel |
toggle for parallel computing via
|
future_args |
arguments passed to |
... |
Additional arguments supplied to |
See tramvs
Optimal subset selection in a Coxph-type transformation model
CoxphVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
CoxphVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
formula |
object of class |
data |
data frame containing the variables in the model. |
supp_max |
maximum support which to call |
k_max |
maximum support size to consider during the splicing algorithm.
Defaults to |
thresh |
threshold when to stop splicing. Defaults to
0.01 * |
init |
initialize active set. Defaults to |
m_max |
maximum number of iterating the splicing algorithm. |
parallel |
toggle for parallel computing via
|
future_args |
arguments passed to |
... |
Additional arguments supplied to |
See tramvs
Optimal subset selection in a Lehmann-type transformation model
LehmannVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
LehmannVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
formula |
object of class |
data |
data frame containing the variables in the model. |
supp_max |
maximum support which to call |
k_max |
maximum support size to consider during the splicing algorithm.
Defaults to |
thresh |
threshold when to stop splicing. Defaults to
0.01 * |
init |
initialize active set. Defaults to |
m_max |
maximum number of iterating the splicing algorithm. |
parallel |
toggle for parallel computing via
|
future_args |
arguments passed to |
... |
Additional arguments supplied to |
See tramvs
Optimal subset selection in an Lm-type transformation model
LmVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
LmVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
formula |
object of class |
data |
data frame containing the variables in the model. |
supp_max |
maximum support which to call |
k_max |
maximum support size to consider during the splicing algorithm.
Defaults to |
thresh |
threshold when to stop splicing. Defaults to
0.01 * |
init |
initialize active set. Defaults to |
m_max |
maximum number of iterating the splicing algorithm. |
parallel |
toggle for parallel computing via
|
future_args |
arguments passed to |
... |
Additional arguments supplied to |
See tramvs
logLik "tramvs"
## S3 method for class 'tramvs' logLik(object, ...)
## S3 method for class 'tramvs' logLik(object, ...)
object |
object of class |
... |
additional arguments to |
Numeric vector containing log-likelihood of best model,
see logLik.tram
Select optimal subset based on high dimensional BIC in mmlts
mmltVS( mltargs, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, verbose = TRUE, parallel = FALSE, m0 = NULL, future_args = list(strategy = "multisession", workers = supp_max), ... )
mmltVS( mltargs, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, verbose = TRUE, parallel = FALSE, m0 = NULL, future_args = list(strategy = "multisession", workers = supp_max), ... )
mltargs |
Arguments passed to |
supp_max |
maximum support which to call |
k_max |
maximum support size to consider during the splicing algorithm.
Defaults to |
thresh |
threshold when to stop splicing. Defaults to
0.01 * |
init |
initialize active set. Defaults to |
m_max |
maximum number of iterating the splicing algorithm. |
verbose |
show progress bar (default: |
parallel |
toggle for parallel computing via
|
m0 |
Transformation model for initialization |
future_args |
arguments passed to |
... |
Arguments passed on to
|
L0-penalized (i.e., best subset selection) multivariate transformation models using the abess algorithm.
object of class "mltvs"
, containing the regularization path
(information criterion SIC
and coefficients coefs
), the
best fit (best_fit
) and all other models (all_fits
)
Plot "tramvs" object
## S3 method for class 'tramvs' plot(x, which = c("tune", "path"), ...)
## S3 method for class 'tramvs' plot(x, which = c("tune", "path"), ...)
x |
object of class |
which |
plotting either the regularization path ( |
... |
additional arguments to |
Returns invisible(NULL)
Optimal subset selection in a Polr-type transformation model
PolrVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
PolrVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
formula |
object of class |
data |
data frame containing the variables in the model. |
supp_max |
maximum support which to call |
k_max |
maximum support size to consider during the splicing algorithm.
Defaults to |
thresh |
threshold when to stop splicing. Defaults to
0.01 * |
init |
initialize active set. Defaults to |
m_max |
maximum number of iterating the splicing algorithm. |
parallel |
toggle for parallel computing via
|
future_args |
arguments passed to |
... |
Additional arguments supplied to |
See tramvs
Predict "tramvs"
## S3 method for class 'tramvs' predict(object, ...)
## S3 method for class 'tramvs' predict(object, ...)
object |
object of class |
... |
additional arguments to |
See predict.tram
Print "tramvs"
## S3 method for class 'tramvs' print(x, ...)
## S3 method for class 'tramvs' print(x, ...)
x |
object of class |
... |
ignored |
"tramvs"
object is returned invisibly
Residuals "tramvs"
## S3 method for class 'tramvs' residuals(object, ...)
## S3 method for class 'tramvs' residuals(object, ...)
object |
object of class |
... |
additional arguments to |
Numeric vector containing residuals of best model,
see residuals.tram
SIC generic
SIC(object, ...)
SIC(object, ...)
object |
Model to compute SIC from |
... |
for methods compatibility only |
Numeric vector (best_only = TRUE
) or data.frame with SIC values
SIC "tramvs"
## S3 method for class 'tramvs' SIC(object, best_only = FALSE, ...)
## S3 method for class 'tramvs' SIC(object, best_only = FALSE, ...)
object |
object of class |
best_only |
Wether to return the coefficients of the best model only (default: FALSE) |
... |
for methods compatibility only |
Numeric vector (best_only = TRUE
) or data.frame with SIC values
Simulate "tramvs"
## S3 method for class 'tramvs' simulate(object, nsim = 1, seed = NULL, ...)
## S3 method for class 'tramvs' simulate(object, nsim = 1, seed = NULL, ...)
object |
object of class |
nsim |
number of simulations |
seed |
random seed for simulation |
... |
additional arguments to |
See simulate.mlt
Summary "tramvs"
## S3 method for class 'tramvs' summary(object, ...)
## S3 method for class 'tramvs' summary(object, ...)
object |
object of class |
... |
ignored |
"tramvs"
object is returned invisibly
Support "tramvs"
## S3 method for class 'tramvs' support(object, ...)
## S3 method for class 'tramvs' support(object, ...)
object |
object of class |
... |
ignored |
Character vector containing active set of best fit
Optimal subset selection in a Survreg model
SurvregVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
SurvregVS( formula, data, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
formula |
object of class |
data |
data frame containing the variables in the model. |
supp_max |
maximum support which to call |
k_max |
maximum support size to consider during the splicing algorithm.
Defaults to |
thresh |
threshold when to stop splicing. Defaults to
0.01 * |
init |
initialize active set. Defaults to |
m_max |
maximum number of iterating the splicing algorithm. |
parallel |
toggle for parallel computing via
|
future_args |
arguments passed to |
... |
Additional arguments supplied to |
See tramvs
Select optimal subset based on high dimensional BIC
tramvs( formula, data, modFUN, mandatory = NULL, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, m0 = NULL, verbose = TRUE, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
tramvs( formula, data, modFUN, mandatory = NULL, supp_max = NULL, k_max = NULL, thresh = NULL, init = TRUE, m_max = 10, m0 = NULL, verbose = TRUE, parallel = FALSE, future_args = list(strategy = "multisession", workers = supp_max), ... )
formula |
object of class |
data |
data frame containing the variables in the model. |
modFUN |
function for fitting a transformation model, e.g., |
mandatory |
formula of mandatory covariates, which will always be included
and estimated in the model. Note that this also changes the intialization
of the active set. The active set is then computed with regards to the
model residuals of |
supp_max |
maximum support which to call |
k_max |
maximum support size to consider during the splicing algorithm.
Defaults to |
thresh |
threshold when to stop splicing. Defaults to
0.01 * |
init |
initialize active set. Defaults to |
m_max |
maximum number of iterating the splicing algorithm. |
m0 |
Transformation model for initialization |
verbose |
show progress bar (default: |
parallel |
toggle for parallel computing via
|
future_args |
arguments passed to |
... |
Arguments passed on to
|
L0-penalized (i.e., best subset selection) transformation models using the abess algorithm.
object of class "tramvs"
, containing the regularization path
(information criterion SIC
and coefficients coefs
), the
best fit (best_fit
) and all other models (all_fits
)
set.seed(24101968) library("tramvs") N <- 1e2 P <- 5 nz <- 3 beta <- rep(c(1, 0), c(nz, P - nz)) X <- matrix(rnorm(N * P), nrow = N, ncol = P) Y <- 1 + X %*% beta + rnorm(N) dat <- data.frame(y = Y, x = X) res <- tramvs(y ~ ., data = dat, modFUN = Lm) plot(res, type = "b") plot(res, which = "path")
set.seed(24101968) library("tramvs") N <- 1e2 P <- 5 nz <- 3 beta <- rep(c(1, 0), c(nz, P - nz)) X <- matrix(rnorm(N * P), nrow = N, ncol = P) Y <- 1 + X %*% beta + rnorm(N) dat <- data.frame(y = Y, x = X) res <- tramvs(y ~ ., data = dat, modFUN = Lm) plot(res, type = "b") plot(res, which = "path")