Title: | Constrained B-Splines (Sparse Matrix Based) |
---|---|
Description: | Qualitatively Constrained (Regression) Smoothing Splines via Linear Programming and Sparse Matrices. |
Authors: | Pin T. Ng [aut], Martin Maechler [aut, cre] |
Maintainer: | Martin Maechler <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.3-9 |
Built: | 2024-11-08 19:15:23 UTC |
Source: | https://github.com/r-forge/curves-etc |
Computes constrained quantile curves using linear or
quadratic splines. The median spline ( loss) is a robust
(constrained) smoother.
cobs(x, y, constraint = c("none", "increase", "decrease", "convex", "concave", "periodic"), w = rep(1,n), knots, nknots = if(lambda == 0) 6 else 20, method = "quantile", degree = 2, tau = 0.5, lambda = 0, ic = c("AIC", "SIC", "BIC", "aic", "sic", "bic"), knots.add = FALSE, repeat.delete.add = FALSE, pointwise = NULL, keep.data = TRUE, keep.x.ps = TRUE, print.warn = TRUE, print.mesg = TRUE, trace = print.mesg, lambdaSet = exp(seq(log(lambda.lo), log(lambda.hi), length.out = lambda.length)), lambda.lo = f.lambda*1e-4, lambda.hi = f.lambda*1e3, lambda.length = 25, maxiter = 100, rq.tol = 1e-8, toler.kn = 1e-6, tol.0res = 1e-6, nk.start = 2)
cobs(x, y, constraint = c("none", "increase", "decrease", "convex", "concave", "periodic"), w = rep(1,n), knots, nknots = if(lambda == 0) 6 else 20, method = "quantile", degree = 2, tau = 0.5, lambda = 0, ic = c("AIC", "SIC", "BIC", "aic", "sic", "bic"), knots.add = FALSE, repeat.delete.add = FALSE, pointwise = NULL, keep.data = TRUE, keep.x.ps = TRUE, print.warn = TRUE, print.mesg = TRUE, trace = print.mesg, lambdaSet = exp(seq(log(lambda.lo), log(lambda.hi), length.out = lambda.length)), lambda.lo = f.lambda*1e-4, lambda.hi = f.lambda*1e3, lambda.length = 25, maxiter = 100, rq.tol = 1e-8, toler.kn = 1e-6, tol.0res = 1e-6, nk.start = 2)
x |
vector of covariate; missing values are omitted. |
y |
vector of response variable. It must have the same length as
|
constraint |
character (string) specifying the kind of
constraint; must be one of the values in the default list above;
may be abbreviated. More flexible constraints can be specified via
the |
w |
vector of weights the same length as |
knots |
vector of locations of the knot mesh; if missing,
|
nknots |
maximum number of knots; defaults to 6 for regression B-splines, 20 for smoothing B-splines. |
method |
character string specifying the method for generating
|
degree |
degree of the splines; 1 for linear spline (equivalent
to |
tau |
desired quantile level; defaults to 0.5 (median). |
lambda |
penalty parameter |
ic |
string indicating the information criterion used in knot
deletion and addition for the regression B-spline method, i.e., when
Note that the default was |
knots.add |
logical indicating if an additional step of stepwise knot addition should be performed for regression B-splines. |
repeat.delete.add |
logical indicating if an additional step of stepwise knot deletion should be performed for regression B-splines. |
pointwise |
an optional three-column matrix with each row specifies one of the following constraints:
|
keep.data |
logical indicating if the |
keep.x.ps |
logical indicating if the pseudo design matrix
|
print.warn |
flag for printing of interactive warning messages;
true by default; set to |
print.mesg |
logical flag or integer for printing of intermediate messages; true
by default. Probably needs to be set to |
trace |
integer |
lambdaSet |
numeric vector of lambda values to use for grid search;
in that case, defaults to a geometric sequence (a “grid in
log scale”) from |
lambda.lo , lambda.hi
|
lower and upper bound of the grid search
for lambda (when |
lambda.length |
number of grid points in the grid search for optimal lambda. |
maxiter |
upper bound of the number of iterations; defaults to 100. |
rq.tol |
numeric convergence tolerance for the interior point
algorithm called from |
toler.kn |
numeric tolerance for shifting the boundary knots
outside; defaults to |
tol.0res |
|
nk.start |
number of starting knots used in automatic knot selection. Defaults to the minimum of 2 knots. |
cobs()
computes the constraint quantile smoothing B-spline with
penalty when lambda is not zero.
If lambda < 0, an optimal lambda will be chosen using Schwarz type
information criterion.
If lambda > 0, the supplied lambda will be used.
If lambda = 0, cobs computes the constraint quantile regression B-spline
with no penalty using the provided knots or those selected by Akaike or
Schwarz information criterion.
an object of class cobs
, a list with components
call |
the |
tau , degree
|
same as input |
constraint |
as input (but no more abbreviated). |
pointwise |
as input. |
coef |
B-spline coefficients. |
knots |
the final set of knots used in the computation. |
ifl |
exit code := |
icyc |
length 2: number of cycles taken to achieve convergence for final lambda, and total number of cycles for all lambdas. |
k |
the effective dimensionality of the final fit. |
k0 |
(usually the same) |
x.ps |
the pseudo design matrix |
resid |
vector of residuals from the fit. |
fitted |
vector of fitted values from the fit. |
SSy |
the sum of squares around centered |
lambda |
the penalty parameter used in the final fit. |
pp.lambda |
vector of all lambdas used for
lambda search when |
pp.sic |
vector of Schwarz information criteria evaluated at
|
Ng, P. and Maechler, M. (2007) A Fast and Efficient Implementation of Qualitatively Constrained Quantile Smoothing Splines, Statistical Modelling 7(4), 315-328.
Koenker, R. and Ng, P. (2005) Inequality Constrained Quantile Regression, Sankhya, The Indian Journal of Statistics 67, 418–440.
He, X. and Ng, P. (1999) COBS: Qualitatively Constrained Smoothing via Linear Programming; Computational Statistics 14, 315–337.
Koenker, R. and Ng, P. (1996) A Remark on Bartels and Conn's Linearly Constrained L1 Algorithm, ACM Transaction on Mathematical Software 22, 493–495.
Ng, P. (1996) An Algorithm for Quantile Smoothing Splines, Computational Statistics & Data Analysis 22, 99–118.
Bartels, R. and Conn A. (1980)
Linearly Constrained Discrete Problems,
ACM Transaction on Mathematical Software 6, 594–608.
A postscript version of the paper that describes the details of COBS can be downloaded from https://www2.nau.edu/PinNg/cobs.html.
smooth.spline
for unconstrained smoothing
splines; bs
for unconstrained (regression)
B-splines.
x <- seq(-1,3,,150) y <- (f.true <- pnorm(2*x)) + rnorm(150)/10 ## specify pointwise constraints (boundary conditions) con <- rbind(c( 1,min(x),0), # f(min(x)) >= 0 c(-1,max(x),1), # f(max(x)) <= 1 c(0, 0, 0.5))# f(0) = 0.5 ## obtain the median REGRESSION B-spline using automatically selected knots Rbs <- cobs(x,y, constraint= "increase", pointwise = con) Rbs plot(Rbs, lwd = 2.5) lines(spline(x, f.true), col = "gray40") lines(predict(cobs(x,y)), col = "blue") mtext("cobs(x,y) # completely unconstrained", 3, col= "blue") ## compute the median SMOOTHING B-spline using automatically chosen lambda Sbs <- cobs(x,y, constraint="increase", pointwise= con, lambda= -1) summary(Sbs) plot(Sbs) ## by default includes SIC ~ lambda Sb1 <- cobs(x,y, constraint="increase", pointwise= con, lambda= -1, degree = 1) summary(Sb1) plot(Sb1) plot(Sb1, which = 2) # only the data + smooth rug(Sb1$knots, col = 4, lwd = 1.6)# (too many knots) xx <- seq(min(x) - .2, max(x)+ .2, len = 201) pxx <- predict(Sb1, xx, interval = "both") lines(pxx, col = 2) mtext(" + pointwise and simultaneous 95% - confidence intervals") matlines(pxx[,1], pxx[,-(1:2)], col= rep(c("green3","blue"), c(2,2)), lty=2)
x <- seq(-1,3,,150) y <- (f.true <- pnorm(2*x)) + rnorm(150)/10 ## specify pointwise constraints (boundary conditions) con <- rbind(c( 1,min(x),0), # f(min(x)) >= 0 c(-1,max(x),1), # f(max(x)) <= 1 c(0, 0, 0.5))# f(0) = 0.5 ## obtain the median REGRESSION B-spline using automatically selected knots Rbs <- cobs(x,y, constraint= "increase", pointwise = con) Rbs plot(Rbs, lwd = 2.5) lines(spline(x, f.true), col = "gray40") lines(predict(cobs(x,y)), col = "blue") mtext("cobs(x,y) # completely unconstrained", 3, col= "blue") ## compute the median SMOOTHING B-spline using automatically chosen lambda Sbs <- cobs(x,y, constraint="increase", pointwise= con, lambda= -1) summary(Sbs) plot(Sbs) ## by default includes SIC ~ lambda Sb1 <- cobs(x,y, constraint="increase", pointwise= con, lambda= -1, degree = 1) summary(Sb1) plot(Sb1) plot(Sb1, which = 2) # only the data + smooth rug(Sb1$knots, col = 4, lwd = 1.6)# (too many knots) xx <- seq(min(x) - .2, max(x)+ .2, len = 201) pxx <- predict(Sb1, xx, interval = "both") lines(pxx, col = 2) mtext(" + pointwise and simultaneous 95% - confidence intervals") matlines(pxx[,1], pxx[,-(1:2)], col= rep(c("green3","blue"), c(2,2)), lty=2)
Print, summary and other methods for cobs
objects.
## S3 method for class 'cobs' print(x, digits = getOption("digits"), ...) ## S3 method for class 'cobs' summary(object, digits = getOption("digits"), ...) ## S3 method for class 'cobs' coef(object, ...) ## S3 method for class 'cobs' fitted(object, ...) ## S3 method for class 'cobs' knots(Fn, ...) ## S3 method for class 'cobs' residuals(object, ...)
## S3 method for class 'cobs' print(x, digits = getOption("digits"), ...) ## S3 method for class 'cobs' summary(object, digits = getOption("digits"), ...) ## S3 method for class 'cobs' coef(object, ...) ## S3 method for class 'cobs' fitted(object, ...) ## S3 method for class 'cobs' knots(Fn, ...) ## S3 method for class 'cobs' residuals(object, ...)
x , object , Fn
|
object of class |
digits |
number of digits to use for printing. |
... |
further arguments passed from and to methods. |
These are methods for fitted COBS objects, as computed by
cobs
.
print.cobs()
returns its argument invisibly.
The coef()
, fitted()
, knots()
, and
residuals()
methods return a numeric vector.
Martin Maechler
predict.cobs
for the predict
method,
plot.cobs
for the plot
method,
and cobs
for examples.
example(cobs) Sbs # uses print.* summary(Sbs) coef(Sbs) knots(Sbs)
example(cobs) Sbs # uses print.* summary(Sbs) coef(Sbs) knots(Sbs)
Compute a univariate concave or convex regression, i.e.,
for given vectors, in
, where
has to be
strictly sorted (
), compute an
-vector
minimizing the weighted sum of squares
under the constraints
for and
,
for concavity.
For convexity (
convex=TRUE
), replace by
and
by
.
conreg(x, y = NULL, w = NULL, convex = FALSE, method = c("Duembgen06_R", "SR"), tol = c(1e-10, 1e-7), maxit = c(500, 20), adjTol = TRUE, verbose = FALSE)
conreg(x, y = NULL, w = NULL, convex = FALSE, method = c("Duembgen06_R", "SR"), tol = c(1e-10, 1e-7), maxit = c(500, 20), adjTol = TRUE, verbose = FALSE)
x , y
|
numeric vectors giving the values of the predictor and
response variable. Alternatively a single “plotting”
structure (two-column matrix / y-values only / list, etc) can be
specified: see |
w |
for |
convex |
logical indicating if convex or concave regression is desired. |
method |
a character string indicating the method used,
|
tol |
convergence tolerance(s); do not make this too small! |
maxit |
maximal number of (outer and inner) iterations of knot selection. |
adjTol |
(for |
verbose |
logical or integer indicating if (and how much) knot placement and fitting iterations should be “reported”. |
Both algorithms need some numerical tolerances because of rounding
errors in computation of finite difference ratios.
The active-set "Duembgen06_R"
method notably has two different
such tolerances which were both 1e-7
up to March
2016.
The two default tolerances (and the exact convergence checks) may change in the future, possibly to become more adaptive.
an object of class conreg
which is basically a list with components
x |
sorted (and possibly aggregated) abscissa values |
y |
corresponding y values. |
w |
corresponding weights, only for |
yf |
corresponding fitted values. |
convex |
logical indicating if a convex or a concave fit has been computed. |
iKnots |
integer vector giving indices of the knots,
i.e. locations where the fitted curve has kinks.
Formally, these are exactly those indices where the constraint is
fulfilled strictly, i.e., those
|
call |
the |
iter |
integer (vector of length one or two) with the number of
iterations used (in the outer and inner loop for |
Note that there are several methods defined for conreg
objects,
see predict.conreg
or methods(class = "conreg")
.
Notably print
and plot
; also
predict
, residuals
, fitted
,
knots
.
Also, interpSplineCon()
to construct a more smooth
(cubic) spline, and isIsplineCon()
which checks
if the int is strictly concave or convex the same as the
conreg()
result from which it was constructed.
Lutz Duembgen programmed the original Matlab code in July 2006; Martin Maechler ported it to R, tested, catch infinite loops, added more options, improved tolerance, etc; from April 27, 2007.
isoreg
for isotone (monotone) regression;
CRAN packages ftnonpar, cobs, logcondens.
## Generated data : N <- 100 f <- function(X) 4*X*(1 - X) xx <- seq(0,1, length=501)# for plotting true f() set.seed(1)# -> conreg does not give convex cubic x <- sort(runif(N)) y <- f(x) + 0.2 * rnorm(N) plot(x,y, cex = 0.6) lines(xx, f(xx), col = "blue", lty=2) rc <- conreg(x,y) lines(rc, col = 2, force.iSpl = TRUE) # 'force.iSpl': force the drawing of the cubic spline through the kinks title("Concave Regression in R") y2 <- y ## Trivial cases work too: (r.1 <- conreg(1,7)) (r.2 <- conreg(1:2,7:6)) (r.3 <- conreg(1:3,c(4:5,1)))
## Generated data : N <- 100 f <- function(X) 4*X*(1 - X) xx <- seq(0,1, length=501)# for plotting true f() set.seed(1)# -> conreg does not give convex cubic x <- sort(runif(N)) y <- f(x) + 0.2 * rnorm(N) plot(x,y, cex = 0.6) lines(xx, f(xx), col = "blue", lty=2) rc <- conreg(x,y) lines(rc, col = 2, force.iSpl = TRUE) # 'force.iSpl': force the drawing of the cubic spline through the kinks title("Concave Regression in R") y2 <- y ## Trivial cases work too: (r.1 <- conreg(1,7)) (r.2 <- conreg(1:2,7:6)) (r.3 <- conreg(1:3,c(4:5,1)))
Methods for conreg
objects
## S3 method for class 'conreg' fitted(object, ...) ## S3 method for class 'conreg' residuals(object, ...) ## S3 method for class 'conreg' knots(Fn, ...) ## S3 method for class 'conreg' lines(x, type = "l", col = 2, lwd = 1.5, show.knots = TRUE, add.iSpline = TRUE, force.iSpl = FALSE, ...) ## S3 method for class 'conreg' plot(x, type = "l", col = 2, lwd = 1.5, show.knots = TRUE, add.iSpline = TRUE, force.iSpl = FALSE, xlab = "x", ylab = expression(s[c](x)), sub = "simple concave regression", col.sub = col, ...) ## S3 method for class 'conreg' predict(object, x, deriv = 0, ...) ## S3 method for class 'conreg' print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'conreg' fitted(object, ...) ## S3 method for class 'conreg' residuals(object, ...) ## S3 method for class 'conreg' knots(Fn, ...) ## S3 method for class 'conreg' lines(x, type = "l", col = 2, lwd = 1.5, show.knots = TRUE, add.iSpline = TRUE, force.iSpl = FALSE, ...) ## S3 method for class 'conreg' plot(x, type = "l", col = 2, lwd = 1.5, show.knots = TRUE, add.iSpline = TRUE, force.iSpl = FALSE, xlab = "x", ylab = expression(s[c](x)), sub = "simple concave regression", col.sub = col, ...) ## S3 method for class 'conreg' predict(object, x, deriv = 0, ...) ## S3 method for class 'conreg' print(x, digits = max(3, getOption("digits") - 3), ...)
object , Fn , x
|
an R object of class |
type , col , lwd , xlab , ylab , sub , col.sub
|
plotting arguments as in
|
show.knots |
logical indicating the spline knots should be marked additionally. |
add.iSpline |
logical indicating if an interpolation
spline should be considered for plotting. This is only used when it
is itself concave/convex, unless |
force.iSpl |
logical indicating if an interpolating spline is drawn even when it is not convex/concave. |
deriv |
for |
digits |
number of significant digits for printing. |
... |
further arguments, potentially passed to methods. |
Martin Maechler
conreg
, ....
example(conreg, echo = FALSE) class(rc) # "conreg" rc # calls the print method knots(rc) plot(rc) ## and now _force_ the not-quite-concave cubic spline : plot(rc, force.iSpl=TRUE) xx <- seq(-0.1, 1.1, length=201) # slightly extrapolate ## Get function s(x) and first derivative s'(x) : yx <- predict(rc, xx) y1 <- predict(rc, xx, deriv = 1) op <- par(las=1) plot(xx, yx, type = "l", main="plot(xx, predict( conreg(.), xx))") par(new=TRUE) # draw the first derivative "on top" plot(xx, y1, type = "l", col = "blue", axes = FALSE, ann = FALSE) abline(h = 0, lty="1A", col="blue") axis(4, col="blue", col.axis="blue", col.ticks="blue") mtext("first derivative s'(.)", col="blue") par(op)
example(conreg, echo = FALSE) class(rc) # "conreg" rc # calls the print method knots(rc) plot(rc) ## and now _force_ the not-quite-concave cubic spline : plot(rc, force.iSpl=TRUE) xx <- seq(-0.1, 1.1, length=201) # slightly extrapolate ## Get function s(x) and first derivative s'(x) : yx <- predict(rc, xx) y1 <- predict(rc, xx, deriv = 1) op <- par(las=1) plot(xx, yx, type = "l", main="plot(xx, predict( conreg(.), xx))") par(new=TRUE) # draw the first derivative "on top" plot(xx, y1, type = "l", col = "blue", axes = FALSE, ann = FALSE) abline(h = 0, lty="1A", col="blue") axis(4, col="blue", col.axis="blue", col.ticks="blue") mtext("first derivative s'(.)", col="blue") par(op)
Estimate the B-spline coefficients for a regression quantile smoothing spline with optional constraints, using Ng(1996)'s algorithm.
drqssbc2(x, y, w = rep.int(1,n), pw, knots, degree, Tlambda, constraint, ptConstr, maxiter = 100, trace = 0, nrq = length(x), nl1, neqc, niqc, nvar, tau = 0.5, select.lambda, give.pseudo.x = FALSE, rq.tol = 1e-8 * sc.y, tol.0res = 1e-6, print.warn = TRUE, rq.print.warn = FALSE)
drqssbc2(x, y, w = rep.int(1,n), pw, knots, degree, Tlambda, constraint, ptConstr, maxiter = 100, trace = 0, nrq = length(x), nl1, neqc, niqc, nvar, tau = 0.5, select.lambda, give.pseudo.x = FALSE, rq.tol = 1e-8 * sc.y, tol.0res = 1e-6, print.warn = TRUE, rq.print.warn = FALSE)
x |
numeric vector, sorted increasingly, the abscissa values |
y |
numeric, same length as |
w |
numeric vector of weights, same length as |
pw |
penalty weights vector passed to |
knots |
numeric vector of knots for the splines. |
degree |
integer, must be 1 or 2. |
Tlambda |
vector of smoothing parameter values |
constraint |
see |
ptConstr |
|
maxiter |
maximal number of iterations; defaults to 100. |
trace |
integer or logical indicating the tracing level of the underlying algorithms; not much implemented (due to lack of trace in quantreg ...) |
nrq |
integer, |
nl1 |
integer, number of observations in the l1 norm that correspond to roughness measure (may be zero). |
neqc |
integer giving the number of equations. |
niqc |
integer giving the number of inequality
constraints; of the same length as |
nvar |
integer giving the number of equations and constraints. |
tau |
desired quantile level; defaults to 0.5 (median). |
select.lambda |
logical indicating if an optimal lambda should be
selected from the vector of |
give.pseudo.x |
logical indicating if the pseudo design matrix
|
rq.tol |
numeric convergence tolerance for the interior point
algorithm called from |
tol.0res |
tolerance used to check for zero residuals, i.e.,
|
print.warn |
logical indicating if warnings should be printed, when the algorithm seems to have behaved somewhat unexpectedly. |
rq.print.warn |
logical indicating if warnings should be printed
from inside the |
This is an auxiliary function for cobs
, possibly
interesting on its own. Depending on degree
, either
l1.design2
or loo.design2
are
called for construction of the sparse design matrix.
Subsequently, either rq.fit.sfnc
or
rq.fit.sfn
is called as the main “work horse”.
This documentation is currently sparse; read the source code!
a list with components
comp1 |
Description of ‘comp1’ |
comp2 |
Description of ‘comp2’ |
...
Pin Ng; this help page: Martin Maechler.
Ng, P. (1996) An Algorithm for Quantile Smoothing Splines, Computational Statistics & Data Analysis 22, 99–118.
The main function cobs
and its auxiliary
qbsks2
which calls drqssbc2()
repeatedly.
l1.design2
and loo.design2
;
further rq.fit.sfnc
and
rq.fit.sfn
from package quantreg.
set.seed(1243) x <- 1:32 fx <- (x-5)*(x-15)^2*(x-21) y <- fx + round(rnorm(x,s = 0.25),2)
set.seed(1243) x <- 1:32 fx <- (x-5)*(x-15)^2*(x-21) y <- fx + round(rnorm(x,s = 0.25),2)
The DublinWind
data frame is basically the time series of daily
average wind speeds from 1961 to 1978, measured in Dublin, Ireland.
These are 6574 observations (18 full years among which four leap years).
data(DublinWind)
data(DublinWind)
This data frame contains the following columns:
numeric vector of average daily wind speed in knots
an integer vector giving the day number of the year, i.e., one of 1:366.
The periodic pattern along the 18 years measured and the autocorrelation are to be taken into account for analysis, see the references. This is Example 3 of the COBS paper.
From shar file available from https://www2.nau.edu/PinNg/cobs.html
Has also been available from Statlib; then, with more variables, e.g., in
help(wind, package = "gstat")
from CRAN package gstat.
Haslett, J. and Raftery, A. (1989) Space-Time Modelling with Long-Memory Dependence: Assessing Ireland's Wind Power Resource (with Discussion: 22-50). Applied Statistics 38, 1–50. doi:10.2307/2347679
COBS: Qualitatively Constrained Smoothing via Linear Programming; Computational Statistics 14, 315–337.
He, X. and Ng, P. (1999) COBS: Qualitatively Constrained Smoothing via Linear Programming; Computational Statistics 14, 315–337.
data(DublinWind) str(DublinWind) plot(speed ~ day, data = DublinWind)# not so nice; want time scale ## transform 'day' to correct "Date" object; and then plot Dday <- seq(from = as.Date("1961-01-01"), by = 1, length = nrow(DublinWind)) plot(speed ~ Dday, data = DublinWind, type = "l", main = paste("DublinWind speed daily data, n=", nrow(DublinWind))) ##--- ~ He & Ng "Example 3" %% much more is in ../tests/wind.R co.o50 <- with(DublinWind, ## use nknots > (default) 6 : cobs(day, speed, knots.add = TRUE, constraint= "periodic", nknots = 10, tau = .5, method = "uniform")) summary(co.o50) lines(Dday, fitted(co.o50), col=2, lwd = 2) ## the periodic "smooth" function - in one period plot(predict(co.o50), type = "o", cex = 0.2, col=2, xlab = "day", ylim = c(0,20)) points(speed ~ day, data = DublinWind, pch = ".")
data(DublinWind) str(DublinWind) plot(speed ~ day, data = DublinWind)# not so nice; want time scale ## transform 'day' to correct "Date" object; and then plot Dday <- seq(from = as.Date("1961-01-01"), by = 1, length = nrow(DublinWind)) plot(speed ~ Dday, data = DublinWind, type = "l", main = paste("DublinWind speed daily data, n=", nrow(DublinWind))) ##--- ~ He & Ng "Example 3" %% much more is in ../tests/wind.R co.o50 <- with(DublinWind, ## use nknots > (default) 6 : cobs(day, speed, knots.add = TRUE, constraint= "periodic", nknots = 10, tau = .5, method = "uniform")) summary(co.o50) lines(Dday, fitted(co.o50), col=2, lwd = 2) ## the periodic "smooth" function - in one period plot(predict(co.o50), type = "o", cex = 0.2, col=2, xlab = "day", ylim = c(0,20)) points(speed ~ day, data = DublinWind, pch = ".")
The exHe
data frame has 10 rows and 2 columns. It is an
example for which smooth.spline
cannot be used.
data(exHe)
data(exHe)
This data frame contains the following columns:
only values 0, 1, and 2.
10 randomly generated values
Xuming He wrote about this
JUST FOR FUN:
I was testing COBS using the following "data". For comparison, I tried
smooth.spline in S+. I never got an answer back! No warning messages either.
The point is that even the well-tested algorithm like
smooth.spline
could leave you puzzled.
To tell you the truth, the response values here were generated by white noise. An ideal fitted curve would be a flat line. See for yourself what COBS would do in this case.
Originally found at the bottom of
http://ux6.cso.uiuc.edu/~x-he/ftp.html
, the web resource
directory of Xuming He at the time, say 2006.
data(exHe) plot(exHe, main = "He's 10 point example and cobs() fits") tm <- tapply(exHe$y, exHe$x, mean) lines(unique(exHe$x), tm, lty = 2) cH. <- with(exHe, cobs(x, y, degree=1, constraint = "increase")) cH <- with(exHe, cobs(x, y, lambda=0.2, degree=1, constraint = "increase")) plot(exHe) lines(predict(cH.), type = "o", col="tomato3", pch = "i")# constant lines(predict(cH), type = "o", col=2, pch = "i") cHn <- cobs(exHe$x, exHe$y, degree=1, constraint = "none") lines(predict(cHn), col= 3, type = "o", pch = "n") cHd <- cobs(exHe$x, exHe$y, degree=1, constraint = "decrease") lines(predict(cHd), col= 4, type = "o", pch = "d")
data(exHe) plot(exHe, main = "He's 10 point example and cobs() fits") tm <- tapply(exHe$y, exHe$x, mean) lines(unique(exHe$x), tm, lty = 2) cH. <- with(exHe, cobs(x, y, degree=1, constraint = "increase")) cH <- with(exHe, cobs(x, y, lambda=0.2, degree=1, constraint = "increase")) plot(exHe) lines(predict(cH.), type = "o", col="tomato3", pch = "i")# constant lines(predict(cH), type = "o", col=2, pch = "i") cHn <- cobs(exHe$x, exHe$y, degree=1, constraint = "none") lines(predict(cHn), col= 3, type = "o", pch = "n") cHd <- cobs(exHe$x, exHe$y, degree=1, constraint = "decrease") lines(predict(cHd), col= 4, type = "o", pch = "d")
Time Series of length 113 of annual average global surface temperature deviations from 1880 to 1992.
data(globtemp)
data(globtemp)
This is Example 1 of the COBS paper, where the hypothesis of a monotonely increasing trend is considered; Koenker and Schorfheide (1994) consider modeling the autocorrelations.
‘temp.data’ in file ‘cobs.shar’ available from https://www2.nau.edu/PinNg/cobs.html
He, X. and Ng, P. (1999) COBS: Qualitatively Constrained Smoothing via Linear Programming; Computational Statistics 14, 315–337.
Koenker, R. and Schorfheide F. (1994) Quantile Spline Models for Global Temperature Change; Climate Change 28, 395–404.
data(globtemp) plot(globtemp, main = "Annual Global Temperature Deviations") str(globtemp) ## forget about time-series, just use numeric vectors: year <- as.vector(time(globtemp)) temp <- as.vector(globtemp) ##---- Code for Figure 1a of He and Ng (1999) ---------- a50 <- cobs(year, temp, knots.add = TRUE, degree = 1, constraint = "increase") summary(a50) ## As suggested in the warning message, we increase the number of knots to 9 a50 <- cobs(year, temp, nknots = 9, knots.add = TRUE, degree = 1, constraint = "increase") summary(a50) ## Here, we use the same knots sequence chosen for the 50th percentile a10 <- cobs(year, temp, nknots = length(a50$knots), knots = a50$knot, degree = 1, tau = 0.1, constraint = "increase") summary(a10) a90 <- cobs(year, temp, nknots = length(a50$knots), knots = a50$knot, degree = 1, tau = 0.9, constraint = "increase") summary(a90) which(hot.idx <- temp >= a90$fit) which(cold.idx <- temp <= a10$fit) normal.idx <- !hot.idx & !cold.idx plot(year, temp, type = "n", ylab = "Temperature (C)", ylim = c(-.7,.6)) lines(predict(a50, year, interval = "both"), col = 2) lines(predict(a10, year, interval = "both"), col = 3) lines(predict(a90, year, interval = "both"), col = 3) points(year, temp, pch = c(1,3)[2 - normal.idx]) ## label the "hot" and "cold" days text(year[hot.idx], temp[hot.idx] + .03, labels = year[hot.idx]) text(year[cold.idx],temp[cold.idx]- .03, labels = year[cold.idx])
data(globtemp) plot(globtemp, main = "Annual Global Temperature Deviations") str(globtemp) ## forget about time-series, just use numeric vectors: year <- as.vector(time(globtemp)) temp <- as.vector(globtemp) ##---- Code for Figure 1a of He and Ng (1999) ---------- a50 <- cobs(year, temp, knots.add = TRUE, degree = 1, constraint = "increase") summary(a50) ## As suggested in the warning message, we increase the number of knots to 9 a50 <- cobs(year, temp, nknots = 9, knots.add = TRUE, degree = 1, constraint = "increase") summary(a50) ## Here, we use the same knots sequence chosen for the 50th percentile a10 <- cobs(year, temp, nknots = length(a50$knots), knots = a50$knot, degree = 1, tau = 0.1, constraint = "increase") summary(a10) a90 <- cobs(year, temp, nknots = length(a50$knots), knots = a50$knot, degree = 1, tau = 0.9, constraint = "increase") summary(a90) which(hot.idx <- temp >= a90$fit) which(cold.idx <- temp <= a10$fit) normal.idx <- !hot.idx & !cold.idx plot(year, temp, type = "n", ylab = "Temperature (C)", ylim = c(-.7,.6)) lines(predict(a50, year, interval = "both"), col = 2) lines(predict(a10, year, interval = "both"), col = 3) lines(predict(a90, year, interval = "both"), col = 3) points(year, temp, pch = c(1,3)[2 - normal.idx]) ## label the "hot" and "cold" days text(year[hot.idx], temp[hot.idx] + .03, labels = year[hot.idx]) text(year[cold.idx],temp[cold.idx]- .03, labels = year[cold.idx])
From a "conreg"
object representing a linear
spline,
interpSplineCon()
produces the corresponding (cubic)
spline (via package splines' interpSpline()
)
by interpolating at the knots, thus “smoothing the kinks”.
isIsplineCon()
determines if the spline fulfills the
same convexity / concavity constraints as the underlying
"conreg"
object.
interpSplineCon(object, ...) isIsplineCon(object, isp, ...)
interpSplineCon(object, ...) isIsplineCon(object, isp, ...)
object |
an R object as resulting from |
isp |
optionally, the result of |
... |
optional further arguments passed to
|
interpSplineCon()
returns the
interpSpline()
interpolation spline object.
isIsplineCon()
is TRUE
(or FALSE
),
indicating if the convexity/concavity constraints are fulfilled (in
knot intervals).
Martin Maechler
cc <- conreg(cars[,"speed"], cars[,"dist"], convex=TRUE) iS <- interpSplineCon(cc) (isC <- isIsplineCon(cc)) # FALSE: not strictly convex ## Passing the interpolation spline --- if you have it anyway --- ## is more efficient (faster) : stopifnot(identical(isC, isIsplineCon(cc, isp = iS))) ## the interpolation spline is not quite convex: plot(cc) with(cars, points(dist ~ speed, col = adjustcolor(1, 1/2))) lines(predict(iS, seq(1,28, by=1/4)), col = adjustcolor("forest green", 3/4), lwd=2)
cc <- conreg(cars[,"speed"], cars[,"dist"], convex=TRUE) iS <- interpSplineCon(cc) (isC <- isIsplineCon(cc)) # FALSE: not strictly convex ## Passing the interpolation spline --- if you have it anyway --- ## is more efficient (faster) : stopifnot(identical(isC, isIsplineCon(cc, isp = iS))) ## the interpolation spline is not quite convex: plot(cc) with(cars, points(dist ~ speed, col = adjustcolor(1, 1/2))) lines(predict(iS, seq(1,28, by=1/4)), col = adjustcolor("forest green", 3/4), lwd=2)
COBS (cobs
) auxiliary function for constructing the
pointwise constraint specification list from the pointwise
3-column matrix (as used as argument in cobs
).
mk.pt.constr(pointwise)
mk.pt.constr(pointwise)
pointwise |
numeric 3-column matrix, see |
A list with components
n.equal |
number of equality constraints |
n.greater |
number of “greater” constraints |
n.smaller |
number of “smaller” constraints |
n.gradient |
number of gradient constraints |
Unless the input pointwise
was NULL
, the result also
has corresponding components:
equal |
3-column matrix ofequality constraints |
greater |
3-column matrix of“greater” constraints |
smaller |
3-column matrix of“smaller” constraints |
gradient |
3-column matrix ofgradient constraints |
Martin Maechler
## from ?cobs: x <- seq(-1,3,,150) con <- rbind(c( 1,min(x),0), # f(min(x)) >= 0 c(-1,max(x),1), # f(max(x)) <= 1 c(0, 0, 0.5))# f(0) = 0.5 r <- mk.pt.constr(con) str(r)
## from ?cobs: x <- seq(-1,3,,150) con <- rbind(c( 1,min(x),0), # f(min(x)) >= 0 c(-1,max(x),1), # f(max(x)) <= 1 c(0, 0, 0.5))# f(0) = 0.5 r <- mk.pt.constr(con) str(r)
The plot
method for cobs
objects.
If there was lambda
selection, it provides two plots by default.
## S3 method for class 'cobs' plot(x, which = if(x$select.lambda) 1:2 else 2, show.k = TRUE, col = par("col"), l.col = c("red","pink"), k.col = gray(c(0.6, 0.8)), lwd = 2, cex = 0.4, ylim = NULL, xlab = deparse(x$call[[2]]), ylab = deparse(x$call[[3]]), main = paste(deparse(x$call, width.cutoff = 100), collapse="\n"), subtit= c("choosing lambda", "data & spline curve") , ...)
## S3 method for class 'cobs' plot(x, which = if(x$select.lambda) 1:2 else 2, show.k = TRUE, col = par("col"), l.col = c("red","pink"), k.col = gray(c(0.6, 0.8)), lwd = 2, cex = 0.4, ylim = NULL, xlab = deparse(x$call[[2]]), ylab = deparse(x$call[[3]]), main = paste(deparse(x$call, width.cutoff = 100), collapse="\n"), subtit= c("choosing lambda", "data & spline curve") , ...)
x |
object of class |
which |
integer vector specifying which plots should be drawn; |
show.k |
logical indicating if the “effective
dimensionality” |
col , l.col , k.col
|
colors for plotting; |
lwd , cex
|
line width and point size for the 2nd plot
(i.e. |
ylim |
y-axis limits, see |
xlab , ylab , main
|
axis annotation; see also |
subtit |
a vector of length 2, specifying a sub title for each
plot (according to |
... |
further arguments passed and to internal
|
plot(.)
produces two side-by-side plots in case there was a
search for the optimal lambda(which = 1:2
), and only the
(second) data plus spline curve plot otherwise (which = 2
).
Martin Maechler
There are several other methods for COBS objects, see, e.g.
summary.cobs
or predict.cobs
.
cobs
for examples.
example(cobs) plot(Sbs) plot(fitted(Sbs), resid(Sbs), main = "Tukey-Anscombe plot for cobs()", sub = deparse(Sbs$call))
example(cobs) plot(Sbs) plot(fitted(Sbs), resid(Sbs), main = "Tukey-Anscombe plot for cobs()", sub = deparse(Sbs$call))
Compute predicted values and simultaneous or pointwise confidence
bounds for cobs
objects.
## S3 method for class 'cobs' predict(object, z, deriv = 0L, minz = knots[1], maxz = knots[nknots], nz = 100, interval = c("none", "confidence", "simultaneous", "both"), level = 0.95, ...)
## S3 method for class 'cobs' predict(object, z, deriv = 0L, minz = knots[1], maxz = knots[nknots], nz = 100, interval = c("none", "confidence", "simultaneous", "both"), level = 0.95, ...)
object |
object of class |
z |
vector of grid points at which the fitted values are
evaluated; defaults to an equally spaced grid with |
deriv |
scalar integer specifying (the order of) the derivative that should be computed. |
minz |
numeric needed if |
maxz |
analogous to |
nz |
number of grid points in |
interval |
type of interval calculation, see below |
level |
confidence level |
... |
further arguments passed to and from methods. |
a matrix of predictions and bounds if interval
is set (not
"none"). The columns are named z
, fit
, further
cb.lo
and cb.up
for the simultaneous
confidence
band, and ci.lo
and ci.up
the pointwise
confidence
intervals according to specified level
.
If z
has been specified, it is unchanged in the result.
Martin Maechler, based on He and Ng's code in cobs()
.
cobs
the model fitting function.
example(cobs) # continuing : (pRbs <- predict(Rbs)) #str(pSbs <- predict(Sbs, xx, interval = "both")) str(pSbs <- predict(Sbs, xx, interval = "none")) plot(x,y, xlim = range(xx), ylim = range(y, pSbs[,2], finite = TRUE), main = "COBS Median smoothing spline, automatical lambda") lines(pSbs, col = "red") lines(spline(x,f.true), col = "gray40") #matlines(pSbs[,1], pSbs[,-(1:2)], # col= rep(c("green","blue"),c(2,2)), lty=2)
example(cobs) # continuing : (pRbs <- predict(Rbs)) #str(pSbs <- predict(Sbs, xx, interval = "both")) str(pSbs <- predict(Sbs, xx, interval = "none")) plot(x,y, xlim = range(xx), ylim = range(y, pSbs[,2], finite = TRUE), main = "COBS Median smoothing spline, automatical lambda") lines(pSbs, col = "red") lines(spline(x,f.true), col = "gray40") #matlines(pSbs[,1], pSbs[,-(1:2)], # col= rep(c("green","blue"),c(2,2)), lty=2)
Compute B-spline coefficients for regression quantile B-spline with stepwise knots selection and quantile B-spline with fixed knots regression spline, using Ng (1996)'s algorithm.
qbsks2(x,y,w,pw, knots,nknots, degree,Tlambda, constraint, ptConstr, maxiter, trace, nrq,nl1, neqc, tau, select.lambda, ks, do.select, knots.add, repeat.delete.add, ic, print.mesg, give.pseudo.x = TRUE, rq.tol = 1e-8, tol.kn = 1e-6, tol.0res = 1e-6, print.warn, nk.start)
qbsks2(x,y,w,pw, knots,nknots, degree,Tlambda, constraint, ptConstr, maxiter, trace, nrq,nl1, neqc, tau, select.lambda, ks, do.select, knots.add, repeat.delete.add, ic, print.mesg, give.pseudo.x = TRUE, rq.tol = 1e-8, tol.kn = 1e-6, tol.0res = 1e-6, print.warn, nk.start)
x |
numeric vector, sorted increasingly, the abscissa values |
y |
numeric, same length as |
w |
numeric vector of weights, same length as |
pw |
penalty weights vector ... ... |
knots |
numeric vector of knots of which |
nknots |
number of |
degree |
integer specifying polynomial degree; must be 1 or 2. |
Tlambda |
(vector of) smoothing parameter(s) |
constraint |
string (or empty) specifying the global constraints;
see |
ptConstr |
|
maxiter |
non-negative integer: maximal number of iterations,
passed to |
trace |
integer or logical indicating the tracing level of the underlying algorithms; not implemented (due to lack of trace in quantreg ...) |
nrq , nl1 , neqc
|
integers specifying dimensionalities, directly
passed to |
tau |
desired quantile level (in interval |
select.lambda |
passed to |
ks |
number used as offset in SIC/AIC/BIC. |
do.select |
logical indicating if knots shall be selected (instead of used as specified). |
knots.add , repeat.delete.add
|
logicals, see |
ic |
information criterion to use, see |
print.mesg |
an integer indicating how |
give.pseudo.x |
logical indicating if the pseudo design matrix
|
rq.tol |
numeric convergence tolerance for the interior point
algorithm called from |
tol.kn |
“tolerance” for shifting the outer knots. |
tol.0res |
tolerance passed to |
print.warn |
flag indicating if and how much warnings and
information is to be printed; currently just passed to
|
nk.start |
number of starting knots used in automatic knot selection. |
This is an auxiliary function for cobs(*, lambda = 0)
,
possibly interesting on its own. This documentation is currently sparse; read
the source code!
a list with components
coef |
.. |
fidel |
.. |
k |
dimensionality of model fit. |
ifl |
integer “flag”; the return code. |
icyc |
integer of length 2, see |
knots |
the vector of inner knots. |
nknots |
the number of inner knots. |
nvar |
the number of “variables”, i.e. unknowns including constraints. |
lambda |
the penalty factor, chosen or given. |
pseudo.x |
the pseudo design matrix |
Pin Ng; this help page: Martin Maechler.
Ng, P. (1996) An Algorithm for Quantile Smoothing Splines, Computational Statistics & Data Analysis 22, 99–118.
See also the references in cobs
.
the main function cobs
; further
drqssbc2
which is called from qbsks2()
.
The USArmyRoofs
data frame has 153 observations of roof sections of US
Army bases and 2 columns, age
and fci
. This is Example 2
of He & Ng (1999).
data(USArmyRoofs)
data(USArmyRoofs)
This data frame contains the following columns:
numeric vector giving the roof's age in years.
numeric, giving the FCI, the flash condition index, i.e., the percentage of flashing which is in good condition.
From shar file available from https://www2.nau.edu/PinNg/cobs.html
He, X. and Ng, P. (1999) COBS: Qualitatively Constrained Smoothing via Linear Programming; Computational Statistics 14, 315–337.
data(USArmyRoofs) plot(fci ~ age, data = USArmyRoofs, main = "US Army Roofs data")
data(USArmyRoofs) plot(fci ~ age, data = USArmyRoofs, main = "US Army Roofs data")