Title: | Generalized Pareto Distribution and Peaks Over Threshold |
---|---|
Description: | Some functions useful to perform a Peak Over Threshold analysis in univariate and bivariate cases, see Beirlant et al. (2004) <doi:10.1002/0470012382>. A user guide is available in the vignette. |
Authors: | Christophe Dutang [aut, cre] , Mathieu Ribatet [aut] |
Maintainer: | Christophe Dutang <[email protected]> |
License: | GPL (>= 2) |
Version: | 1.1-11 |
Built: | 2024-11-16 06:20:56 UTC |
Source: | https://github.com/r-forge/pot |
The POT package aims to provide operational tools to analyze peak over threshold. This package relies on the extreme value theory (EVT) to model the tail of any continuous distribution. Tail modelling, in particular POT modelling, is of great importance for many financial and environmental applications.
The POT package was first committed to the CRAN in April 2005 and is still in development. The package is hosted in R-forge. Since November 2016, the package has a new maintainer.
The main motivation was to provide practical tools for probabilistic modelling of high flood flows. However, the strength of the EVT is that results do not depend on the process to be modelled. Thus, one can use the POT package to analyze precipitations, floods, financial times series, earthquakes and so on...
The POT package can perform univariate and bivariate extreme value analysis; first order Markov chains can also be considered. For instance, the (univariate) GPD is currently fitted using 18 estimators. These estimators rely on three different techniques:
Likelihood maximization: MLE, LME, MPLE
Moment Approaches: MOM, PWM, MED
Distance Minimization: MDPD and MGF estimators.
Contrary to the univariate case, there is no finite parametrisation to model bivariate exceedances over thresholds. The POT packages allows 6 parametrisation for the bivariate GPD: the logistic, negative logistic and mixed models - with their respective asymmetric counterparts.
Lastly, first order Markov chains can be fitted using the bivariate GPD for the joint distribution of two consecutive observations.
We have written a package vignette to help new users.
This users guide is a part of the package - just run vignette("POT")
once the package is loaded.
Mathieu Ribatet and Christophe Dutang.
Computes analysis of deviance for “bvpot” object
## S3 method for class 'bvpot' anova(object, object2, ..., half = FALSE)
## S3 method for class 'bvpot' anova(object, object2, ..., half = FALSE)
object , object2
|
Two objects of class “bvpot”, most often return of the
|
... |
Other options to be passed to the |
half |
Logical. For some non-regular testing problems the deviance
difference is known to be one half of a chi-squared random
variable. Set half to |
This function returns an object of class anova. These objects represent analysis-of-deviance tables.
Circumstances may arise such that the asymptotic distribution of the test statistic is not chi-squared. In particular, this occurs when the smaller model is constrained at the edge of the parameter space. It is up to the user recognize this, and to interpret the output correctly.
In some cases the asymptotic distribution is known to be one half of a
chi-squared; you can set half = TRUE
in these cases.
Mathieu Ribatet (Alec Stephenson for the “Warning” case)
x <- rgpd(1000, 0, 1, -0.25) y <- rgpd(1000, 2, 0.5, 0) M0 <- fitbvgpd(cbind(x,y), c(0, 2)) M1 <- fitbvgpd(cbind(x,y), c(0,2), model = "alog") anova(M0, M1) ##Non regular case M0 <- fitbvgpd(cbind(x,y), c(0, 2)) M1 <- fitbvgpd(cbind(x,y), c(0, 2), alpha = 1) anova(M0, M1, half = TRUE)
x <- rgpd(1000, 0, 1, -0.25) y <- rgpd(1000, 2, 0.5, 0) M0 <- fitbvgpd(cbind(x,y), c(0, 2)) M1 <- fitbvgpd(cbind(x,y), c(0,2), model = "alog") anova(M0, M1) ##Non regular case M0 <- fitbvgpd(cbind(x,y), c(0, 2)) M1 <- fitbvgpd(cbind(x,y), c(0, 2), alpha = 1) anova(M0, M1, half = TRUE)
Computes analysis of deviance for “uvpot” object
## S3 method for class 'uvpot' anova(object, object2, ...)
## S3 method for class 'uvpot' anova(object, object2, ...)
object , object2
|
Two objects of class “uvpot”, most often return of the
|
... |
Other options to be passed to the |
This function returns an object of class anova. These objects represent analysis-of-deviance tables.
Mathieu Ribatet
x <- rgpd(1000, 0, 1, -0.15) M0 <- fitgpd(x, 0, shape = -0.15) M1 <- fitgpd(x, 0) anova(M0, M1)
x <- rgpd(1000, 0, 1, -0.15) M0 <- fitgpd(x, 0, shape = -0.15) M1 <- fitgpd(x, 0) anova(M0, M1)
Density, distribution function and random generation for six different parametric bivariate GPD
rbvgpd(n, alpha, model = "log", asCoef, asCoef1, asCoef2, mar1 = c(0,1,0), mar2 = mar1) pbvgpd(q, alpha, model = "log", asCoef, asCoef1, asCoef2, mar1 = c(0,1,0), mar2 = mar1, lower.tail = TRUE)
rbvgpd(n, alpha, model = "log", asCoef, asCoef1, asCoef2, mar1 = c(0,1,0), mar2 = mar1) pbvgpd(q, alpha, model = "log", asCoef, asCoef1, asCoef2, mar1 = c(0,1,0), mar2 = mar1, lower.tail = TRUE)
n |
The number of observations to be simulated. |
q |
A matrix or vector with two columns at which the distribution is computed. |
alpha |
Dependence parameter for the logistic, asymmetric logistic, negative logistic, asymmetric negative logistic, mixed and asymmetric mixed models. |
model |
The specified model; a character string. Must be
either |
asCoef , asCoef1 , asCoef2
|
The asymmetric coefficients for the asymmetric logistic, asymmetric negative logistic and asymmetric mixed models. |
mar1 , mar2
|
Vectors of length 3 giving the marginal parameters. |
lower.tail |
Logical. If |
The logistic and asymmetric logistic models respectively are simulated using bivariate versions of Algorithms 1.1 and 1.2 in Stephenson(2003). All other models are simulated using a root finding algorithm to simulate from the conditional distributions.
Generate a random vector of length n
.
Mathieu Ribatet (Alec Stephenson for the C codes)
Stephenson, A. G. (2003) Simulating multivariate extreme value distributions of logistic type. Extremes, 6(1), 49–60.
x <- rbvgpd(1000, alpha = 0.25, model = "log", mar1 = c(0,1,0.25), mar2 = c(2,0.5, -0.15)) y <- rbvgpd(1000, alpha = 0.75, model = "nlog", mar1 = c(0,1,0.25), mar2 = c(2,0.5, -0.15)) par(mfrow=c(1,2)) plot(x);plot(y)
x <- rbvgpd(1000, alpha = 0.25, model = "log", mar1 = c(0,1,0.25), mar2 = c(2,0.5, -0.15)) y <- rbvgpd(1000, alpha = 0.75, model = "nlog", mar1 = c(0,1,0.25), mar2 = c(2,0.5, -0.15)) par(mfrow=c(1,2)) plot(x);plot(y)
Provide two measures to assess for asymptotic dependence or independence
chimeas(data, u.range, n.u = 500, xlab, ylabs, ci = 0.95, boot = FALSE, n.boot = 250, block.size = 50, show.bound = TRUE, which = 1:2, ask = nb.fig < length(which) && dev.interactive(), ..., col.ci = "grey", col.bound = "blue", lty.ci = 1, lty.bound = 1)
chimeas(data, u.range, n.u = 500, xlab, ylabs, ci = 0.95, boot = FALSE, n.boot = 250, block.size = 50, show.bound = TRUE, which = 1:2, ask = nb.fig < length(which) && dev.interactive(), ..., col.ci = "grey", col.bound = "blue", lty.ci = 1, lty.bound = 1)
data |
A matrix with 2 columns with the data. |
u.range |
Numeric vection of length 2 (may be missing): the range for the probabilities. |
n.u |
The number of probabilities to be considered |
xlab , ylabs
|
The x-axis and ylabs labels. ylabs must be of length 2 |
ci |
The probability level for the confidence intervals |
boot |
Logical. If |
n.boot |
The number of bootstrap replicates. |
block.size |
The size of the “contiguous” blocks. See details. |
show.bound |
Logical. If |
which |
Which plot should be plotted? |
ask |
Logical. Should user be asked before each plot is computed? |
... |
Additional options to be passed to the |
col.ci , col.bound
|
The color for the confidence intervals and theoretical bounds. |
lty.ci , lty.bound
|
The line type for the confidence intervals and theoretical bounds. |
These two plots help us to understand the dependence relationship
between the two data set. The sign of determines
if the variables are positively or negatively correlated. Two variable
are asymptotically independent if
. For the independent case,
for all u in (0,1). For the perfect dependence case,
for all u in (0,1). Note that for a
bivariate extreme value model,
for all u in (0,1).
The measure is only useful for
asymptotically independent variables. Indeed, for asymptotically
dependent variable, we have
. For
asymptotically independent variables,
reflects the strength
of the dependence between variables. For independent variables,
for all u in (0,1).
If there is (short range) dependence between observations, users may
need to use bootstrap confidence intervals. Bootstrap series are
obtained by sampling contiguous blocks, of length l
say,
uniformly with replacement from the original observations. The block
length l
should be chosen to be much greater than the
short-range dependence and much smaller than the total number of
observations.
A graphic window.
Mathieu Ribatet
Coles, S., Heffernan, J. and Tawn, J. (1999) Dependence measures for extreme value analyses. Extremes 2 339–365.
tailind.test
, specdens
,
tsdep.plot
mc <- simmc(200, alpha = 0.9) mc2 <- simmc(100, alpha = 0.2) ##An independent case par(mfrow = c(1,2)) chimeas(cbind(mc[1:100], mc2)) ##Asymptotic dependence par(mfrow = c(1,2)) chimeas(cbind(mc[seq(1,200, by = 2)], mc[seq(2,200,by = 2)])) ##The same but with bootstrap ci par(mfrow = c(1,2)) chimeas(cbind(mc[seq(1,200, by = 2)], mc[seq(2,200,by = 2)]), boot = TRUE, n.boot=50)
mc <- simmc(200, alpha = 0.9) mc2 <- simmc(100, alpha = 0.2) ##An independent case par(mfrow = c(1,2)) chimeas(cbind(mc[1:100], mc2)) ##Asymptotic dependence par(mfrow = c(1,2)) chimeas(cbind(mc[seq(1,200, by = 2)], mc[seq(2,200,by = 2)])) ##The same but with bootstrap ci par(mfrow = c(1,2)) chimeas(cbind(mc[seq(1,200, by = 2)], mc[seq(2,200,by = 2)]), boot = TRUE, n.boot=50)
A function to identify clusters of exceedances of a time series.
clust(data, u, tim.cond = 1, clust.max = FALSE, plot = FALSE, only.excess = TRUE, ...)
clust(data, u, tim.cond = 1, clust.max = FALSE, plot = FALSE, only.excess = TRUE, ...)
data |
A matrix/data.frame with two columns. Columns names must
be |
u |
Numeric. A value giving the threshold. |
tim.cond |
A time condition to ensure independence between
events. Should be in the same unit than |
clust.max |
Logical. If |
plot |
Logical. If |
only.excess |
Logical. If |
... |
Optional parameters to be passed in |
The clusters of exceedances are defined as follows:
The first exceedance initiates the first cluster;
The first observation under the threshold u
“ends” the
current cluster unless tim.cond
does not hold;
The next exceedance initiates a new cluster;
The process is iterated as needed.
This function differs from the function clusters
of evd
Package as independence condition i.e. tim.cond
could be a
“temporal” condition. That is, two events are considered independent
if the inter-arrival time is greater than a fixed duration.
However, it is also possible to used the “index” independence as in
clust
by setting data[,"time"] =
1:length(data[,"obs"])
.
If clust.max
is FALSE
, a list containing the clusters of
exceedances is returned. Else, a matrix containing the cluster maxima,
related dates and indices are returned.
In any case, the returned object has an attribute exi
giving
an estimation of the Extremal Index, that is the inverse of the
average cluster size.
Mathieu Ribatet
clusters
of package evd
.
data(ardieres) par(mfrow=c(1,2)) clust(ardieres, 4, 10 / 365) clust(ardieres, 4, 10 / 365, clust.max = TRUE) clust(ardieres, 4, 10 / 365, clust.max = TRUE, plot = TRUE) ##The same but with optional arguments passed to function ``plot'' clust(ardieres, 4, 10 / 365, clust.max = TRUE, plot = TRUE, xlab = "Time (Years)", ylab = "Flood discharges", xlim = c(1972, 1980))
data(ardieres) par(mfrow=c(1,2)) clust(ardieres, 4, 10 / 365) clust(ardieres, 4, 10 / 365, clust.max = TRUE) clust(ardieres, 4, 10 / 365, clust.max = TRUE, plot = TRUE) ##The same but with optional arguments passed to function ``plot'' clust(ardieres, 4, 10 / 365, clust.max = TRUE, plot = TRUE, xlab = "Time (Years)", ylab = "Flood discharges", xlim = c(1972, 1980))
Plot estimates of the Extremal Index
exiplot(data, u.range, tim.cond = 1, n.u = 50, xlab, ylab, ...)
exiplot(data, u.range, tim.cond = 1, n.u = 50, xlab, ylab, ...)
data |
A matrix/data.frame with two columns. Columns names must
be |
u.range |
A numeric vector of length 2. Specify the range of threshold for which the Extremal Index is estimated. |
tim.cond |
A time condition to ensure independence between
events. Should be in the same unit that |
n.u |
Numeric. The number of thresholds at which the Extremal Index is estimated. |
xlab , ylab
|
Optional character strings to label the x and y axis. |
... |
Optional options to be passed to the |
Returns invisibly a matrix with two columns. The first one thresh
giving the threshold and the second one exi
the related Extremal
Index estimate.
Mathieu Ribatet
'pot'
modelcoef
extracts model coefficients of an object of class 'pot'
## S3 method for class 'pot' coef(object, ...)
## S3 method for class 'pot' coef(object, ...)
object |
An object of class |
... |
Other arguments to be passed to the |
Standard coef
object: see coef
.
Christophe Dutang
set.seed(123) x <- rgpd(500, 0, 1, -0.15) fmle <- fitgpd(x, 0) coef(fmle)
set.seed(123) x <- rgpd(500, 0, 1, -0.15) fmle <- fitgpd(x, 0) coef(fmle)
Compute (profile) confidence intervals for the scale, shape GPD parameters and also for GPD quantiles.
## S3 method for class 'uvpot' confint(object, parm, level = 0.95, ..., range, prob, prof = TRUE)
## S3 method for class 'uvpot' confint(object, parm, level = 0.95, ..., range, prob, prof = TRUE)
object |
|
parm |
Charater string specifies for which variable confidence
intervals are computed. One of |
level |
Numeric. The confidence level. |
... |
Optional parameters. See details. |
range |
Vector of dimension two. It gives the lower and upper bound on which the profile likelihood is performed. Only required when "prof = TRUE". |
prob |
The probability of non exceedance. |
prof |
Logical. If |
Additional options can be passed using "..." in the function
call. Possibilites are related to the specific functions:
link{gpd.fiscale}
, link{gpd.fishape}
,
link{gpd.firl}
, link{gpd.pfscale}
,
link{gpd.pfshape}
, link{gpd.pfrl}
.
Returns a vector of the lower and upper bound for the (profile) confidence
interval. Moreover, a graphic of the profile likelihood function is
displayed when prof = TRUE
.
Mathieu Ribatet
link{gpd.fiscale}
, link{gpd.fishape}
,
link{gpd.firl}
, link{gpd.pfscale}
,
link{gpd.pfshape}
and link{gpd.pfrl}
x <- rgpd(100, 0, 1, 0.25) fmle <- fitgpd(x, 0) confint(fmle, prob = 0.2) confint(fmle, parm = "shape")
x <- rgpd(100, 0, 1, 0.25) fmle <- fitgpd(x, 0) confint(fmle, prob = 0.2) confint(fmle, parm = "shape")
convassess
is a generic function used to assess the convergence of
the estimation procedure to the global maximum. The function invokes particular methods
which depend on the class
of the first argument.
This function uses several starting values to assess the sensitiveness of the
fitted object with respect to starting values.
convassess(object, n = 50) ## S3 method for class 'uvpot' convassess(object, n = 50) ## S3 method for class 'mcpot' convassess(object, n = 50) ## S3 method for class 'bvpot' convassess(object, n = 50)
convassess(object, n = 50) ## S3 method for class 'uvpot' convassess(object, n = 50) ## S3 method for class 'mcpot' convassess(object, n = 50) ## S3 method for class 'bvpot' convassess(object, n = 50)
object |
A fitted object. When using the POT package, an object
of class |
n |
The number of starting values to be tested. |
The starting values are defined using the unbiased probability
weighted moments fitted on n
bootstrap samples.
Graphics: the considered starting values, the objective values derived from numerical optimizations and traceplots for all estimated parameters. In addition, it returns invisibly all these informations.
Mathieu Ribatet
set.seed(1) ##Univariate Case x <- rgpd(30, 0, 1, 0.2) fgpd1 <- fitgpd(x, 0, "med") convassess(fgpd1) ##Bivariate Case x <- rbvgpd(50, model = "log", alpha = 0.5, mar1 = c(0, 1, 0.2)) fgpd2 <- fitbvgpd(x, c(0.5,0.5)) convassess(fgpd2)
set.seed(1) ##Univariate Case x <- rgpd(30, 0, 1, 0.2) fgpd1 <- fitgpd(x, 0, "med") convassess(fgpd1) ##Bivariate Case x <- rbvgpd(50, model = "log", alpha = 0.5, mar1 = c(0, 1, 0.2)) fgpd2 <- fitbvgpd(x, c(0.5,0.5)) convassess(fgpd2)
dens
is a generic function used to plot the density.
The function invokes particular methods
which depend on the class
of the first argument.
So the function plots density for univariate POT models.
dens(object, ...) ## S3 method for class 'uvpot' dens(object, main, xlab, ylab, dens.adj = 1, kern.lty = 2, rug = TRUE, plot.kernel = TRUE, plot.hist = TRUE, hist.col = NULL, ...)
dens(object, ...) ## S3 method for class 'uvpot' dens(object, main, xlab, ylab, dens.adj = 1, kern.lty = 2, rug = TRUE, plot.kernel = TRUE, plot.hist = TRUE, hist.col = NULL, ...)
object |
A fitted object. When using the POT package, an object
of class |
main |
The title of the graphic. If missing, the title is set to
|
xlab , ylab
|
The labels for the x and y axis. If missing, they are
set to |
dens.adj |
Numeric. The adjustment for the kernel density
estimation in the |
kern.lty |
The line type for the kernel density estimation. This
corresponds to the |
rug |
Logical. Should we call the |
plot.kernel |
Logical. Should the kernel density estimate be plotted? |
plot.hist |
Logical. Should the histogram be plotted? |
hist.col |
The color to fill the histogram. |
... |
Other arguments to be passed to the |
The density plot consists of plotting on the same windows the theoretical density and a kernel estimation one. If the theoretical model is correct, then the two densities should be “similar”.
A graphical window.
Mathieu Ribatet
x <- rgpd(75, 1, 2, 0.1) pwmu <- fitgpd(x, 1, "pwmu") dens(pwmu)
x <- rgpd(75, 1, 2, 0.1) pwmu <- fitgpd(x, 1, "pwmu") dens(pwmu)
Compute the density of the extremal index using simulations from a fitted markov chain model.
dexi(x, n.sim = 1000, n.mc = length(x$data), plot = TRUE, ...)
dexi(x, n.sim = 1000, n.mc = length(x$data), plot = TRUE, ...)
x |
A object of class |
n.sim |
The number of simulation of Markov chains. |
n.mc |
The length of the simulated Markov chains. |
plot |
Logical. If |
... |
Optional parameters to be passed to the
|
The Markov chains are simulated using the simmc
function to obtained dependent realisations of standard
uniform realizations. Then, they are transformed to correspond to the
parameter of the fitted markov chain model. Thus, if
is the location, scale and shape parameters ; and
is the probability of exceedance of
,
then by defining :
the realizations are distributed such as the probability
of exceedance of
is equal to
.
At last, the extremal index for each generated Markov chain is estimated using the estimator of Ferro and Segers (2003) (and thus avoid any declusterization).
The function returns a optionally plot of the kernel density estimate of the extremal index. In addition, the vector of extremal index estimations is returned invisibly.
Mathieu Ribatet
Fawcett L., and Walshaw D. (2006) Markov chain models for extreme wind speed. Environmetrics, 17:(8) 795–809.
Ferro, C. and Segers, J. (2003) Inference for clusters of extreme values. Journal of the Royal Statistical Society. Series B 65:(2) 545–556.
Ledford A., and Tawn, J. (1996) Statistics for near Independence in Multivariate Extreme Values. Biometrika, 83 169–187.
Smith, R., and Tawn, J., and Coles, S. (1997) Markov chain models for threshold exceedances. Biometrika, 84 249–268.
mc <- simmc(100, alpha = 0.25) mc <- qgpd(mc, 0, 1, 0.25) fgpd1 <- fitmcgpd(mc, 2, shape = 0.25, scale = 1) dexi(fgpd1, n.sim = 100)
mc <- simmc(100, alpha = 0.25) mc <- qgpd(mc, 0, 1, 0.25) fgpd1 <- fitmcgpd(mc, 2, shape = 0.25, scale = 1) dexi(fgpd1, n.sim = 100)
The Dispersion Index Plot
diplot(data, u.range, main, xlab, ylab, nt = max(200, nrow(data)), conf=0.95, ...)
diplot(data, u.range, main, xlab, ylab, nt = max(200, nrow(data)), conf=0.95, ...)
data |
A matrix with two column. The first one represents the date of events (in a numeric format) and the second the data associated with those dates. |
u.range |
A numeric vector of length two giving the limit of threshold analyzed. If missing, default values are taken. |
main |
The title of the plot. |
xlab , ylab
|
Labels for the x and y axis. |
nt |
The number of thresholds at which the dispersion index plot is evaluated. |
conf |
The confident coefficient for the plotted confidence intervals. |
... |
Other arguments to be passed to the |
According to the Extreme Value Theory, the number of exceedance
over a high threshold in a fixed period - generally a year - must be
distributed as Poisson process. As for a random variable Poisson
distributed, the ratio of the variance and the mean is equal to 1, one
can test if the ratio differs from
1. Moreover, confidence levels for
DI
can be calculated by
testing against a distribution with
M
-1 degree of
freedom, M
being the total number of fixed periods -generally
the total number of years in the sample. So, the Poisson hypothesis is
not rejected if the estimated DI
is within the range
It returns invisibly a list with two components. The first one
'thresh'
gives the thresholds analyzed. The second 'DI'
gives the dispersion index relative to the threshold.
Mathieu Ribatet
Cunnane, C. (1979) Note on the poisson assumption in partial duration series model. Water Resource Research, 15(2) :489–494.
data(ardieres) ardieres <- clust(ardieres, 4, 10 / 365, clust.max = TRUE) diplot(ardieres)
data(ardieres) ardieres <- clust(ardieres, 4, 10 / 365, clust.max = TRUE) diplot(ardieres)
Compute Fisher based confidence intervals on parameter and return level for the GP distribution. This is achieved through asymptotic theory and the Observed information matrix of Fisher and eventually the Delta method.
gpd.fishape(object, conf = 0.95) gpd.fiscale(object, conf = 0.95) gpd.firl(object, prob, conf = 0.95)
gpd.fishape(object, conf = 0.95) gpd.fiscale(object, conf = 0.95) gpd.firl(object, prob, conf = 0.95)
object |
|
prob |
The probability of non exceedance. |
conf |
Numeric. The confidence level. |
Returns a vector of the lower and upper bound for the confidence interval.
Mathieu Ribatet
rp2prob
, prob2rp
,
gpd.pfscale
,
gpd.pfshape
, gpd.pfrl
and
confint
data(ardieres) ardieres <- clust(ardieres, 4, 10 / 365, clust.max = TRUE) f1 <- fitgpd(ardieres[,"obs"], 5, 'mle') gpd.fishape(f1) gpd.fiscale(f1)
data(ardieres) ardieres <- clust(ardieres, 4, 10 / 365, clust.max = TRUE) f1 <- fitgpd(ardieres[,"obs"], 5, 'mle') gpd.fishape(f1) gpd.fiscale(f1)
Maximum (Penalized) Likelihood, Unbiased Probability Weighted Moments,Biased Probability Weighted Moments, Moments, Pickands', Minimum Density Power Divergence, Medians, Likelihood Moment and Maximum Goodness-of-Fit Estimators to fit Peaks Over a Threshold to a GP distribution.
fitgpd(data, threshold, est = "mle", ...)
fitgpd(data, threshold, est = "mle", ...)
data |
A numeric vector. |
threshold |
A numeric value giving the threshold for the
GPD. The |
est |
A string giving the names of the estimator. It can be
|
... |
Other optional arguments to be passed to the
|
This function returns an object of class "uvpot"
with components:
fitted.values |
A vector containing the estimated parameters. |
std.err |
A vector containing the standard errors. |
fixed |
A vector containing the parameters of the model that have been held fixed. |
param |
A vector containing all parameters (optimized and fixed). |
deviance |
The deviance at the maximum likelihood estimates. |
corr |
The correlation matrix. |
convergence , counts , message
|
Components taken from the
list returned by |
threshold |
The threshold passed to argument |
nat , pat
|
The number and proportion of exceedances. |
data |
The data passed to the argument |
exceed |
The exceedances, or the maxima of the clusters of exceedances. |
scale |
The scale parameter for the fitted generalized Pareto distribution. |
std.err.type |
The standard error type - for |
var.thresh |
Logical. Specify if the threshold is a varying one -
|
The Maximum Likelihood estimator is obtained through a slightly
modified version of Alec Stephenson's fpot.norm
function in the
evd
package.
For the 'mple'
estimator, the likelihood function is penalized
using the following function :
where and
are the penalizing
constants. Coles and Dixon (1999) suggest the use of
.
The 'lme'
estimator has a special parameter 'r'
. Zhang
(2007) shows that a value of -0.5 should be accurate in most of the
cases. However, other values such as r < 0.5
can be
explored. In particular, if r
is approximatively equal to the
opposite of the true shape parameter value, then the lme
estimate is equivalent to the mle
estimate.
The 'pwmb'
estimator has special parameters 'a'
and
'b'
. These parameters are called the "plotting-position"
values. Hosking and Wallis (1987) recommend the use of a = 0.35
and b = 0
(the default). However, different values can be
tested.
For the 'pwmu'
and 'pwmb'
approaches, one can pass the
option 'hybrid = TRUE'
to use hybrid estimators as proposed by
Dupuis and Tsao (1998). Hybrid estimators avoid to have no feasible
points.
The mdpd
estimator has a special parameter 'a'
. This is
a parameter of the "density power divergence". Juarez and Schucany
(2004) recommend the use of a = 0.1
, but any value of
a
such as a > 0
can be used (small values are recommend
yet).
The med
estimator admits two extra arguments tol
and
maxit
to control the "stopping-rule" of the optimization
process.
The mgf
approach uses goodness-of-fit statistics to estimate
the GPD parameters. There are currently 8 different statitics: the
Kolmogorov-Smirnov "KS"
, Cramer von Mises "CM"
, Anderson
Darling "AD"
, right tail Anderson Darling "ADR"
, left
tail Anderson Darling "ADL"
, right tail Anderson Darling
(second degree) "AD2R"
, left tail Anderson Darling (second
degree) "AD2L"
and the Anderson Darling (second degree)
"AD2"
statistics.
Mathieu Ribatet
Coles, S. (2001) An Introduction to Statistical Modelling of Extreme Values. Springer Series in Statistics. London.
Coles, S. and Dixon, M. (1999) Likelihood-Based Inference for Extreme Value Models. Extremes 2(1):5–23.
Dupuis, D. and Tsao (1998) M. A hybrid estimator for generalized Pareto and extreme-value distributions. Communications in Statistics-Theory and Methods 27:925–941.
Hosking, J. and Wallis, J. (1987) Parameters and Quantile Estimation for the Generalized Pareto Distribution. Technometrics 29:339–349.
Juarez, S. and Schucany, W. (2004) Robust and Efficient Estimation for the Generalized Pareto Distribution. Extremes 7:237–251.
Luceno, A. (2006) Fitting the generalized Pareto distribution to data using maximum goodness-of-fit estimators. Computational Statistics and Data Analysis 51:904–917.
Peng, L. and Welsh, A. (2001) Robust Estimation of the Generalized Pareto Distribution. Extremes 4:53–65.
Embrechts, P and Kluppelberg, C. and Mikosch, T (1997) Modelling Extremal Events for Insurance and Finance. Springers.
Pickands, J. (1975) Statistical Inference Using Extreme Order Statistics. Annals of Statistics. 3:119–131.
Zhang, J. (2007) Likelihood Moment Estimation for the Generalized Pareto Distribution. Australian and New Zealand Journal of Statistics. 49(1):69–77.
The following usual generic functions are available
print
,
plot
,
confint
and
anova
as well as new generic functions
retlev
,
qq
,
pp
,
dens
and
convassess
.
x <- rgpd(200, 1, 2, 0.25) mle <- fitgpd(x, 1, "mle")$param pwmu <- fitgpd(x, 1, "pwmu")$param pwmb <- fitgpd(x, 1, "pwmb")$param pickands <- fitgpd(x, 1, "pickands")$param ##Check if Pickands estimates ##are valid or not !!! med <- fitgpd(x, 1, "med", ##Sometimes the fitting algo is not start = list(scale = 2, shape = 0.25))$param ##accurate. So specify ##good starting values is ##a good idea. mdpd <- fitgpd(x, 1, "mdpd")$param lme <- fitgpd(x, 1, "lme")$param mple <- fitgpd(x, 1, "mple")$param ad2r <- fitgpd(x, 1, "mgf", stat = "AD2R")$param print(rbind(mle, pwmu, pwmb, pickands, med, mdpd, lme, mple, ad2r)) ##Use PWM hybrid estimators fitgpd(x, 1, "pwmu", hybrid = FALSE) ##Now fix one of the GPD parameters ##Only the MLE, MPLE and MGF estimators are allowed ! fitgpd(x, 1, "mle", scale = 2) fitgpd(x, 1, "mple", shape = 0.25)
x <- rgpd(200, 1, 2, 0.25) mle <- fitgpd(x, 1, "mle")$param pwmu <- fitgpd(x, 1, "pwmu")$param pwmb <- fitgpd(x, 1, "pwmb")$param pickands <- fitgpd(x, 1, "pickands")$param ##Check if Pickands estimates ##are valid or not !!! med <- fitgpd(x, 1, "med", ##Sometimes the fitting algo is not start = list(scale = 2, shape = 0.25))$param ##accurate. So specify ##good starting values is ##a good idea. mdpd <- fitgpd(x, 1, "mdpd")$param lme <- fitgpd(x, 1, "lme")$param mple <- fitgpd(x, 1, "mple")$param ad2r <- fitgpd(x, 1, "mgf", stat = "AD2R")$param print(rbind(mle, pwmu, pwmb, pickands, med, mdpd, lme, mple, ad2r)) ##Use PWM hybrid estimators fitgpd(x, 1, "pwmu", hybrid = FALSE) ##Now fix one of the GPD parameters ##Only the MLE, MPLE and MGF estimators are allowed ! fitgpd(x, 1, "mle", scale = 2) fitgpd(x, 1, "mple", shape = 0.25)
Fitting a bivariate extreme value distribution to bivariate exceedances over thresholds using censored maximum likelihood procedure.
fitbvgpd(data, threshold, model = "log", start, ..., cscale = FALSE, cshape = FALSE, std.err.type = "observed", corr = FALSE, warn.inf = TRUE, method = "BFGS")
fitbvgpd(data, threshold, model = "log", start, ..., cscale = FALSE, cshape = FALSE, std.err.type = "observed", corr = FALSE, warn.inf = TRUE, method = "BFGS")
data |
A matrix with two columns which gives the observation
vector for margin 1 and 2 respectively. |
threshold |
A numeric vector for the threshold (of length 2). |
model |
A character string which specifies the model used. Must
be one of |
start |
Optional. A list for starting values in the fitting procedure. |
... |
Additional parameters to be passed to the
|
cscale |
Logical. Should the two scale parameters be equal? |
cshape |
Logical. Should the two shape parameters be equal? |
std.err.type |
The type of the standard error. Currently, one
must specify |
corr |
Logical. Should the correlation matrix be computed? |
warn.inf |
Logical. Should users be warned if likelihood is not finite at starting values? |
method |
The optimization method, see |
The bivariate exceedances are fitted using censored likelihood procedure. This methodology is fully described in Ledford (1996).
Most of models are described in Kluppelberg (2006).
The function returns an object of class c("bvpot","pot")
. As
usual, one can extract several features using fitted
(or
fitted.values
), deviance
,
logLik
and AIC
functions.
fitted.values |
The maximum likelihood estimates of the bivariate extreme value distribution. |
std.err |
A vector containing the standard errors - only present when the observed information matrix is not singular. |
var.cov |
The asymptotic variance covariance matrix - only presents when the observed information matrix is not singular. |
deviance |
The deviance. |
corr |
The correlation matrix. |
convergence , counts , message
|
Informations taken from the
|
threshold |
The marginal thresholds. |
pat |
The marginal proportion above the threshold. |
nat |
The marginal number above the threshold. |
data |
The bivariate matrix of observations. |
exceed1 , exceed2
|
The marginal exceedances. |
call |
The call of the current function. |
model |
The model for the bivariate extreme value distribution. |
chi |
The chi statistic of Coles (1999). A value near 1 (resp. 0) indicates perfect dependence (resp. independence). |
Because of numerical problems, their exists artificial numerical constraints imposed on each model. These are:
For the logistic and asymmetric logistic models:
must lie in [0.05, 1] instead of [0,1];
For the negative logistic model: must lie in
[0.01, 15] instead of
;
For the asymmetric negative logistic model: must
lie in [0.2, 15] instead of
;
For the mixed and asymmetric mixed models: None artificial numerical constraints are imposed.
For this purpose, users must check if estimates are near these artificial numerical constraints. Such cases may lead to substantial biases on the GP parameter estimates. One way to detect quickly if estimates are near the border constraints is to look at the standard errors for the dependence parameters. Small values (i.e. < 1e-5) often indicates that numerical constraints have been reached.
In addition, users must be aware that the mixed and asymmetric mixed models can not deal with perfect dependence.
Thus, user may want to plot the Pickands' dependence function to see
if variable are near independence or dependence cases using the
pickdep
function.
Mathieu Ribatet
Coles, S., Heffernan, J. and Tawn, J. (1999) Dependence Measure for Extreme Value Analyses. Extremes, 2:4 339–365.
Kl\"uppelberg, C., and May A. (2006) Bivariate extreme value distributions based on polynomial dependence functions. Mathematical Methods in the Applied Sciences, 29: 1467–1480.
Ledford A., and Tawn, J. (1996) Statistics for near Independence in Multivariate Extreme Values. Biometrika, 83: 169–187.
The following usual generic functions are available
print
,
plot
and
anova
as well as new generic functions
retlev
and
convassess
.
For optimization in R, see optim
.
x <- rgpd(1000, 0, 1, 0.25) y <- rgpd(1000, 3, 1, -0.25) ind <- fitbvgpd(cbind(x, y), c(0, 3), "log") ind ind2 <- fitbvgpd(cbind(x, y), c(0, 3), "log", alpha = 1) ind2 ind3 <- fitbvgpd(cbind(x, y), c(0, 3), cscale = TRUE) ind3 ##The mixed model can not deal with perfect dependent variables ##Thus, there is a substantial bias in GPD parameter estimates dep <- fitbvgpd(cbind(x, x + 3), c(0, 3), "mix") dep
x <- rgpd(1000, 0, 1, 0.25) y <- rgpd(1000, 3, 1, -0.25) ind <- fitbvgpd(cbind(x, y), c(0, 3), "log") ind ind2 <- fitbvgpd(cbind(x, y), c(0, 3), "log", alpha = 1) ind2 ind3 <- fitbvgpd(cbind(x, y), c(0, 3), cscale = TRUE) ind3 ##The mixed model can not deal with perfect dependent variables ##Thus, there is a substantial bias in GPD parameter estimates dep <- fitbvgpd(cbind(x, x + 3), c(0, 3), "mix") dep
Estimation of the extremal index using interexceedance times.
fitexi(data, threshold)
fitexi(data, threshold)
data |
A matrix with two columns: |
threshold |
The threshold. |
The extremal index estimator proposed by Ferro and Segers (2003) is based on interexceedance times. In particular, it does not require a specific declusterization of the time series.
The tim.cond
gives an “automatic” procedure to decluster
the time series without any subjective choice to define the
independence condition between clusters.
This function returns a list with two components. The first one
exi
gives the estimate of the extremal index; while the
second, tim.cond
gives the time condition for independence
between events to be passed to the clust
function.
Mathieu Ribatet
Ferro, C. and Segers, J. (2003) Inference for clusters of extreme values. Journal of the Royal Statistical Society. Series B 65:2 545–556.
n.obs <- 500 x <- rexp(n.obs + 1) y <- pmax(x[-1], x[-(n.obs + 1)])## The extremal index is 0.5 u <- quantile(y, 0.95) fitexi(y, u)
n.obs <- 500 x <- rexp(n.obs + 1) y <- pmax(x[-1], x[-(n.obs + 1)])## The extremal index is 0.5 u <- quantile(y, 0.95) fitexi(y, u)
Fitting a Markov chain to cluster exceedances using a bivariate extreme value distribution and a censored maximum likelihood procedure.
fitmcgpd(data, threshold, model = "log", start, ..., std.err.type = "observed", corr = FALSE, warn.inf = TRUE, method = "BFGS")
fitmcgpd(data, threshold, model = "log", start, ..., std.err.type = "observed", corr = FALSE, warn.inf = TRUE, method = "BFGS")
data |
A vector of observations. |
threshold |
The threshold value. |
model |
A character string which specifies the model used. Must
be one of |
start |
Optional. A list for starting values in the fitting procedure. |
... |
Additional parameters to be passed to the
|
std.err.type |
The type of the standard error. Currently, one
must specify |
corr |
Logical. Should the correlation matrix be computed? |
warn.inf |
Logical. Should users be warned if likelihood is not finite at starting values? |
method |
The optimization method, see |
The Markov Chain model is defined as follows:
As exceedances above a (high enough) threshold are of interest, it is assumed that the marginal are GPD distributed, while the joint distribution is represented by a bivariate extreme value distribution. Smith et al. (1997) present theoretical results about this Markov Chain model.
The bivariate exceedances are fitted using censored likelihood procedure. This methodology is fully described in Ledford (1996).
Most of models are described in Kluppelberg (2006).
The function returns an object of class c("mcpot", "uvpot",
"pot")
. As usual, one can extract several features using
fitted
(or fitted.values
),
deviance
, logLik
and AIC
functions.
fitted.values |
The maximum likelihood estimates of the Markov chain including estimated parameters of the bivariate extreme value distribution. |
std.err |
A vector containing the standard errors - only present when the observed information matrix is not singular. |
var.cov |
The asymptotic variance covariance matrix - only presents when the observed information matrix is not singular. |
deviance |
The deviance. |
corr |
The correlation matrix. |
convergence , counts , message
|
Informations taken from the
|
threshold |
The threshold. |
pat |
The proportion above the threshold. |
nat |
The number above the threshold. |
data |
The observations. |
exceed |
The exceedances. |
call |
The call of the current function. |
model |
The model for the bivariate extreme value distribution. |
chi |
The chi statistic of Coles (1999). A value near 1 (resp. 0) indicates perfect dependence (resp. independence). |
Because of numerical problems, there exists artificial numerical constraints imposed on each model. These are:
For the logistic and asymmetric logistic models:
must lie in [0.05, 1] instead of [0,1];
For the negative logistic model: must lie in
[0.01, 15] instead of
;
For the asymmetric negative logistic model: must
lie in [0.2, 15] instead of
;
For the mixed and asymmetric mixed models: None artificial numerical constraints are imposed.
For this purpose, users must check if estimates are near these artificial numerical constraints. Such cases may lead to substantial biases on the GP parameter estimates. One way to detect quickly if estimates are near the border constraints is to look at the standard errors for the dependence parameters. Small values (i.e. < 1e-5) often indicates that numerical constraints have been reached.
In addition, users must be aware that the mixed and asymmetric mixed models can not deal with perfect dependence.
Thus, user may want to plot the Pickands' dependence function to see
if variable are near independence or dependence cases using the
pickdep
function.
In addition, we recommend to fix the marginal parameters. Indeed, even this is a two steps optimization procedure, this avoid numerical troubles - the likelihood function for the Markov chain model seems to be problematic. Thus, estimates are often better using the two stages approach.
Mathieu Ribatet
Kl\"uppelberg, C., and May A. (2006) Bivariate extreme value distributions based on polynomial dependence functions. Mathematical Methods in the Applied Sciences, 29 1467–1480.
Ledford A., and Tawn, J. (1996) Statistics for near Independence in Multivariate Extreme Values. Biometrika, 83 169–187.
Smith, R., and Tawn, J., and Coles, S. (1997) Markov chain models for threshold exceedances. Biometrika, 84 249–268
The following usual generic functions are available
print
,
plot
as well as new generic functions
retlev
and
convassess
.
See also pickdep
.
For optimization in R, see optim
.
mc <- simmc(1000, alpha = 0.25) mc <- qgpd(mc, 0, 1, 0.25) ##A first application when marginal parameter are estimated fitmcgpd(mc, 0) ##Another one where marginal parameters are fixed fmle <- fitgpd(mc, 0) fitmcgpd(mc, 0, scale = fmle$param["scale"], shape = fmle$param["shape"])
mc <- simmc(1000, alpha = 0.25) mc <- qgpd(mc, 0, 1, 0.25) ##A first application when marginal parameter are estimated fitmcgpd(mc, 0) ##Another one where marginal parameters are fixed fmle <- fitgpd(mc, 0) fitmcgpd(mc, 0, scale = fmle$param["scale"], shape = fmle$param["shape"])
This function estimates the point process characterisation from exceedances above a threshold.
fitpp(data, threshold, noy = length(data) / 365.25, start, ..., std.err.type = "observed", corr = FALSE, method = "BFGS", warn.inf = TRUE)
fitpp(data, threshold, noy = length(data) / 365.25, start, ..., std.err.type = "observed", corr = FALSE, method = "BFGS", warn.inf = TRUE)
data |
A numeric vector. |
threshold |
A numeric value giving the threshold for the GPD. |
noy |
Numeric. The number of year of observation. |
start |
A named list that gives the starting values for the optimization routine. Each list argument must correspond to one parameter to be estimated. May be missing. |
... |
Other optional arguments to be passed to the
|
std.err.type |
A character string. If "observed", the standard errors are derived from the observed Fisher information matrix. If "none", standard errors are not computed. |
corr |
Logical. Does the asymptotic correlation matrix has to be
computed? Default is "not computed" - e.g. |
method |
A character string specifying which numerical
optimization procedure has to be used. See |
warn.inf |
Logical. If |
This function returns a list with components:
fitted.values |
A vector containing the estimated parameters. |
std.err |
A vector containing the standard errors. |
fixed |
A vector containing the parameters of the model that have been held fixed. |
param |
A vector containing all parameters (optimized and fixed). |
deviance |
The deviance at the maximum likelihood estimates. |
corr |
The correlation matrix. |
convergence , counts , message
|
Components taken from the
list returned by |
threshold |
The threshold passed to argument |
nat , pat
|
The number and proportion of exceedances. |
data |
The data passed to the argument |
exceed |
The exceedances, or the maxima of the clusters of exceedances. |
scale |
The scale parameter for the fitted generalized Pareto distribution. |
std.err.type |
The standard error type - for |
var.thresh |
Logical. Specify if the threshold is a varying one -
|
Mathieu Ribatet
Coles, S. (2001) An Introduction to Statistical Modelling of Extreme Values. Springer Series in Statistics. London.
Embrechts, P and Kluppelberg, C. and Mikosch, T (1997) Modelling Extremal Events for Insurance and Finance. Springers.
Pickands, J. (1975) Statistical Inference Using Extreme Order Statistics. Annals of Statistics. 3:119–131.
x <- rgpd(1000, 0, 1, 0.2) fitpp(x, 0)
x <- rgpd(1000, 0, 1, 0.2) fitpp(x, 0)
A data frame containing flood discharges, in units of cubic meters per second, of the Ardieres River at Beaujeu (FRANCE), over a period of 33 years and the related date of those events.
data(ardieres)
data(ardieres)
A data frame with two columns: "time" and "obs".
Mathieu Ribatet
data(ardieres) plot(ardieres, xlab = "Time (Years)", ylab = expression(paste("Flood discharges ", m^2/s, sep="")), type = "l")
data(ardieres) plot(ardieres, xlab = "Time (Years)", ylab = expression(paste("Flood discharges ", m^2/s, sep="")), type = "l")
Density, distribution function, quantile function and random generation for the GP distribution with location equal to 'loc', scale equal to 'scale' and shape equal to 'shape'.
rgpd(n, loc = 0, scale = 1, shape = 0) pgpd(q, loc = 0, scale = 1, shape = 0, lower.tail = TRUE, lambda = 0) qgpd(p, loc = 0, scale = 1, shape = 0, lower.tail = TRUE, lambda = 0) dgpd(x, loc = 0, scale = 1, shape = 0, log = FALSE)
rgpd(n, loc = 0, scale = 1, shape = 0) pgpd(q, loc = 0, scale = 1, shape = 0, lower.tail = TRUE, lambda = 0) qgpd(p, loc = 0, scale = 1, shape = 0, lower.tail = TRUE, lambda = 0) dgpd(x, loc = 0, scale = 1, shape = 0, log = FALSE)
x , q
|
vector of quantiles. |
p |
vector of probabilities. |
n |
number of observations. |
loc |
vector of the location parameters. |
scale |
vector of the scale parameters. |
shape |
a numeric of the shape parameter. |
lower.tail |
logical; if TRUE (default), probabilities are |
log |
logical; if TRUE, probabilities p are given as log(p). |
lambda |
a single probability - see the "value" section. |
If 'loc', 'scale' and 'shape' are not specified they assume the default values of '0', '1' and '0', respectively.
The GP distribution function for loc = , scale =
and shape =
is
for and
, where
. If
, the distribution is defined by continuity corresponding to the
exponential distribution.
By definition, the GP distribution models exceedances above a
threshold. In particular, the function is a suited
candidate to model
for large enough.
However, it may be usefull to model the "non conditional" quantiles,
that is the ones related to . Using
the conditional probability definition, one have :
where .
When , the "conditional" distribution
is equivalent to the "non conditional" distribution.
dgpd(0.1) rgpd(100, 1, 2, 0.2) qgpd(seq(0.1, 0.9, 0.1), 1, 0.5, -0.2) pgpd(12.6, 2, 0.5, 0.1) ##for non conditional quantiles qgpd(seq(0.9, 0.99, 0.01), 1, 0.5, -0.2, lambda = 0.9) pgpd(2.6, 2, 2.5, 0.25, lambda = 0.5)
dgpd(0.1) rgpd(100, 1, 2, 0.2) qgpd(seq(0.1, 0.9, 0.1), 1, 0.5, -0.2) pgpd(12.6, 2, 0.5, 0.1) ##for non conditional quantiles qgpd(seq(0.9, 0.99, 0.01), 1, 0.5, -0.2, lambda = 0.9) pgpd(2.6, 2, 2.5, 0.25, lambda = 0.5)
Transforms GPD observations to unit Frechet ones and vice versa
gpd2frech(x, loc = 0, scale = 1, shape = 0, pat = 1) frech2gpd(z, loc = 0, scale = 1, shape = 0, pat = 1)
gpd2frech(x, loc = 0, scale = 1, shape = 0, pat = 1) frech2gpd(z, loc = 0, scale = 1, shape = 0, pat = 1)
x , z
|
The vector of observations. |
loc , scale , shape
|
The location, scale and shape parameters respectively. |
pat |
The proportion above the threshold, i.e. Pr[X > log] = pat. |
Let ,
be the realisation
of a GPD random variable. Thus, the transformation to unit Frechet is
defined as:
A numeric vector.
Mathieu Ribatet
x <- rgpd(10, 0, 1, 0.25) z <- gpd2frech(x, 0, 1, 0.25) z all(frech2gpd(z, 0, 1, 0.25) == x)
x <- rgpd(10, 0, 1, 0.25) z <- gpd2frech(x, 0, 1, 0.25) z all(frech2gpd(z, 0, 1, 0.25) == x)
Compute the sample L-moments - unbiased version.
samlmu(x, nmom = 4, sort.data = TRUE)
samlmu(x, nmom = 4, sort.data = TRUE)
x |
a vector of data |
nmom |
a numeric value giving the number of sample L-moments to be computed |
sort.data |
a logical which specifies if the vector of data x should be sorted or not. |
This function returns a vector of length nmom
corresponding to the
sample L-moments. Note that for orders greater or equal than 3 it is the L-moments
ratio that is sample L-coefficient of variation, sample L-skewness, sample L-kurtosis, ...
Hosking, J. R. M. (1990) L-moment analysis and estimation of order statistics. Journal of the Royal Statistical Society Series B, 52: 105–124.
x <- runif(50) samlmu(x, nmom = 5)
x <- runif(50) samlmu(x, nmom = 5)
Plots of sample L-Skewness ans L-Kurtosis estimates at various thresholds for peaks over threshold modelling, using the Generalized Pareto parametrization.
lmomplot(data, u.range, nt = max(50, length(data)), identify = TRUE, ...)
lmomplot(data, u.range, nt = max(50, length(data)), identify = TRUE, ...)
data |
A numeric vector. |
u.range |
A numeric vector of length two, giving the limits for the thresholds at which the model is fitted. |
nt |
The number of thresholds at which the sample L-moments are evaluated. |
identify |
Logical. If |
... |
Other arguments to be passed to the model fit
function |
For each thresholds, sample L-skewness and L-kurtosis are computed. If data are GP distributed, one have :
So, a threshold is acceptable if sample are near the theoretical curve.
L-moments plot are really difficult to interpret. It can help us to say if the GP distribution is suited to model data.
Mathieu Ribatet
Hosking, J. R. M. and Wallis, J. R. (1997) Regional Frequency Analysis. Cambridge University Press.
Begueria, S. (2005) Uncertainties in partial duration series modelling of extremes related to the choice of the threshold value. Journal of Hydrology, 303(1-4): 215–230.
data(ardieres) ardieres <- clust(ardieres, 4, 10 / 365, clust.max = TRUE) flows <- ardieres[, "obs"] lmomplot(flows, identify = FALSE)
data(ardieres) ardieres <- clust(ardieres, 4, 10 / 365, clust.max = TRUE) flows <- ardieres[, "obs"] lmomplot(flows, identify = FALSE)
Extract Log-Likelihood for object of class 'pot'
## S3 method for class 'pot' logLik(object, ...)
## S3 method for class 'pot' logLik(object, ...)
object |
An object of class |
... |
Other arguments to be passed to the |
Standard logLik
object: see logLik
.
Mathieu Ribatet
x <- rgpd(500, 0, 1, -0.15) fmle <- fitgpd(x, 0) logLik(fmle)
x <- rgpd(500, 0, 1, -0.15) fmle <- fitgpd(x, 0) logLik(fmle)
The empirical mean residual life plot.
mrlplot(data, u.range, main, xlab, ylab, nt = max(100, length(data)), lty = rep(1,3), col = c('grey', 'black', 'grey'), conf = 0.95, lwd = c(1, 1.5, 1), ...)
mrlplot(data, u.range, main, xlab, ylab, nt = max(100, length(data)), lty = rep(1,3), col = c('grey', 'black', 'grey'), conf = 0.95, lwd = c(1, 1.5, 1), ...)
data |
A numeric vector. |
u.range |
A numeric vector of length two, giving the limits for
the thresholds at which the mean residual life plot is
evaluated. If |
main |
Plot title. |
xlab , ylab
|
x and y axis labels. |
nt |
The number of thresholds at which the mean residual life plot is evaluated. |
lty , col , lwd
|
Arguments passed to |
conf |
The (pointwise) confidence coefficient for the plotted confidence intervals. |
... |
Other arguments to be passed to |
The empirical mean residual life plot is the locus of points
where are
the
observations that exceed the threshold
. If the
exceedances of a threshold
are generalized Pareto, the
empirical mean residual life plot should be approximately linear for
.
The confidence intervals within the plot are symmetric intervals based on the approximate normality of sample means.
A list with components x
and y
is invisibly returned.
The components contain those objects that were passed to the formal
arguments x
and y
of matplot
in order to create
the mean residual life plot.
Stuart Coles and Alec Stephenson
Coles, S. (2001) An Introduction to Statistical Modelling of Extreme Values. Springer Series in Statistics. London.
Embrechts, P., Kl\"uppelberg, C., and Mikosch, T. (1997) Modelling Extremal Events for Insurance and Finance.
data(ardieres) ardieres <- clust(ardieres, 4, 10 / 365, clust.max = TRUE) flows <- ardieres[, "obs"] mrlplot(flows)
data(ardieres) ardieres <- clust(ardieres, 4, 10 / 365, clust.max = TRUE) flows <- ardieres[, "obs"] mrlplot(flows)
Return and optionally plot the Pickands' dependence function.
pickdep(object, main, bound = TRUE, plot = TRUE, ...)
pickdep(object, main, bound = TRUE, plot = TRUE, ...)
object |
A object of class |
main |
May be missing. If present, the plot title. |
bound |
Logical. Should the perfect dependent and independent case bounds be plotted? |
plot |
Logical. Should the dependence function be plotted? |
... |
Optional parameters to be passed to the
|
It is common to parametrize a bivariate extreme value distribution
according to the Pickands' representation (Pickands, 1981). That is,
if is any bivariate extreme value distribution, then it has
the following parametrization:
where are unit Frechet.
is the Pickands' dependence function. It has the following
properties:
is defined on [0,1];
;
;
is a convex function;
For two independent (unit Frechet) random variables, ;
For two perfectly dependent (unit Frechet) random variables,
.
The function returns an invisible function: the Pickands' dependence function. Moreover, the returned object has an attribute which specifies the model for the bivariate extreme value distribution.
If plot = TRUE
, then the dependence function is plotted.
Mathieu Ribatet
Pickands, J. (1981) Multivariate Extreme Value Distributions Proceedings 43rd Session International Statistical Institute
x <- rbvgpd(1000, alpha = 0.9, model = "mix", mar1 = c(0,1,0.25), mar2 = c(2,0.5,0.1)) Mmix <- fitbvgpd(x, c(0,2), "mix") pickdep(Mmix)
x <- rbvgpd(1000, alpha = 0.9, model = "mix", mar1 = c(0,1,0.25), mar2 = c(2,0.5,0.1)) Mmix <- fitbvgpd(x, c(0,2), "mix") pickdep(Mmix)
Plot several graphics to judge goodness of fit of the fitted model.
## S3 method for class 'bvpot' plot(x, mains, which = 1:3, ask = nb.fig < length(which) && dev.interactive(), ...)
## S3 method for class 'bvpot' plot(x, mains, which = 1:3, ask = nb.fig < length(which) && dev.interactive(), ...)
x |
An object of class |
mains |
May be missing. If present a 3–vector of character strings which gives the titles of the plots. |
which |
a numeric vector which specifies which plot must be drawn
: |
ask |
Logical. If |
... |
Other parameters to pass to the |
Several plots.
Mathieu Ribatet
x <- rbvgpd(1000, alpha = 0.55, mar1 = c(0,1,0.25), mar2 = c(2,0.5,0.1)) Mlog <- fitbvgpd(x, c(0, 2), "log") layout(matrix(c(1,1,2,2,0,3,3,0), 2, byrow = TRUE)) plot(Mlog)
x <- rbvgpd(1000, alpha = 0.55, mar1 = c(0,1,0.25), mar2 = c(2,0.5,0.1)) Mlog <- fitbvgpd(x, c(0, 2), "log") layout(matrix(c(1,1,2,2,0,3,3,0), 2, byrow = TRUE)) plot(Mlog)
Plot several graphics to judge goodness of fit of the fitted model.
## S3 method for class 'mcpot' plot(x, opy, exi, mains, which = 1:4, ask = nb.fig < length(which) && dev.interactive(), acf.type = "partial", ...)
## S3 method for class 'mcpot' plot(x, opy, exi, mains, which = 1:4, ask = nb.fig < length(which) && dev.interactive(), acf.type = "partial", ...)
x |
An object of class |
opy |
Numeric. The number of Observation Per Year (or more generally per block). If missing, the function warns and set it to 365. |
exi |
Numeric. The extremal index value. If missing, the estimator of Ferro and Segers (2003) is used. |
mains |
May be missing. If present a 4–vector of character strings which gives the titles of the plots. |
which |
a numeric vector which specifies which plot must be
drawn: |
ask |
Logical. If |
acf.type |
The type of auto correlation to be plotted. Must be
one of |
... |
Other parameters to pass to the |
Several plots and returns invisibly the return level function.
See the warning for the return level estimation in documentation of
the retlev.mcpot
function.
For the return level plot, the observations are not plotted as these
are dependent realisations. In particular, the return periods computed
using the prob2rp
are inaccurate.
Mathieu Ribatet
Ferro, C. and Segers, J. (2003). Inference for clusters of extreme values. Journal of the Royal Statistical Society B. 65: 545–556.
set.seed(123) mc <- simmc(200, alpha = 0.5) mc <- qgpd(mc, 0, 1, 0.25) Mclog <- fitmcgpd(mc, 1) par(mfrow=c(2,2)) rlMclog <- plot(Mclog) rlMclog(T = 3)
set.seed(123) mc <- simmc(200, alpha = 0.5) mc <- qgpd(mc, 0, 1, 0.25) Mclog <- fitmcgpd(mc, 1) par(mfrow=c(2,2)) rlMclog <- plot(Mclog) rlMclog(T = 3)
Produces QQ-plot, Probability Plot and a Density Plot of the fitted model versus the empirical one. Another function computes the Return Level Plot of the fitted model.
## S3 method for class 'uvpot' plot(x, npy, main, which = 1:4, ask = nb.fig < length(which) && dev.interactive(),ci = TRUE, ...)
## S3 method for class 'uvpot' plot(x, npy, main, which = 1:4, ask = nb.fig < length(which) && dev.interactive(),ci = TRUE, ...)
x |
A fitted object of class |
npy |
The mean Number of events Per Year - or more generally a block. |
main |
optional. A string vector corresponding to the title of each plot. |
which |
a numeric vector which specifies which plot must be drawn
: |
ask |
Logical. If |
ci |
Logical. If |
... |
Other parameters to pass to the |
Mathieu Ribatet
data(ardieres) ardieres <- clust(ardieres, 4, 10 / 365, clust.max = TRUE) fgpd <- fitgpd(ardieres[, "obs"], 6, 'mle') npy <- fgpd$nat / 33.4 ##33.4 is the total record length (in year) par(mfrow=c(2,2)) plot(fgpd, npy = npy)
data(ardieres) ardieres <- clust(ardieres, 4, 10 / 365, clust.max = TRUE) fgpd <- fitgpd(ardieres[, "obs"], 6, 'mle') npy <- fgpd$nat / 33.4 ##33.4 is the total record length (in year) par(mfrow=c(2,2)) plot(fgpd, npy = npy)
pp
is a generic function used to show probability-probability plot.
The function invokes particular methods
which depend on the class
of the first argument.
So the function makes a probability probability plot for univariate POT models.
pp(object, ...) ## S3 method for class 'uvpot' pp(object, main, xlab, ylab, ci = TRUE, ...)
pp(object, ...) ## S3 method for class 'uvpot' pp(object, main, xlab, ylab, ci = TRUE, ...)
object |
A fitted object. When using the POT package, an object
of class |
main |
The title of the graphic. If missing, the title is set to
|
xlab , ylab
|
The labels for the x and y axis. If missing, they are
set to |
ci |
Logical. If |
... |
Other arguments to be passed to the |
The probability probability plot consists of plotting the theoretical probabilities in function of the empirical model ones. The theoretical probabilities are computed from the fitted GPD, while the empirical ones are computing from a particular plotting position estimator. This plotting position estimator is suited for the GPD case (Hosking, 1995) and are defined by:
where is the total number of observations.
If the theoretical model is correct, then points should be “near”
the line .
A graphical window.
Mathieu Ribatet
Hosking, J. R. M. and Wallis, J. R. (1995). A comparison of unbiased and plotting-position estimators of L moments. Water Resources Research. 31(8): 2019–2025.
x <- rgpd(75, 1, 2, 0.1) pwmb <- fitgpd(x, 1, "pwmb") pp(pwmb)
x <- rgpd(75, 1, 2, 0.1) pwmb <- fitgpd(x, 1, "pwmb") pp(pwmb)
Print a “bvpot” object
## S3 method for class 'bvpot' print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'bvpot' print(x, digits = max(3, getOption("digits") - 3), ...)
x |
An object of class |
digits |
The number of digits to be printed. |
... |
Other options to be passed to the |
Print on screen.
Mathieu Ribatet
print.uvpot
, print.mcpot
,
print
set.seed(123) x <- rgpd(500, 0, 1, 0.2) y <- rgpd(500, 2, 0.5, -0.1) Mlog <- fitbvgpd(cbind(x, y), c(0, 2)) Mlog
set.seed(123) x <- rgpd(500, 0, 1, 0.2) y <- rgpd(500, 2, 0.5, -0.1) Mlog <- fitbvgpd(cbind(x, y), c(0, 2)) Mlog
Print an “mcpot” object
## S3 method for class 'mcpot' print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'mcpot' print(x, digits = max(3, getOption("digits") - 3), ...)
x |
An object of class |
digits |
The number of digits to be printed. |
... |
Other options to be passed to the |
Print on screen.
Mathieu Ribatet
print.uvpot
, print.bvpot
,
print
x <- simmc(1000, alpha = 0.5) x <- qgpd(x, 0, 1, 0.15) Mc <- fitmcgpd(x, 0) Mc
x <- simmc(1000, alpha = 0.5) x <- qgpd(x, 0, 1, 0.15) Mc <- fitmcgpd(x, 0) Mc
Print an “uvpot” object
## S3 method for class 'uvpot' print(x, digits = max(3, getOption("digits") - 3), ...)
## S3 method for class 'uvpot' print(x, digits = max(3, getOption("digits") - 3), ...)
x |
An object of class |
digits |
The number of digits to be printed. |
... |
Other options to be passed to the |
Print on screen.
Mathieu Ribatet
print.bvpot
, print.mcpot
,
print
x <- rgpd(500, 0, 1, 0.2) MLE <- fitgpd(x, 0) MLE
x <- rgpd(500, 0, 1, 0.2) MLE <- fitgpd(x, 0) MLE
Compute profiled confidence intervals on parameter and return level for the GP distribution. This is achieved through the profile likelihood procedure.
gpd.pfshape(object, range, xlab, ylab, conf = 0.95, nrang = 100, vert.lines = TRUE, ...) gpd.pfscale(object, range, xlab, ylab, conf = 0.95, nrang = 100, vert.lines = TRUE, ...) gpd.pfrl(object, prob, range, thresh, xlab, ylab, conf = 0.95, nrang = 100, vert.lines = TRUE, ...)
gpd.pfshape(object, range, xlab, ylab, conf = 0.95, nrang = 100, vert.lines = TRUE, ...) gpd.pfscale(object, range, xlab, ylab, conf = 0.95, nrang = 100, vert.lines = TRUE, ...) gpd.pfrl(object, prob, range, thresh, xlab, ylab, conf = 0.95, nrang = 100, vert.lines = TRUE, ...)
object |
|
prob |
The probability of non exceedance. |
range |
Vector of dimension two. It gives the lower and upper bound on which the profile likelihood is performed. |
thresh |
Optional. The threshold. Only needed with non constant threshold. |
xlab , ylab
|
Optional Strings. Allows to label the x-axis and y-axis. If missing, default value are considered. |
conf |
Numeric. The confidence level. |
nrang |
Numeric. It specifies the number of profile likelihood
computed on the whole range |
vert.lines |
Logical. If |
... |
Optional parameters to be passed to the
|
Returns a vector of the lower and upper bound for the profile confidence interval. Moreover, a graphic of the profile likelihood function is displayed.
Mathieu Ribatet
Coles, S. (2001). An Introduction to Statistical Modelling of Extreme Values. Springer Series in Statistics. London.
gpd.fiscale
, gpd.fishape
,
gpd.firl
and confint
data(ardieres) events <- clust(ardieres, u = 4, tim.cond = 8 / 365, clust.max = TRUE) MLE <- fitgpd(events[, "obs"], 4, 'mle') gpd.pfshape(MLE, c(0, 0.8)) rp2prob(10, 2) gpd.pfrl(MLE, 0.95, c(12, 25))
data(ardieres) events <- clust(ardieres, u = 4, tim.cond = 8 / 365, clust.max = TRUE) MLE <- fitgpd(events[, "obs"], 4, 'mle') gpd.pfshape(MLE, c(0, 0.8)) rp2prob(10, 2) gpd.pfrl(MLE, 0.95, c(12, 25))
qq
is a generic function used to show quantile-quantile plot.
The function invokes particular methods
which depend on the class
of the first argument.
So the function makes a quantile quantile plot for univariate POT models.
qq(object, ...) ## S3 method for class 'uvpot' qq(object, main, xlab, ylab, ci = TRUE, ...)
qq(object, ...) ## S3 method for class 'uvpot' qq(object, main, xlab, ylab, ci = TRUE, ...)
object |
A fitted object. When using the POT package, an object
of class |
main |
The title of the graphic. If missing, the title is set to
|
xlab , ylab
|
The labels for the x and y axis. If missing, they are
set to |
ci |
Logical. If |
... |
Other arguments to be passed to the |
The quantile quantile plot consists of plotting the observed quantiles
in function of the theoretical ones. The theoretical quantiles
are computed from the fitted GPD, that
is:
where is the fitted quantile function and
are empirical probabilities defined by :
where is the total number of observations - see Hosking
(1995).
If the theoretical model is correct, then points should be “near”
the line .
A graphical window.
Mathieu Ribatet
Hosking, J. R. M. and Wallis, J. R. (1995). A comparison of unbiased and plotting-position estimators of L moments. Water Resources Research. 31(8): 2019–2025.
x <- rgpd(75, 1, 2, 0.1) pwmu <- fitgpd(x, 1, "pwmu") qq(pwmu)
x <- rgpd(75, 1, 2, 0.1) pwmu <- fitgpd(x, 1, "pwmu") qq(pwmu)
retlev
is a generic function used to show return level plot.
The function invokes particular methods
which depend on the class
of the first argument.
So the function makes a return level plot for POT models.
retlev(object, ...) ## S3 method for class 'uvpot' retlev(object, npy, main, xlab, ylab, xlimsup, ci = TRUE, points = TRUE, ...) ## S3 method for class 'mcpot' retlev(object, opy, exi, main, xlab, ylab, xlimsup, ...)
retlev(object, ...) ## S3 method for class 'uvpot' retlev(object, npy, main, xlab, ylab, xlimsup, ci = TRUE, points = TRUE, ...) ## S3 method for class 'mcpot' retlev(object, opy, exi, main, xlab, ylab, xlimsup, ...)
object |
A fitted object. When using the POT package, an object
of class |
npy |
The mean Number of events Per Year (or more generally per block).if missing, setting it to 1. |
main |
The title of the graphic. If missing, the title is set to
|
xlab , ylab
|
The labels for the x and y axis. If missing, they are
set to |
xlimsup |
Numeric. The right limit for the x-axis. If missing, a suited value is computed. |
ci |
Logical. Should the 95% pointwise confidence interval be plotted? |
points |
Logical. Should observations be plotted? |
... |
Other arguments to be passed to the |
opy |
The number of Observations Per Year (or more generally per block). If missing, it is set it to 365 i.e. daily values with a warning. |
exi |
Numeric. The extremal index. If missing, an estimate is
given using the |
For class "uvpot"
, the return level plot consists of plotting the theoretical quantiles
in function of the return period (with a logarithmic scale for the
x-axis). For the definition of the return period see the
prob2rp
function. Thus, the return level plot consists
of plotting the points defined by:
where is the return period related to the non
exceedance probability
,
is the
fitted quantile function.
If points = TRUE
, the probabilities related to
each observation are computed using the following plotting position
estimator proposed by Hosking (1995):
where is the total number of observations.
If the theoretical model is correct, the points should be “close” to the “return level” curve.
For class "mcpot"
, let be the first
observations from a stationary sequence with marginal distribution
function
. Thus, we can use the following (asymptotic)
approximation:
where is the extremal index.
Thus, to obtain the T-year return level, we equate this equation to
and solve for
.
A graphical window. In addition, it returns invisibly the return level function.
For class "mcpot"
, though this is computationally expensive, we recommend to give the
extremal index estimate using the dexi
function. Indeed,
there is a severe bias when using the Ferro and Segers (2003)
estimator - as it is estimated using observation and not the Markov
chain model.
Mathieu Ribatet
Hosking, J. R. M. and Wallis, J. R. (1995). A comparison of unbiased and plotting-position estimators of L moments. Water Resources Research. 31(8): 2019–2025.
Ferro, C. and Segers, J. (2003). Inference for clusters of extreme values. Journal of the Royal Statistical Society B. 65: 545–556.
#for uvpot class x <- rgpd(75, 1, 2, 0.1) pwmu <- fitgpd(x, 1, "pwmu") rl.fun <- retlev(pwmu) rl.fun(100) #for mcpot class data(ardieres) Mcalog <- fitmcgpd(ardieres[,"obs"], 5, "alog") retlev(Mcalog, opy = 990)
#for uvpot class x <- rgpd(75, 1, 2, 0.1) pwmu <- fitgpd(x, 1, "pwmu") rl.fun <- retlev(pwmu) rl.fun(100) #for mcpot class data(ardieres) Mcalog <- fitmcgpd(ardieres[,"obs"], 5, "alog") retlev(Mcalog, opy = 990)
Plot return levels for a fitted bivariate extreme value distribution.
## S3 method for class 'bvpot' retlev(object, p = seq(0.75,0.95,0.05), main, n = 5000, only.excess = FALSE, ...)
## S3 method for class 'bvpot' retlev(object, p = seq(0.75,0.95,0.05), main, n = 5000, only.excess = FALSE, ...)
object |
An object of class |
p |
A vector of probabilities for which return levels must be drawn. |
main |
The title of the graphic window. May be missing. |
n |
The number (default: 5000) of points needed to draw return levels lines. |
only.excess |
Logical. If |
... |
Other parameters to pass to the |
Any bivariate extreme value distribution has the Pickands' representation form i.e.:
where corresponds to
transformed to be
unit Frechet distributed and
which lies in
.
Thus, for a fixed probability and
, we have the
corresponding
,
values:
At last, the are transformed back to their original
scale.
Plot return levels for a fitted bivariate extreme value distribution. Moreover, an invisible list is return which gives the points used to draw the current plot.
Mathieu Ribatet
x <- rbvgpd(1000, alpha = 0.25, mar1 = c(0, 1, 0.25)) Mlog <- fitbvgpd(x, c(0, 0), "log") retlev(Mlog)
x <- rbvgpd(1000, alpha = 0.25, mar1 = c(0, 1, 0.25)) Mlog <- fitbvgpd(x, c(0, 0), "log") retlev(Mlog)
Compute return period from probability of non exceedance and vice versa.
rp2prob(retper, npy) prob2rp(prob, npy)
rp2prob(retper, npy) prob2rp(prob, npy)
retper |
The return period. |
prob |
the probability of non exceedance. |
npy |
The mean Number of events per year (block). |
The return period is defined by:
where is the mean number of events per year (block),
is the probability of non exceedance.
Returns a table with mean numbers of events per year, return periods and probabilities of non exceedance associated.
Mathieu Ribatet
rp2prob(50, 1.8) prob2rp(0.6, 2.2)
rp2prob(50, 1.8) prob2rp(0.6, 2.2)
Simulation of first order Markov chains, such that each pair of consecutive values has the dependence structure of one of nine parametric bivariate extreme value distributions.
simmc(n, alpha, model = "log", asCoef, asCoef1, asCoef2, margins = "uniform")
simmc(n, alpha, model = "log", asCoef, asCoef1, asCoef2, margins = "uniform")
n |
Number of observations. |
alpha |
Dependence parameter for the logistic, asymmetric logistic, negative logistic, asymmetric negative logistic, mixed and asymmetric mixed models. |
asCoef , asCoef1 , asCoef2
|
The asymmetric coefficients for the asymmetric logistic, asymmetric negative logistic and asymmetric mixed models. |
model |
The specified model; a character string. Must be
either |
margins |
The marginal distribution of each value; a
character string. Must be either |
A numeric vector of length n
.
Alec Stephenson (modified for the POT package by Mathieu Ribatet)
simmc(100, alpha = 0.1, model = "log") simmc(100, alpha = 1.2, model = "nlog", margins = "gum")
simmc(100, alpha = 0.1, model = "log") simmc(100, alpha = 1.2, model = "nlog", margins = "gum")
Simulate a synthetic Markov chain from a fitted 'mcpot'
object.
simmcpot(object, plot = TRUE, ...)
simmcpot(object, plot = TRUE, ...)
object |
An object of class |
plot |
Logical. If |
... |
Other optional arguments to be passed to the
|
The simulated Markov chain is computed as follows:
Simulate a Markov chain prob
with uniform margins on
(0,1) and with the fixed extreme value dependence given by
object
;
For all prob
such as , set
(where
pat
is given by
object$pat
);
For all prob
such as , set
. Thus,
prob2
are uniformly distributed on
(0,1);
For all prob2
, set mc = qgpd(prob2, thresh,
scale, shape)
, where thresh, scale, shape
are given by the
object$threshold, object$param["scale"]
and
object$param["shape"]
respectively.
A Markov chain which has the same features as the fitted object. If
plot = TRUE
, the Markov chain is plotted.
Mathieu Ribatet
data(ardieres) flows <- ardieres[,"obs"] Mclog <- fitmcgpd(flows, 5) par(mfrow = c(1,2)) idx <- which(flows <= 5) flows[idx] <- NA plot(flows, main = "Ardieres Data") flowsSynth <- simmcpot(Mclog, main = "Simulated Data")
data(ardieres) flows <- ardieres[,"obs"] Mclog <- fitmcgpd(flows, 5) par(mfrow = c(1,2)) idx <- which(flows <= 5) flows[idx] <- NA plot(flows, main = "Ardieres Data") flowsSynth <- simmcpot(Mclog, main = "Simulated Data")
Plot the spectral density for a bivariate extreme value distribution or an extreme Markov chain model.
specdens(object, main, plot = TRUE, ...)
specdens(object, main, plot = TRUE, ...)
object |
An object of class |
main |
The title of the graphic window. May be missing. |
plot |
Logical. Should the spectral density be plotted? The default is to plot it. |
... |
Other options to be passed to the |
Any bivariate extreme value distribution has the following representation:
where holds:
is called the spectral measure with density
. Thus,
is called the spectral density. In
addition,
has a total mass of 2.
For two independent random variables, the spectral measure consists of
two points of mass 1 at . For two perfect dependent
random variables, the spectral measure consists of a single point of
mass 2 at
.
Plot the spectral density for a fitted bivariate extreme value distribution. Moreover, the spectral density is returned invisibly.
Mathieu Ribatet
retlev.bvpot
, pickdep
and
plot.bvpot
par(mfrow=c(1,2)) ##Spectral density for a Markov Model mc <- simmc(1000, alpha = 0.25, model = "log") mc <- qgpd(mc, 0, 1, 0.1) Mclog <- fitmcgpd(mc, 0, "log") specdens(Mclog) ##Spectral density for a bivariate POT model x <- rgpd(500, 5, 1, -0.1) y <- rgpd(500, 2, 0.2, -0.25) Manlog <- fitbvgpd(cbind(x,y), c(5,2), "anlog") specdens(Manlog)
par(mfrow=c(1,2)) ##Spectral density for a Markov Model mc <- simmc(1000, alpha = 0.25, model = "log") mc <- qgpd(mc, 0, 1, 0.1) Mclog <- fitmcgpd(mc, 0, "log") specdens(Mclog) ##Spectral density for a bivariate POT model x <- rgpd(500, 5, 1, -0.1) y <- rgpd(500, 2, 0.2, -0.25) Manlog <- fitbvgpd(cbind(x,y), c(5,2), "anlog") specdens(Manlog)
Compactly display the structure of an object of class 'pot'
## S3 method for class 'pot' summary(object, ...)
## S3 method for class 'pot' summary(object, ...)
object |
An object of class |
... |
Other arguments to be passed to the |
Standard summary
object: see summary
.
Christophe Dutang
set.seed(123) x <- rgpd(500, 0, 1, -0.15) fmle <- fitgpd(x, 0) summary(fmle)
set.seed(123) x <- rgpd(500, 0, 1, -0.15) fmle <- fitgpd(x, 0) summary(fmle)
Several tests for tail independence (e.g. asymptotic independence) for a bivariate extreme value distribution
tailind.test(data, c = -0.1, emp.trans = TRUE, chisq.n.class = 4)
tailind.test(data, c = -0.1, emp.trans = TRUE, chisq.n.class = 4)
data |
A matrix with two columns given the data. |
c |
A negative numeric. Must be close to zero to approximate accurately asymptotic results. |
emp.trans |
Logical. If |
chisq.n.class |
A numeric given the number of classes for the Chi squared test. |
These tests are based on an asymptotic results shown by Falk and Michel
(2006). Let be a random vector which follows in its
upper tail a bivariate extreme value distribution with reverse
exponential margins. The conditional distribution function of
, given that
, converges to
,
, if
iff
and
are
asymptotically independent. Otherwise, the limit is
This function returns a table with the Neymann-Pearson, Fisher, Kolmogorov-Smirnov and Chi-Square statistics and the related p-values.
Mathieu Ribatet
Falk, M. and Michel, Rene(2006) Testing for tail independence in extreme value models. Annals of the Institute of Statistical Mathematics 58: 261–290
##A total independence example x <- rbvgpd(7000, alpha = 1, mar1 = c(0, 1, 0.25)) tailind.test(x) ##An asymptotically dependent example y <- rbvgpd(7000, alpha = 0.75, model = "nlog", mar1 = c(0, 1, 0.25), mar2 = c(2, 0.5, -0.15)) tailind.test(y) ##A perfect dependence example z <- rnorm(7000) tailind.test(cbind(z, 2*z - 5))
##A total independence example x <- rbvgpd(7000, alpha = 1, mar1 = c(0, 1, 0.25)) tailind.test(x) ##An asymptotically dependent example y <- rbvgpd(7000, alpha = 0.75, model = "nlog", mar1 = c(0, 1, 0.25), mar2 = c(2, 0.5, -0.15)) tailind.test(y) ##A perfect dependence example z <- rnorm(7000) tailind.test(cbind(z, 2*z - 5))
Plots of parameter estimates at various thresholds for peaks over threshold modelling, using the Generalized Pareto or Point Process representation.
tcplot(data, u.range, cmax = FALSE, r = 1, ulow = -Inf, rlow = 1, nt = 25, which = 1:npar, conf = 0.95, lty = 1, lwd = 1, type = "b", cilty = 1, ask = nb.fig < length(which) && dev.interactive(), ...)
tcplot(data, u.range, cmax = FALSE, r = 1, ulow = -Inf, rlow = 1, nt = 25, which = 1:npar, conf = 0.95, lty = 1, lwd = 1, type = "b", cilty = 1, ask = nb.fig < length(which) && dev.interactive(), ...)
data |
A numeric vector. |
u.range |
A numeric vector of length two, giving the limits for the thresholds at which the model is fitted. |
cmax |
Logical; if |
r , ulow , rlow
|
Arguments used for the identification of clusters
of exceedances. Ignored if |
nt |
The number of thresholds at which the model is fitted. |
which |
If a subset of the plots is required, specify a
subset of the numbers |
conf |
The (pointwise) confidence coefficient for the plotted confidence intervals. Use zero to suppress. |
lty , lwd
|
The line type and width of the line connecting the parameter estimates. |
type |
The form taken by the line connecting the parameter
estimates and the points denoting these estimates. Possible
values include |
cilty |
The line type of the lines depicting the confidence intervals. |
ask |
Logical; if |
... |
Other arguments to be passed to the model fit
function |
For each of the nt
thresholds a peaks over threshold model
is fitted using the function fitgpd
. The maximum likelihood
estimates for the shape and the modified scale parameter (modified by
subtracting the shape multiplied by the threshold) are plotted against
the thresholds. If the threshold u
is a valid threshold to be
used for peaks over threshold modelling, the parameter estimates
depicted should be approximately constant above u
.
A list is invisibly returned. Each component is a matrix with three columns giving parameter estimates and confidence limits.
Stuart Coles and Alec Stephenson
Coles, S. (2001) An Introduction to Statistical Modelling of Extreme Values. Springer Series in Statistics. London.
data(ardieres) ardieres <- clust(ardieres, 4, 10 / 365, clust.max = TRUE) flows <- ardieres[, "obs"] par(mfrow=c(1,2)) tcplot(flows, u.range = c(0, 15) )
data(ardieres) ardieres <- clust(ardieres, 4, 10 / 365, clust.max = TRUE) flows <- ardieres[, "obs"] par(mfrow=c(1,2)) tcplot(flows, u.range = c(0, 15) )
This function performs a mobile average windows on the whole time series. Thus, if the time series represents flood discharges, it returns the averaged discharges over a specific duration.
ts2tsd(ts, d, vol = FALSE, method = "linear")
ts2tsd(ts, d, vol = FALSE, method = "linear")
ts |
The time series. It consists of two columns: one named
|
d |
Numeric which corresponds of the duration for the mobile window. |
vol |
Logical. If |
method |
Specifies the interpolation method to be used. Choices
are |
A mobile windows of length d
is performed on the whole time
sire. The “discrete” time series in first transformed in a
function; interpolation are obtained using the approx
function. Thus, if f(t) is the function representing the time series,
volume over duration d
is defined by:
while average values are:
Returns a time series like object ts
. In particular
ts[,"time"]
and tsd[,"time"]
are identical.
Please note that as the time series is interpolated, caution should be taken if the method to interpolate is not efficient.
Note that object d
should have the same unit than
ts[,"time"]
.
Mathieu Ribatet
data(ardieres) tsd <- ts2tsd(ardieres, 3 / 365) plot(ardieres, type = "l", col = "blue") lines(tsd, col = "green")
data(ardieres) tsd <- ts2tsd(ardieres, 3 / 365) plot(ardieres, type = "l", col = "blue") lines(tsd, col = "green")
A diagnostic tool to assess for short range asymptotic dependence within a stationary time series.
tsdep.plot(data, u, ..., xlab, ylab, n.boot = 100, show.lines = TRUE, lag.max, ci = 0.95, block.size = 5 * lag.max, angle = 90, arrow.length = 0.1)
tsdep.plot(data, u, ..., xlab, ylab, n.boot = 100, show.lines = TRUE, lag.max, ci = 0.95, block.size = 5 * lag.max, angle = 90, arrow.length = 0.1)
data |
The time series observations. |
u |
The threshold. |
... |
Optional arguments to be passed to the |
xlab , ylab
|
The x and y-axis labels. |
n.boot |
Numeric. The number of replicates to compute the bootstrap confidence interval. |
show.lines |
Logical. If |
lag.max |
The maximum lag to be explored - may be missing. |
ci |
The level for the bootstrap confidence interval. The default is the 95% confidence interval. |
block.size |
The size for the contiguous bootstrap approach. |
angle |
The angle at the end of the error bar. If |
arrow.length |
The length to be passed in the function
|
Let X_t
be a stationary sequence of unit Frechet random
variables. By stationarity, the joint survivor function
of
does not depend on
.
One parametric representation for is given by
for some parameter and a
slowly varying function
.
The statistic is defined by
This statistic belongs to (-1,1] and is a measure of extremal
dependence. corresponds to
asymptotic dependence,
to positive extremal association,
to “near” independence and
to negative extremal association.
This function plot the statictics
against the lag. Bootstrap confidence intervals are also drawn. The
function returns invisibly this statistic and the confidence bounds.
Mathieu Ribatet
Ledford, A. and Tawn, J. (2003) Diagnostics for dependence within time series extremes. L. R. Statist. Soc. B. 65, Part 2, 521–543.
Ledford, A. and Tawn, J (1996) Statistics for near independence in multivariate extreme values. Biometrika 83 169–187.
##An independent case tsdep.plot(runif(5000), u = 0.95, lag.max = 5) ##Asymptotic dependence mc <- simmc(5000, alpha = 0.2) tsdep.plot(mc, u = 0.95, lag.max = 5)
##An independent case tsdep.plot(runif(5000), u = 0.95, lag.max = 5) ##Asymptotic dependence mc <- simmc(5000, alpha = 0.2) tsdep.plot(mc, u = 0.95, lag.max = 5)