Title: | DPQ (Density, Probability, Quantile) Distribution Computations using MPFR |
---|---|
Description: | An extension to the 'DPQ' package with computations for 'DPQ' (Density (pdf), Probability (cdf) and Quantile) functions, where the functions here partly use the 'Rmpfr' package and hence the underlying 'MPFR' and 'GMP' C libraries. |
Authors: | Martin Maechler [aut, cre] |
Maintainer: | Martin Maechler <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.3-3 |
Built: | 2024-11-02 19:12:53 UTC |
Source: | https://github.com/r-forge/specfun |
An extension to the 'DPQ' package with computations for 'DPQ' (Density (pdf), Probability (cdf) and Quantile) functions, where the functions here partly use the 'Rmpfr' package and hence the underlying 'MPFR' and 'GMP' C libraries.
The DESCRIPTION file:
Package: | DPQmpfr |
Title: | DPQ (Density, Probability, Quantile) Distribution Computations using MPFR |
Version: | 0.3-3 |
VersionNote: | Last CRAN: 0.3-2 on 2023-12-05; 0.3-1 on 2021-05-17 |
Date: | 2024-08-19 |
Authors@R: | person("Martin","Maechler", role=c("aut","cre"), email="[email protected]", comment = c(ORCID = "0000-0002-8685-9910")) |
Description: | An extension to the 'DPQ' package with computations for 'DPQ' (Density (pdf), Probability (cdf) and Quantile) functions, where the functions here partly use the 'Rmpfr' package and hence the underlying 'MPFR' and 'GMP' C libraries. |
Depends: | R (>= 3.6.0) |
Imports: | DPQ (>= 0.5-3), Rmpfr (>= 0.9-0), gmp (>= 0.6-4), sfsmisc, stats, graphics, methods, utils |
Suggests: | Matrix |
SuggestsNote: | Matrix for its test-tools-1.R |
License: | GPL (>= 2) |
Encoding: | UTF-8 |
URL: | https://specfun.r-forge.r-project.org/, https://r-forge.r-project.org/R/?group_id=611, https://r-forge.r-project.org/scm/viewvc.php/pkg/DPQmpfr/?root=specfun, svn://svn.r-forge.r-project.org/svnroot/specfun/pkg/DPQmpfr |
BugReports: | https://r-forge.r-project.org/tracker/?atid=2462&group_id=611 |
Repository: | https://r-forge.r-universe.dev |
RemoteUrl: | https://github.com/r-forge/specfun |
RemoteRef: | HEAD |
RemoteSha: | c953b5aa5accf58418aa63e70638b0c0850a9c0d |
Author: | Martin Maechler [aut, cre] (<https://orcid.org/0000-0002-8685-9910>) |
Maintainer: | Martin Maechler <[email protected]> |
Index of help topics:
DPQmpfr-package DPQ (Density, Probability, Quantile) Distribution Computations using MPFR algdivM Compute log(gamma(b)/gamma(a+b)) Accurately, also via 'Rmpfr' dbetaD94 Ding(1994) (non-central) Beta Distribution Functions dhyperQ Exact Hypergeometric Distribution Probabilites dntJKBm Non-central t-Distribution Density gam1M Compute 1/Gamma(x+1) - 1 Accurately ldexp Numeric / Mpfr Utilities for DPQmpfr lgamma1pM Compute log( Gamma(x+1) ) Arbitrarily (MPFR) Accurately pbeta_ser Beta Distribution Function - 'BPSER' Series Expansion from TOMS 708 pnormAsymp Asymptotic Approximations of Extreme Tail 'pnorm()' and 'qnorm()' pnormL_LD10 Bounds for 1-Phi(.) - Mill's Ratio related Bounds for pnorm() qbBaha2017 Accurate qbeta() values from Baharev et al (2017)'s Program stirlerrM Stirling Formula Approximation Error
Martin Maechler [aut, cre] (<https://orcid.org/0000-0002-8685-9910>)
Maintainer: Martin Maechler <[email protected]>
Packages DPQ, Rmpfr are both used by this package.
## An example how mpfr-numbers "just work" with reasonable R functions: .srch <- search() ; doAtt <- is.na(match("Rmpfr:package", .srch)) if(doAtt) require(Rmpfr) nu.s <- 2^seqMpfr(mpfr(-30, 64), mpfr(100, 64), by = 1/mpfr(4, 64)) b0 <- DPQ::b_chi(nu.s) b1 <- DPQ::b_chi(nu.s, one.minus=TRUE) stopifnot(inherits(b0,"mpfr"), inherits(b1, "mpfr"), b0+b1 == 1, diff(log(b1)) < 0) plot(nu.s, log(b1), type="l", log="x") plot(nu.s[-1], diff(log(b1)), type="l", log="x") if(doAtt) # detach the package(s) we've attached above for(pkg in setdiff(search(), .srch)) detach(pkg, character.only=TRUE)
## An example how mpfr-numbers "just work" with reasonable R functions: .srch <- search() ; doAtt <- is.na(match("Rmpfr:package", .srch)) if(doAtt) require(Rmpfr) nu.s <- 2^seqMpfr(mpfr(-30, 64), mpfr(100, 64), by = 1/mpfr(4, 64)) b0 <- DPQ::b_chi(nu.s) b1 <- DPQ::b_chi(nu.s, one.minus=TRUE) stopifnot(inherits(b0,"mpfr"), inherits(b1, "mpfr"), b0+b1 == 1, diff(log(b1)) < 0) plot(nu.s, log(b1), type="l", log="x") plot(nu.s[-1], diff(log(b1)), type="l", log="x") if(doAtt) # detach the package(s) we've attached above for(pkg in setdiff(search(), .srch)) detach(pkg, character.only=TRUE)
Computes
in a numerically stable way.
The name ‘algdiv’ is from the auxiliary function in R's (TOMS 708) implementation of
pbeta()
.
As package DPQ provides R's Mathlib (double precision) as R
function algdiv()
, we append ‘M’ to show the reliance on
the Rmpfr package.
algdivM(a, b, usePr = NULL)
algdivM(a, b, usePr = NULL)
a , b
|
numeric or numeric-alike vectors (recycled to the same length
if needed), typically inheriting from |
usePr |
positive integer specifying the precision in bits, or
|
Note that this is also useful to compute the Beta function
Clearly,
In our ‘../tests/qbeta-dist.R’ file, we look into computing
accurately for
.
We are proposing a nice solution there.
How is this related to algdiv()
?
Additionally, we have defined
such that
fulfills simply
see logQab_asy
from package DPQ.
a numeric vector of length max(length(a), length(b))
(if neither
is of length 0, in which case the result has length 0 as well).
Martin Maechler (for the Rmpfr version).
Didonato, A. and Morris, A., Jr, (1992) Algorithm 708: Significant digit computation of the incomplete beta function ratios, ACM Transactions on Mathematical Software 18, 360–373.
gamma
, beta
;
the (double precision) version algdiv()
in DPQ,
and also in DPQ, the asymptotic approximation
logQab_asy()
.
Qab <- algdivM(2:3, 8:14) cbind(a = 2:3, b = 8:14, Qab) # recycling with a warning ## algdivM() and my logQab_asy() give *very* similar results for largish b: (lQab <- DPQ::logQab_asy(3, 100)) all.equal( - algdivM(3, 100), lQab, tolerance=0) # 1.283e-16 !! ## relative error 1 + lQab/ algdivM(3, 1e10) # 0 (64b F 30 Linux; 2019-08-15) ## in-and outside of "certified" argument range {b >= 8}: a. <- c(1:3, 4*(1:8))/32 b. <- seq(1/4, 20, by=1/4) ad <- t(outer(a., b., algdivM)) ## direct computation: f.algdiv0 <- function(a,b) lgamma(b) - lgamma(a+b) f.algdiv1 <- function(a,b) lgamma(b) - lgamma(a+b) ad.d <- t(outer(a., b., f.algdiv0)) matplot (b., ad.d, type = "o", cex=3/4, main = quote(log(Gamma(b)/Gamma(a+b)) ~" vs. algdivM(a,b)")) mtext(paste0("a[1:",length(a.),"] = ", paste0(paste(head(paste0(formatC(a.*32), "/32")), collapse=", "), ", .., 1"))) matlines(b., ad, type = "l", lwd=4, lty=1, col=adjustcolor(1:6, 1/2)) abline(v=1, lty=3, col="midnightblue") # The larger 'b', the more accurate the direct formula wrt algdivM() all.equal(ad[b. >= 1,], ad.d[b. >= 1,] )# 1.5e-5 all.equal(ad[b. >= 2,], ad.d[b. >= 2,], tol=0)# 3.9e-9 all.equal(ad[b. >= 4,], ad.d[b. >= 4,], tol=0)# 4.6e-13 all.equal(ad[b. >= 6,], ad.d[b. >= 6,], tol=0)# 3.0e-15 all.equal(ad[b. >= 8,], ad.d[b. >= 8,], tol=0)# 2.5e-15 (not much better)
Qab <- algdivM(2:3, 8:14) cbind(a = 2:3, b = 8:14, Qab) # recycling with a warning ## algdivM() and my logQab_asy() give *very* similar results for largish b: (lQab <- DPQ::logQab_asy(3, 100)) all.equal( - algdivM(3, 100), lQab, tolerance=0) # 1.283e-16 !! ## relative error 1 + lQab/ algdivM(3, 1e10) # 0 (64b F 30 Linux; 2019-08-15) ## in-and outside of "certified" argument range {b >= 8}: a. <- c(1:3, 4*(1:8))/32 b. <- seq(1/4, 20, by=1/4) ad <- t(outer(a., b., algdivM)) ## direct computation: f.algdiv0 <- function(a,b) lgamma(b) - lgamma(a+b) f.algdiv1 <- function(a,b) lgamma(b) - lgamma(a+b) ad.d <- t(outer(a., b., f.algdiv0)) matplot (b., ad.d, type = "o", cex=3/4, main = quote(log(Gamma(b)/Gamma(a+b)) ~" vs. algdivM(a,b)")) mtext(paste0("a[1:",length(a.),"] = ", paste0(paste(head(paste0(formatC(a.*32), "/32")), collapse=", "), ", .., 1"))) matlines(b., ad, type = "l", lwd=4, lty=1, col=adjustcolor(1:6, 1/2)) abline(v=1, lty=3, col="midnightblue") # The larger 'b', the more accurate the direct formula wrt algdivM() all.equal(ad[b. >= 1,], ad.d[b. >= 1,] )# 1.5e-5 all.equal(ad[b. >= 2,], ad.d[b. >= 2,], tol=0)# 3.9e-9 all.equal(ad[b. >= 4,], ad.d[b. >= 4,], tol=0)# 4.6e-13 all.equal(ad[b. >= 6,], ad.d[b. >= 6,], tol=0)# 3.0e-15 all.equal(ad[b. >= 8,], ad.d[b. >= 8,], tol=0)# 2.5e-15 (not much better)
The three functions "p" (cumulative distribution, CDF), "d" (density
(PDF)), and "q" (quantile) use Ding(1994)'s algorithm A, B, and C, respectively,
each of which implements a recursion formula using only simple
arithmetic and log
and exp
.
These are particularly useful also for using with high precision
"mpfr"
numbers from the Rmpfr CRAN package.
dbetaD94(x, shape1, shape2, ncp = 0, log = FALSE, eps = 1e-10, itrmax = 100000L, verbose = FALSE) pbetaD94(q, shape1, shape2, ncp = 0, lower.tail = TRUE, log.p = FALSE, log_scale = (a * b > 0) && (a + b > 100 || c >= 500), eps = 1e-10, itrmax = 100000L, verbose = FALSE) qbetaD94(p, shape1, shape2, ncp = 0, lower.tail = TRUE, log.p = FALSE, log_scale = (a * b > 0) && (a + b > 100 || c >= 500), delta = 1e-6, eps = delta^2, itrmax = 100000L, iterN = 1000L, verbose = FALSE)
dbetaD94(x, shape1, shape2, ncp = 0, log = FALSE, eps = 1e-10, itrmax = 100000L, verbose = FALSE) pbetaD94(q, shape1, shape2, ncp = 0, lower.tail = TRUE, log.p = FALSE, log_scale = (a * b > 0) && (a + b > 100 || c >= 500), eps = 1e-10, itrmax = 100000L, verbose = FALSE) qbetaD94(p, shape1, shape2, ncp = 0, lower.tail = TRUE, log.p = FALSE, log_scale = (a * b > 0) && (a + b > 100 || c >= 500), delta = 1e-6, eps = delta^2, itrmax = 100000L, iterN = 1000L, verbose = FALSE)
x , q
|
numeric vector of values in |
shape1 , shape2
|
the two shape parameters of the beta distribution, must be positive. |
ncp |
the noncentrality parameter; by default zero for the (central) beta distribution; if positive, we have a noncentral beta distribution. |
p |
numeric vector of probabilities, |
log , log.p
|
logical indicating if the density or probability
values should be |
lower.tail |
logical indicating if the lower or upper tail
probability should be computed, or for |
eps |
a non-negative number specifying the desired accuracy for computing F() and f(). |
itrmax |
the maximal number of steps for computing F() and f(). |
delta |
[For |
iterN |
[For |
log_scale |
logical indicating if most of the computations should
happen in |
verbose |
logical (or integer) indicating the amount of diagnostic output during computation; by default none. |
In all three cases, a numeric vector with the same attributes as
x
(or q
respectively),
containing (an approximation) to the correponding beta distribution function.
Martin Maechler, notably log_scale
was not part of Ding's
proposals.
Cherng G. Ding (1994) On the computation of the noncentral beta distribution. Computational Statistics & Data Analysis 18, 449–455.
pbeta
. Package Rmpfr's pbetaI()
needs
both shape1
and shape2
to be integer but is typically more
efficient than the current pbetaD94()
implementation.
## Low precision (eps, delta) values as "e.g." in Ding(94): ------------------ ## Compare with Table 3 of Baharev_et_al 2017 %% ===> ./qbBaha2017.Rd <<<<<<<<<<< aa <- c(0.5, 1, 1.5, 2, 2.5, 3, 5, 10, 25) bb <- c(1:15, 10*c(2:5, 10, 25, 50)) utime <- qbet <- matrix(NA_real_, length(aa), length(bb), dimnames = list(a = formatC(aa), b = formatC(bb))) (doExtras <- DPQmpfr:::doExtras()) if(doExtras) qbetL <- utimeL <- utime p <- 0.95 delta <- 1e-4 eps <- 1e-6 system.t.usr <- function(expr) system.time(gcFirst = FALSE, expr)[["user.self"]] system.time( for(ia in seq_along(aa)) { a <- aa[ia]; cat("\n--==--\na=",a,":\n") for(ib in seq_along(bb)) { b <- bb[ib]; cat("\n>> b=",b,"\n") utime [ia, ib] <- system.t.usr( qbet[ia, ib] <- qbetaD94(p, a, b, ncp = 0, delta=delta, eps=eps, verbose = 2)) if(doExtras) utimeL[ia, ib] <- system.t.usr( qbetL[ia, ib] <- qbetaD94(p, a, b, ncp = 0, delta=delta, eps=eps, verbose = 2, log_scale=TRUE)) } cat("\n") } )# system.time(.): ~ 1 sec (lynne i7-7700T, Fedora 32, 2020) sum(print(table(round(1000*utime)))) # lynne .. : ## 0 1 2 3 4 5 6 7 8 9 10 11 14 15 16 29 ## 53 94 15 3 3 12 2 2 2 2 1 2 3 1 2 1 ## [1] 198 if(doExtras) print(sum(print(table(round(1000*utimeL))))) # lynne .. :
## Low precision (eps, delta) values as "e.g." in Ding(94): ------------------ ## Compare with Table 3 of Baharev_et_al 2017 %% ===> ./qbBaha2017.Rd <<<<<<<<<<< aa <- c(0.5, 1, 1.5, 2, 2.5, 3, 5, 10, 25) bb <- c(1:15, 10*c(2:5, 10, 25, 50)) utime <- qbet <- matrix(NA_real_, length(aa), length(bb), dimnames = list(a = formatC(aa), b = formatC(bb))) (doExtras <- DPQmpfr:::doExtras()) if(doExtras) qbetL <- utimeL <- utime p <- 0.95 delta <- 1e-4 eps <- 1e-6 system.t.usr <- function(expr) system.time(gcFirst = FALSE, expr)[["user.self"]] system.time( for(ia in seq_along(aa)) { a <- aa[ia]; cat("\n--==--\na=",a,":\n") for(ib in seq_along(bb)) { b <- bb[ib]; cat("\n>> b=",b,"\n") utime [ia, ib] <- system.t.usr( qbet[ia, ib] <- qbetaD94(p, a, b, ncp = 0, delta=delta, eps=eps, verbose = 2)) if(doExtras) utimeL[ia, ib] <- system.t.usr( qbetL[ia, ib] <- qbetaD94(p, a, b, ncp = 0, delta=delta, eps=eps, verbose = 2, log_scale=TRUE)) } cat("\n") } )# system.time(.): ~ 1 sec (lynne i7-7700T, Fedora 32, 2020) sum(print(table(round(1000*utime)))) # lynne .. : ## 0 1 2 3 4 5 6 7 8 9 10 11 14 15 16 29 ## 53 94 15 3 3 12 2 2 2 2 1 2 3 1 2 1 ## [1] 198 if(doExtras) print(sum(print(table(round(1000*utimeL))))) # lynne .. :
Computes exact probabilities for the hypergeometric distribution
(see, e.g., dhyper()
in R), using package gmp's
big integer and rational numbers, notably chooseZ()
.
dhyperQ(x, m, n, k) phyperQ(x, m, n, k, lower.tail=TRUE) phyperQall(m, n, k, lower.tail=TRUE)
dhyperQ(x, m, n, k) phyperQ(x, m, n, k, lower.tail=TRUE) phyperQall(m, n, k, lower.tail=TRUE)
x |
the number of white balls drawn without replacement from an urn which contains both black and white balls. |
m |
the number of white balls in the urn. |
n |
the number of black balls in the urn. |
k |
the number of balls drawn from the urn, hence must be in
|
lower.tail |
logical indicating if the lower or upper tail probability should be computed. |
a bigrational (class "bigq"
from package gmp) vector
“as” x
; currently of length one (as all the function
arguments must be “scalar”, currently).
Martin Maechler
chooseZ
(pkg gmp),
and R's own Hypergeometric
## dhyperQ() is simply function (x, m, n, k) { stopifnot(k - x == as.integer(k - x)) chooseZ(m, x) * chooseZ(n, k - x) / chooseZ(m + n, k) } # a case where phyper(11, 15, 0, 12, log=TRUE) gave 'NaN' (phyp5.0.12 <- cumsum(dhyperQ(0:12, m=15,n=0,k=12))) stopifnot(phyp5.0.12 == c(rep(0, 12), 1)) for(x in 0:9) stopifnot(phyperQ(x, 10,7,8) + phyperQ(x, 10,7,8, lower.tail=FALSE) == 1) (ph. <- phyperQall(m=10, n=7, k=8)) ## Big Rational ('bigq') object of length 8: ## [1] 1/2431 5/374 569/4862 2039/4862 3803/4862 4685/4862 4853/4862 1 stopifnot(identical(gmp::c_bigq(list(0, ph.)), 1- c(phyperQall(10,7,8, lower.tail=FALSE), 0))) (doExtras <- DPQmpfr:::doExtras()) if(doExtras) { # too slow for standard testing k <- 5000 system.time(ph <- phyper(k, 2*k, 2*k, 2*k)) # 0 (< 0.001 sec) system.time(phQ <- phyperQ(k, 2*k, 2*k, 2*k)) # 5.6 (was 6.3) sec ## Relative error of R's phyper() stopifnot(print(gmp::asNumeric(1 - ph/phQ)) < 1e-14) # seen 1.063e-15 }
## dhyperQ() is simply function (x, m, n, k) { stopifnot(k - x == as.integer(k - x)) chooseZ(m, x) * chooseZ(n, k - x) / chooseZ(m + n, k) } # a case where phyper(11, 15, 0, 12, log=TRUE) gave 'NaN' (phyp5.0.12 <- cumsum(dhyperQ(0:12, m=15,n=0,k=12))) stopifnot(phyp5.0.12 == c(rep(0, 12), 1)) for(x in 0:9) stopifnot(phyperQ(x, 10,7,8) + phyperQ(x, 10,7,8, lower.tail=FALSE) == 1) (ph. <- phyperQall(m=10, n=7, k=8)) ## Big Rational ('bigq') object of length 8: ## [1] 1/2431 5/374 569/4862 2039/4862 3803/4862 4685/4862 4853/4862 1 stopifnot(identical(gmp::c_bigq(list(0, ph.)), 1- c(phyperQall(10,7,8, lower.tail=FALSE), 0))) (doExtras <- DPQmpfr:::doExtras()) if(doExtras) { # too slow for standard testing k <- 5000 system.time(ph <- phyper(k, 2*k, 2*k, 2*k)) # 0 (< 0.001 sec) system.time(phQ <- phyperQ(k, 2*k, 2*k, 2*k)) # 5.6 (was 6.3) sec ## Relative error of R's phyper() stopifnot(print(gmp::asNumeric(1 - ph/phQ)) < 1e-14) # seen 1.063e-15 }
dntJKBm
is a fully Rmpfr-ified vectorized version of
dntJKBf()
from DPQ; see there.
dtWVm(x, df, ncp)
computes the density function of the t distribution with
df
degrees of freedom and non-centrality parameter ncp
,
according to Wolfgang Viechtbauer's proposal in 2002, using an asymptotic
formula for “large” df
.
dntJKBm(x, df, ncp, log = FALSE, M = 1000) # __ Deprecated __ use DPQ :: dntJKBf dtWVm (x, df, ncp, log = FALSE)
dntJKBm(x, df, ncp, log = FALSE, M = 1000) # __ Deprecated __ use DPQ :: dntJKBf dtWVm (x, df, ncp, log = FALSE)
x |
numeric or |
df |
degrees of freedom ( |
ncp |
non-centrality parameter |
log |
as in |
M |
the number of terms to be used, a positive integer. |
See dtWV
's details (package DPQ).
As DPQ's dntJKBf()
is already fully
mpfr-ized, dntJKBm()
is deprecated.
an mpfr
vector of the same length as the maximum
of the lengths of x, df, ncp
.
Package DPQ's dntJKBf()
is already fully
mpfr-ized, and hence dntJKBm()
is redundant, and therefore deprecated.
Martin Maechler
R's dt
, and package DPQ's
dntJKBf()
and dtWV()
.
tt <- seq(0, 10, len = 21) ncp <- seq(0, 6, len = 31) dt3R <- outer(tt, ncp, dt , df = 3) dt3WV <- outer(tt, ncp, dtWVm, df = 3) all.equal(dt3R, dt3WV) # rel.err 0.00063 dt25R <- outer(tt, ncp, dt , df = 25) dt25WV <- outer(tt, ncp, dtWVm, df = 25) all.equal(dt25R, dt25WV) # rel.err 1.1e-5 x <- -10:700 fx <- dt (x, df = 22, ncp =100) lfx <- dt (x, df = 22, ncp =100, log=TRUE) lfV <- dtWVm(x, df = 22, ncp =100, log=TRUE) head(lfx, 15) # shows that R's dt(*, log=TRUE) implementation is "quite suboptimal" ## graphics opa <- par(no.readonly=TRUE) par(mar=.1+c(5,4,4,3), mgp = c(2, .8,0)) plot(fx ~ x, type="l") par(new=TRUE) ; cc <- c("red", adjustcolor("orange", 0.4)) plot(lfx ~ x, type = "o", pch=".", col=cc[1], cex=2, ann=FALSE, yaxt="n") sfsmisc::eaxis(4, col=cc[1], col.axis=cc[1], small.args = list(col=cc[1])) lines(x, lfV, col=cc[2], lwd=3) dtt1 <- " dt"; dtt2 <- "(x, df=22, ncp=100"; dttL <- paste0(dtt2,", log=TRUE)") legend("right", c(paste0(dtt1,dtt2,")"), paste0(c(dtt1,"dtWVm"), dttL)), lty=1, lwd=c(1,1,3), col=c("black", cc), bty = "n") par(opa) # reset ## For dntJKBm(), see example(dntJKBf, package="DPQ")
tt <- seq(0, 10, len = 21) ncp <- seq(0, 6, len = 31) dt3R <- outer(tt, ncp, dt , df = 3) dt3WV <- outer(tt, ncp, dtWVm, df = 3) all.equal(dt3R, dt3WV) # rel.err 0.00063 dt25R <- outer(tt, ncp, dt , df = 25) dt25WV <- outer(tt, ncp, dtWVm, df = 25) all.equal(dt25R, dt25WV) # rel.err 1.1e-5 x <- -10:700 fx <- dt (x, df = 22, ncp =100) lfx <- dt (x, df = 22, ncp =100, log=TRUE) lfV <- dtWVm(x, df = 22, ncp =100, log=TRUE) head(lfx, 15) # shows that R's dt(*, log=TRUE) implementation is "quite suboptimal" ## graphics opa <- par(no.readonly=TRUE) par(mar=.1+c(5,4,4,3), mgp = c(2, .8,0)) plot(fx ~ x, type="l") par(new=TRUE) ; cc <- c("red", adjustcolor("orange", 0.4)) plot(lfx ~ x, type = "o", pch=".", col=cc[1], cex=2, ann=FALSE, yaxt="n") sfsmisc::eaxis(4, col=cc[1], col.axis=cc[1], small.args = list(col=cc[1])) lines(x, lfV, col=cc[2], lwd=3) dtt1 <- " dt"; dtt2 <- "(x, df=22, ncp=100"; dttL <- paste0(dtt2,", log=TRUE)") legend("right", c(paste0(dtt1,dtt2,")"), paste0(c(dtt1,"dtWVm"), dttL)), lty=1, lwd=c(1,1,3), col=c("black", cc), bty = "n") par(opa) # reset ## For dntJKBm(), see example(dntJKBf, package="DPQ")
Utilities for package DPQmpfr
ldexp(f, E)
ldexp(f, E)
f |
‘fraction’, as such with absolute value in |
E |
integer-valued exponent(s). |
ldexp()
is a simple wrapper, either calling
DPQ::ldexp
from DPQ or
ldexpMpfr
from the Rmpfr package,
computed accurately and fast on typical platforms with internally binary arithmetic.
either a numeric or a "mpfr"
, depending on the type of
f
, vector as (the recyled) combination of f
and E
.
ldexp
from package DPQ and
ldexpMpfr
from package Rmpfr.
ldexp(1:10, 2) ldexp(Rmpfr::Const("pi", 96), -2:2) # = pi * (1/4 1/2 1 2 4)
ldexp(1:10, 2) ldexp(Rmpfr::Const("pi", 96), -2:2) # = pi * (1/4 1/2 1 2 4)
FIXME: "R's own" double prec version is now in package DPQ: e.g. ~/R/Pkgs/DPQ/man/gam1.Rd
FIXME2: R-only implementation is in ~/R/Pkgs/DPQ/TODO_R_versions_gam1_etc.R
Computes accurately in
for numeric argument
a
;
For "mpfr"
numbers, the precision is increased intermediately such
that should not lose precision.
gam1M(a, usePr = NULL)
gam1M(a, usePr = NULL)
a |
a numeric or numeric-alike, typically inheriting from |
usePr |
the precision to use; the default, |
https://dlmf.nist.gov/ states the well-know Taylor series for
with ,
, (Euler's gamma,
), with
recursion
.
Hence,
Consequently, for ,
,
.
require(Rmpfr) # Const(), mpfr(), zeta() gam <- Const("gamma", 128) z <- zeta(mpfr(1:7, 128)) (c3 <- (gam^2 -z[2])/2) # -0.655878071520253881077019515145 (c4 <- (gam*c3 - z[2]*c2 + z[3])/3) # -0.04200263503409523552900393488 (c4 <- gam*(gam^2/6 - z[2]/2) + z[3]/3) (c5 <- (gam*c4 - z[2]*c3 + z[3]*c2 - z[4])/4) # 0.1665386113822914895017007951 (c5 <- (gam^4/6 - gam^2*z[2] + z[2]^2/2 + gam*z[3]*4/3 - z[4])/4)
a numeric-alike vector like a
.
Martin Maechler building on C code of TOMS 708
TOMS 708, see pbeta
##' naive direct formula: g1 <- function(u) 1/gamma(u+1) - 1 ##' @title gam1() from TOMS 708 -- translated to R (*and* vectorized) ##' @author Martin Maechler gam1R <- function(a, chk=TRUE) { ## == 1/gamma(a+1) - 1 -- accurately ONLY for -0.5 <= a <= 1.5 if(!length(a)) return(a) ## otherwise: if(chk) stopifnot(-0.5 <= a, a <= 1.5) # if not, the computation below is non-sense! d <- a - 0.5 ## t := if(a > 1/2) a-1 else a ==> t in [-0.5, 0.5] <==> |t| <= 0.5 R <- t <- a dP <- d > 0 t[dP] <- d[dP] - 0.5 if(any(N <- (t < 0.))) { ## L30: */ r <- c(-.422784335098468, -.771330383816272, -.244757765222226, .118378989872749, 9.30357293360349e-4, -.0118290993445146, .00223047661158249, 2.66505979058923e-4, -1.32674909766242e-4) s1 <- .273076135303957 s2 <- .0559398236957378 t_ <- t[N] top <- (((((((r[9] * t_ + r[8]) * t_ + r[7]) * t_ + r[6]) * t_ + r[5]) * t_ + r[4] ) * t_ + r[3]) * t_ + r[2]) * t_ + r[1] bot <- (s2 * t_ + s1) * t_ + 1. w <- top / bot ## if (d > 0.) : if(length(iP <- which(dP[N]))) R[N & dP] <- (t_ * w)[iP] / a[N & dP] ## else d <= 0 : if(length(iN <- which(!dP[N]))) R[N & !dP] <- a[N & !dP] * (w[iN] + 0.5 + 0.5) } if(any(Z <- (t == 0))) ## L10: a in {0, 1} R[Z] <- 0. if(any(P <- t > 0)) { ## t > 0; L20: */ p <- c( .577215664901533, -.409078193005776, -.230975380857675, .0597275330452234, .0076696818164949, -.00514889771323592, 5.89597428611429e-4 ) q <- c(1., .427569613095214, .158451672430138, .0261132021441447, .00423244297896961) t <- t[P] top <- (((((p[7] * t + p[6])*t + p[5])*t + p[4])*t + p[3])*t + p[2])*t + p[1] bot <- (((q[5] * t + q[4]) * t + q[3]) * t + q[2]) * t + 1. w <- top / bot ## if (d > 0.) ## L21: */ if(length(iP <- which(dP[P]))) R[P & dP] <- t[iP] / a[P & dP] * (w[iP] - 0.5 - 0.5) ## else d <= 0 : if(length(iN <- which(!dP[P]))) R[P & !dP] <- a[P & !dP] * w[iN] } R } ## gam1R() u <- seq(-.5, 1.5, by=1/16); set.seed(1); u <- sample(u) # permuted (to check logic) g11 <- vapply(u, gam1R, 1) # [-.5, 1.5] == the interval for which the above gam1() was made gam1. <- gam1R(u) cbind(u, gam1., D = sfsmisc::relErrV(gam1., g1(u)))[order(u),] # looks "too good", as we are not close (but different) to {0, 1} stopifnot( identical(g11, gam1.) ) all.equal(g1(u), gam1., tolerance = 0) # 6.7e-16 ("too good", see above) stopifnot( all.equal(g1(u), gam1.) ) ## Comparison using Rmpfr; slightly extending [-.5, 1.5] interval (and getting much closer to {0,1}) u <- seq(-0.525, 1.525, length.out = 2001) uM <- Rmpfr::mpfr(u, 128) gam1M. <- gam1M(uM) relE <- Rmpfr::asNumeric(sfsmisc::relErrV(gam1M., gam1R(u, chk=FALSE))) rbind(rErr = summary(relE), `|rE|` = summary(abs(relE))) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## rErr -3.280e-15 -3.466e-16 1.869e-17 1.526e-16 4.282e-16 1.96e-14 ## |rE| 1.343e-19 2.363e-16 3.861e-16 6.014e-16 6.372e-16 1.96e-14 stopifnot(max(abs(relE)) < 1e-13) relEtit <- expression("Relative Error of " ~~ gam1(u) %~%{} == frac(1, Gamma(u+1)) - 1) #% plot(relE ~ u, type="l", ylim = c(-1,1) * 2.5e-15, main = relEtit) grid(lty = 3); abline(v = c(-.5, 1.5), col = adjustcolor(4, 1/2), lty=2, lwd=2) ## what about the direct formula -- how bad is it really ? relED <- Rmpfr::asNumeric(sfsmisc::relErrV(gam1M., g1(u))) plot(relE ~ u, type="l", ylim = c(-1,1) * 1e-14, main = relEtit) lines(relED ~ u, col = adjustcolor(2, 1/2), lwd = 2); abline(v = (-1:3)/2, lty=2, col="orange3") mtext("comparing with direct formula 1/gamma(u+1) - 1", col=2, cex=3/4) legend("top", c("gam1R(u)", "1/gamma(u+1) - 1"), col = 1:2, lwd=1:2, bty="n") ## direct is clearly *worse* , but not catastrophical
##' naive direct formula: g1 <- function(u) 1/gamma(u+1) - 1 ##' @title gam1() from TOMS 708 -- translated to R (*and* vectorized) ##' @author Martin Maechler gam1R <- function(a, chk=TRUE) { ## == 1/gamma(a+1) - 1 -- accurately ONLY for -0.5 <= a <= 1.5 if(!length(a)) return(a) ## otherwise: if(chk) stopifnot(-0.5 <= a, a <= 1.5) # if not, the computation below is non-sense! d <- a - 0.5 ## t := if(a > 1/2) a-1 else a ==> t in [-0.5, 0.5] <==> |t| <= 0.5 R <- t <- a dP <- d > 0 t[dP] <- d[dP] - 0.5 if(any(N <- (t < 0.))) { ## L30: */ r <- c(-.422784335098468, -.771330383816272, -.244757765222226, .118378989872749, 9.30357293360349e-4, -.0118290993445146, .00223047661158249, 2.66505979058923e-4, -1.32674909766242e-4) s1 <- .273076135303957 s2 <- .0559398236957378 t_ <- t[N] top <- (((((((r[9] * t_ + r[8]) * t_ + r[7]) * t_ + r[6]) * t_ + r[5]) * t_ + r[4] ) * t_ + r[3]) * t_ + r[2]) * t_ + r[1] bot <- (s2 * t_ + s1) * t_ + 1. w <- top / bot ## if (d > 0.) : if(length(iP <- which(dP[N]))) R[N & dP] <- (t_ * w)[iP] / a[N & dP] ## else d <= 0 : if(length(iN <- which(!dP[N]))) R[N & !dP] <- a[N & !dP] * (w[iN] + 0.5 + 0.5) } if(any(Z <- (t == 0))) ## L10: a in {0, 1} R[Z] <- 0. if(any(P <- t > 0)) { ## t > 0; L20: */ p <- c( .577215664901533, -.409078193005776, -.230975380857675, .0597275330452234, .0076696818164949, -.00514889771323592, 5.89597428611429e-4 ) q <- c(1., .427569613095214, .158451672430138, .0261132021441447, .00423244297896961) t <- t[P] top <- (((((p[7] * t + p[6])*t + p[5])*t + p[4])*t + p[3])*t + p[2])*t + p[1] bot <- (((q[5] * t + q[4]) * t + q[3]) * t + q[2]) * t + 1. w <- top / bot ## if (d > 0.) ## L21: */ if(length(iP <- which(dP[P]))) R[P & dP] <- t[iP] / a[P & dP] * (w[iP] - 0.5 - 0.5) ## else d <= 0 : if(length(iN <- which(!dP[P]))) R[P & !dP] <- a[P & !dP] * w[iN] } R } ## gam1R() u <- seq(-.5, 1.5, by=1/16); set.seed(1); u <- sample(u) # permuted (to check logic) g11 <- vapply(u, gam1R, 1) # [-.5, 1.5] == the interval for which the above gam1() was made gam1. <- gam1R(u) cbind(u, gam1., D = sfsmisc::relErrV(gam1., g1(u)))[order(u),] # looks "too good", as we are not close (but different) to {0, 1} stopifnot( identical(g11, gam1.) ) all.equal(g1(u), gam1., tolerance = 0) # 6.7e-16 ("too good", see above) stopifnot( all.equal(g1(u), gam1.) ) ## Comparison using Rmpfr; slightly extending [-.5, 1.5] interval (and getting much closer to {0,1}) u <- seq(-0.525, 1.525, length.out = 2001) uM <- Rmpfr::mpfr(u, 128) gam1M. <- gam1M(uM) relE <- Rmpfr::asNumeric(sfsmisc::relErrV(gam1M., gam1R(u, chk=FALSE))) rbind(rErr = summary(relE), `|rE|` = summary(abs(relE))) ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## rErr -3.280e-15 -3.466e-16 1.869e-17 1.526e-16 4.282e-16 1.96e-14 ## |rE| 1.343e-19 2.363e-16 3.861e-16 6.014e-16 6.372e-16 1.96e-14 stopifnot(max(abs(relE)) < 1e-13) relEtit <- expression("Relative Error of " ~~ gam1(u) %~%{} == frac(1, Gamma(u+1)) - 1) #% plot(relE ~ u, type="l", ylim = c(-1,1) * 2.5e-15, main = relEtit) grid(lty = 3); abline(v = c(-.5, 1.5), col = adjustcolor(4, 1/2), lty=2, lwd=2) ## what about the direct formula -- how bad is it really ? relED <- Rmpfr::asNumeric(sfsmisc::relErrV(gam1M., g1(u))) plot(relE ~ u, type="l", ylim = c(-1,1) * 1e-14, main = relEtit) lines(relED ~ u, col = adjustcolor(2, 1/2), lwd = 2); abline(v = (-1:3)/2, lty=2, col="orange3") mtext("comparing with direct formula 1/gamma(u+1) - 1", col=2, cex=3/4) legend("top", c("gam1R(u)", "1/gamma(u+1) - 1"), col = 1:2, lwd=1:2, bty="n") ## direct is clearly *worse* , but not catastrophical
Computes accurately notably when
.
For
"mpfr"
numbers, the precision is increased intermediately such
that should not lose precision.
R's "own" double prec version is
soon
available in package in DPQ,
under the name gamln1()
(from TOMS 708).
lgamma1pM(a, usePr = NULL, DPQmethod = c("lgamma1p", "algam1"))
lgamma1pM(a, usePr = NULL, DPQmethod = c("lgamma1p", "algam1"))
a |
a numeric or numeric-alike vector, typically inheriting from
|
usePr |
positive integer specifying the precision in bits, or
|
DPQmethod |
a character string; must be the name of an
|
a numeric-alike vector like a
.
Martin Maechler
TOMS 708, see pbeta
lgamma()
(and gamma()
(same page)),
and our algdivM()
; further, package DPQ's
lgamma1p()
and
(if already available) gamln1()
.
## Package {DPQ}'s lgamma1p(): lgamma1p <- DPQ::lgamma1p lg1 <- function(u) lgamma(u+1) # the simple direct form u <- seq(-.5, 1.5, by=1/16); set.seed(1); u <- sample(u) # permuted (to check logic) g11 <- vapply(u, lgamma1p, numeric(1)) lgamma1p. <- lgamma1p(u) all.equal(lg1(u), g11, tolerance = 0) # see 3.148e-16 stopifnot(exprs = { all.equal(lg1(u), g11, tolerance = 2e-15) identical(g11, lgamma1p.) }) ## Comparison using Rmpfr; slightly extending the [-.5, 1.5] interval: u <- seq(-0.525, 1.525, length.out = 2001) lg1p <- lgamma1pM( u) lg1pM <- lgamma1pM(Rmpfr::mpfr(u, 128)) asNumeric <- Rmpfr::asNumeric relErrV <- sfsmisc::relErrV if(FALSE) { # DPQ "latest" version __FIXME__ lng1 <- DPQ::lngam1(u) relE <- asNumeric(relErrV(lg1pM, cbind(lgamma1p = lg1p, lngam1 = lng1))) } else { relE <- asNumeric(relErrV(lg1pM, cbind(lgamma1p = lg1p)))#, lngam1 = lng1))) } ## FIXME: lgamma1p() is *NOT* good around u =1. -- even though it should ## and the R-only vs (not installed) *does* "work" (is accurate there) ????? ## --> ~/R/Pkgs/DPQ/TODO_R_versions_gam1_etc.R if(FALSE) { matplot(u, relE, type="l", ylim = c(-1,1) * 2.5e-15, main = expression("relative error of " ~~ lgamma1p(u) == log( Gamma(u+1) ))) } else { plot(relE ~ u, type="l", ylim = c(-1,1) * 2.5e-15, main = expression("relative error of " ~~ lgamma1p(u) == log( Gamma(u+1) ))) } grid(lty = 3); abline(v = c(-.5, 1.5), col = adjustcolor(4, 1/2), lty=2, lwd=2) ## what about the direct formula -- how bad is it really ? relED <- asNumeric(relErrV(lg1pM, lg1(u))) lines(relED ~ u, col = adjustcolor(2, 1/2), lwd = 2)
## Package {DPQ}'s lgamma1p(): lgamma1p <- DPQ::lgamma1p lg1 <- function(u) lgamma(u+1) # the simple direct form u <- seq(-.5, 1.5, by=1/16); set.seed(1); u <- sample(u) # permuted (to check logic) g11 <- vapply(u, lgamma1p, numeric(1)) lgamma1p. <- lgamma1p(u) all.equal(lg1(u), g11, tolerance = 0) # see 3.148e-16 stopifnot(exprs = { all.equal(lg1(u), g11, tolerance = 2e-15) identical(g11, lgamma1p.) }) ## Comparison using Rmpfr; slightly extending the [-.5, 1.5] interval: u <- seq(-0.525, 1.525, length.out = 2001) lg1p <- lgamma1pM( u) lg1pM <- lgamma1pM(Rmpfr::mpfr(u, 128)) asNumeric <- Rmpfr::asNumeric relErrV <- sfsmisc::relErrV if(FALSE) { # DPQ "latest" version __FIXME__ lng1 <- DPQ::lngam1(u) relE <- asNumeric(relErrV(lg1pM, cbind(lgamma1p = lg1p, lngam1 = lng1))) } else { relE <- asNumeric(relErrV(lg1pM, cbind(lgamma1p = lg1p)))#, lngam1 = lng1))) } ## FIXME: lgamma1p() is *NOT* good around u =1. -- even though it should ## and the R-only vs (not installed) *does* "work" (is accurate there) ????? ## --> ~/R/Pkgs/DPQ/TODO_R_versions_gam1_etc.R if(FALSE) { matplot(u, relE, type="l", ylim = c(-1,1) * 2.5e-15, main = expression("relative error of " ~~ lgamma1p(u) == log( Gamma(u+1) ))) } else { plot(relE ~ u, type="l", ylim = c(-1,1) * 2.5e-15, main = expression("relative error of " ~~ lgamma1p(u) == log( Gamma(u+1) ))) } grid(lty = 3); abline(v = c(-.5, 1.5), col = adjustcolor(4, 1/2), lty=2, lwd=2) ## what about the direct formula -- how bad is it really ? relED <- asNumeric(relErrV(lg1pM, lg1(u))) lines(relED ~ u, col = adjustcolor(2, 1/2), lwd = 2)
Compute a version of the Beta cumulative distribution function
(pbeta()
in R), namely using the series expansion, named
BPSER()
, from “TOMS 708”, i.e., Didonato and Morris (1992).
This “pure R” function exists for didactical or documentational reasons on one hand,
as R's own pbeta()
uses this expansion when appropriate and
other algorithms otherwise.
On the other hand, using high precision q
and MPFR arithmetic (via
package Rmpfr) may allow to get highly accurate pbeta()
values.
pbeta_ser(q, shape1, shape2, log.p = FALSE, eps = 1e-15, errPb = 0, verbose = FALSE)
pbeta_ser(q, shape1, shape2, log.p = FALSE, eps = 1e-15, errPb = 0, verbose = FALSE)
q , shape1 , shape2
|
quantiles and shape parameters of the Beta
distribution, |
log.p |
if TRUE, probabilities |
eps |
non-negative number; |
errPb |
an integer code, typically in |
verbose |
logical indicating if console output about intermediate results should be printed. |
pbeta_ser()
crucially needs three auxiliary functions which we
“mpfr-ized” as well: gam1M()
,
lgamma1pM()
, and algdivM
.
An approximation to the Beta probability
for
(where
shape1
, and shape2
).
Didonato and Morris and R Core team; separate packaging by Martin Maechler.
Didonato, A. and Morris, A., Jr, (1992) Algorithm 708: Significant digit computation of the incomplete beta function ratios, ACM Transactions on Mathematical Software 18, 360–373; doi:10.1145/131766.131776.
pbeta
, DPQmpfr's own pbetaD94
;
even more pbeta()
approximations in package DPQ, e.g.,
pnbetaAS310
, or pbetaRv1
.
In addition, for integer shape parameters, the potentially “fully accurate”
finite sum base pbetaI()
in package Rmpfr.
(p. <- pbeta_ser(1/2, shape1 = 2, shape2 = 3, verbose=TRUE)) (lp <- pbeta_ser(1/2, shape1 = 2, shape2 = 3, log.p = TRUE)) all.equal(lp, log(p.), tolerance=0) # 1.48e-16 stopifnot(all.equal(lp, log(p.), tolerance = 1e-13)) ## Using Vectorize() in order to allow vector 'q' e.g. for curve(): str(pbetaSer <- Vectorize(pbeta_ser, "q")) curve(pbetaSer(x, 1.5, 4.5)); abline(h=0:1, v=0:1, lty=2, col="gray") curve(pbeta (x, 1.5, 4.5), add=TRUE, col = adjustcolor(2, 1/4), lwd=3) ## now using mpfr-numbers: half <- 1/Rmpfr::mpfr(2, 256) (p2 <- pbeta_ser(half, shape1 = 1, shape2 = 123))
(p. <- pbeta_ser(1/2, shape1 = 2, shape2 = 3, verbose=TRUE)) (lp <- pbeta_ser(1/2, shape1 = 2, shape2 = 3, log.p = TRUE)) all.equal(lp, log(p.), tolerance=0) # 1.48e-16 stopifnot(all.equal(lp, log(p.), tolerance = 1e-13)) ## Using Vectorize() in order to allow vector 'q' e.g. for curve(): str(pbetaSer <- Vectorize(pbeta_ser, "q")) curve(pbetaSer(x, 1.5, 4.5)); abline(h=0:1, v=0:1, lty=2, col="gray") curve(pbeta (x, 1.5, 4.5), add=TRUE, col = adjustcolor(2, 1/4), lwd=3) ## now using mpfr-numbers: half <- 1/Rmpfr::mpfr(2, 256) (p2 <- pbeta_ser(half, shape1 = 1, shape2 = 123))
Bounds for , i.e.,
pnorm(x, *,
lower.tail=FALSE)
, typically related to Mill's Ratio.
pnormL_LD10(x, lower.tail = FALSE, log.p = FALSE) pnormU_S53 (x, lower.tail = FALSE, log.p = FALSE)
pnormL_LD10(x, lower.tail = FALSE, log.p = FALSE) pnormU_S53 (x, lower.tail = FALSE, log.p = FALSE)
x |
positive (at least non-negative) numeric |
lower.tail , log.p
|
logical, see, e.g., |
vector/array/mpfr like x
.
Martin Maechler
Lutz Duembgen (2010)
Bounding Standard Gaussian Tail Probabilities;
arXiv preprint 1012.2063
,
https://arxiv.org/abs/1012.2063
pnorm
. The same functions “numeric-only” are in my
DPQ package.
x <- seq(1/64, 10, by=1/64) px <- cbind( lQ = pnorm (x, lower.tail=FALSE, log.p=TRUE) , Lo = pnormL_LD10(x, lower.tail=FALSE, log.p=TRUE) , Up = pnormU_S53 (x, lower.tail=FALSE, log.p=TRUE)) matplot(x, px, type="l") # all on top of each other matplot(x, (D <- px[,2:3] - px[,1]), type="l") # the differences abline(h=0, lty=3, col=adjustcolor(1, 1/2)) ## check they are lower and upper bounds indeed : stopifnot(D[,"Lo"] < 0, D[,"Up"] > 0) matplot(x[x>4], D[x>4,], type="l") # the differences abline(h=0, lty=3, col=adjustcolor(1, 1/2)) ### zoom out to larger x : [1, 1000] x <- seq(1, 1000, by=1/4) px <- cbind( lQ = pnorm (x, lower.tail=FALSE, log.p=TRUE) , Lo = pnormL_LD10(x, lower.tail=FALSE, log.p=TRUE) , Up = pnormU_S53 (x, lower.tail=FALSE, log.p=TRUE)) matplot(x, px, type="l") # all on top of each other matplot(x, (D <- px[,2:3] - px[,1]), type="l") # the differences abline(h=0, lty=3, col=adjustcolor(1, 1/2)) ## check they are lower and upper bounds indeed : table(D[,"Lo"] < 0) # no longer always true table(D[,"Up"] > 0) ## not even when equality (where it's much better though): table(D[,"Lo"] <= 0) table(D[,"Up"] >= 0) ## *relative* differences: matplot(x, (rD <- 1 - px[,2:3] / px[,1]), type="l", log = "x") abline(h=0, lty=3, col=adjustcolor(1, 1/2)) ## abs() matplot(x, abs(rD), type="l", log = "xy", axes=FALSE, # NB: curves *cross* main = "relative differences 1 - pnormUL(x, *)/pnorm(x,*)") legend("top", c("Low.Bnd(D10)", "Upp.Bnd(S53)"), bty="n", col=1:2, lty=1:2) sfsmisc::eaxis(1, sub10 = 2) sfsmisc::eaxis(2) abline(h=(1:4)*2^-53, col=adjustcolor(1, 1/4)) ### zoom out to LARGE x : --------------------------- x <- 2^seq(0, 30, by = 1/64) col4 <- adjustcolor(1:4, 1/2) options(width = 111) -> oop # (nicely printing "tables") if(FALSE)## or even HUGE: x <- 2^seq(4, 513, by = 1/16) px <- cbind( lQ = pnorm (x, lower.tail=FALSE, log.p=TRUE) , a0 = dnorm(x, log=TRUE) , a1 = dnorm(x, log=TRUE) - log(x) , Lo = pnormL_LD10(x, lower.tail=FALSE, log.p=TRUE) , Up = pnormU_S53 (x, lower.tail=FALSE, log.p=TRUE)) doLegTit <- function(col=1:4) { title(main = "relative differences 1 - pnormUL(x, *)/pnorm(x,*)") legend("top", c("phi(x)", "phi(x)/x", "Low.Bnd(D10)", "Upp.Bnd(S53)"), bty="n", col=col, lty=1:4) } ## *relative* differences are relevant: matplot(x, (rD <- 1 - px[,-1] / px[,1]), type="l", log = "x", ylim = c(-1,1)/2^8, col=col4) ; doLegTit() abline(h=0, lty=3, col=adjustcolor(1, 1/2)) if(x[length(x)] > 1e150) # the "HUGE" case (not default) print( tail(cbind(x, px), 20) ) ##--> For very large x ~= 1e154, the approximations overflow *later* than pnorm() itself !! ## abs(rel.Diff) ---> can use log-log: matplot(x, abs(rD), type="l", log = "xy", xaxt="n", yaxt="n"); doLegTit() sfsmisc::eaxis(1, sub10=2) sfsmisc::eaxis(2) abline(h=(1:4)*2^-53, col=adjustcolor(1, 1/4)) ## lower.tail=TRUE (w/ log.p=TRUE) works "the same" for x < 0: require(Rmpfr) x <- - 2^seq(0, 30, by = 1/64) ## == log1mexp <- Rmpfr::log1mexp # Rmpfr version >= 0.8-2 (2020-11-11 on CRAN) px <- cbind( lQ = pnorm (x, lower.tail=TRUE, log.p=TRUE) , a0 = log1mexp(- dnorm(-x, log=TRUE)) , a1 = log1mexp(-(dnorm(-x, log=TRUE) - log(-x))) , Lo = log1mexp(-pnormL_LD10(-x, lower.tail=TRUE, log.p=TRUE)) , Up = log1mexp(-pnormU_S53 (-x, lower.tail=TRUE, log.p=TRUE)) ) matplot(-x, (rD <- 1 - px[,-1] / px[,1]), type="l", log = "x", ylim = c(-1,1)/2^8, col=col4) ; doLegTit() abline(h=0, lty=3, col=adjustcolor(1, 1/2)) ## Comparison with Rmpfr::erf() / erfc() based pnorm(): ## Set the exponential ranges to maximal -- to evade underflow as long as possible .mpfr_erange_set(value = (1-2^-52) * .mpfr_erange(c("min.emin","max.emax"))) l2t <- seq(0, 32, by=1/4) twos <- mpfr(2, 1024)^l2t Qt <- pnorm(twos, lower.tail=FALSE) pnU <- pnormU_S53 (twos, log.p=TRUE) pnL <- pnormL_LD10(twos, log.p=TRUE) logQt <- log(Qt) M <- cbind(twos, Qt, logQt = logQt, pnU) roundMpfr(M, 40) dM <- asNumeric(cbind(dU = pnU - logQt, dL = logQt - pnL, # NB: the numbers are *negative* rdU= 1 - pnU/logQt, rdL = pnL/logQt - 1)) data.frame(l2t, dM) ## The bounds are ok (where Qt does not underflow): L < p < U : stopifnot(pnU > pnL, pnU > logQt, (logQt > pnL)[Qt > 0]) roundMpfr(cbind(twos, pnL, pnU, D=pnU-pnL, relD=(pnU-pnL)/((pnU+pnL)/2)), 40) ## ----- R's pnorm() -- is it always inside [L, U] ?? --------------------- nQt <- stats::pnorm(asNumeric(twos), lower.tail=FALSE, log.p=TRUE) data.frame(l2t, check.names=FALSE , nQt , "L <= p" = c(" ", "W")[2 -(pnL <= nQt)] , "p <= U" = c(" ", "W")[2- (nQt <= pnU)]) ## ==> pnorm() is *outside* sometimes for l2t >= 7.25; always as soon as l2t >= 9.25 ## *but* the relative errors are around c_epsilon in all these cases : plot (2^l2t, asNumeric(abs(nQt-pnL)/abs(pnU)), type="o", cex=1/4, log="xy", axes=FALSE) sfsmisc::eaxis(1, sub10 = 2) sfsmisc::eaxis(2) lines(2^l2t, asNumeric(abs(nQt-pnU)/abs(pnU)), type="o", cex=1/4, col=2) abline(h=c(1:4)*2^-53, lty=2, col=adjustcolor(1, 1/4)) options(oop)# reverting
x <- seq(1/64, 10, by=1/64) px <- cbind( lQ = pnorm (x, lower.tail=FALSE, log.p=TRUE) , Lo = pnormL_LD10(x, lower.tail=FALSE, log.p=TRUE) , Up = pnormU_S53 (x, lower.tail=FALSE, log.p=TRUE)) matplot(x, px, type="l") # all on top of each other matplot(x, (D <- px[,2:3] - px[,1]), type="l") # the differences abline(h=0, lty=3, col=adjustcolor(1, 1/2)) ## check they are lower and upper bounds indeed : stopifnot(D[,"Lo"] < 0, D[,"Up"] > 0) matplot(x[x>4], D[x>4,], type="l") # the differences abline(h=0, lty=3, col=adjustcolor(1, 1/2)) ### zoom out to larger x : [1, 1000] x <- seq(1, 1000, by=1/4) px <- cbind( lQ = pnorm (x, lower.tail=FALSE, log.p=TRUE) , Lo = pnormL_LD10(x, lower.tail=FALSE, log.p=TRUE) , Up = pnormU_S53 (x, lower.tail=FALSE, log.p=TRUE)) matplot(x, px, type="l") # all on top of each other matplot(x, (D <- px[,2:3] - px[,1]), type="l") # the differences abline(h=0, lty=3, col=adjustcolor(1, 1/2)) ## check they are lower and upper bounds indeed : table(D[,"Lo"] < 0) # no longer always true table(D[,"Up"] > 0) ## not even when equality (where it's much better though): table(D[,"Lo"] <= 0) table(D[,"Up"] >= 0) ## *relative* differences: matplot(x, (rD <- 1 - px[,2:3] / px[,1]), type="l", log = "x") abline(h=0, lty=3, col=adjustcolor(1, 1/2)) ## abs() matplot(x, abs(rD), type="l", log = "xy", axes=FALSE, # NB: curves *cross* main = "relative differences 1 - pnormUL(x, *)/pnorm(x,*)") legend("top", c("Low.Bnd(D10)", "Upp.Bnd(S53)"), bty="n", col=1:2, lty=1:2) sfsmisc::eaxis(1, sub10 = 2) sfsmisc::eaxis(2) abline(h=(1:4)*2^-53, col=adjustcolor(1, 1/4)) ### zoom out to LARGE x : --------------------------- x <- 2^seq(0, 30, by = 1/64) col4 <- adjustcolor(1:4, 1/2) options(width = 111) -> oop # (nicely printing "tables") if(FALSE)## or even HUGE: x <- 2^seq(4, 513, by = 1/16) px <- cbind( lQ = pnorm (x, lower.tail=FALSE, log.p=TRUE) , a0 = dnorm(x, log=TRUE) , a1 = dnorm(x, log=TRUE) - log(x) , Lo = pnormL_LD10(x, lower.tail=FALSE, log.p=TRUE) , Up = pnormU_S53 (x, lower.tail=FALSE, log.p=TRUE)) doLegTit <- function(col=1:4) { title(main = "relative differences 1 - pnormUL(x, *)/pnorm(x,*)") legend("top", c("phi(x)", "phi(x)/x", "Low.Bnd(D10)", "Upp.Bnd(S53)"), bty="n", col=col, lty=1:4) } ## *relative* differences are relevant: matplot(x, (rD <- 1 - px[,-1] / px[,1]), type="l", log = "x", ylim = c(-1,1)/2^8, col=col4) ; doLegTit() abline(h=0, lty=3, col=adjustcolor(1, 1/2)) if(x[length(x)] > 1e150) # the "HUGE" case (not default) print( tail(cbind(x, px), 20) ) ##--> For very large x ~= 1e154, the approximations overflow *later* than pnorm() itself !! ## abs(rel.Diff) ---> can use log-log: matplot(x, abs(rD), type="l", log = "xy", xaxt="n", yaxt="n"); doLegTit() sfsmisc::eaxis(1, sub10=2) sfsmisc::eaxis(2) abline(h=(1:4)*2^-53, col=adjustcolor(1, 1/4)) ## lower.tail=TRUE (w/ log.p=TRUE) works "the same" for x < 0: require(Rmpfr) x <- - 2^seq(0, 30, by = 1/64) ## == log1mexp <- Rmpfr::log1mexp # Rmpfr version >= 0.8-2 (2020-11-11 on CRAN) px <- cbind( lQ = pnorm (x, lower.tail=TRUE, log.p=TRUE) , a0 = log1mexp(- dnorm(-x, log=TRUE)) , a1 = log1mexp(-(dnorm(-x, log=TRUE) - log(-x))) , Lo = log1mexp(-pnormL_LD10(-x, lower.tail=TRUE, log.p=TRUE)) , Up = log1mexp(-pnormU_S53 (-x, lower.tail=TRUE, log.p=TRUE)) ) matplot(-x, (rD <- 1 - px[,-1] / px[,1]), type="l", log = "x", ylim = c(-1,1)/2^8, col=col4) ; doLegTit() abline(h=0, lty=3, col=adjustcolor(1, 1/2)) ## Comparison with Rmpfr::erf() / erfc() based pnorm(): ## Set the exponential ranges to maximal -- to evade underflow as long as possible .mpfr_erange_set(value = (1-2^-52) * .mpfr_erange(c("min.emin","max.emax"))) l2t <- seq(0, 32, by=1/4) twos <- mpfr(2, 1024)^l2t Qt <- pnorm(twos, lower.tail=FALSE) pnU <- pnormU_S53 (twos, log.p=TRUE) pnL <- pnormL_LD10(twos, log.p=TRUE) logQt <- log(Qt) M <- cbind(twos, Qt, logQt = logQt, pnU) roundMpfr(M, 40) dM <- asNumeric(cbind(dU = pnU - logQt, dL = logQt - pnL, # NB: the numbers are *negative* rdU= 1 - pnU/logQt, rdL = pnL/logQt - 1)) data.frame(l2t, dM) ## The bounds are ok (where Qt does not underflow): L < p < U : stopifnot(pnU > pnL, pnU > logQt, (logQt > pnL)[Qt > 0]) roundMpfr(cbind(twos, pnL, pnU, D=pnU-pnL, relD=(pnU-pnL)/((pnU+pnL)/2)), 40) ## ----- R's pnorm() -- is it always inside [L, U] ?? --------------------- nQt <- stats::pnorm(asNumeric(twos), lower.tail=FALSE, log.p=TRUE) data.frame(l2t, check.names=FALSE , nQt , "L <= p" = c(" ", "W")[2 -(pnL <= nQt)] , "p <= U" = c(" ", "W")[2- (nQt <= pnU)]) ## ==> pnorm() is *outside* sometimes for l2t >= 7.25; always as soon as l2t >= 9.25 ## *but* the relative errors are around c_epsilon in all these cases : plot (2^l2t, asNumeric(abs(nQt-pnL)/abs(pnU)), type="o", cex=1/4, log="xy", axes=FALSE) sfsmisc::eaxis(1, sub10 = 2) sfsmisc::eaxis(2) lines(2^l2t, asNumeric(abs(nQt-pnU)/abs(pnU)), type="o", cex=1/4, col=2) abline(h=c(1:4)*2^-53, lty=2, col=adjustcolor(1, 1/4)) options(oop)# reverting
These functions provide the first terms of asymptotic series approximations to
pnorm()
's (extreme) tail, from Abramawitz and Stegun's
26.2.13 (p.932),
or qnorm()
where the approximations have been derived via
iterative plugin using Abramowitz and Stegun's formula.
pnormAsymp(x, k, lower.tail = FALSE, log.p = FALSE) qnormAsymp(p, lp = .DT_Clog(p, lower.tail = lower.tail, log.p = log.p), order, M_2PI =, lower.tail = TRUE, log.p = missing(p))
pnormAsymp(x, k, lower.tail = FALSE, log.p = FALSE) qnormAsymp(p, lp = .DT_Clog(p, lower.tail = lower.tail, log.p = log.p), order, M_2PI =, lower.tail = TRUE, log.p = missing(p))
x |
positive (at least non-negative) numeric vector. |
k |
integer |
p |
numeric vector of probabilities, possibly transformed, depending
on |
lp |
numeric (vector) of |
order |
an integer in |
M_2PI |
the number |
lower.tail |
logical; if true, probabilities are |
log.p |
logical; if |
see both help pages pnormAsymp
and
qnormAsymp
from our package DPQ.
vector/array/mpfr like first argument x
or p
or lp
, respectively.
Martin Maechler
pnorm
. The same functions “numeric-only” are in my
DPQ package with more extensive documentation.
require("Rmpfr") # (in strong dependencies of this pkg {DPQmpfr}) x <- seq(1/64, 10, by=1/64) xm <- mpfr( x, 96) "TODO" ## More extreme tails: ---------------------------------------------- ## ## 1. pnormAsymp() --------------------- lx <- c((2:10)*2, 25, (3:9)*10, (1:9)*100, (1:8)*1000, (2:7)*5000) lxm <- mpfr(lx, 256) Px <- pnorm(lxm, lower.tail = FALSE, log.p=TRUE) PxA <- sapplyMpfr(setNames(0:5, paste("k =",0:5)), pnormAsymp, x=lxm, lower.tail = FALSE, log.p=TRUE) if(interactive()) roundMpfr(PxA, 40) # rel.errors : relE <- asNumeric(1 - PxA/Px) options(width = 99) -> oop # (nicely printing the matrices) cbind(lx, relE) matplot(lx, abs(relE), type="b", cex = 1/2, log="xy", pch=as.character(0:5), axes=FALSE, main = "|relE( <pnormAsymp(lx, k=*, lower.tail=FALSE, log.p=TRUE) )|") sfsmisc::eaxis(1, sub10=2); sfsmisc::eaxis(2) legend("bottom", paste("k =", 0:5), col=1:6, lty=1:5, pch = as.character(0:5), pt.cex=1/2, bty="n") ## NB: rel.Errors go down to 7e-59 ==> need precision of -log2(7e-59) ~ 193.2 bits ## 2. qnormAsymp() --------------------- QPx <- sapplyMpfr(setNames(0:5, paste("k =",0:5)), function(k) qnormAsymp(Px, order=k, lower.tail = FALSE, log.p=TRUE)) (relE.q <- asNumeric(QPx/lx - 1)) # note how consistent the signs are (!) <==> have upper/lower bounds matplot(-asNumeric(Px), abs(relE.q), type="b", cex = 1/2, log="xy", pch=as.character(0:5), xlab = quote(-Px), axes=FALSE, main = "|relE( <qnormAsymp(Px, k=*, lower.tail=FALSE, log.p=TRUE) )|") sfsmisc::eaxis(1, sub10=2); sfsmisc::eaxis(2) legend("bottom", paste("k =", 0:5), col=1:6, lty=1:5, pch = as.character(0:5), pt.cex=1/2, bty="n") options(oop) # {revert to previous state}
require("Rmpfr") # (in strong dependencies of this pkg {DPQmpfr}) x <- seq(1/64, 10, by=1/64) xm <- mpfr( x, 96) "TODO" ## More extreme tails: ---------------------------------------------- ## ## 1. pnormAsymp() --------------------- lx <- c((2:10)*2, 25, (3:9)*10, (1:9)*100, (1:8)*1000, (2:7)*5000) lxm <- mpfr(lx, 256) Px <- pnorm(lxm, lower.tail = FALSE, log.p=TRUE) PxA <- sapplyMpfr(setNames(0:5, paste("k =",0:5)), pnormAsymp, x=lxm, lower.tail = FALSE, log.p=TRUE) if(interactive()) roundMpfr(PxA, 40) # rel.errors : relE <- asNumeric(1 - PxA/Px) options(width = 99) -> oop # (nicely printing the matrices) cbind(lx, relE) matplot(lx, abs(relE), type="b", cex = 1/2, log="xy", pch=as.character(0:5), axes=FALSE, main = "|relE( <pnormAsymp(lx, k=*, lower.tail=FALSE, log.p=TRUE) )|") sfsmisc::eaxis(1, sub10=2); sfsmisc::eaxis(2) legend("bottom", paste("k =", 0:5), col=1:6, lty=1:5, pch = as.character(0:5), pt.cex=1/2, bty="n") ## NB: rel.Errors go down to 7e-59 ==> need precision of -log2(7e-59) ~ 193.2 bits ## 2. qnormAsymp() --------------------- QPx <- sapplyMpfr(setNames(0:5, paste("k =",0:5)), function(k) qnormAsymp(Px, order=k, lower.tail = FALSE, log.p=TRUE)) (relE.q <- asNumeric(QPx/lx - 1)) # note how consistent the signs are (!) <==> have upper/lower bounds matplot(-asNumeric(Px), abs(relE.q), type="b", cex = 1/2, log="xy", pch=as.character(0:5), xlab = quote(-Px), axes=FALSE, main = "|relE( <qnormAsymp(Px, k=*, lower.tail=FALSE, log.p=TRUE) )|") sfsmisc::eaxis(1, sub10=2); sfsmisc::eaxis(2) legend("bottom", paste("k =", 0:5), col=1:6, lty=1:5, pch = as.character(0:5), pt.cex=1/2, bty="n") options(oop) # {revert to previous state}
Compuate "accurate" qbeta()
values from
Baharev et al (2017)'s Program.
data("qbBaha2017")
data("qbBaha2017")
FIXME: Their published table only shows 6 digits, but running their (32-bit statically linked) Linux executable ‘mindiffver’ (from their github repos, see "source") with their own ‘input.txt’ gives 12 digits accuracy, which we should be able to increase even more, see https://github.com/baharev/mindiffver/blob/master/README.md
A numeric matrix, with guaranteed accuracy
qbeta(0.95, a,b)
values, for
and
with
str()
num [1:9, 1:22] 0.902 0.95 0.966 0.975 0.98 ... - attr(*, "dimnames")=List of 2 ..$ a: chr [1:9] "0.5" "1" "1.5" "2" ... ..$ b: chr [1:22] "1" "2" "3" "4" ...
MM constructed this data as follows (TODO: say more..):
ff <- "~/R/MM/NUMERICS/dpq-functions/beta-gamma-etc/Baharev_et_al-2017_table3.txt" qbB2017 <- t( data.matrix(read.table(ff)) ) dimnames(qbB2017) <- dimnames(qbet) saveRDS(qbB2017, "..../qbBaha2017.rds")
This matrix comprises all entries of Table 3, p. 776 of
Baharev, A., Schichl, H. and Rév, E. (2017)
Computing the noncentral-F distribution and the power of the F-test with
guaranteed accuracy;
Comput. Stat. 32(2), 763–779.
doi:10.1007/s00180-016-0701-3
The paper mentions the first author's ‘github’ repos where source code and executables are available from: https://github.com/baharev/mindiffver/
data(qbBaha2017) str(qbBaha2017) str(ab <- lapply(dimnames(qbBaha2017), as.numeric)) stopifnot(ab$a == c((1:6)/2, 5, 10, 25), ab$b == c(1:15, 10*c(2:5, 10, 25, 50))) matplot(ab$b, t(qbBaha2017)[,9:1], type="l", log = "x", xlab = "b", ylab = "qbeta(.95, a,b)", main = "Guaranteed accuracy 95% percentiles of Beta distribution") legend("right", paste("a = ", format(ab$a)), lty=1:5, col=1:6, bty="n") ## Relative error of R's qbeta() -- given that the table only shows 6 ## digits, there is *no* relevant error: R's qbeta() is accurate enough: x.ab <- do.call(expand.grid, ab) matplot(ab$b, 1 - t(qbeta(0.95, x.ab$a, x.ab$b) / qbBaha2017), main = "rel.error of R's qbeta() -- w/ 6 digits, it is negligible", ylab = "1 - qbeta()/'true'", type = "l", log="x", xlab="b") abline(h=0, col=adjustcolor("gray", 1/2))
data(qbBaha2017) str(qbBaha2017) str(ab <- lapply(dimnames(qbBaha2017), as.numeric)) stopifnot(ab$a == c((1:6)/2, 5, 10, 25), ab$b == c(1:15, 10*c(2:5, 10, 25, 50))) matplot(ab$b, t(qbBaha2017)[,9:1], type="l", log = "x", xlab = "b", ylab = "qbeta(.95, a,b)", main = "Guaranteed accuracy 95% percentiles of Beta distribution") legend("right", paste("a = ", format(ab$a)), lty=1:5, col=1:6, bty="n") ## Relative error of R's qbeta() -- given that the table only shows 6 ## digits, there is *no* relevant error: R's qbeta() is accurate enough: x.ab <- do.call(expand.grid, ab) matplot(ab$b, 1 - t(qbeta(0.95, x.ab$a, x.ab$b) / qbBaha2017), main = "rel.error of R's qbeta() -- w/ 6 digits, it is negligible", ylab = "1 - qbeta()/'true'", type = "l", log="x", xlab="b") abline(h=0, col=adjustcolor("gray", 1/2))
Compute the log()
of the error of Stirling's formula for .
Used in certain accurate approximations of (negative) binomial and Poisson probabilities.
stirlerrM()
currently simply uses the direct mathematical formula,
based on lgamma()
, adapted for use with mpfr
-numbers.
stirlerrM(n, minPrec = 128L) stirlerrSer(n, k)
stirlerrM(n, minPrec = 128L) stirlerrSer(n, k)
n |
numeric or “numeric-alike” vector, typically
“large” positive integer or half integer valued, here typically an
|
k |
integer scalar, now in |
minPrec |
minimal precision (in bits) to be used when coercing
number-alikes, say, biginteger ( |
Stirling's approximation to has been
where by definition the error is the difference of the left and right
hand side of this formula, in -scale,
See the vignette log1pmx, bd0, stirlerr, ... from package
DPQ, where the series expansion of is used with
11 terms, starting with
a numeric or other “numeric-alike” class vector, e.g.,
mpfr
, of the same length as n
.
In principle, the direct formula should be replaced by a few terms of the
series in powers of for large
n
, but we assume using
high enough precision for n
should be sufficient and “easier”.
Martin Maechler
Catherine Loader, see dbinom
;
Martin Maechler (2021) log1pmx(), bd0(), stirlerr() – Computing Poisson, Binomial, Gamma Probabilities in R. https://CRAN.R-project.org/package=DPQ/vignettes/log1pmx-etc.pdf
dbinom
; rational exact dbinomQ()
in package gmp.
stirlerr()
in package
DPQ which is a pure R version R's mathlib-internal C function.
### ---------------- Regular R double precision ------------------------------- n <- n. <- c(1:10, 15, 20, 30, 50*(1:6), 100*(4:9), 10^(3:12)) (stE <- stirlerrM(n)) # direct formula is *not* good when n is large: require(graphics) plot(stirlerrM(n) ~ n, log = "x", type = "b", xaxt="n") # --> *negative for large n! sfsmisc::eaxis(1, sub10=3) ; abline(h = 0, lty=3, col=adjustcolor(1, 1/2)) oMax <- 22 # was 8 originally str(stirSer <- sapply(setNames(1:oMax, paste0("k=",1:oMax)), function(k) stirlerrSer(n, k))) cols <- 1:(oMax+1) matlines(n, stirSer, col = cols, lty=1) leg1 <- c("stirlerrM(n): [dble prec] direct f()", paste0("stirlerrSer(n, k=", 1:oMax, ")")) legend("top", leg1, pch = c(1,rep(NA,oMax)), col=cols, lty=1, bty="n") ## for larger n, current values are even *negative* ==> dbl prec *not* sufficient ## y in log-scale [same conclusion] plot (stirlerrM(n) ~ n, log = "xy", type = "b", ylim = c(1e-13, 0.08)) matlines(n, stirSer, col = cols, lty=1) legend("bottomleft", leg1, pch=c(1,rep(NA,oMax)), col=1:(oMax+1), lty=1, bty="n") ## the numbers: options(digits=4, width=111) stEmat. <- cbind(sM = setNames(stirlerrM(n),n), stirSer) stEmat. # note *bad* values for (n=1, k >= 8) ! ## for printing n=<nice>: 8 N <- Rmpfr::asNumeric dfm <- function(n, mm) data.frame(n=formatC(N(n)), N(mm), check.names=FALSE) ## relative differences: dfm(n, stEmat.[,-1]/stEmat.[,1] - 1) # => stirlerrM() {with dbl prec} deteriorates after ~ n = 200--500 ### ---------------- MPFR High Accuracy ------------------------------- stopifnot(require(gmp), require(Rmpfr)) n <- as.bigz(n.) ## now repeat everything .. from above ... FIXME shows bugs ! ## fully accurate using big rational arithmetic class(stEserQ <- sapply(setNames(1:oMax, paste0("k=",1:oMax)), function(k) stirlerrSer(n=n, k=k))) # list .. stopifnot(sapply(stEserQ, class) == "bigq") # of exact big rationals str(stEsQM <- lapply(stEserQ, as, Class="mpfr"))# list of oMax; each prec. 128..702 stEsQM. <- lapply(stEserQ, .bigq2mpfr, precB = 512) # constant higher precision stEsQMm <- sapply(stEserQ, asNumeric) # a matrix -- "exact" values of Stirling series stEM <- stirlerrM(mpfr(n, 128)) # now ok (loss of precision, but still ~ 10 digits correct) stEM4k <- stirlerrM(mpfr(n, 4096))# assume practically "perfect"/ "true" stirlerr() values ## ==> what's the accuracy of the 128-bit 'stEM'? N <- asNumeric # short dfm <- function(n, mm) data.frame(n=formatC(N(n)), N(mm), check.names=FALSE) dfm(n, stEM/stEM4k - 1) ## ........... ## 29 1e+06 4.470e-25 ## 30 1e+07 -7.405e-23 ## 31 1e+08 -4.661e-21 ## 32 1e+09 -7.693e-20 ## 33 1e+10 3.452e-17 (still ok) ## 34 1e+11 -3.472e-15 << now start losing --> 128 bits *not* sufficient! ## 35 1e+12 -3.138e-13 <<<< ## same conclusion via number of correct (decimal) digits: dfm(n, log10(abs(stEM/stEM4k - 1))) plot(N(-log10(abs(stEM/stEM4k - 1))) ~ N(n), type="o", log="x", xlab = quote(n), main = "#{correct digits} of 128-bit stirlerrM(n)") ubits <- c(128, 52) # above 128-bit and double precision abline(h = ubits* log10(2), lty=2) text(1, ubits* log10(2), paste0(ubits,"-bit"), adj=c(0,0)) stopifnot(identical(stirlerrM(n), stEM)) # for bigz & bigq, we default to precBits = 128 all.equal(roundMpfr(stEM4k, 64), stirlerrSer (n., oMax)) # 0.00212 .. because of 1st few n. ==> drop these all.equal(roundMpfr(stEM4k,64)[n. >= 3], stirlerrSer (n.[n. >= 3], oMax)) # 6.238e-8 plot(asNumeric(abs(stirlerrSer(n., oMax) - stEM4k)) ~ n., log="xy", type="b", main="absolute error of stirlerrSer(n, oMax) & (n, 5)") abline(h = 2^-52, lty=2); text(1, 2^-52, "52-bits", adj=c(1,-1)/oMax) lines(asNumeric(abs(stirlerrSer(n., 5) - stEM4k)) ~ n., col=2) plot(asNumeric(stirlerrM(n)) ~ n., log = "x", type = "b") for(k in 1:oMax) lines(n, stirlerrSer(n, k), col = k+1) legend("top", c("stirlerrM(n)", paste0("stirlerrSer(n, k=", 1:oMax, ")")), pch=c(1,rep(NA,oMax)), col=1:(oMax+1), lty=1, bty="n") ## y in log-scale plot(asNumeric(stirlerrM(n)) ~ n., log = "xy", type = "b", ylim = c(1e-13, 0.08)) for(k in 1:oMax) lines(n, stirlerrSer(n, k), col = k+1) legend("topright", c("stirlerrM(n)", paste0("stirlerrSer(n, k=", 1:oMax, ")")), pch=c(1,rep(NA,oMax)), col=1:(oMax+1), lty=1, bty="n") ## all "looks" perfect (so we could skip this) ## The numbers ... reused ## stopifnot(sapply(stEserQ, class) == "bigq") # of exact big rationals ## str(stEsQM <- lapply(stEserQ, as, Class="mpfr"))# list of oMax; each prec. 128..702 ## stEsQM. <- lapply(stEserQ, .bigq2mpfr, precB = 512) # constant higher precision ## stEsQMm <- sapply(stEserQ, asNumeric) # a matrix -- "exact" values of Stirling series ## stEM <- stirlerrM(mpfr(n, 128)) # now ok (loss of precision, but still ~ 10 digits correct) ## stEM4k <- stirlerrM(mpfr(n, 4096))# assume "perfect" stEmat <- cbind(sM = stEM4k, stEsQMm) signif(asNumeric(stEmat), 6) # prints nicely -- large n = 10^e: see ~= 1/(12 n) = 0.8333 / n ## print *relative errors* nicely : ## simple double precision version of direct formula (cancellation for n >> 1 !): stE <- stirlerrM(n.) # --> bad for small n; catastrophically bad for n >= 10^7 ## relative *errors* dfm(n , cbind(stEsQMm, dbl=stE)/stEM4k - 1) ## only "perfect" Series (showing true mathematical approx. error; not *numerical* relE <- N(stEsQMm / stEM4k - 1) dfm(n, relE) matplot(n, relE, type = "b", log="x", ylim = c(-1,1) * 1e-12) ## |rel.Err| in [log log] matplot(n, abs(N(relE)), type = "b", log="xy") matplot(n, pmax(abs(N(relE)), 1e-19), type = "b", log="xy", ylim = c(1e-17, 1e-12)) matplot(n, pmax(abs(N(relE)), 1e-19), type = "b", log="xy", ylim = c(4e-17, 1e-15)) abline(h = 2^-(53:52), lty=3)
### ---------------- Regular R double precision ------------------------------- n <- n. <- c(1:10, 15, 20, 30, 50*(1:6), 100*(4:9), 10^(3:12)) (stE <- stirlerrM(n)) # direct formula is *not* good when n is large: require(graphics) plot(stirlerrM(n) ~ n, log = "x", type = "b", xaxt="n") # --> *negative for large n! sfsmisc::eaxis(1, sub10=3) ; abline(h = 0, lty=3, col=adjustcolor(1, 1/2)) oMax <- 22 # was 8 originally str(stirSer <- sapply(setNames(1:oMax, paste0("k=",1:oMax)), function(k) stirlerrSer(n, k))) cols <- 1:(oMax+1) matlines(n, stirSer, col = cols, lty=1) leg1 <- c("stirlerrM(n): [dble prec] direct f()", paste0("stirlerrSer(n, k=", 1:oMax, ")")) legend("top", leg1, pch = c(1,rep(NA,oMax)), col=cols, lty=1, bty="n") ## for larger n, current values are even *negative* ==> dbl prec *not* sufficient ## y in log-scale [same conclusion] plot (stirlerrM(n) ~ n, log = "xy", type = "b", ylim = c(1e-13, 0.08)) matlines(n, stirSer, col = cols, lty=1) legend("bottomleft", leg1, pch=c(1,rep(NA,oMax)), col=1:(oMax+1), lty=1, bty="n") ## the numbers: options(digits=4, width=111) stEmat. <- cbind(sM = setNames(stirlerrM(n),n), stirSer) stEmat. # note *bad* values for (n=1, k >= 8) ! ## for printing n=<nice>: 8 N <- Rmpfr::asNumeric dfm <- function(n, mm) data.frame(n=formatC(N(n)), N(mm), check.names=FALSE) ## relative differences: dfm(n, stEmat.[,-1]/stEmat.[,1] - 1) # => stirlerrM() {with dbl prec} deteriorates after ~ n = 200--500 ### ---------------- MPFR High Accuracy ------------------------------- stopifnot(require(gmp), require(Rmpfr)) n <- as.bigz(n.) ## now repeat everything .. from above ... FIXME shows bugs ! ## fully accurate using big rational arithmetic class(stEserQ <- sapply(setNames(1:oMax, paste0("k=",1:oMax)), function(k) stirlerrSer(n=n, k=k))) # list .. stopifnot(sapply(stEserQ, class) == "bigq") # of exact big rationals str(stEsQM <- lapply(stEserQ, as, Class="mpfr"))# list of oMax; each prec. 128..702 stEsQM. <- lapply(stEserQ, .bigq2mpfr, precB = 512) # constant higher precision stEsQMm <- sapply(stEserQ, asNumeric) # a matrix -- "exact" values of Stirling series stEM <- stirlerrM(mpfr(n, 128)) # now ok (loss of precision, but still ~ 10 digits correct) stEM4k <- stirlerrM(mpfr(n, 4096))# assume practically "perfect"/ "true" stirlerr() values ## ==> what's the accuracy of the 128-bit 'stEM'? N <- asNumeric # short dfm <- function(n, mm) data.frame(n=formatC(N(n)), N(mm), check.names=FALSE) dfm(n, stEM/stEM4k - 1) ## ........... ## 29 1e+06 4.470e-25 ## 30 1e+07 -7.405e-23 ## 31 1e+08 -4.661e-21 ## 32 1e+09 -7.693e-20 ## 33 1e+10 3.452e-17 (still ok) ## 34 1e+11 -3.472e-15 << now start losing --> 128 bits *not* sufficient! ## 35 1e+12 -3.138e-13 <<<< ## same conclusion via number of correct (decimal) digits: dfm(n, log10(abs(stEM/stEM4k - 1))) plot(N(-log10(abs(stEM/stEM4k - 1))) ~ N(n), type="o", log="x", xlab = quote(n), main = "#{correct digits} of 128-bit stirlerrM(n)") ubits <- c(128, 52) # above 128-bit and double precision abline(h = ubits* log10(2), lty=2) text(1, ubits* log10(2), paste0(ubits,"-bit"), adj=c(0,0)) stopifnot(identical(stirlerrM(n), stEM)) # for bigz & bigq, we default to precBits = 128 all.equal(roundMpfr(stEM4k, 64), stirlerrSer (n., oMax)) # 0.00212 .. because of 1st few n. ==> drop these all.equal(roundMpfr(stEM4k,64)[n. >= 3], stirlerrSer (n.[n. >= 3], oMax)) # 6.238e-8 plot(asNumeric(abs(stirlerrSer(n., oMax) - stEM4k)) ~ n., log="xy", type="b", main="absolute error of stirlerrSer(n, oMax) & (n, 5)") abline(h = 2^-52, lty=2); text(1, 2^-52, "52-bits", adj=c(1,-1)/oMax) lines(asNumeric(abs(stirlerrSer(n., 5) - stEM4k)) ~ n., col=2) plot(asNumeric(stirlerrM(n)) ~ n., log = "x", type = "b") for(k in 1:oMax) lines(n, stirlerrSer(n, k), col = k+1) legend("top", c("stirlerrM(n)", paste0("stirlerrSer(n, k=", 1:oMax, ")")), pch=c(1,rep(NA,oMax)), col=1:(oMax+1), lty=1, bty="n") ## y in log-scale plot(asNumeric(stirlerrM(n)) ~ n., log = "xy", type = "b", ylim = c(1e-13, 0.08)) for(k in 1:oMax) lines(n, stirlerrSer(n, k), col = k+1) legend("topright", c("stirlerrM(n)", paste0("stirlerrSer(n, k=", 1:oMax, ")")), pch=c(1,rep(NA,oMax)), col=1:(oMax+1), lty=1, bty="n") ## all "looks" perfect (so we could skip this) ## The numbers ... reused ## stopifnot(sapply(stEserQ, class) == "bigq") # of exact big rationals ## str(stEsQM <- lapply(stEserQ, as, Class="mpfr"))# list of oMax; each prec. 128..702 ## stEsQM. <- lapply(stEserQ, .bigq2mpfr, precB = 512) # constant higher precision ## stEsQMm <- sapply(stEserQ, asNumeric) # a matrix -- "exact" values of Stirling series ## stEM <- stirlerrM(mpfr(n, 128)) # now ok (loss of precision, but still ~ 10 digits correct) ## stEM4k <- stirlerrM(mpfr(n, 4096))# assume "perfect" stEmat <- cbind(sM = stEM4k, stEsQMm) signif(asNumeric(stEmat), 6) # prints nicely -- large n = 10^e: see ~= 1/(12 n) = 0.8333 / n ## print *relative errors* nicely : ## simple double precision version of direct formula (cancellation for n >> 1 !): stE <- stirlerrM(n.) # --> bad for small n; catastrophically bad for n >= 10^7 ## relative *errors* dfm(n , cbind(stEsQMm, dbl=stE)/stEM4k - 1) ## only "perfect" Series (showing true mathematical approx. error; not *numerical* relE <- N(stEsQMm / stEM4k - 1) dfm(n, relE) matplot(n, relE, type = "b", log="x", ylim = c(-1,1) * 1e-12) ## |rel.Err| in [log log] matplot(n, abs(N(relE)), type = "b", log="xy") matplot(n, pmax(abs(N(relE)), 1e-19), type = "b", log="xy", ylim = c(1e-17, 1e-12)) matplot(n, pmax(abs(N(relE)), 1e-19), type = "b", log="xy", ylim = c(4e-17, 1e-15)) abline(h = 2^-(53:52), lty=3)