Package 'copulaData'

Title: Data Sets for Copula Modeling
Description: Data sets used for copula modeling in addition to those in the R package 'copula'. These include a random subsample from the US National Education Longitudinal Study (NELS) of 1988 and nursing home data from Wisconsin.
Authors: Marius Hofert [aut, cre], Ivan Kojadinovic [aut], Martin Maechler [aut], Jun Yan [aut], Edward W. Frees [dtc] (NELS and nursingHomes)
Maintainer: Marius Hofert <[email protected]>
License: GPL (>= 3) | file LICENCE
Version: 0.0-2
Built: 2024-10-18 05:45:36 UTC
Source: https://github.com/r-forge/copula

Help Index


National Education Longitudinal Study Data

Description

Random sample of size 1000 from the US National Education Longitudinal Study (NELS) data containing the mathematics, science and reading scores, together with covariates, of 8th graders in 1988.

Usage

data("NELS88")

Format

data.frame containing the identification number of the school to which the student belongs (ID), the standardized score of the student on a mathematics achievement test (Math; rescaled by an Item Response Theory (IRT) method where a higher score indicates greater proficiency in mathematics), the standardized score of the student on a science achievement test (Science), the standardized score of the student on a reading achievement test (Reading), a factor indicating whether the student is a member of an ethnic minority group (Minority), a numeric measure of the socio-economic status of the student and family (SES), a factor indicating whether the student is female (Female), a factor indicating whether the school is publicly funded (Public), the size of the student's school (Size), a factor indicating whether the school is located in an urban environment (Urban) and a factor indicating whether the school is located in a rural environment (Rural).

Source

Edward W. Frees, ‘Student Achievement Data’ in https://sites.google.com/a/wisc.edu/jed-frees/tutorial-multivariate-regression-using-copulas.

Originally, the National Center for Education Statistics page, https://nces.ed.gov/surveys/nels88/

Examples

data("NELS88")
str(NELS88)
ftable(xtabs(~ Urban+Rural + Public, NELS88))#
## Add more sensible variable, ordered factor rural < agglo < urban
NELS88. <- within(NELS88, {
       UR <- factor(Urban:Rural, labels = c("agglo", "rural", "urban"))
      Urbanity <- ordered(UR, levels = c("rural", "agglo", "urban"))
      rm(UR) })
unique(NELS88.[, c("Urban","Rural", "Urbanity")]) # indeed, just 3 combination cases

xtabs(~ Minority+Urbanity, NELS88.) # (_not_ independent)
ftable(xtabs(~ Public+Urbanity+Female+Minority, NELS88.) -> tab.)
summary(tab.) # very very clearly not independent

Wisconsin Nursing Homes Utilization Data

Description

Data set containing the occupancy rate (utilization) and covariates of 377 nursing homes in Wisconsin between 1995 and 2001.

Usage

data("nursingHomes")

Format

data.frame containing the nursing home identification number (ID), the occupancy rate (Rate; see Sun et al. (2008, Equation (7)) for how this is computed), the logarithmic number of beds of the nursing home (LnNumBed), the logarithmic net square foot of the nursing home (LnSqrFoot), the cost report year (CRYear), indicators whether the nursing home runs on a for-profit basis (Pro), whether it is tax exempt (TaxExempt), whether it has self funding of insurance (SelfIns), whether it is accredited as Medicare Certified (MCert) and whether it is located in an urban environment (Urban); see Sun et al. (2008, Table 2).

Source

Edward W. Frees, Wisconsin Department of Health and Family Services (by now named “Wisconsin Department of Health Services”)

References

Sun, J., Frees, E. W. and Rosenberg, M. A. (2008) Heavy-tailed longitudinal data modeling using copulas. Insurance: Mathematics and Economics 42, 817–830.

Examples

data("nursingHomes")
str(nursingHomes)