Automating HRV analysis: RHRVEasy

RHRVEasy automates all steps of a Heart Rate Variability (HRV) analysis, including data processing, indices calculation, and statistical analysis. It takes as input a list of folders, each containing the recordings of a same population. It calculates time, frequency, and nonlinear domain HRV indices, and then it applies hypothesis test, and corrects the significance levels. If there are more than two experimental groups and statistically significant differences are found, it performs a post-hoc analysis to find out which groups have the differences.

0. Set up required to run this tutorial

This tutorial uses the recordings of the Normal Sinus Rhythm RR Interval Database (hereinafter referred to as NSR_DB), a subset of the RR interval time series from healthy subjects (referred to as HEALTHY_DB), and the Congestive Heart Failure RR Interval Database (referred to as CHF_DB). The former two databases comprise data from healthy individuals, while the latter consists of recordings from patients with severe cardiac pathology. Consequently, significant disparities in numerous HRV indices are anticipated between the healthy databases and the CHF_DB.

The three databases are available in the GitHub repository for the book “Heart Rate Variability Analysis with the R package RHRV”, under the data/Chapter8 folder, within the data/Chapter8 directory. To execute this tutorial, download this folder to your local machine and define the following variables:

library("RHRV")

basePath <- "book_data"  # adjust as needed
NSR_DB <- file.path(basePath, "normal")
CHF_DB <- file.path(basePath, "chf")
HEALTHY_DB <- file.path(basePath, "healthy")

RHRVEasy permits creating an Excel spreadsheet with all the HRV indices calculated for each recording. The following variable must contain the folder on the local machine where the Excel spreadsheet is to be saved:

spreadsheetPath <- basePath

1. Time and frequency analysis

RHRVEasy enables the user to carry out a full HRV analysis by just invoking a function with a single mandatory parameter: a list with the folders containing the recordings of the experimental groups. This list must have at least two folders. Each folder must contain all the RR recordings of the same experimental group and no additional files, as RHRVEasy will try to open all the files within these folders. The name that will be used to refer to each experimental group within RHRVEasy will be the name of the folder in which its recordings are located.

The following function call computes the time and frequency indices for the NSR_DB and CHF_DB databases, and performs a statistical comparison of each index correcting the significance level with the Bonferroni method. Note the use of the nJobs to use several cores and parallelize the computations. With nJobs = -1, it uses all available cores; if an integer greater than 0 is indicated, it uses the number of cores indicated by the integer.

easyAnalysis <- RHRVEasy(folders = c(NSR_DB, CHF_DB), nJobs = -1)

When the returned object is displayed in the console, it shows which indices present statistically significant differences:

print(easyAnalysis)
## Significant differences in SDNN (Kruskal-Wallis rank sum test, bonferroni p-value = 1.117154e-07):
##   chf's mean95% CI: (61.91503, 94.0085) [Bootstrap CI without adjustment]
##   normal's mean95% CI: (131.1187, 148.1985) [Bootstrap CI without adjustment]
## 
## Significant differences in SDANN (Kruskal-Wallis rank sum test, bonferroni p-value = 3.799696e-07):
##   chf's mean95% CI: (48.19527, 80.0444) [Bootstrap CI without adjustment]
##   normal's mean95% CI: (122.0759, 139.05) [Bootstrap CI without adjustment]
## 
## Significant differences in SDNNIDX (Kruskal-Wallis rank sum test, bonferroni p-value = 0.01426098):
##   chf's mean95% CI: (29.96821, 47.6446) [Bootstrap CI without adjustment]
##   normal's mean95% CI: (47.0144, 54.5201) [Bootstrap CI without adjustment]
## 
## Significant differences in IRRR (Kruskal-Wallis rank sum test, bonferroni p-value = 1.492754e-07):
##   chf's mean95% CI: (78.67064, 124.1918) [Bootstrap CI without adjustment]
##   normal's mean95% CI: (189.5291, 215.7118) [Bootstrap CI without adjustment]
## 
## Significant differences in TINN (Kruskal-Wallis rank sum test, bonferroni p-value = 1.452872e-06):
##   chf's mean95% CI: (243.1949, 373.8965) [Bootstrap CI without adjustment]
##   normal's mean95% CI: (511.0544, 586.6332) [Bootstrap CI without adjustment]
## 
## Significant differences in HRVi (Kruskal-Wallis rank sum test, bonferroni p-value = 1.452872e-06):
##   chf's mean95% CI: (15.96148, 23.78737) [Bootstrap CI without adjustment]
##   normal's mean95% CI: (32.80169, 37.58583) [Bootstrap CI without adjustment]
## 
## Significant differences in ULF (Kruskal-Wallis rank sum test, bonferroni p-value = 1.74099e-08):
##   chf's mean95% CI: (1182.117, 4410.562) [Bootstrap CI without adjustment]
##   normal's mean95% CI: (7215.618, 9824.658) [Bootstrap CI without adjustment]
## 
## Significant differences in VLF (Kruskal-Wallis rank sum test, bonferroni p-value = 0.002535127):
##   chf's mean95% CI: (52.21509, 135.5065) [Bootstrap CI without adjustment]
##   normal's mean95% CI: (131.5723, 175.2834) [Bootstrap CI without adjustment]

All computed indices, as well as all p-values resulting from all comparisons, are stored in data.frames contained in the object. Two different sets of p-values are available; the ones obtained before (p.value) and after (adj.p.value) applying the significance level correction:

# HRVIndices
head(easyAnalysis$HRVIndices)
## # A tibble: 6 × 16
##   file       group  SDNN SDANN SDNNIDX pNN50  SDSD rMSSD  IRRR MADRR  TINN  HRVi
##   <chr>      <fct> <dbl> <dbl>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 chf201_rr… chf    75.5  52.9    49.6  2.03  20.2  20.2  93.8  7.81  358. 22.9 
## 2 chf202_rr… chf    88.5  75.8    39.6  6.13  34.7  34.7 117.  15.6   350. 22.4 
## 3 chf203_rr… chf    38.8  30.9    21.7  1.20  17.3  17.3  46.9  7.81  170. 10.9 
## 4 chf204_rr… chf    55.1  39.1    36.0  4.84  33.0  33.0  70.3  7.81  237. 15.2 
## 5 chf205_rr… chf    34.9  26.1    19.5  1.97  23.7  23.7  46.9  7.81  169. 10.8 
## 6 chf206_rr… chf    41.2  34.9    14.8  2.02  18.9  18.9  31.2  7.81  122.  7.79
## # ℹ 4 more variables: ULF <dbl>, VLF <dbl>, LF <dbl>, HF <dbl>
# Statistical analysis
head(easyAnalysis$stats)
## # A tibble: 6 × 4
##   HRVIndex method                             p.value adj.p.value
##   <chr>    <chr>                                <dbl>       <dbl>
## 1 SDNN     Kruskal-Wallis rank sum test 0.00000000798 0.000000112
## 2 SDANN    Kruskal-Wallis rank sum test 0.0000000271  0.000000380
## 3 SDNNIDX  Kruskal-Wallis rank sum test 0.00102       0.0143     
## 4 pNN50    Kruskal-Wallis rank sum test 0.774         1          
## 5 SDSD     Kruskal-Wallis rank sum test 0.0891        1          
## 6 rMSSD    Kruskal-Wallis rank sum test 0.0891        1

The format parameter specifies the format in which the RR intervals are stored. All formats supported by the RHRV package can be used: WFDB, ASCII, RR, Polar, Suunto, EDFPlus or Ambit (check the RHRV website for more information). The default format is RR, where the beat distances in seconds are stored in a single column of an ASCII file. This is the format of the three databases used in this tutorial.

By default, the frequency analysis is performed using the Fourier transform. It is also possible to use the Wavelet transform pasing the value 'wavelet' to the typeAnalysis parameter (check the paper “García, C. A., Otero, A., Vila, X., & Márquez, D. G. (2013). A new algorithm for wavelet-based heart rate variability analysis. Biomedical Signal Processing and Control, 8(6), 542-550” for details):

easyAnalysisWavelet <- RHRVEasy(
  folders = c(NSR_DB, CHF_DB), 
  typeAnalysis = 'wavelet', 
  n_jobs = -1
)

Note that the significant indices are the same as the previous ones.

2. Correction of the significance level

Given that multiple statistical tests are performed on several HRV indices, a correction of the significance level should be applied. The Bonferroni method is used by default. This behavior can be overridden with the parameter correctionMethod of RHRVEasy. The possible values of this parameter besides bonferroni are holm, hochberg, hommel, BH (Benjamini & Hochberg), fdr (false discovery rate), BY (Benjamini & Yekutieli), and none (indicating that no correction is to be made). Furthermore, there is no need to recompute the HRV indices to apply a different correction method, but the RHRVEasyStats function can be used to this end. The confidence level can also be changed using the significance parameter (in both RHRVEasy and RHRVEasyStats functions).

easyAnalysisFDR <- RHRVEasyStats(easyAnalysis, correctionMethod =  'fdr')
pValues <- merge(
  easyAnalysis$stats, 
  easyAnalysisFDR$stats,
  by = setdiff(names(easyAnalysis$stats), "adj.p.value"),
  suffixes = c(".bonf", ".fdr")
)
#Let us compare the p-values obtained with different correction methods 
print(
  head(
    pValues[, c("HRVIndex", "p.value", "adj.p.value.bonf", "adj.p.value.fdr")]
  )
) 
##   HRVIndex      p.value adj.p.value.bonf adj.p.value.fdr
## 1       HF 5.601495e-01     1.000000e+00    6.032380e-01
## 2     HRVi 1.037766e-07     1.452872e-06    2.421454e-07
## 3     IRRR 1.066253e-08     1.492754e-07    4.975847e-08
## 4       LF 1.651479e-02     2.312071e-01    2.568968e-02
## 5    MADRR 6.319903e-02     8.847864e-01    8.847864e-02
## 6    pNN50 7.744691e-01     1.000000e+00    7.744691e-01

3. Saving the indices to an Excel spreadsheet

If the argument saveHRVindicesInPath is specified when invoking the function RHRVEasy, an Excel spreadsheet with all the HRV indices calculated for each recording will be created in the path specified by this parameter. The name of the spreadsheet generated is “<group 1 name>Vs<group 2 name> .xlsx”:

easyAnalysis <- RHRVEasy(folders = c(NSR_DB, CHF_DB), 
                         saveHRVIndicesInPath = spreadsheetPath)

This spreadsheet can also be generated from the object returned by RHRVEasy by calling the function SaveHRVIndices.

SaveHRVIndices(easyAnalysis, saveHRVIndicesInPath = spreadsheetPath)

4. Comparing more than two experimental groups

If the analysis involves three or more groups, when statistically significant differences are found among them it does not necessarily mean that there are statistically significant differences between all pairs of groups. In such a scenario post-hoc tests are used to find which pairs of groups present differences:

#Comparison of the three databases
easyAnalysis3 <- RHRVEasy(
  folders = c(NSR_DB, CHF_DB, HEALTHY_DB),
  nJobs = -1
)
print(easyAnalysis3)
## Significant differences in SDNN (Kruskal-Wallis rank sum test, bonferroni p-value = 3.543622e-07):
##   Significant differences in the post-hoc tests (Dunn's all-pairs test + bonferroni-p-value adjustment):
##       group1  group2 adj.p.value
##     1 healthy chf    0.00799    
##     2 normal  chf    0.000000282
##     ----------------------------
##     chf's mean95% CI: (63.20538, 92.2515) [Bootstrap CI without adjustment]
##     healthy's mean95% CI: (123.242, 158.269) [Bootstrap CI without adjustment]
##     normal's mean95% CI: (131.665, 147.9961) [Bootstrap CI without adjustment]
## 
## Significant differences in SDANN (Kruskal-Wallis rank sum test, bonferroni p-value = 1.345688e-06):
##   Significant differences in the post-hoc tests (Dunn's all-pairs test + bonferroni-p-value adjustment):
##       group1 group2 adj.p.value
##     1 normal chf    0.000000403
##     ---------------------------
##     chf's mean95% CI: (47.61222, 81.42191) [Bootstrap CI without adjustment]
##     healthy's mean95% CI: (105.1872, 134.0331) [Bootstrap CI without adjustment]
##     normal's mean95% CI: (120.4753, 138.5329) [Bootstrap CI without adjustment]
## 
## Significant differences in SDNNIDX (Kruskal-Wallis rank sum test, bonferroni p-value = 0.001063849):
##   Significant differences in the post-hoc tests (Dunn's all-pairs test + bonferroni-p-value adjustment):
##       group1  group2 adj.p.value
##     1 healthy chf        0.00111
##     ----------------------------
##     chf's mean95% CI: (29.1345, 47.73994) [Bootstrap CI without adjustment]
##     healthy's mean95% CI: (56.23389, 74.9991) [Bootstrap CI without adjustment]
##     normal's mean95% CI: (47.0101, 54.33106) [Bootstrap CI without adjustment]
## 
## Significant differences in IRRR (Kruskal-Wallis rank sum test, bonferroni p-value = 3.688167e-07):
##   Significant differences in the post-hoc tests (Dunn's all-pairs test + bonferroni-p-value adjustment):
##       group1  group2 adj.p.value
##     1 healthy chf    0.00395    
##     2 normal  chf    0.000000425
##     ----------------------------
##     chf's mean95% CI: (77.3305, 124.7238) [Bootstrap CI without adjustment]
##     healthy's mean95% CI: (179.9086, 234.5556) [Bootstrap CI without adjustment]
##     normal's mean95% CI: (187.6484, 215.9975) [Bootstrap CI without adjustment]
## 
## Significant differences in MADRR (Kruskal-Wallis rank sum test, bonferroni p-value = 0.006224158):
##   Significant differences in the post-hoc tests (Dunn's all-pairs test + bonferroni-p-value adjustment):
##       group1  group2 adj.p.value
##     1 healthy chf        0.00237
##     ----------------------------
##     chf's mean95% CI: (8.62069, 11.85345) [Bootstrap CI without adjustment]
##     healthy's mean95% CI: (16.55556, 24.66667) [Bootstrap CI without adjustment]
##     normal's mean95% CI: (11.28472, 14.03356) [Bootstrap CI without adjustment]
## 
## Significant differences in TINN (Kruskal-Wallis rank sum test, bonferroni p-value = 1.350844e-06):
##   Significant differences in the post-hoc tests (Dunn's all-pairs test + bonferroni-p-value adjustment):
##       group1  group2 adj.p.value
##     1 healthy chf     0.000933  
##     2 normal  chf     0.00000519
##     ----------------------------
##     chf's mean95% CI: (244.0477, 371.3618) [Bootstrap CI without adjustment]
##     healthy's mean95% CI: (533.6798, 701.4795) [Bootstrap CI without adjustment]
##     normal's mean95% CI: (511.6379, 586.4394) [Bootstrap CI without adjustment]
## 
## Significant differences in HRVi (Kruskal-Wallis rank sum test, bonferroni p-value = 1.350844e-06):
##   Significant differences in the post-hoc tests (Dunn's all-pairs test + bonferroni-p-value adjustment):
##       group1  group2 adj.p.value
##     1 healthy chf     0.000933  
##     2 normal  chf     0.00000519
##     ----------------------------
##     chf's mean95% CI: (15.85798, 23.7487) [Bootstrap CI without adjustment]
##     healthy's mean95% CI: (34.45, 45.19331) [Bootstrap CI without adjustment]
##     normal's mean95% CI: (32.68737, 37.61479) [Bootstrap CI without adjustment]
## 
## Significant differences in ULF (Kruskal-Wallis rank sum test, bonferroni p-value = 5.860632e-08):
##   Significant differences in the post-hoc tests (Dunn's all-pairs test + bonferroni-p-value adjustment):
##       group1 group2  adj.p.value
##     1 normal chf    0.0000000162
##     ----------------------------
##     chf's mean95% CI: (1075.296, 4358.885) [Bootstrap CI without adjustment]
##     healthy's mean95% CI: (4995.594, 8167.694) [Bootstrap CI without adjustment]
##     normal's mean95% CI: (7063.468, 9898.164) [Bootstrap CI without adjustment]
## 
## Significant differences in VLF (Kruskal-Wallis rank sum test, bonferroni p-value = 0.0005669878):
##   Significant differences in the post-hoc tests (Dunn's all-pairs test + bonferroni-p-value adjustment):
##       group1  group2 adj.p.value
##     1 healthy chf        0.00239
##     2 normal  chf        0.00977
##     ----------------------------
##     chf's mean95% CI: (54.04686, 134.9712) [Bootstrap CI without adjustment]
##     healthy's mean95% CI: (171.6335, 340.8925) [Bootstrap CI without adjustment]
##     normal's mean95% CI: (130.0847, 177.0061) [Bootstrap CI without adjustment]

Note that the stats data.frame now contains a column named pairwise storing the results of the post-hoc analysis for those indices where the omnibus test has been significant:

print(head(easyAnalysis3$stats))
## # A tibble: 6 × 5
##   HRVIndex method                            p.value adj.p.value pairwise
##   <chr>    <chr>                               <dbl>       <dbl> <list>  
## 1 SDNN     Kruskal-Wallis rank sum test 0.0000000253 0.000000354 <tibble>
## 2 SDANN    Kruskal-Wallis rank sum test 0.0000000961 0.00000135  <tibble>
## 3 SDNNIDX  Kruskal-Wallis rank sum test 0.0000760    0.00106     <tibble>
## 4 pNN50    Kruskal-Wallis rank sum test 0.0186       0.260       <NULL>  
## 5 SDSD     Kruskal-Wallis rank sum test 0.0301       0.421       <NULL>  
## 6 rMSSD    Kruskal-Wallis rank sum test 0.0301       0.421       <NULL>
# Let's print the post-hoc comparisons for "SDNN"
print(head(easyAnalysis3$stats$pairwise[[1]]))
## # A tibble: 3 × 6
##   HRVIndex group1  group2  method                     p.value adj.p.value
##   <chr>    <chr>   <chr>   <chr>                        <dbl>       <dbl>
## 1 SDNN     healthy chf     Dunn's all-pairs test 0.000296     0.00799    
## 2 SDNN     normal  chf     Dunn's all-pairs test 0.0000000104 0.000000282
## 3 SDNN     normal  healthy Dunn's all-pairs test 0.861        1

5. Overwriting default parameters

Any parameter of any RHRV function can be specified as an additional parameter of the RHRVEasy function; in this case, the default value used for that parameter will be overwritten by the one specified for the user. The default values used in the RHRVEasy package are the same as those used in the RHRV package. For more information about the parameters available you can consult the RHRV website. For example, the following analysis modifies the the limits of the ULF, VLF, LF and HF spectral bands, and uses an interpolation frequency (freqhr) of 2 Hz:

easyAnalysisOverwritten <- RHRVEasy(folders = c(NSR_DB, CHF_DB),
                                    freqhr = 2, 
                                    ULFmin = 0, ULFmax = 0.02, 
                                    VLFmin = 0.02,  VLFmax = 0.07, 
                                    LFmin = 0.07, LFmax = 0.20, 
                                    HFmin = 0.20, HFmax = 0.5)

6. Nonlinear analysis

The calculation of the nonlinear indices requires considerable computational resources, specially the Recurrence Quantification Analysis (RQA). Whereas in a typical HRV analysis the computation of all the time and frequency domain indices for a few dozens of recordings often completes within a few minutes, the computation of the nonlinear indices could last many hours. That’s why the boolean parameters nonLinear and doRQA are set to FALSE by default. If these parameters are not changed, only time and frequency indices will be calculated, as in the previous sections.

Warning: the following sentence, will take several hours to execute on a medium to high performance PC. You may reproduce the results of the paper by running this chunk of code.

fullAnalysis <- RHRVEasy(
  folders = c(NSR_DB, CHF_DB, HEALTHY_DB),
  nJobs = -1,
  nonLinear =  TRUE, 
  doRQA = TRUE,
  saveHRVIndicesInPath = spreadsheetPath
)