The ideas here are based on Chapter 9 of Särndal, Swensson and Wretman, which deals with two-phase sampling and the regression estimation of population totals. Here we are interested in multiphase sampling and in settings where many statistics may be computed with the same set of calibrated weights, so the priorities and notation are slightly different.
As in the book, we consider the setting where the sampling probabilities are known (or well-estimated) in advance and raking adjustments are small. When sampling probabilities π are grossly incorrect (eg: missing data/non-response) so that the raking adjustments are not small, we would want different computations such as those of Chang & Kott (Biometrika, 2008).
In multiphase sampling we have a sequence of K nested subsamples, with sampling probabilities that can depend on any data at previous phases. We write πi, 1, πi, 2|1, πi, 3|2, … , πi, k|k − 1,…,πi, K|K − 1 as the sampling probabilities for observation i at each phase, and similarly for pairwise probabilities. In general, we use a , k|k − 1 subscript for quantities related to the kth phase of sampling and a , k subscript for quantities related to the cumulative effect of sampling from phase 1 through phase k. All sums are over the population unless otherwise specified; restriction to samples is achieved with the sampling indicators Ri, k (and Ri ≡ Ri, K). We also write $$\pi^\dagger_{ij,k}=\pi^*_{ij,K}/\pi^*_{ij,k}=\prod_{\ell=k+1}^K \pi^*_{ij,\ell}$$ for the forward cumulative pairwise probabilities.
The product πi* ≡ πi, K* = πi, 1πi, 2|1πi, 3|2...πi, K|K − 1 is not in general the marginal sampling probability for observation i, because each πk|k − 1 can depend on data up to phase k − 1, so πi* = πi, K* is a random variable. To get the true marginal probabilities we would need to integrate out all the dependence on data measured at intermediate phases. However, it is still true that E[Ri, k/πi, k*] = 1, which is the key fact we need to estimate totals $$\hat T_X = \sum_i \frac{R_i}{\pi_i^*}X_i$$
It’s also still true (per Särndal et al) that the variance of a total can be estimated by something very like the Horvitz-Thompson formula $$\widehat{\mathrm{var}}[\hat T_X]= \sum_{i,j} \check{X}_i\check{X}_j\check{\Delta}_{ij}$$ where X̌i = Xi/πi*, Δij = πij* = πi*πj*, and Δ̌ij = Δij/πij* = (1 − πi*πj*/πij*)
In the absence of raking we could speed up computation using a recursive relationship for constructing Δ̌ from the weighted covariances Δ̌1, Δ̌2|1, and so on at each phase, subscripted down to the subsample remaining at phase K. This also has the advantage of simplicity. However, we can’t do this precomputation with raking, and it has the disadvantage of not giving components of variance at each phase, which we like having. We do use the recursive combination for constructing Δ̌k|k − 1 in multistage designs within a single phase.
We need to use the a summation over phases. We consider an estimated total as a telescoping sum over phases T̂X − TX = (T̂X, 1 − TX) + (T̂X, 2 − T̂X, 1) + ⋯ + (T̂X − T̂X, K − 1) where each term is the error incurred by one phase of sampling. That is
$$\hat T_{X,1}-T_X=\sum_i \frac{R_{i,1}}{\pi^*_{i,1}}x_i- \sum_i x_i$$ $$\hat T_{X,2}-\hat T_{X,1}=\sum_i \frac{R_{i,2}}{\pi^*_{i,2}}x_i- \sum_i \frac{R_{i,1}}{\pi^*_{i,1}}x_i$$ and in general $$\hat T_{X,k}-\hat T_{X,k-1}=\sum_i \frac{R_{i,k}}{\pi^*_{i,k}}x_i- \sum_i \frac{R_{i,k-1}}{\pi^*_{i,k-1}}x_i$$ These are all uncorrelated, because each depends on sampling only at one phase. They aren’t independent, because the available data for sampling at phase k depends on all previous phases, but they do form a martingale difference sequence. I will write x̌i, k for xi/πi, k*, the weighted observation at phase k, so that we have T̂X, k − T̂X, k − 1 = ∑iRi, kx̌i, k − ∑iRi, k − 1x̌i, k − 1
The variance of this sum is thus the sum of variances var[T̂X] = var[T̂X, 1 − TX] + var[T̂X, 2 − T̂X, 1] + ⋯ + var[T̂X − T̂X, K − 1] and each variance is (conditional on the sampling so far) of the usual Horvitz-Thompson form var[T̂X, k − T̂X, k − 1] = ∑i, jRi, k − 1Rj, k − 1cov[Ri, k|k − 1, Rj, k|k − 1]x̌i, kx̌j, k which could be estimated at phase k by $$\widehat{\mathrm{var}}_k\left[\hat T_{X,k}-\hat T_{X,k-1}\right]=\sum_{i,j}\frac{R_{i,k}R_{j,k}}{\pi_{ij,k|k-1}}\mathrm{cov}\left[R_{i,k|k-1},R_{j,k|k-1}\right]\check{x}_{i,k}\check{x}_{j,k}=\sum_{i,j}R_{i,k}R_{j,k}\check{\Delta}_{ij,k|k-1}\check{x}_{i,k}\check{x}_{j,k} $$ That’s still not enough, because we don’t necessarily have x until phase K, so we need to weight down to phase K $$\widehat{\mathrm{var}}_K\left[\hat T_{X,k}-\hat T_{X,k-1}\right]=\sum_{i,j}R_{i,k}R_{j,k}\frac{R_{i,K}R_{j,K}}{\pi^*_{ij,K}/\pi^*_{ij,k}}\check{\Delta}_{ij,k|k-1}\check{x}_{i,k}\check{x}_{j,k}=\sum_{i,j}\frac{R_{i,K}R_{j,K}}{\pi^\dagger_{ij,k}}\check{\Delta}_{ij,k|k-1}\check{x}_{i,k}\check{x}_{j,k}$$
Raking of phase k to phase k − 1 involves estimating raking adjustments gi, k|k − 1 that satisfy calibration constraints on variables Ai available at phase k − 1. $$\sum_i R_{i,k}g_{i,k|k-1}\frac{1}{\pi_{i,k}^*}A_i=\sum_i R_{i,k-1}\frac{1}{\pi^*_{i,k-1}}A_i$$ Let ΠA and ΠĀ be the projections on to and orthogonal to the space spanned by A given weights 1/πk − 1* and let x̂i = ΠAxi and ei = ΠĀxi. The calibrated total estimate is $$\hat T_X=\sum_i\frac{R_ig_{i,k|k-1}}{\pi^*_i}x_i=\sum_iR_ig_{i,k|k-1}\check{x}_i=\sum_iR_ig_{i,k|k-1}\check{e}_{i,k}+\sum_i R_{i,k}g_{i,k|k-1}\check{\hat x}_{i,k}$$ Applying the calibration constraints to the second term we have $$\sum_iR_ig_{i,k|k-1}\check{e}_{i,k}+\sum_i R_{i,k}g_{i,k|k-1}\check{\hat x}_{i,k}=\sum_iR_ig_{i,k|k-1}\check{e}_{i,k}+\sum_i R_{i,k-1}\check{\hat x}_{i,k-1}$$ and since x̂i, k − 1 is not random (to first order) conditional on phase k − 1 the estimated variance contribution for phase k comes from just the first term and is $$\widehat{\textrm{var}}_{k|k-1}\left[\hat T_X\right]=\sum_{i,j}\frac{R_{i,K}R_{j,K}}{\pi^\dagger_{jk,K}}g_{i,k|k-1}g_{j,k|k-1}\check{e}_{i,k}\check{e}_{j,k}\check{\Delta}_{ij,k|k-1}$$ with the variances of the other phases being unaffected.