dipy logo

Site Navigation

NIPY Community

Previous topic

dipy.boots.resampling

Next topic

dipy data

dipy.boots.resampling

dipy.boots.resampling.abc(x, statistic=<function bs_se at 0x4c47c80>, alpha=0.05, eps=1e-05)

Calculates the bootstrap confidence interval by approximating the BCa.

Parameters :

x : np.ndarray

Observed data (e.g. chosen gold standard estimate used for bootstrap)

statistic : method

Method to calculate the desired statistic given x and probability proportions (flat probability densities vector)

alpha : float (0, 1)

Desired confidence interval initial endpoint (Default: 0.05)

eps : float (optional)

Specifies step size in calculating numerical derivative T’ and T’‘. Default: 1e-5

See also

__tt, __tt_dot, __tt_dot_dot, __calc_z0

Notes

Unlike the BCa method of calculating the bootstrap confidence interval, the ABC method is computationally less demanding (about 3% computational power needed) and is fairly accurate (sometimes out performing BCa!). It does not require any bootstrap resampling and instead uses numerical derivatives via Taylor series to approximate the BCa calculation. However, the ABC method requires the statistic to be smooth and follow a multinomial distribution.

References

[2]DiCiccio, T.J., Efron, B., 1996. Bootstrap Confidence Intervals. Statistical Science. 11, 3, 189-228.
dipy.boots.resampling.bootstrap(x, statistic=<function bs_se at 0x4c47c80>, B=1000, alpha=0.95)

Bootstrap resampling [1] to accurately estimate the standard error and confidence interval of a desired statistic of a probability distribution function (pdf).

Parameters :

x : ndarray (N, 1)

Observable sample to resample. N should be reasonably large.

statistic : method (optional)

Method to calculate the desired statistic. (Default: calculate bootstrap standard error)

B : integer (optional)

Total number of bootstrap resamples in bootstrap pdf. (Default: 1000)

alpha : float (optional)

Percentile for confidence interval of the statistic. (Default: 0.05)

Returns :

bs_pdf : ndarray (M, 1)

Jackknife probabilisty distribution function of the statistic.

se : float

Standard error of the statistic.

ci : ndarray (2, 1)

Confidence interval of the statistic.

See also

numpy.std, numpy.random.random

Notes

Bootstrap resampling is non parametric. It is quite powerful in determining the standard error and the confidence interval of a sample distribution. The key characteristics of bootstrap is:

  1. uniform weighting among all samples (1/n)
  2. resampling with replacement

In general, the sample size should be large to ensure accuracy of the estimates. The number of bootstrap resamples should be large as well as that will also influence the accuracy of the estimate.

References

[1]Efron, B., 1979. 1977 Rietz lecture–Bootstrap methods–Another look at the jackknife. Ann. Stat. 7, 1-26.
dipy.boots.resampling.bs_se(bs_pdf)

Calculates the bootstrap standard error estimate of a statistic

dipy.boots.resampling.jackknife(pdf, statistic=<function std at 0x2e26410>, M=None)

Jackknife resampling [R3] to quickly estimate the bias and standard error of a desired statistic in a probability distribution function (pdf).

Parameters :

pdf : ndarray (N, 1)

Probability distribution function to resample. N should be reasonably large.

statistic : method (optional)

Method to calculate the desired statistic. (Default: calculate standard deviation)

M : integer (M < N)

Total number of samples in jackknife pdf. (Default: M == N)

Returns :

jk_pdf : ndarray (M, 1)

Jackknife probabilisty distribution function of the statistic.

bias : float

Bias of the jackknife pdf of the statistic.

se : float

Standard error of the statistic.

See also

numpy.std, numpy.mean, numpy.random.random

Notes

Jackknife resampling like bootstrap resampling is non parametric. However, it requires a large distribution to be accurate and in some ways can be considered deterministic (if one removes the same set of samples, then one will get the same estimates of the bias and variance).

In the context of this implementation, the sample size should be at least larger than the asymptotic convergence of the statistic (ACstat); preferably, larger than ACstat + np.greater(ACbias, ACvar)

The clear benefit of using jackknife is its ability to estimate the bias of the statistic. The most powerful application of this is estimating the bias of a bootstrap-estimated standard error. In fact, one could “bootstrap the bootstrap” (nested bootstrap) of the estimated standard error, but the inaccuracy of the bootstrap to characterize the true mean would incur a poor estimate of the bias (recall: bias = mean[sample_est] - mean[true population])

References

[R3](1, 2) Efron, B., 1979. 1977 Rietz lecture–Bootstrap methods–Another look at the jackknife. Ann. Stat. 7, 1-26.