spike_inference.Rd
Estimation and inference for AR-1 spike problem using an L0 penalty
spike_inference( dat, decay_rate, tuning_parameter, window_size, sig2 = NULL, return_conditioning_sets = FALSE, return_ci = FALSE, two_sided = FALSE, alpha = 0.05 )
dat | Numeric vector; observed data. |
---|---|
decay_rate | Numeric; specified AR-1 decay rate \(\gamma\), a number between 0 and 1 (non-inclusive). |
tuning_parameter | Numeric; tuning parameter \(\lambda\) for L0 spike estimation, a non-negative number. |
window_size | Numeric; window size for fixed window hypothesis testing, a non-negative integer. |
sig2 | Numeric; noise variance for the observed data, a non-negative number. If unknown (NULL), sample variance of residuals is used instead. |
return_conditioning_sets | Logical; Should the conditioning set S be returned? |
return_ci | Logical; if TRUE, the confidence interval for the change in calcium is computed and returned. |
two_sided | Logical; if TRUE, a 2-sided p-value is computed and returned. |
alpha | Numeric; significance level for the hypothesis test, a number between 0 and 1 (non-inclusive). |
Returns a list with elements:
spikes
the set of spikes,
pvals
p-values associated with each spike,
LCB
lower confidence band for each spike,
UCB
upper confidence band for each spike.
Consider the AR-1 generative model $$Y_t = c_t + \epsilon_t, \epsilon_t \sim N(0, \sigma^2),$$ where \(c_t = \gamma c_{t-1} + z_t\) and \(z_t \sim Poisson(poisMean)\). In words, this says between spikes (when \(z_t=0\)), calcium decays exponentially at a known rate \(\gamma\in(0,1)\), which is taken to be known. Further denote the locations of true spikes, \(\{t:z_t\geq 0\}\) as \(\{0 = \tau_0 < \tau_1 < \ldots < \tau_K < \tau_{K+1} = T\}\).
This function first estimates spikes via L0 penalty based on noisy observations \(y_t, t = 1, \ldots, T\) by solving the following optimization problem $$ minimize_{c_1,...,c_T\geq 0} \frac{1}{2} \sum_{t=1}^{T} ( y_t - c_t )^2 + \lambda \sum_{t=2}^{T} 1(c_t \neq \gamma c_t-1).$$ Estimated spikes correspond to the time t such that estimated calcium does not decay exponentially, i.e., \(\{\cdots,\hat{\tau}_j,\cdots\} = \{t: \hat{c}_{t+1} -\gamma c_{t} \neq 0\} \).
Now suppose that we want to test whether the calcium is exponentially decaying near an estimated spike \(\hat{\tau}_j\); or equivalently, the null hypothesis of the form \(H_{0}: \nu^T c = 0\) versus \(H_{1}: \nu^T c> 0\) for suitably chosen \(\nu\) (see Section 2 in Chen et al. (2021+) for details). This function computes the following p-value $$P(\nu^T Y \geq \nu^T y | \hat{\tau}_j \in M(Y), \nu^T Y> 0, \Pi_\nu^\perp Y = \Pi_\nu^\perp y)$$ where \(M(Y)\) is the set of spikes estimated from Y via the L0 method, \(\Pi_\nu^\perp\) is the orthogonal projection to the orthogonal complement of \(\nu\). In particular, this p-value controls the selective Type I error (see Section 3 in Chen et al. (2021+) for details).
In addition, we implement a \(1-\alpha\) confidence interval for the parameter \( \nu^T c\), the increase in calcium associated with an estimated spike \(\hat{\tau}_j\) (see Section 4 in Chen et al. (2021+) for details).
Chen YT, Jewell SW, Witten DM. (2021+) Quantifying uncertainty in spikes estimated from calcium imaging data. arXiv:2103.0781 [statME].
Jewell, S. W., Hocking, T. D., Fearnhead, P., & Witten, D. M. (2019). Fast nonconvex deconvolution of calcium imaging data. Biostatistics.
Jewell, S., Fearnhead, P., and Witten, D. (2019+). Testing for a change in mean after changepoint detection. Technical report.
Maidstone, R., Hocking, T., Rigaill, G., & Fearnhead, P. (2017). On optimal multiple changepoint algorithms for large data. Statistics and Computing, 27(2), 519-533.
Rigaill, G. (2015). A pruned dynamic programming algorithm to recover the best segmentations with 1 to K_max change-points. Journal de la Societe Francaise de Statistique, 156(4), 180-205.
gam <- 0.98 LAMBDA <- 0.7 sigma <- 0.3 n_length <- 1000 curr_sim <- simulate_ar1(n = n_length, gam = gam, poisMean = 0.01, sd = sigma, seed = 2) curr_fit_spike <- spike_estimates(dat = curr_sim$fl, decay_rate = gam, tuning_parameter = LAMBDA) curr_inference_spike <- spike_inference(dat = curr_sim$fl, decay_rate = gam, tuning_parameter = LAMBDA, window_size = 2, sig2 = sigma*sigma, return_ci = TRUE) ### a summary of the inferential results summary(curr_inference_spike)#> estimated_spikes pvals LCB UCB #> 1 95 0.0026865876 0.3313205 1.5624120 #> 2 237 0.3080037641 -0.9602564 0.9293162 #> 3 369 0.0003211625 0.5423357 1.7345743 #> 4 463 0.0489093674 -0.1322414 1.1940410 #> 5 503 0.0086375856 0.1706250 1.4210234 #> 6 515 0.0007364613 0.4639493 1.6659810 #> 7 538 0.7342421299 -8.5600416 1.0333508 #> 8 561 0.1274106669 -0.5842248 1.2708549 #> 9 571 0.0011170360 0.4546186 1.6866898 #> 10 610 0.0370503683 -0.1092493 1.4964073 #> 11 627 0.0020417663 0.3977844 1.6573791 #> 12 662 0.0046657711 0.2894321 1.5793793 #> 13 781 0.0059471283 0.2203090 1.4564512 #> 14 914 0.0142668035 0.1420499 1.6782515 #> 15 924 0.6120957544 -4.3107861 1.0106799 #> 16 946 0.0011172539 0.4747208 1.7228907