Estimation and inference for AR-1 spike problem using an L0 penalty

spike_inference(
  dat,
  decay_rate,
  tuning_parameter,
  window_size,
  sig2 = NULL,
  return_conditioning_sets = FALSE,
  return_ci = FALSE,
  two_sided = FALSE,
  alpha = 0.05
)

Arguments

dat	Numeric vector; observed data.
decay_rate	Numeric; specified AR-1 decay rate $\gamma$, a number between 0 and 1 (non-inclusive).
tuning_parameter	Numeric; tuning parameter $\lambda$ for L0 spike estimation, a non-negative number.
window_size	Numeric; window size for fixed window hypothesis testing, a non-negative integer.
sig2	Numeric; noise variance for the observed data, a non-negative number. If unknown (NULL), sample variance of residuals is used instead.
return_conditioning_sets	Logical; Should the conditioning set S be returned?
return_ci	Logical; if TRUE, the confidence interval for the change in calcium is computed and returned.
two_sided	Logical; if TRUE, a 2-sided p-value is computed and returned.
alpha	Numeric; significance level for the hypothesis test, a number between 0 and 1 (non-inclusive).

Value

Returns a list with elements:

spikes the set of spikes,
pvals p-values associated with each spike,
LCB lower confidence band for each spike,
UCB upper confidence band for each spike.

Details

Consider the AR-1 generative model $$Y_t = c_t + \epsilon_t, \epsilon_t \sim N(0, \sigma^2),$$ where $c_t = \gamma c_{t-1} + z_t$ and $z_t \sim Poisson(poisMean)$. In words, this says between spikes (when $z_t=0$), calcium decays exponentially at a known rate $\gamma\in(0,1)$, which is taken to be known. Further denote the locations of true spikes, $\{t:z_t\geq 0\}$ as $\{0 = \tau_0 < \tau_1 < \ldots < \tau_K < \tau_{K+1} = T\}$.

This function first estimates spikes via L0 penalty based on noisy observations $y_t, t = 1, \ldots, T$ by solving the following optimization problem $$ minimize_{c_1,...,c_T\geq 0} \frac{1}{2} \sum_{t=1}^{T} ( y_t - c_t )^2 + \lambda \sum_{t=2}^{T} 1(c_t \neq \gamma c_t-1).$$ Estimated spikes correspond to the time t such that estimated calcium does not decay exponentially, i.e., $\{\cdots,\hat{\tau}_j,\cdots\} = \{t: \hat{c}_{t+1} -\gamma c_{t} \neq 0\} $.

Now suppose that we want to test whether the calcium is exponentially decaying near an estimated spike $\hat{\tau}_j$; or equivalently, the null hypothesis of the form $H_{0}: \nu^T c = 0$ versus $H_{1}: \nu^T c> 0$ for suitably chosen $\nu$ (see Section 2 in Chen et al. (2021+) for details). This function computes the following p-value $$P(\nu^T Y \geq \nu^T y | \hat{\tau}_j \in M(Y), \nu^T Y> 0, \Pi_\nu^\perp Y = \Pi_\nu^\perp y)$$ where $M(Y)$ is the set of spikes estimated from Y via the L0 method, $\Pi_\nu^\perp$ is the orthogonal projection to the orthogonal complement of $\nu$. In particular, this p-value controls the selective Type I error (see Section 3 in Chen et al. (2021+) for details).

In addition, we implement a $1-\alpha$ confidence interval for the parameter $ \nu^T c$, the increase in calcium associated with an estimated spike $\hat{\tau}_j$ (see Section 4 in Chen et al. (2021+) for details).

References

Chen YT, Jewell SW, Witten DM. (2021+) Quantifying uncertainty in spikes estimated from calcium imaging data. arXiv:2103.0781 [statME].

Jewell, S. W., Hocking, T. D., Fearnhead, P., & Witten, D. M. (2019). Fast nonconvex deconvolution of calcium imaging data. Biostatistics.

Jewell, S., Fearnhead, P., and Witten, D. (2019+). Testing for a change in mean after changepoint detection. Technical report.

Maidstone, R., Hocking, T., Rigaill, G., & Fearnhead, P. (2017). On optimal multiple changepoint algorithms for large data. Statistics and Computing, 27(2), 519-533.

Rigaill, G. (2015). A pruned dynamic programming algorithm to recover the best segmentations with 1 to K_max change-points. Journal de la Societe Francaise de Statistique, 156(4), 180-205.

Examples


gam <- 0.98
LAMBDA <- 0.7
sigma <- 0.3
n_length <- 1000
curr_sim <- simulate_ar1(n = n_length, gam = gam, poisMean = 0.01, sd = sigma, seed = 2)
curr_fit_spike <- spike_estimates(dat = curr_sim$fl, decay_rate = gam, tuning_parameter = LAMBDA)
curr_inference_spike <- spike_inference(dat = curr_sim$fl, decay_rate = gam,
 tuning_parameter = LAMBDA, window_size = 2, sig2 = sigma*sigma, return_ci = TRUE)
### a summary of the inferential results
summary(curr_inference_spike)
#>    estimated_spikes        pvals        LCB       UCB
#> 1                95 0.0026865876  0.3313205 1.5624120
#> 2               237 0.3080037641 -0.9602564 0.9293162
#> 3               369 0.0003211625  0.5423357 1.7345743
#> 4               463 0.0489093674 -0.1322414 1.1940410
#> 5               503 0.0086375856  0.1706250 1.4210234
#> 6               515 0.0007364613  0.4639493 1.6659810
#> 7               538 0.7342421299 -8.5600416 1.0333508
#> 8               561 0.1274106669 -0.5842248 1.2708549
#> 9               571 0.0011170360  0.4546186 1.6866898
#> 10              610 0.0370503683 -0.1092493 1.4964073
#> 11              627 0.0020417663  0.3977844 1.6573791
#> 12              662 0.0046657711  0.2894321 1.5793793
#> 13              781 0.0059471283  0.2203090 1.4564512
#> 14              914 0.0142668035  0.1420499 1.6782515
#> 15              924 0.6120957544 -4.3107861 1.0106799
#> 16              946 0.0011172539  0.4747208 1.7228907

dat	Numeric vector; observed data.
decay_rate	Numeric; specified AR-1 decay rate \(\gamma\), a number between 0 and 1 (non-inclusive).
tuning_parameter	Numeric; tuning parameter \(\lambda\) for L0 spike estimation, a non-negative number.
window_size	Numeric; window size for fixed window hypothesis testing, a non-negative integer.
sig2	Numeric; noise variance for the observed data, a non-negative number. If unknown (NULL), sample variance of residuals is used instead.
return_conditioning_sets	Logical; Should the conditioning set S be returned?
return_ci	Logical; if TRUE, the confidence interval for the change in calcium is computed and returned.
two_sided	Logical; if TRUE, a 2-sided p-value is computed and returned.
alpha	Numeric; significance level for the hypothesis test, a number between 0 and 1 (non-inclusive).