# Coming Attractions of The Canadian Journal of Statistics: 2017 Issue 4

In the final issue of 2017, *The Canadian Journal of Statistics* presents eight papers covering topics on time series, nonparametric regression models, two-stage cluster sampling, current status data, and missing observations.

Gaussian mixtures of autoregressive models are useful for explaining the heterogeneous behaviour of time series. One important task is to infer the number of autoregressive regimes and the autoregressive orders. Information-theoretic criteria such as AIC or BIC are commonly used for such inference, and they typically evaluate each regime/autoregressive combination separately in order to choose an optimal model. However, the number of combinations can be too large to make such an approach computationally infeasible. To handle this issue, KHALILI, CHEN and STEPHENS develop a computationally efficient regularization method for simultaneous autoregressive-order and parameter estimation when the number of autoregressive regimes is predetermined. They propose a regularized Bayesian information criterion (RBIC) to select the number of regimes.

The second paper concerns the analysis of time series with limited or censored data. Practitioners commonly disregard the censored observations or replace them with some function of the limit of detection. However, this treatment usually results in biased estimates. Motivated by this, SCHUMACHER, LACHOS and DEY propose an analytically tractable and efficient stochastic approximation of the EM algorithm to obtain the maximum likelihood estimates of the parameters of censored regression models with autoregressive errors. The authors also develop an R package, “ARCensReg,” for the implementation of their method.

The next two papers deal with incomplete data that have missing observations or are interval-censored. MORIKAWA, KIM and KANO consider a problem with data that are missing not at random (MNAR). Handling MNAR data often requires two types of modelling, one for the outcome and the other for the response propensity; correctly specifying these two models is often difficult. Consequently, the authors propose a semiparametric maximum likelihood method for analyzing MNAR data where a parametric model is used for the response propensity and a nonparametric model is used for the outcome part. The resulting analysis is more robust than the fully parametric approach.

Focusing on bivariate current status or case I interval-censored failure time data, HU, ZHOU and SUN propose a sieve maximum likelihood estimation approach under copula models where marginal proportional hazards models are assumed. The proposed method leaves the underlying copula model completely unspecified and can be easily implemented. The authors establish that the proposed estimators are strongly consistent and have the asymptotic normality property.

The next two articles consider problems of nonparametric regression. ZAMBOM and KIM develop a new method to test for heteroscedasticity in a multiple nonparametric regression model. The test statistic is based on a high-dimensional one-way ANOVA constructed with the absolute value of the residuals, and its asymptotic distribution is derived under the null hypothesis of homoscedasticity and local alternative. The properties of the proposed test statistic are preserved when a correctly specified parametric mean function is used to obtain the residuals.

To analyze multi-curve data, DE SOUZA, HECKMAN and XU examine a switching nonparametric regression model where each curve is driven by a latent state process. The state at any particular point determines a smooth function, forcing the individual curve to “switch” from one function to another. They develop an EM algorithm to estimate the model parameters and also obtain standard errors for the parameter estimates of the state process. They focus on three types of hidden states: those that are independent and identically distributed, those that follow a Markov structure, and those that are independent but with distribution depending on some covariates.

In the seventh paper, HOLMQUIST and GUSTAFSSON present a likelihood-based test for clustering among subpopulation mean directions for circular data. The test is based on a two-level hierarchical model with von Mises distributed variation on each level. The properties of the tests are investigated and compared to the commonly applied techniques of second-order analysis and pseudo-pooling of directions.

In the final article, KIM, PARK and LEE consider informative cluster sampling with generalized linear mixed models. When a sample is obtained from a two-stage cluster sampling scheme with unequal selection probabilities, the sample distribution can differ from that of the population and the sampling design can be informative. The authors propose an estimation approach for the model parameters using the EM algorithm. To avoid using the intractable sample likelihood function, they use a normal approximation for the sampling distribution of the profile pseudo maximum likelihood estimator for the random effects in the level-one model.

Enjoy the new issue!

Grace Y. Yi

*CJS* Editor