CJS Coming Attractions

CJS logo

In the third issue of 2018, The Canadian Journal of Statistics presents eight papers covering a number of topics including Bayesian inference, model checking, variable selection, sampling design, prediction, and missing data.

The first article, coauthored by AL-LABADI and EVANS, explores model checking procedures based on the use of the Dirichlet process and relative belief. The authors discuss the unique advantages of using such a combination. In the implementation of the proposed methods, it is important to properly select the hyperparameters for the Dirichlet process. The authors advocate a particular choice for the base distribution that avoids prior-data conflict and double use of the data. Several examples are presented to demonstrate the performance of the approach.

For sparse and high-dimensional data analysis, a valid approximation of the L-zero norm plays a key role. However, there has been little study of the L-zero norm approximation in the Bayesian paradigm. Motivated by this, GOH and DEY introduce a new prior, called the Gaussian and diffused-gamma prior, which leads to a nice L-zero norm approximation under the maximum a posteriori estimation. To develop a general likelihood function, the authors utilize a class of divergence measures that can handle various types of data including count, binary, and continuous data.

The next two papers consider variable selection from different perspectives. There has been limited work on variable selection for recurrent event data, and ZHAO, SUN, LI and SUN propose the broken adaptive ridge regression approach for simultaneous parameter estimation and variable selection. Rather than directly generalizing penalized procedures for linear models, the authors introduce a new penalty function. In addition to establishing the oracle property, they also show that the method has the clustering or grouping effect when the covariates are highly correlated.

Concerning sparse finite mixtures of regression models, KHALILI and VIDYASHANKAR describe hypothesis testing approaches that take into account model selection uncertainty. The methods asymptotically control the family-wise error rate at a prespecified nominal level, while accounting for variable selection uncertainty. The authors provide examples of consistent model selectors and describe methods for improving the finite-sample performance. The performance of the methods is demonstrated via numerical studies.

The fifth and sixth papers examine statistical designs. YI and LI discuss the asymptotic optimality of statistical inference for response-adaptive designs; such designs have ethical advantages over traditional methods for clinical trials. They derive an upper bound on the power of statistical tests. They also show that the Wald statistic is asymptotically optimal in terms of achieving this upper bound.

HU considers the construction of robust sampling designs for the estimation of threshold probabilities in spatial studies. In particular, averaging the mean squared error of the predicted values relative to the true values over all possible covariance structures in a neighbourhood of the experimenter’s nominal choice, the author proposes designs for which the estimation of the threshold probabilities is robust to possibly misspecified regression responses or  covariance structures.

The next article is related to small area estimation, which often involves constructing predictions with an estimated model followed by a benchmarking step. In the benchmarking operation, the predictions are modified so that their weighted sums satisfy constraints. The most common constraint requires a weighted sum of the predictions to be equal to the same weighted sum of the original observations. Concerning this, BERG and FULLER propose two benchmarking procedures for nonlinear models: a linear additive adjustment and a method based on an augmented model for the expectation function. They also present variance estimators for the benchmarked predictors.

The final paper, coauthored by BINDELE and ADEKPEDJOU, deals with rank-based inference in the presence of missing data. The authors propose a robust and efficient approach for estimating the true regression parameters when some responses are missing not at random. The large-sample properties of the proposed estimator are established under mild regularity conditions. Monte Carlo simulation experiments are carried out to show that the new estimator is more efficient than the least squares estimator when the model error distribution is heavily tailed or contaminated or when the data contain gross outliers.

Enjoy the new issue!

Grace Y. Yi
CJS Editor

Wednesday, July 25, 2018

Liaison Newsletter: