CJS editor's corner

cjs logo

The 50th anniversary jubilee of The Canadian Journal of Statistics ended with a wonderful free-to-read special issue edited by Bruce Smith, Wendy Lou, Grace Yi, and Bruno Rémillard that celebrates the work of prominent Canadian statisticians.

In 2023, I hope to keep my new year’s resolution to revive the tradition of offering you a digest of upcoming issues of CJS in Liaison. The March issue is being prepared as I write. It will feature 17 research articles, all of which are already available online in Early View.

The issue opens with a series of papers concerned with various problems related to regression models. To test hypotheses about the coefficients in high-dimensional partially linear models, Zhao, Lin, and Zhang [1] explicitly avoid estimation of the coefficients and use U-statistics methodology. This makes their procedure applicable in both sparse and non-sparse models. Variable selection is another classical problem in regression, which Miyawaki and MacEachern [2] examine from a decision-theoretic perspective by adding cost to predictors. They consider two Bayesian approaches, addressing both model and parameter uncertainty.

To contend with heavy-tailed errors and outliers in the explanatory variables, Cao, Kang, and Wang [3] propose weighted composite quantile regression with weights based on principal components. Their estimation procedure uses SCAD-L2 penalty and enjoys the oracle property even if the error variance is infinite. Yet another issue that abounds in regression models is heteroscedastic errors. Burak and Kashlak [4] propose the so-called analytic wild bootstrap to construct confidence regions for the regression parameters in such settings; their procedure achieves similar coverage as the wild bootstrap while being computationally much more efficient. Zhang, Zhang, and Ma [9] tackle Poisson count regression when covariates are measured with error and several instrumental variables are available. Their idea uses model averaging to take various potential instruments into account, and yields averaging weights that minimize the asymptotic prediction risk.

Several articles make contributions to the analysis of complex data. Making connections between nonparametric regression and graphon estimation leads Madrid-Padilla and Chen [5] to a novel estimation procedure for binary networks, where the presence or absence of an edge is a Bernoulli variable with success probability determined by a graphon. Motivated by streaming health data that may arrive at a fast rate and in large volumes, Luo and Song [6] develop a new Kalman filter and an online estimation tool for linear state-space mixed models that can account for heterogeneity between different data batches. The challenging modelling of longitudinal data is often accomplished using continuous-time hidden Markov models that represent the latent trajectories behind observed data. Luo, Stephens, and Buckeridge [7] propose a model-based Bayesian methodology to cluster these latent trajectories, without knowing the number of clusters a priori. Clustering is also the subject of the article by Hu, Yang, Xue, and Dey [8], albeit in the entirely different context of sports analytics. They propose a Bayesian zero-inflated Poisson regression with clustered coefficients to elucidate different shooting habits of basketball players.

Three articles address issues arising in clinical trials. Assuming a generalized linear model framework, Gavanji, Jiang, and Chen [10] use a penalized likelihood approach to test for the presence of a biomarker cut point that would divide participants in a clinical trial into two groups depending on how well they respond to treatment. Feng, Prasangika, and Zuo [11] consider an additive hazards model with time-varying coefficients for multivariate current-status data with informative censoring, and develop inference based on local linear and partial likelihood techniques. Sun, Heng, Lee, and Gilbert [12] model conditional cumulative incidence functions of HIV-1 infections when covariates are missing, treating different infection types as competing risks. To this end, they develop estimation and inference in generalized semiparametric regression models based on a doubly robust augmented inverse probability weighted complete-case approach.

The March issue also features contributions to experimental design and survey sampling. Abousaleh and Zhou [13] provide an alternative approach to construct minimax optimal designs for regression models with heteroscedastic errors which are robust to error variance misspecification. They also provide an algorithm to find such designs in discrete design spaces. Krieger, Azriel, and Kapelner [14] present a new experimental design to divide participants in a clinical trial into two groups to minimize error in-treatment effect estimation. Their design addresses both robustness to misspecification in response models and large covariate imbalance.

Uniform designs are widely used space-filling designs, e.g., in computer and physical experiments to investigate complex systems. Liu, Wang, and Sun [15] study the uniform projection criterion with the aim to minimize average discrepancy for all two-dimensional projections of a design. Survey data integration is paramount to improved policy making. Erciulescu, Opsomer, and Schneider [16] propose a multilevel hierarchical Bayes model to combine data from two surveys, where estimation of two dependent variables is of interest at granular levels but where merely the much smaller of the two surveys collects data on both variables.

The issue closes with a theoretical paper by Wu, Yu, Yang, Ding, and Wang [17]. These authors show that for certain functions and under certain regularity conditions, an expected value of a function of a weighted sum of weakly dependent random variables can be approximated by the same function evaluated at its expected value.

Wishing you inspirational readings,

Johanna G. Nešlehová

Editor in Chief, The Canadian Journal of Statistics

Table of Contents of the March 2023 Issue of The Canadian Journal of Statistics

  1. A new test for high-dimensional regression coefficients in partially linear models by Fanrong Zhao, Nan Lin, and Baoxue Zhang
  2. Economic variable selection by Koji Miyawaki and Steven N. MacEachern
  3. Doubly robust weighted composite quantile regression based on SCAD-L2 by Zhimiao Cao, Xiaoning Kang, and Mingqiu Wang
  4. Nonparametric confidence regions via the analytic wild bootstrap by Katherine L. Burak and Adam B. Kashlak
  5. Graphon estimation via nearest-neighbour algorithm and two-dimensional fused-lasso denoising by Oscar Hernan Madrid-Padilla and Yanzhen Chen
  6. Multivariate online regression analysis with heterogeneous streaming data by Lan Luo and Peter X.-K. Song
  7. Bayesian clustering for continuous-time hidden Markov models by Yu Luo, David A. Stephens, and David L. Buckeridge
  8. Zero-inflated Poisson model with clustered regression coefficients: Application to heterogeneity learning of field goal attempts of professional basketball players by Guanyu Hu, Hou-Cheng Yang, Yishu Xue, and Dipak K. Dey
  9. A model-averaging treatment of multiple instruments in Poisson models with errors by Xiaomeng Zhang, Xinyu Zhang, and Yanyuan Ma
  10. Penalized likelihood ratio test for a biomarker threshold effect in clinical trials based on generalized linear models by Parisa Gavanji, Wenyu Jiang, and Bingshu E. Chen
  11. Regression analysis of multivariate current status data under a varying coefficients additive hazards frailty model by Yanqin Feng, K. D. Prasangika, and Guoxin Zuo
  12. Estimation of conditional cumulative incidence functions under generalized semiparametric regression models with missing covariates, with application to analysis of biomarker correlates in vaccine trials by Yanqing Sun, Fei Heng, Unkyung Lee, and Peter B. Gilbert
  13. Minimax A-, c-, and I-optimal regression designs for models with heteroscedastic errors by Hanan Abousaleh and Julie Zhou
  14. Better experimental design by hybridizing binary matching with imbalance optimization by Abba M. Krieger, David A. Azriel, and Adam Kapelner
  15. Two-dimensional projection uniformity for space-filling designs by Sixu Liu, Yaping Wang, and Fasheng Sun
  16. Statistical data integration using multilevel models to predict employee compensation by Andreea L. Erciulescu, Jean D. Opsomer, and Benjamin J. Schneider
  17. On asymptotic approximation of ratio models for weakly dependent sequences by Yi Wu, Wei Yu, Wenzhi Yang, Saisai Ding, and Xuejun Wang
Sunday, January 29, 2023

Liaison Newsletter: