Papers in the journal Journal of Computational and Graphical Statistics (Page 2)

#21

JOURNAL ARTICLE

Sample-wise Combined Missing Effect Model with Penalization.

Jialu Li, Guan Yu, Qizhai Li, Yufeng Liu

Modern high-dimensional statistical inference often faces the problem of missing data. In recent decades, many studies have focused on this topic and provided strategies including complete-sample analysis and imputation procedures. However, complete-sample analysis discards information of incomplete samples, while imputation procedures have accumulative errors from each single imputation. In this paper, we propose a new method, Sample-wise COmbined missing effect Model with penalization (SCOM), to deal with missing data occurring in predictors...

37274355

2023: Journal of Computational and Graphical Statistics

#22

JOURNAL ARTICLE

Cross-Validated Loss-Based Covariance Matrix Estimator Selection in High Dimensions.

Philippe Boileau, Nima S Hejazi, Mark J van der Laan, Sandrine Dudoit

The covariance matrix plays a fundamental role in many modern exploratory and inferential statistical procedures, including dimensionality reduction, hypothesis testing, and regression. In low-dimensional regimes, where the number of observations far exceeds the number of variables, the optimality of the sample covariance matrix as an estimator of this parameter is well-established. High-dimensional regimes do not admit such a convenience. Thus, a variety of estimators have been derived to overcome the shortcomings of the canonical estimator in such settings...

37273839

2023: Journal of Computational and Graphical Statistics

#23

JOURNAL ARTICLE

Generalized Connectivity Matrix Response Regression with Applications in Brain Connectivity Studies.

Jingfei Zhang, Will Wei Sun, Lexin Li

Multiple-subject network data are fast emerging in recent years, where a separate connectivity matrix is measured over a common set of nodes for each individual subject, along with subject covariates information. In this article, we propose a new generalized matrix response regression model, where the observed network is treated as a matrix-valued response and the subject covariates as predictors. The new model characterizes the population-level connectivity pattern through a low-rank intercept matrix, and the effect of subject covariates through a sparse slope tensor...

36970553

2023: Journal of Computational and Graphical Statistics

#24

JOURNAL ARTICLE

Streamlined Variational Inference for Linear Mixed Models with Crossed Random Effects.

Marianne Menictas, Gioia Di Credico, Matt P Wand

We derive streamlined mean field variational Bayes algorithms for fitting linear mixed models with crossed random effects. In the most general situation, where the dimensions of the crossed groups are arbitrarily large, streamlining is hindered by lack of sparseness in the underlying least squares system. Because of this fact we also consider a hierarchy of relaxations of the mean field product restriction. The least stringent product restriction delivers a high degree of inferential accuracy. However, this accuracy must be mitigated against its higher storage and computing demands...

36873962

2023: Journal of Computational and Graphical Statistics

#25

JOURNAL ARTICLE

Multiple domain and multiple kernel outcome-weighted learning for estimating individualized treatment regimes.

Shanghong Xie, Thaddeus Tarpey, Eva Petkova, R Todd Ogden

Individualized treatment rules (ITRs) recommend treatments that are tailored specifically according to each patient's own characteristics. It can be challenging to estimate optimal ITRs when there are many features, especially when these features have arisen from multiple data domains (e.g., demographics, clinical measurements, neuroimaging modalities). Considering data from complementary domains and using multiple similarity measures to capture the potential complex relationship between features and treatment can potentially improve the accuracy of assigning treatments...

36970034

2022: Journal of Computational and Graphical Statistics

#26

JOURNAL ARTICLE

A scalable hierarchical lasso for gene-environment interactions.

Natalia Zemlianskaia, W James Gauderman, Juan Pablo Lewinger

We describe a regularized regression model for the selection of gene-environment (G×E) interactions. The model focuses on a single environmental exposure and induces a main-effect-before-interaction hierarchical structure. We propose an efficient fitting algorithm and screening rules that can discard large numbers of irrelevant predictors with high accuracy. We present simulation results showing that the model outperforms existing joint selection methods for (G×E) interactions in terms of selection performance, scalability and speed, and provide a real data application...

36793591

2022: Journal of Computational and Graphical Statistics

#27

JOURNAL ARTICLE

Latent Network Estimation and Variable Selection for Compositional Data Via Variational EM.

Nathan Osborne, Christine B Peterson, Marina Vannucci

Network estimation and variable selection have been extensively studied in the statistical literature, but only recently have those two challenges been addressed simultaneously. In this article, we seek to develop a novel method to simultaneously estimate network interactions and associations to relevant covariates for count data, and specifically for compositional data, which have a fixed sum constraint. We use a hierarchical Bayesian model with latent layers and employ spike-and-slab priors for both edge and covariate selection...

36776345

2022: Journal of Computational and Graphical Statistics

#28

JOURNAL ARTICLE

A User-Friendly Computational Framework for Robust Structured Regression with the L 2 Criterion.

Jocelyn T Chi, Eric C Chi

We introduce a user-friendly computational framework for implementing robust versions of a wide variety of structured regression methods with the L2 criterion. In addition to introducing an algorithm for performing L2 E regression, our framework enables robust regression with the L2 criterion for additional structural constraints, works without requiring complex tuning procedures on the precision parameter, can be used to identify heterogeneous subpopulations, and can incorporate readily available non-robust structured regression solvers...

36721836

2022: Journal of Computational and Graphical Statistics

#29

JOURNAL ARTICLE

Variable selection with multiply-imputed datasets: choosing between stacked and grouped methods.

Jiacong Du, Jonathan Boss, Peisong Han, Lauren J Beesley, Michael Kleinsasser, Stephen A Goutman, Stuart Batterman, Eva L Feldman, Bhramar Mukherjee

Penalized regression methods are used in many biomedical applications for variable selection and simultaneous coefficient estimation. However, missing data complicates the implementation of these methods, particularly when missingness is handled using multiple imputation. Applying a variable selection algorithm on each imputed dataset will likely lead to different sets of selected predictors. This paper considers a general class of penalized objective functions which, by construction, force selection of the same variables across imputed datasets...

36644406

2022: Journal of Computational and Graphical Statistics

#30

JOURNAL ARTICLE

Estimation and model selection for nonparametric function-on-function regression.

Zhanfeng Wang, Hao Dong, Ping Ma, Yuedong Wang

Regression models with a functional response and functional covariate have received significant attention recently. While various nonparametric and semiparametric models have been developed, there is an urgent need for model selection and diagnostic methods. In this article, we develop a unified framework for estimation and model selection in nonparametric function-on-function regression. We propose a general nonparametric functional regression model with the model space constructed through smoothing spline analysis of variance (SS ANOVA)...

36594047

2022: Journal of Computational and Graphical Statistics

#31

JOURNAL ARTICLE

Finite-Sample Two-Group Composite Hypothesis Testing via Machine Learning.

Tianyu Zhan, Jian Kang

In the problem of composite hypothesis testing, identifying the potential uniformly most powerful (UMP) unbiased test is of great interest. Beyond typical hypothesis settings with exponential family, it is usually challenging to prove the existence and further construct such UMP unbiased tests with finite sample size. For example in the COVID-19 pandemic with limited previous assumptions on the treatment for investigation and the standard of care, adaptive clinical trials are appealing due to ethical considerations, and the ability to accommodate uncertainty while conducting the trial...

36506350

2022: Journal of Computational and Graphical Statistics

#32

JOURNAL ARTICLE

Bayesian Distance Weighted Discrimination.

Eric F Lock

Distance weighted discrimination (DWD) is a linear discrimination method that is particularly well-suited for classification tasks with high-dimensional data. The DWD coefficients minimize an intuitive objective function, which can solved efficiently using state-of-the-art optimization techniques. However, DWD has not yet been cast into a model-based framework for statistical inference. In this article we show that DWD identifies the mode of a proper Bayesian posterior distribution, that results from a particular link function for the class probabilities and a shrinkage-inducing proper prior distribution on the coefficients...

36465095

2022: Journal of Computational and Graphical Statistics

#33

JOURNAL ARTICLE

Smoothing splines approximation using Hilbert curve basis selection.

Cheng Meng, Jun Yu, Yongkai Chen, Wenxuan Zhong, Ping Ma

Smoothing splines have been used pervasively in nonparametric regressions. However, the computational burden of smoothing splines is significant when the sample size n is large. When the number of predictors d ≥ 2 , the computational cost for smoothing splines is at the order of O ( n 3 ) using the standard approach. Many methods have been developed to approximate smoothing spline estimators by using q basis functions instead of n ones, resulting in a computational cost of the order O ( nq 2 ). These methods are called the basis selection methods...

36407675

2022: Journal of Computational and Graphical Statistics

#34

JOURNAL ARTICLE

AdaptSPEC-X: Covariate-Dependent Spectral Modeling of Multiple Nonstationary Time Series.

Michael Bertolacci, Ori Rosen, Edward Cripps, Sally Cripps

We present the AdaptSPEC-X method for the joint analysis of a panel of possibly nonstationary time series. The approach is Bayesian and uses a covariate-dependent infinite mixture model to incorporate multiple time series, with mixture components parameterized by a time-varying mean and log spectrum. The mixture components are based on AdaptSPEC, a nonparametric model which adaptively divides the time series into an unknown number of segments and estimates the local log spectra by smoothing splines. AdaptSPEC-X extends AdaptSPEC in three ways...

36329784

2022: Journal of Computational and Graphical Statistics

#35

JOURNAL ARTICLE

Adaptive Preferential Sampling in Phylodynamics With an Application to SARS-CoV-2.

Lorenzo Cappello, Julia A Palacios

Longitudinal molecular data of rapidly evolving viruses and pathogens provide information about disease spread and complement traditional surveillance approaches based on case count data. The coalescent is used to model the genealogy that represents the sample ancestral relationships. The basic assumption is that coalescent events occur at a rate inversely proportional to the effective population size N e ( t ), a time-varying measure of genetic diversity. When the sampling process (collection of samples over time) depends on N e ( t ), the coalescent and the sampling processes can be jointly modeled to improve estimation of N e ( t )...

36035966

2022: Journal of Computational and Graphical Statistics

#36

JOURNAL ARTICLE

A single-index model with a surface-link for optimizing individualized dose rules.

Hyung Park, Eva Petkova, Thaddeus Tarpey, R Todd Ogden

This paper focuses on the problem of modeling and estimating interaction effects between covariates and a continuous treatment variable on an outcome, using a single-index regression. The primary motivation is to estimate an optimal individualized dose rule and individualized treatment effects. To model possibly nonlinear interaction effects between patients' covariates and a continuous treatment variable, we employ a two-dimensional penalized spline regression on an index-treatment domain, where the index is defined as a linear projection of the covariates...

35873662

2022: Journal of Computational and Graphical Statistics

#37

JOURNAL ARTICLE

Fast Univariate Inference for Longitudinal Functional Models.

Erjia Cui, Andrew Leroux, Ekaterina Smirnova, Ciprian M Crainiceanu

We propose fast univariate inferential approaches for longitudinal Gaussian and non-Gaussian functional data. The approach consists of three steps: (1) fit massively univariate pointwise mixed effects models; (2) apply any smoother along the functional domain; and (3) obtain joint confidence bands using analytic approaches for Gaussian data or a bootstrap of study participants for non-Gaussian data. Methods are motivated by two applications: (1) Diffusion Tensor Imaging (DTI) measured at multiple visits along the corpus callosum of multiple sclerosis (MS) patients; and (2) physical activity data measured by body-worn accelerometers for multiple days...

35712524

2022: Journal of Computational and Graphical Statistics

#38

JOURNAL ARTICLE

The Chi-Square Test of Distance Correlation.

Cencheng Shen, Sambit Panda, Joshua T Vogelstein

Distance correlation has gained much recent attention in the data science community: the sample statistic is straightforward to compute and asymptotically equals zero if and only if independence, making it an ideal choice to discover any type of dependency structure given sufficient sample size. One major bottleneck is the testing process: because the null distribution of distance correlation depends on the underlying random variables and metric choice, it typically requires a permutation test to estimate the null and compute the p-value, which is very costly for large amount of data...

35707063

2022: Journal of Computational and Graphical Statistics

#39

JOURNAL ARTICLE

Eigenvectors from Eigenvalues Sparse Principal Component Analysis (EESPCA).

H Robert Frost

We present a novel technique for sparse principal component analysis. This method, named Eigenvectors from Eigenvalues Sparse Principal Component Analysis (EESPCA), is based on the formula for computing squared eigenvector loadings of a Hermitian matrix from the eigenvalues of the full matrix and associated sub-matrices. We explore two versions of the EESPCA method: a version that uses a fixed threshold for inducing sparsity and a version that selects the threshold via cross-validation. Relative to the state-of-the-art sparse PCA methods of Witten et al...

35693984

2022: Journal of Computational and Graphical Statistics

#40

JOURNAL ARTICLE

Interval censored recursive forests.

Hunyong Cho, Nicholas P Jewell, Michael R Kosorok

We propose interval censored recursive forests (ICRF), an iterative tree ensemble method for interval censored survival data. This nonparametric regression estimator addresses the splitting bias problem of existing tree-based methods and iteratively updates survival estimates in a self-consistent manner. Consistent splitting rules are developed for interval censored data, convergence is monitored using out-of-bag samples, and kernel-smoothing is applied. The ICRF is uniformly consistent and displays high prediction accuracy in both simulations and applications to avalanche and national mortality data...

35685204

2022: Journal of Computational and Graphical Statistics

Use the journals feature with a free QxMD account.

Journal of Computational and Graphical Statistics

Save your favorite articles in one place with a free QxMD account.

Read

Search Tips