Most recent papers in the journal Journal of Computational and Graphical Statistics

#1

JOURNAL ARTICLE

Improving and Extending STERGM Approximations Based on Cross-Sectional Data and Tie Durations.

Chad Klumb, Martina Morris, Steven M Goodreau, Samuel M Jenness

Temporal exponential-family random graph models (TERGMs) are a flexible class of models for network ties that change over time. Separable TERGMs (STERGMs) are a subclass of TERGMs in which the dynamics of tie formation and dissolution can be separated within each discrete time step and may depend on different factors. The Carnegie et al. (2015) approximation improves estimation efficiency for a subclass of STERGMs, allowing them to be reliably estimated from inexpensive cross-sectional study designs. This approximation adapts to cross-sectional data by attempting to construct a STERGM with two specific properties: a cross-sectional equilibrium distribution defined by an exponential-family random graph model (ERGM) for the network structure, and geometric tie duration distributions defined by constant hazards for tie dissolution...

38455738

2024: Journal of Computational and Graphical Statistics

#2

JOURNAL ARTICLE

More Powerful Selective Inference for the Graph Fused Lasso.

Yiqun Chen, Sean Jewell, Daniela Witten

The graph fused lasso-which includes as a special case the one-dimensional fused lasso-is widely used to reconstruct signals that are piecewise constant on a graph, meaning that nodes connected by an edge tend to have identical values. We consider testing for a difference in the means of two connected components estimated using the graph fused lasso. A naive procedure such as a z-test for a difference in means will not control the selective Type I error, since the hypothesis that we are testing is itself a function of the data...

38250478

2023: Journal of Computational and Graphical Statistics

#3

JOURNAL ARTICLE

A Stochastic Approximation-Langevinized Ensemble Kalman Filter Algorithm for State Space Models with Unknown Parameters.

Tianning Dong, Peiyi Zhang, Faming Liang

Inference for high-dimensional, large scale and long series dynamic systems is a challenging task in modern data science. The existing algorithms, such as particle filter or sequential importance sampler, do not scale well to the dimension of the system and the sample size of the dataset, and often suffers from the sample degeneracy issue for long series data. The recently proposed Langevinized ensemble Kalman filter (LEnKF) addresses these difficulties in a coherent way. However, it cannot be applied to the case that the dynamic system contains unknown parameters...

38240013

2023: Journal of Computational and Graphical Statistics

#4

JOURNAL ARTICLE

A quantum parallel Markov chain Monte Carlo.

Andrew J Holbrook

We propose a novel hybrid quantum computing strategy for parallel MCMC algorithms that generate multiple proposals at each step. This strategy makes the rate-limiting step within parallel MCMC amenable to quantum parallelization by using the Gumbel-max trick to turn the generalized accept-reject step into a discrete optimization problem. When combined with new insights from the parallel MCMC literature, such an approach allows us to embed target density evaluations within a well-known extension of Grover's quantum search algorithm...

38127472

2023: Journal of Computational and Graphical Statistics

#5

JOURNAL ARTICLE

Fast Marginal Likelihood Estimation of Penalties for Group-Adaptive Elastic Net.

Mirrelijn M van Nee, Tim van de Brug, Mark A van de Wiel

Elastic net penalization is widely used in high-dimensional prediction and variable selection settings. Auxiliary information on the variables, for example, groups of variables, is often available. Group-adaptive elastic net penalization exploits this information to potentially improve performance by estimating group penalties, thereby penalizing important groups of variables less than other groups. Estimating these group penalties is, however, hard due to the high dimension of the data. Existing methods are computationally expensive or not generic in the type of response...

38013849

2023: Journal of Computational and Graphical Statistics

#6

JOURNAL ARTICLE

A General Framework for Identifying Hierarchical Interactions and Its Application to Genomics Data.

Zhang Xiao, Shi Xingjie, Liu Yiming, Liu Xu, Shuangge Ma

The analysis of hierarchical interactions has long been a challenging problem due to the large number of candidate main effects and interaction effects, and the need for accommodating the "main effects, interactions" hierarchy. The two-stage analysis methods enjoy simplicity and low computational cost, but contradict the fact that the outcome of interest is attributable to the joint effects of multiple main factors and their interactions. The existing joint analysis methods can accurately describe the underlying data generating process, but suffer from prohibitively high computational cost...

38009111

2023: Journal of Computational and Graphical Statistics

#7

JOURNAL ARTICLE

Transfer learning of individualized treatment rules from experimental to real-world data.

Lili Wu, Shu Yang

Individualized treatment effect lies at the heart of precision medicine. Interpretable individualized treatment rules (ITRs) are desirable for clinicians or policymakers due to their intuitive appeal and transparency. The gold-standard approach to estimating the ITRs is randomized experiments, where subjects are randomized to different treatment groups and the confounding bias is minimized to the extent possible. However, experimental studies are limited in external validity because of their selection restrictions, and therefore the underlying study population is not representative of the target real-world population...

37997592

2023: Journal of Computational and Graphical Statistics

#8

JOURNAL ARTICLE

Cost-based feature selection for network model choice.

Louis Raynal, Till Hoffmann, Jukka-Pekka Onnela

Selecting a small set of informative features from a large number of possibly noisy candidates is a challenging problem with many applications in machine learning and approximate Bayesian computation. In practice, the cost of computing informative features also needs to be considered. This is particularly important for networks because the computational costs of individual features can span several orders of magnitude. We addressed this issue for the network model selection problem using two approaches. First, we adapted nine feature selection methods to account for the cost of features...

37982131

2023: Journal of Computational and Graphical Statistics

#9

JOURNAL ARTICLE

A Dirichlet model of alignment cost in mixed-membership unsupervised clustering.

Xiran Liu, Naama M Kopelman, Noah A Rosenberg

Mixed-membership unsupervised clustering is widely used to extract informative patterns from data in many application areas. For a shared data set, the stochasticity and unsupervised nature of clustering algorithms can cause difficulties in comparing clustering results produced by different algorithms, or even multiple runs of the same algorithm, as outcomes can differ owing to permutation of the cluster labels or genuine differences in clustering results. Here, with a focus on inference of individual genetic ancestry in population-genetic studies, we study the cost of misalignment of mixed-membership unsupervised clustering replicates under a theoretical model of cluster memberships...

37982130

2023: Journal of Computational and Graphical Statistics

#10

JOURNAL ARTICLE

Algorithms for Sparse Support Vector Machines.

Alfonso Landeros, Kenneth Lange

Many problems in classification involve huge numbers of irrelevant features. Variable selection reveals the crucial features, reduces the dimensionality of feature space, and improves model interpretation. In the support vector machine literature, variable selection is achieved by <mml:math xmlns:mml="https://www.w3.org/1998/Math/MathML"><mml:mrow><mml:msub><mml:mi>ℓ</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:math> penalties. These convex relaxations seriously bias parameter estimates toward 0 and tend to admit too many irrelevant features...

37982129

2023: Journal of Computational and Graphical Statistics

#11

JOURNAL ARTICLE

clusterMLD: An Efficient Hierarchical Clustering Method for Multivariate Longitudinal Data.

Junyi Zhou, Ying Zhang, Wanzhu Tu

Longitudinal data clustering is challenging because the grouping has to account for the similarity of individual trajectories in the presence of sparse and irregular times of observation. This paper puts forward a hierarchical agglomerative clustering method based on a dissimilarity metric that quantifies the cost of merging two distinct groups of curves, which are depicted by B -splines for the repeatedly measured data. Extensive simulations show that the proposed method has superior performance in determining the number of clusters, classifying individuals into the correct clusters, and in computational efficiency...

37859643

2023: Journal of Computational and Graphical Statistics

#12

JOURNAL ARTICLE

Bayesian Trend Filtering via Proximal Markov Chain Monte Carlo.

Qiang Heng, Hua Zhou, Eric C Chi

Proximal Markov Chain Monte Carlo is a novel construct that lies at the intersection of Bayesian computation and convex optimization, which helped popularize the use of nondifferentiable priors in Bayesian statistics. Existing formulations of proximal MCMC, however, require hyperparameters and regularization parameters to be prespecified. In this work, we extend the paradigm of proximal MCMC through introducing a novel new class of nondifferentiable priors called epigraph priors. As a proof of concept, we place trend filtering, which was originally a nonparametric regression problem, in a parametric setting to provide a posterior median fit along with credible intervals as measures of uncertainty...

37822489

2023: Journal of Computational and Graphical Statistics

#13

JOURNAL ARTICLE

Stability Approach to Regularization Selection for Reduced-Rank Regression.

Canhong Wen, Qin Wang, Yuan Jiang

The reduced-rank regression model is a popular model to deal with multivariate response and multiple predictors, and is widely used in biology, chemometrics, econometrics, engineering, and other fields. In the reduced-rank regression modelling, a central objective is to estimate the rank of the coefficient matrix that represents the number of effective latent factors in predicting the multivariate response. Although theoretical results such as rank estimation consistency have been established for various methods, in practice rank determination still relies on information criterion based methods such as AIC and BIC or subsampling based methods such as cross validation...

37810194

2023: Journal of Computational and Graphical Statistics

#14

JOURNAL ARTICLE

Lessons from West Virginia's Pandemic Response.

Bradley S Price, John P Saldanha, Dariane Drake, Katherine Kopp

In this editorial discussion we describe our experience developing and implementing predictive models during the pandemic response in the state of West Virginia. We provide insights the on the importance of communication and the dynamic environment that exists that impacts predictive modeling in situations such as those that we faced. It is our hope that this work brings insight to those who may experience similar challenges while working in public health policy.

37790240

2023: Journal of Computational and Graphical Statistics

#15

JOURNAL ARTICLE

Ultra-Fast Approximate Inference Using Variational Functional Mixed Models.

Shuning Huo, Jeffrey S Morris, Hongxiao Zhu

While Bayesian functional mixed models have been shown effective to model functional data with various complex structures, their application to extremely high-dimensional data is limited due to computational challenges involved in posterior sampling. We introduce a new computational framework that enables ultra-fast approximate inference for high-dimensional data in functional form. This framework adopts parsimonious basis to represent functional observations, which facilitates efficient compression and parallel computing in basis space...

37608921

2023: Journal of Computational and Graphical Statistics

#16

JOURNAL ARTICLE

Practical Network Modeling via Tapered Exponential-family Random Graph Models.

Bart Blackburn, Mark S Handcock

Exponential-family Random Graph Models (ERGMs) have long been at the forefront of the analysis of relational data. The exponential-family form allows complex network dependencies to be represented. Models in this class are interpretable, flexible and have a strong theoretical foundation. The availability of powerful user-friendly open-source software allows broad accessibility and use. However, ERGMs sometimes suffer from a serious condition known as near-degeneracy, in which the model exhibits unrealistic probabilistic behavior or a severe lack-of-fit to real network data...

37608920

2023: Journal of Computational and Graphical Statistics

#17

JOURNAL ARTICLE

Multiway sparse distance weighted discrimination.

Bin Guo, Lynn E Eberly, Pierre-Gilles Henry, Christophe Lenglet, Eric F Lock

Modern data often take the form of a multiway array. However, most classification methods are designed for vectors, i.e., 1-way arrays. Distance weighted discrimination (DWD) is a popular high-dimensional classification method that has been extended to the multiway context, with dramatic improvements in performance when data have multiway structure. However, the previous implementation of multiway DWD was restricted to classification of matrices, and did not account for sparsity. In this paper, we develop a general framework for multiway classification which is applicable to any number of dimensions and any degree of sparsity...

37377729

2023: Journal of Computational and Graphical Statistics

#18

JOURNAL ARTICLE

Template independent component analysis with spatial priors for accurate subject-level brain network estimation and inference.

Amanda F Mejia, David Bolin, Yu Ryan Yue, Jiongran Wang, Brian S Caffo, Mary Beth Nebel

Independent component analysis is commonly applied to functional magnetic resonance imaging (fMRI) data to extract independent components (ICs) representing functional brain networks. While ICA produces reliable group-level estimates, single-subject ICA often produces noisy results. Template ICA is a hierarchical ICA model using empirical population priors to produce more reliable subject-level estimates. However, this and other hierarchical ICA models assume unrealistically that subject effects are spatially independent...

37377728

2023: Journal of Computational and Graphical Statistics

#19

JOURNAL ARTICLE

Testing Biased Randomization Assumptions and Quantifying Imperfect Matching and Residual Confounding in Matched Observational Studies.

Kan Chen, Siyu Heng, Qi Long, Bo Zhang

One central goal of design of observational studies is to embed non-experimental data into an approximate randomized controlled trial using statistical matching. Despite empirical researchers' best intention and effort to create high-quality matched samples, residual imbalance due to observed covariates not being well matched often persists. Although statistical tests have been developed to test the randomization assumption and its implications, few provide a means to quantify the level of residual confounding due to observed covariates not being well matched in matched samples...

37334200

2023: Journal of Computational and Graphical Statistics

#20

JOURNAL ARTICLE

Fast Multilevel Functional Principal Component Analysis.

Erjia Cui, Ruonan Li, Ciprian M Crainiceanu, Luo Xiao

We introduce fast multilevel functional principal component analysis (fast MFPCA), which scales up to high dimensional functional data measured at multiple visits. The new approach is orders of magnitude faster than and achieves comparable estimation accuracy with the original MFPCA (Di et al., 2009). Methods are motivated by the National Health and Nutritional Examination Survey (NHANES), which contains minute-level physical activity information of more than 10000 participants over multiple days and 1440 observations per day...

37313008

2023: Journal of Computational and Graphical Statistics

Use the journals feature with a free QxMD account.

Journal of Computational and Graphical Statistics

Save your favorite articles in one place with a free QxMD account.

Read

Search Tips