Most recent papers in the journal Statistica Sinica

#1

JOURNAL ARTICLE

Use of random integration to test equality of high dimensional covariance matrices.

Yunlu Jiang, Canhong Wen, Yukang Jiang, Xueqin Wang, Heping Zhang

Testing the equality of two covariance matrices is a fundamental problem in statistics, and especially challenging when the data are high-dimensional. Through a novel use of random integration, we can test the equality of high-dimensional covariance matrices without assuming parametric distributions for the two underlying populations, even if the dimension is much larger than the sample size. The asymptotic properties of our test for arbitrary number of covariates and sample size are studied in depth under a general multivariate model...

37799490

October 2023: Statistica Sinica

#2

JOURNAL ARTICLE

Globally Adaptive Longitudinal Quantile Regression with High Dimensional Compositional Covariates.

Huijuan Ma, Qi Zheng, Zhumin Zhang, Huichuan Lai, Limin Peng

In this work, we propose a longitudinal quantile regression framework that enables a robust characterization of heterogeneous covariate-response associations in the presence of high-dimensional compositional covariates and repeated measurements of both response and covariates. We develop a globally adaptive penalization procedure, which can consistently identify covariate sparsity patterns across a continuum set of quantile levels. The proposed estimation procedure properly aggregates longitudinal observations over time, and ensures the satisfaction of the sum-zero coefficient constraint that is needed for proper interpretation of the effects of compositional covariates...

37483468

May 2023: Statistica Sinica

#3

JOURNAL ARTICLE

An Efficient Greedy Search Algorithm for High-dimensional Linear Discriminant Analysis.

Hannan Yang, D Y Lin, Quefeng Li

High-dimensional classification is an important statistical problem that has applications in many areas. One widely used classifier is the Linear Discriminant Analysis (LDA). In recent years, many regularized LDA classifiers have been proposed to solve the problem of high-dimensional classification. However, these methods rely on inverting a large matrix or solving large-scale optimization problems to render classification rules-methods that are computationally prohibitive when the dimension is ultra-high. With the emergence of big data, it is increasingly important to develop more efficient algorithms to solve the high-dimensional LDA problem...

37455685

May 2023: Statistica Sinica

#4

JOURNAL ARTICLE

Marginal Bayesian Posterior Inference using Recurrent Neural Networks with Application to Sequential Models.

Thayer Fisher, Alex Luedtke, Marco Carone, Noah Simon

In Bayesian data analysis, it is often important to evaluate quantiles of the posterior distribution of a parameter of interest (e.g., to form posterior intervals). In multi-dimensional problems, when non-conjugate priors are used, this is often difficult generally requiring either an analytic or sampling-based approximation, such as Markov chain Monte-Carlo (MCMC), Approximate Bayesian computation (ABC) or variational inference. We discuss a general approach that reframes this as a multi-task learning problem and uses recurrent deep neural networks (RNNs) to approximately evaluate posterior quantiles...

37409184

May 2023: Statistica Sinica

#5

JOURNAL ARTICLE

HETEROGENEITY ANALYSIS VIA INTEGRATING MULTI-SOURCES HIGH-DIMENSIONAL DATA WITH APPLICATIONS TO CANCER STUDIES.

Tingyan Zhong, Qingzhao Zhang, Jian Huang, Mengyun Wu, Shuangge Ma

This study has been motivated by cancer research, in which heterogeneity analysis plays an important role and can be roughly classified as unsupervised or supervised. In supervised heterogeneity analysis, the finite mixture of regression (FMR) technique is used extensively, under which the covariates affect the response differently in subgroups. High-dimensional molecular and, very recently, histopathological imaging features have been analyzed separately and shown to be effective for heterogeneity analysis...

38037567

April 2023: Statistica Sinica

#6

JOURNAL ARTICLE

Sieve estimation of a class of partially linear transformation models with interval-censored competing risks data.

Xuewen Lu, Yan Wang, Dipankar Bandyopadhyay, Giorgos Bakoyannis

In this paper, we consider a class of partially linear transformation models with interval-censored competing risks data. Under a semiparametric generalized odds rate specification for the cause-specific cumulative incidence function, we obtain optimal estimators of the large number of parametric and nonparametric model components via maximizing the likelihood function over a joint B-spline and Bernstein polynomial spanned sieve space. Our specification considers a relatively simpler finite-dimensional parameter space, approximating the infinite-dimensional parameter space as n → ∞, thereby allowing us to study the almost sure consistency, and rate of convergence for all parameters, and the asymptotic distributions and efficiency of the finite-dimensional components...

37234206

April 2023: Statistica Sinica

#7

JOURNAL ARTICLE

PENALIZED REGRESSION FOR MULTIPLE TYPES OF MANY FEATURES WITH MISSING DATA.

Kin Yau Wong, Donglin Zeng, D Y Lin

Recent technological advances have made it possible to measure multiple types of many features in biomedical studies. However, some data types or features may not be measured for all study subjects because of cost or other constraints. We use a latent variable model to characterize the relationships across and within data types and to infer missing values from observed data. We develop a penalized-likelihood approach for variable selection and parameter estimation and devise an efficient expectation-maximization algorithm to implement our approach...

37197479

April 2023: Statistica Sinica

#8

JOURNAL ARTICLE

HIGH-DIMENSIONAL FACTOR REGRESSION FOR HETEROGENEOUS SUBPOPULATIONS.

Peiyao Wang, Quefeng Li, Dinggang Shen, Yufeng Liu

In modern scientific research, data heterogeneity is commonly observed owing to the abundance of complex data. We propose a factor regression model for data with heterogeneous subpopulations. The proposed model can be represented as a decomposition of heterogeneous and homogeneous terms. The heterogeneous term is driven by latent factors in different subpopulations. The homogeneous term captures common variation in the covariates and shares common regression coefficients across subpopulations. Our proposed model attains a good balance between a global model and a group-specific model...

37854586

January 2023: Statistica Sinica

#9

JOURNAL ARTICLE

Interval estimation for operating characteristic of continuous biomarkers with controlled sensitivity or specificity.

Yijian Huang, Isaac Parakati, Dattatraya H Patil, Martin G Sanda

The receiver operating characteristic (ROC) curve provides a comprehensive performance assessment of a continuous biomarker over the full threshold spectrum. Nevertheless, a medical test often dictates to operate at a certain high level of sensitivity or specificity. A diagnostic accuracy metric directly targeting the clinical utility is specificity at the controlled sensitivity level, or vice versa. While the empirical point estimation is readily adopted in practice, the nonparametric interval estimation is challenged by the fact that the variance involves density functions due to estimated threshold...

37193541

January 2023: Statistica Sinica

#10

JOURNAL ARTICLE

An Online Projection Estimator for Nonparametric Regression in Reproducing Kernel Hilbert Spaces.

Tianyu Zhang, Noah Simon

The goal of nonparametric regression is to recover an underlying regression function from noisy observations, under the assumption that the regression function belongs to a prespecified infinite-dimensional function space. In the online setting, in which the observations come in a stream, it is generally computationally infeasible to refit the whole model repeatedly. As yet, there are no methods that are both computationally efficient and statistically rate optimal. In this paper, we propose an estimator for online nonparametric regression...

37153711

January 2023: Statistica Sinica

#11

JOURNAL ARTICLE

Feature-weighted elastic net: using "features of features" for better prediction.

J Kenneth Tay, Nima Aghaeepour, Trevor Hastie, Robert Tibshirani

In some supervised learning settings, the practitioner might have additional information on the features used for prediction. We propose a new method which leverages this additional information for better prediction. The method, which we call the feature-weighted elastic net ("fwelnet"), uses these "features of features" to adapt the relative penalties on the feature coefficients in the elastic net penalty. In our simulations, fwelnet outperforms the lasso in terms of test mean squared error and usually gives an improvement in true positive rate or false positive rate for feature selection...

37102071

January 2023: Statistica Sinica

#12

JOURNAL ARTICLE

Robust Inference for Partially Observed Functional Response Data.

Yeonjoo Park, Xiaohui Chen, Douglas G Simpson

Irregular functional data in which densely sampled curves are observed over different ranges pose a challenge for modeling and inference, and sensitivity to outlier curves is a concern in applications. Motivated by applications in quantitative ultrasound signal analysis, this paper investigates a class of robust M-estimators for partially observed functional data including functional location and quantile estimators. Consistency of the estimators is established under general conditions on the partial observation process...

36353392

October 2022: Statistica Sinica

#13

JOURNAL ARTICLE

Prior Knowledge Guided Ultra-high Dimensional Variable Screening with Application to Neuroimaging Data.

Jie He, Jian Kang

Variable screening is a powerful and efficient tool for dimension reduction under ultrahigh dimensional settings. However, most existing methods overlook useful prior knowledge in specific applications. In this work, from a Bayesian modeling perspective, we develop a unified variable screening procedure for the linear regression model. We discuss different constructions of posterior mean screening (PMS) statistics to incorporate different types of prior knowledge according to specific applications. With non-informative prior specifications, PMS is equivalent to high-dimensional ordinary least-square projections (HOLP)...

36052338

October 2022: Statistica Sinica

#14

JOURNAL ARTICLE

STATISTICAL INFERENCE IN QUANTILE REGRESSION FOR ZERO-INFLATED OUTCOMES.

Wodan Ling, Bin Cheng, Ying Wei, Joshua Z Willey, Ying Kuen Cheung

An extension of quantile regression is proposed to model zero-inflated outcomes, which have become increasingly common in biomedical studies. The method is flexible enough to depict complex and nonlinear associations between the covariates and the quantiles of the outcome. We establish the theoretical properties of the estimated quantiles, and develop inference tools to assess the quantile effects. Extensive simulation studies indicate that the novel method generally outperforms existing zero-inflated approaches and the direct quantile regression in terms of the estimation and inference of the heterogeneous effect of the covariates...

36349247

July 2022: Statistica Sinica

#15

JOURNAL ARTICLE

A spline-based nonparametric analysis for interval-censored bivariate survival data.

Yuan Wu, Ying Zhang, Junyi Zhou

In this manuscript we propose a spline-based sieve nonparametric maximum likelihood estimation method for joint distribution function with bivariate interval-censored data. We study the asymptotic behavior of the proposed estimator by proving the consistency and deriving the rate of convergence. Based on the sieve estimate of the joint distribution, we also develop an efficient nonparametric test for making inference about the dependence between two interval-censored event times and establish its asymptotic normality...

35795611

July 2022: Statistica Sinica

#16

JOURNAL ARTICLE

Maximum Likelihood Estimation for Cox Proportional Hazards Model with a Change Hyperplane.

Yu Deng, Jianwen Cai, Donglin Zeng

We propose a Cox proportional hazards model with a change hyperplane to allow the effect of risk factors to differ depending on whether a linear combination of baseline covariates exceeds a threshold. The proposed model is a natural extension of the change-point hazards model. We maximize the partial likelihood function for estimation and suggest an m -out-of- n bootstrapping procedure for inference. We establish the asymptotic distribution of the estimators and show that the estimators for the change hyperplane converge in distribution to an integrated composite Poisson process defined on a multidimensional space...

35431516

April 2022: Statistica Sinica

#17

JOURNAL ARTICLE

SEMIPARAMETRIC DOSE FINDING METHODS FOR PARTIALLY ORDERED DRUG COMBINATIONS.

Matthieu Clertant, Nolan A Wages, John O'Quigley

We investigate a statistical framework for Phase I clinical trials that test the safety of two or more agents in combination. For such studies, the traditional assumption of a simple monotonic relation between dose and the probability of an adverse event no longer holds. Nonetheless, the dose toxicity (adverse event) relationship will obey an assumption of partial ordering in that there will be pairs of combinations for which the ordering of the toxicity probabilities is known. Some authors have considered how to best estimate the maximum tolerated dose (a dose providing a rate of toxicity as close as possible to some target rate) in this setting...

36643072

2022: Statistica Sinica

#18

JOURNAL ARTICLE

Robust inference of conditional average treatment effects using dimension reduction.

Ming-Yueh Huang, Shu Yang

Personalized treatment aims at tailoring treatments to individual characteristics. An important step is to understand how a treatment effect varies across individual characteristics, known as the conditional average treatment effect (CATE). In this study, we make robust inferences of the CATE from observational data, which becomes challenging with a multivariate confounder. To reduce the curse of dimensionality, while keeping the nonparametric advantages, we propose double dimension reductions that achieve different goal...

36415324

2022: Statistica Sinica

#19

JOURNAL ARTICLE

Hypothesis Testing for Network Data with Power Enhancement.

Yin Xia, Lexin Li

Comparing two population means of network data is of paramount importance in a wide range of scientific applications. Numerous existing network inference solutions focus on global testing of entire networks, without comparing individual network links. The observed data often take the form of vectors or matrices, and the problem is formulated as comparing two covariance or precision matrices under a normal or matrix normal distribution. Moreover, many tests suffer from a limited power under a small sample size...

35002179

2022: Statistica Sinica

#20

JOURNAL ARTICLE

Exchangeable Markov multi-state survival processes.

Walter Dempsey

We consider exchangeable Markov multi-state survival processes , which are temporal processes taking values over a state-space <mml:math xmlns:mml="https://www.w3.org/1998/Math/MathML"><mml:mi>S</mml:mi></mml:math> , with at least one absorbing failure state <mml:math xmlns:mml="https://www.w3.org/1998/Math/MathML"><mml:mrow><mml:mi>b</mml:mi> <mml:mo>∈</mml:mo> <mml:mi>S</mml:mi></mml:mrow> </mml:math> that satisfy the natural invariance properties of exchangeability and consistency under subsampling...

34707337

October 2021: Statistica Sinica

Use the journals feature with a free QxMD account.

Statistica Sinica

Save your favorite articles in one place with a free QxMD account.

Read

Search Tips