A Projection-Based Feature Screening Method for Multiple Responses

Feng Zou, Hengjian Cui

Acta Mathematica Sinica, Chinese Series ›› 2025, Vol. 68 ›› Issue (1) : 1-29.

PDF(897 KB)
PDF(897 KB)
Acta Mathematica Sinica, Chinese Series ›› 2025, Vol. 68 ›› Issue (1) : 1-29. DOI: 10.12386/A20230182

A Projection-Based Feature Screening Method for Multiple Responses

  • Feng Zou1, Hengjian Cui2
Author information +
History +

Abstract

In this paper, a nonnegative projection correlation coefficient (NPCC) is proposed to measure the dependence between two random vectors, where the projection direction comes from the standard multivariate normal distribution. The NPCC is nonnegative and is zero if and only if the two random vectors are independent. Also, its estimation is free of tuning parameters and does not require any moment conditions on the random vectors. Based on the NPCC, we further propose a novel feature screening procedure for ultrahigh dimensional data, which is robust, model-free and enjoys both sure screening and rank consistency properties under weak assumptions. Monte Carlo simulation studies indicate that the NPCC-based screening procedure have strong competitive advantages over the existing methods. Lastly, we also use a real data example to illustrate the application of the proposed procedure.

Key words

projection correlation coefficient / dependence measure / sure screening property / rank consistency

Cite this article

Download Citations
Feng Zou , Hengjian Cui. A Projection-Based Feature Screening Method for Multiple Responses. Acta Mathematica Sinica, Chinese Series, 2025, 68(1): 1-29 https://doi.org/10.12386/A20230182

References

[1] Chang J. Y., Tang C. Y., Wu Y. C., Local independence feature screening for nonparametric and semiparametric models by marginal empirical likelihood, Annals of Statistics, 2016, 44(2): 515-539.
[2] Chen L. S., Paul D., Prentice R. L., et al., A regularized Hotelling’s T2 test for pathway analysis in proteomic studies, Journal of the American Statistical Association, 2011, 106(496): 1345-1360.
[3] Chen M., Lian Y. M., Chen Z., et al., Sure explained variability and independence screening, Journal of Nonparametric Statistics, 2017, 29(4): 849-883.
[4] Chin K., Devries S., Fridlyand J., et al., Genomic and transcriptional aberrations linked to breast cancer pathophysiologies, Cancer Cell, 2006, 10: 529-541.
[5] Cui H. J., Li R. Z., Zhong W., Model-free feature screening for ultrahigh dimensional discriminant analysis, Journal of the American Statistical Association, 2015, 110(510): 630-641.
[6] Cui H. J., Zou F., Ling L., Feature screening and error variance estimation for ultrahigh dimensional linear model with measurement errors, Communication in Mathematics and Statistics, 2023, https://doi.org/10.1007/s40304-022-00317-3.
[7] Dunford N., Schwartz J. T., Linear Operators, Wiley, New York, 1963.
[8] Fan J. Q., Feng Y., Song R., Nonparametric independence screening in sparse ultra-high dimensional additive models, Journal of the American Statistical Association, 2011, 106(494): 544-557.
[9] Fan J. Q., Lv J. C., Sure independence screening for ultra-high dimensional feature space, Journal of the Royal Statistical Society: Series B, 2008, 70(5): 849-911.
[10] Fan J. Q., Li R. Z., Variable selection via nonconvave penalized likelihood and its oracle properties, Journal of the American Statistical Association, 2001, 96(456): 1348-1346.
[11] Fan J. Q., Song R., Sure independence screening in generalized linear models with NP-dimensionality, Annals of Statistics, 2010, 38(6): 3567-3604.
[12] Fan J. Q., Samworth R., Wu Y. C., Ultrahigh dimensional feature selection: beyond the linear model, Journal of Machine Learning Research, 2009, 10: 2013-2038.
[13] Guo X., Li R. Z., Liu W. J., et al., Stable correlation and robust feature screening, Science China Mathematics, 2022, 65(1): 153-168.
[14] Gupta S. S., Probability integrals of multivariate normal and multivariate t, Annals of Mathematical Statistics, 1963, 34: 792-828.
[15] He D., Zhou Y., Zou H., On sure screening with multiple responses, Statistica Sinica, 2021, 31(4): 1749-1777.
[16] He X. M., Wang L., Hong H. G., Quantile-adaptive model-free variable screening for high-dimensional heterogeneous data, Annals of Statistics, 2013, 41(1): 342-369.
[17] Huang J., Horowitz J., Ma S. G., Asymptotic properties of bridge estimators in sparse high-dimensional regression models, Annals of Statistics, 2008, 36: 587-613.
[18] Jiang Z. Z., Guo H. P., Wang J. J., Feature screening for multiple responses, Journal of Multivariate Analysis, 2023, 198: 105223.
[19] Kauraniemi P., Kuukasjärvi T., Sauter G., et al., Amplification of a 280-kilobase core region at the ERBB2 locus leads to activation of two hypothetical proteins in breast cancer, American Journal of Pathology, 2003, 163: 1979-1984.
[20] Li G. R., Peng H., Zhang J., et al., Robust rank correlation based screening, Annals of Statistics, 2012, 40(3): 1846-1877.
[21] Li R. Z., Zhong W., Zhu L. P., Feature screening via distance correlation learning, Journal of the American Statistical Association, 2012, 107(499): 1129-1139.
[22] Li X. X., Cheng G. S., Wang L. M., et al., Ultrahigh dimensional feature screening via projection, Computational Statistics & Data Analysis, 2017, 114: 88-104.
[23] Li Z. M., Zhang Y. W., On a projective ensemble approach to two sample test for equality of distributions, The 37th International Conference on Machine Learning, 2020, 119: 6020-6027.
[24] Liu J. Y., Li R. Z., Wu R. L., Feature selection for varying coefficient models with ultrahigh dimensional covariates, Journal of the American Statistical Association, 2014, 109(505): 266-274.
[25] Liu W. J., Ke Y., Liu J. Y., et al., Model-free feature screening and FDR control with knockoff features, Journal of the American Statistical Association, 2022, 117(537): 428-443.
[26] Mai Q., Zou H., The fused Kolmogorov filter: A nonparametric model-free screening method, Annals of Statistics, 2015, 43(4): 1471-1497.
[27] Niu Y., Li H. P., Liu Y. H., et al. Overview of feature screening methods for ultrahigh dimensional data, Chinese Journal of Applied Probability and Statistics, 2021, 37(1): 69-110(in Chinese).
[28] Pan W. L., Wang X. Q., Xiao W. N., et al., A generic sure independence screening procedure, Journal of the American Statistical Association, 2019, 114(526): 928-937.
[29] Ross J. S., Fletcher J. A., The HER-2/neu oncogene: prognostic factor, predictive factor and target for therapy, Seminars in Cancer Biology, 1999, 9: 125-138.
[30] Samorodnitsky G., Taqqu M. S., Stable Non-Gaussian Random Processes: Stochastic Models with Infinite Variance, Chapman & Hall, New York, 1994.
[31] Serfling R. J., Approximation Theorems of Mathematical Statistics, Wiley, New York, 1980.
[32] Székely G. J., Rizzo M. L., Bakirov N. K., Measuring and testing dependence by correlation of distances, Annals of Statistics, 2007, 35(6): 2769-2794.
[33] Tibshirani R., Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B, 1996, 58(1): 267-288.
[34] Wang H. S., Forward regression for ultra-high dimensional variable screening, Journal of the American Statistical Association, 2009, 104(88): 1512-1524.
[35] Witten D. M., Tibshirani R., Hastie T. J., A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis, Biostatistics, 2009, 10: 515-534.
[36] Xu K., Shen Z. L., Huang X. D., et al., Projection correlation between scalar and vector variables and its use in feature screening with multi-response data, Journal of Statistical Computation and Simulation, 2020, 90(11): 1923-1942.
[37] Zhang C. H., Nearly unbiased variable selection under minimax concave penalty, Annals of Statistics, 2010, 38(2): 894-942.
[38] Zhu L. P., Li L. X., Li R. Z., et al., Model-free feature screening for ultrahigh-dimensional data, Journal of the American Statistical Association, 2011, 106(496): 1464-1475.
[39] Zhu L. P., Xu K., Li R. Z., et al., Projection correlation between two random vectors, Biometrika, 2017, 104(4): 829-843.
PDF(897 KB)

1039

Accesses

0

Citation

Detail

Sections
Recommended

/