Amemiya, T. (1976). Selection of Regressors. Technical Report 225, Stanford University, Stanford, CA.
Andrew, G., and Gao, J. (2007). “Scalable Training of -Regularized Log-Linear Models.” In Proceedings of the 24th International Conference on Machine Learning, 33–40. Corvallis, OR: ACM.
Boyd, S., Parikh, N., Chu, E., Peleato, B., and Eckstein, J. (2011). “Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers.” Foundations and Trends in Machine Learning 2:1–122.
Browne, M. W. (1982). “Covariance Structures.” In Topics in Applied Multivariate Analysis, edited by D. M. Hawkins, 72–141. Cambridge: Cambridge University Press.
Burnham, K. P., and Anderson, D. R. (2002). Model Selection and Multimodel Inference. 2nd ed. New York: Springer-Verlag.
Darlington, R. B. (1968). “Multiple Regression in Psychological Research and Practice.” Psychological Bulletin 69:161–182.
Dennis, J. E., Gay, D. M., and Welsch, R. E. (1981). “An Adaptive Nonlinear Least-Squares Algorithm.” ACM Transactions on Mathematical Software 7:348–368.
Dennis, J. E., and Mei, H. H. W. (1979). “Two New Unconstrained Optimization Algorithms Which Use Function and Gradient Values.” Journal of Optimization Theory and Applications 28:453–482.
Edwards, D., and Berry, J. J. (1987). “The Efficiency of Simulation-Based Multiple Comparisons.” Biometrics 43:913–928.
Efron, B., Hastie, T. J., Johnstone, I. M., and Tibshirani, R. (2004). “Least Angle Regression.” Annals of Statistics 32:407–499. With discussion.
Eskow, E., and Schnabel, R. B. (1991). “Algorithm 695: Software for a New Modified Cholesky Factorization.” ACM Transactions on Mathematical Software 17:306–312.
Fletcher, R. (1987). Practical Methods of Optimization. 2nd ed. Chichester, UK: John Wiley & Sons.
Furnival, G. M., and Wilson, R. W. (1974). “Regression by Leaps and Bounds.” Technometrics 16:499–511.
Gay, D. M. (1983). “Subroutines for Unconstrained Minimization.” ACM Transactions on Mathematical Software 9:503–524.
Hastie, T. J., Tibshirani, R. J., and Friedman, J. H. (2001). The Elements of Statistical Learning: Data Mining, Inference, and Prediction. New York: Springer-Verlag.
Hsu, J. C. (1992). “The Factor Analytic Approach to Simultaneous Inference in the General Linear Model.” Journal of Computational and Graphical Statistics 1:151–168.
Hsu, J. C., and Nelson, B. L. (1998). “Multiple Comparisons in the General Linear Model.” Journal of Computational and Graphical Statistics 7:23–41.
Judge, G. G., Griffiths, W. E., Hill, R. C., and Lee, T.-C. (1980). The Theory and Practice of Econometrics. New York: John Wiley & Sons.
Kramer, C. Y. (1956). “Extension of Multiple Range Tests to Group Means with Unequal Numbers of Replications.” Biometrics 12:307–310.
Lawless, J. F., and Singhal, K. (1978). “Efficient Screening of Nonnormal Regression Models.” Biometrics 34:318–327.
Littell, R. C., Freund, R. J., and Spector, P. C. (1991). SAS System for Linear Models. 3rd ed. Cary, NC: SAS Institute Inc.
Liu, H., Yao, T., and Li, R. (2016). “Global Solutions to Folded Concave Penalized Nonconvex Learning.” Annals of Statistics 44:629–659.
Moré, J. J. (1978). “The Levenberg-Marquardt Algorithm: Implementation and Theory.” In Lecture Notes in Mathematics, vol. 30, edited by G. A. Watson, 105–116. Berlin: Springer-Verlag.
Moré, J. J., and Sorensen, D. C. (1983). “Computing a Trust-Region Step.” SIAM Journal on Scientific and Statistical Computing 4:553–572.
Muller, K. E., and Fetterman, B. A. (2002). Regression and ANOVA: An Integrated Approach Using SAS Software. Cary, NC: SAS Institute Inc.
Nelson, P. R. (1982). “Exact Critical Points for the Analysis of Means.” Communications in Statistics—Theory and Methods 11:699–709.
Nelson, P. R. (1991). “Numerical Evaluation of Multivariate Normal Integrals with Correlations .” In Frontiers of Statistical Scientific Theory and Industrial Applications: Proceedings of the ICOSCO I Conference, edited by A. Öztürk and E. C. van der Meulen, 97–114. Columbus, OH: American Sciences Press.
Nelson, P. R. (1993). “Additional Uses for the Analysis of Means and Extended Tables of Critical Values.” Technometrics 35:61–71.
Nesterov, Y. (2013). “Gradient Methods for Minimizing Composite Objective Function.” Mathematical Programming 140:125–161.
Osborne, M. R., Presnell, B., and Turlach, B. A. (2000). “A New Approach to Variable Selection in Least Squares Problems.” IMA Journal of Numerical Analysis 20:389–404.
Ott, E. R. (1967). “Analysis of Means: A Graphical Procedure.” Industrial Quality Control 24:101–109. Reprinted in Journal of Quality Technology 15 (1983): 10–18.
Schabenberger, O., Gregoire, T. G., and Kong, F. (2000). “Collections of Simple Effects and Their Relationship to Main Effects and Interactions in Factorials.” American Statistician 54:210–214.
Searle, S. R. (1971). Linear Models. New York: John Wiley & Sons.
Stein, C. (1960). “Multiple Regression.” In Contributions to Probability and Statistics: Essays in Honor of Harold Hotelling, edited by I. Olkin, S. G. Ghurye, W. Hoeffding, W. G. Madow, and H. B. Mann, 264–305. Stanford, CA: Stanford University Press.
Tibshirani, R. (1996). “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society, Series B 58:267–288.
Westfall, P. H., Tobias, R. D., Rom, D., Wolfinger, R. D., and Hochberg, Y. (1999). Multiple Comparisons and Multiple Tests Using the SAS System. Cary, NC: SAS Institute Inc.
Westfall, P. H., and Young, S. S. (1993). Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment. New York: John Wiley & Sons.
Winer, B. J. (1971). Statistical Principles in Experimental Design. 2nd ed. New York: McGraw-Hill.
Yuan, M., and Lin, L. (2006). “Model Selection and Estimation in Regression with Grouped Variables.” Journal of the Royal Statistical Society, Series B 68:49–67.
Zou, H. (2006). “The Adaptive Lasso and Its Oracle Properties.” Journal of the American Statistical Association 101:1418–1429.
Zou, H., and Hastie, T. (2005). “Regularization and Variable Selection via the Elastic Net.” Journal of the Royal Statistical Society, Series B 67:301–320.
Zou, H., and Zhang, H. H. (2009). “On the Adaptive Elastic-Net with a Diverging Number of Parameters.” Annals of Statistics 37:1733–1751.