Shared Concepts

Folded Concave Penalized Selection

This section applies to the glm action in the regression action set.

The folded concave penalized (FCP) selection coefficients bold-italic beta equals left-parenthesis beta 1 comma beta 2 comma ellipsis comma beta Subscript m Baseline right-parenthesis Superscript upper T are the solution to the regularized least squares

min StartFraction 1 Over 2 n EndFraction StartAbsoluteValue EndAbsoluteValue bold y minus bold upper X bold-italic beta StartAbsoluteValue EndAbsoluteValue squared plus sigma-summation Underscript j equals 1 Overscript m Endscripts upper P Subscript lamda comma alpha Baseline left-parenthesis StartAbsoluteValue beta Subscript j Baseline EndAbsoluteValue right-parenthesis

where upper P Subscript lamda comma alpha Baseline left-parenthesis dot right-parenthesis is a folded concave penalty.

A variety of nonconvex penalties have been proposed. Two of the earliest and most influential nonconvex penalties are the smoothly clipped absolute deviation (SCAD) and the minimax concave penalty (MCP).

SCAD

When the method subparameter value is 'scad', the model selection action performs the SCAD selection method, which minimizes ordinary least squares plus the smoothly clipped absolute deviation (SCAD) function upper P Subscript lamda comma alpha Superscript scad Baseline left-parenthesis dot right-parenthesis:

upper P Subscript lamda comma alpha Superscript scad Baseline left-parenthesis theta right-parenthesis equals StartLayout Enlarged left-brace 1st Row 1st Column lamda theta 2nd Column if 3rd Column 0 less-than-or-equal-to theta less-than-or-equal-to lamda 2nd Row 1st Column StartFraction negative 1 Over 2 left-parenthesis alpha minus 1 right-parenthesis EndFraction left-parenthesis theta squared minus 2 alpha lamda theta plus lamda squared right-parenthesis 2nd Column if 3rd Column lamda less-than theta less-than alpha lamda 3rd Row 1st Column StartFraction alpha plus 1 Over 2 EndFraction lamda squared 2nd Column if 3rd Column theta greater-than alpha lamda EndLayout

The quadratic program (QP) reformulation of the SCAD problem is given by

where bold 0 equals left-parenthesis 0 comma ellipsis comma 0 right-parenthesis Superscript upper T and bold 1 equals left-parenthesis 1 comma ellipsis comma 1 right-parenthesis Superscript upper T.

Liu, Yao, and Li (2016) show that the preceding QP (1) is equivalent to the following mixed integer linear program (MILP):

where script upper M greater-than 0 is a properly large constant.

MCP

When the method subparameter value is 'mcp', the model selection action performs the MCP selection method, which minimizes ordinary least squares plus the minimax concave penalty (MCP) function upper P Subscript lamda comma alpha Superscript mcp Baseline left-parenthesis dot right-parenthesis:

upper P Subscript lamda comma alpha Superscript mcp Baseline left-parenthesis theta right-parenthesis equals StartLayout Enlarged left-brace 1st Row 1st Column lamda theta minus StartFraction 1 Over 2 alpha EndFraction theta squared 2nd Column if 3rd Column 0 less-than-or-equal-to theta less-than-or-equal-to alpha lamda 2nd Row 1st Column StartFraction alpha Over 2 EndFraction lamda squared 2nd Column if 3rd Column theta greater-than alpha lamda EndLayout

The QP reformulation of the MCP problem is given by

Liu, Yao, and Li (2016) show that the preceding QP (3) is equivalent to the following MILP:

where script upper M greater-than 0 is a properly large constant.

SOLVER

When the solver subparameter value is 'MILP' in the fcpSelectionOptions subparameter, the coefficients bold-italic beta are obtained by solving the minimization problem (2) or (4). Furthermore, you can also specify the bigM subparameter in the fcpSelectionOptions subparameter for the script upper M value and the intTol subparameter in the fcpSelectionOptions subparameter for the tolerance of integer variables left-brace bold z Subscript k Baseline right-brace Subscript k equals 1 Superscript 4. By default, script upper M equals 2 StartAbsoluteValue bold y Superscript upper T Baseline bold upper X EndAbsoluteValue Subscript normal infinity and intTol=1E–7.

When the solver subparameter value is 'NLP' in the fcpSelectionOptions subparameter, the coefficients bold-italic beta are obtained by solving the minimization problem (1) or (3).

LAMBDAGRID

When you search for the optimal model, the default values of lamda Subscript max and lamda Subscript min are, respectively,

lamda Subscript max Baseline equals sigma Subscript y Baseline StartRoot log left-parenthesis m right-parenthesis slash m EndRoot comma lamda Subscript min Baseline equals StartFraction sigma Subscript y Baseline Over 20 StartRoot log left-parenthesis m right-parenthesis EndRoot EndFraction

where sigma Subscript y is the standard deviation of bold y, m is the length of bold-italic beta, and log is the natural logarithm.

Suppose that the number of steps in the search for a suitable lamda value is n Subscript lamda. You can define the value by specifying the maxIterLambda subparameter. When you specify 'LINSPACE' for the lambdaGrid subparameter, the values of lamda series are generated as follows:

lamda Subscript i Superscript linspace Baseline equals lamda Subscript min Baseline plus left-parenthesis i minus 1 right-parenthesis times delta Subscript lamda Superscript linspace Baseline comma i equals 1 comma 2 comma ellipsis comma n Subscript lamda Baseline

where delta Subscript lamda Superscript linspace Baseline equals left-parenthesis lamda Subscript max Baseline minus lamda Subscript min Baseline right-parenthesis slash left-parenthesis n Subscript lamda Baseline minus 1 right-parenthesis.

When the lambdaGrid subparameter value is 'LOGSPACE' in the fcpSelectionOptions subparameter, the values of lamda series are generated as follows:

log Subscript 10 Baseline left-parenthesis lamda Subscript i Superscript logspace Baseline right-parenthesis equals log Subscript 10 Baseline left-parenthesis lamda Subscript min Baseline right-parenthesis plus left-parenthesis i minus 1 right-parenthesis times delta Subscript lamda Superscript logspace Baseline comma i equals 1 comma 2 comma ellipsis comma n Subscript lamda Baseline

where delta Subscript lamda Superscript logspace Baseline equals left-parenthesis log Subscript 10 Baseline left-parenthesis lamda Subscript max Baseline right-parenthesis minus log Subscript 10 Baseline left-parenthesis lamda Subscript min Baseline right-parenthesis right-parenthesis slash left-parenthesis n Subscript lamda Baseline minus 1 right-parenthesis.

Last updated: March 05, 2026