Skip to contents

Tune the regularization parameter for an angle-based large-margin classifier by the ET-Lasso method (Yang, et al., 2019).

Usage

et.abclass(
  x,
  y,
  loss = c("logistic", "boost", "hinge.boost", "lum"),
  penalty = c("glasso", "lasso"),
  weights = NULL,
  offset = NULL,
  intercept = TRUE,
  control = list(),
  nstages = 2L,
  nfolds = 0L,
  stratified = TRUE,
  alignment = c("fraction", "lambda"),
  refit = FALSE,
  ...
)

Arguments

x

A numeric matrix representing the design matrix. No missing valus are allowed. The coefficient estimates for constant columns will be zero. Thus, one should set the argument intercept to TRUE to include an intercept term instead of adding an all-one column to x.

y

An integer vector, a character vector, or a factor vector representing the response label.

loss

A character value specifying the loss function. The available options are "logistic" for the logistic deviance loss, "boost" for the exponential loss approximating Boosting machines, "hinge.boost" for hybrid of SVM and AdaBoost machine, and "lum" for largin-margin unified machines (LUM). See Liu, et al. (2011) for details.

penalty

A character vector specifying the name of the penalty.

weights

A numeric vector for nonnegative observation weights. Equal observation weights are used by default.

offset

An optional numeric matrix for offsets of the decision functions.

intercept

A logical value indicating if an intercept should be considered in the model. The default value is TRUE and the intercept is excluded from regularization.

control

A list of control parameters. See abclass.control() for details.

nstages

A positive integer specifying for the number of stages in the ET-Lasso procedure. By default, two rounds of tuning by random permutations will be performed as suggested in Yang, et al. (2019).

nfolds

A positive integer specifying the number of folds for cross-validation. Five-folds cross-validation will be used by default. An error will be thrown out if the nfolds is specified to be less than 2.

stratified

A logical value indicating if the cross-validation procedure should be stratified by the response label. The default value is TRUE to ensure the same number of categories be used in validation and training.

alignment

A character vector specifying how to align the lambda sequence used in the main fit with the cross-validation fits. The available options are "fraction" for allowing cross-validation fits to have their own lambda sequences and "lambda" for using the same lambda sequence of the main fit. The option "lambda" will be applied if a meaningful lambda is specified. The default value is "fraction".

refit

A logical value indicating if a new classifier should be trained using the selected predictors or a named list that will be passed to abclass.control() to specify how the new classifier should be trained.

...

Other control parameters passed to abclass.control().

Value

An S3 object of class et.abclass and abclass.

Details

The ET-Lasso procedure is intended for tuning the lambda parameter solely. The arguments regarding cross-validation, nfolds, stratified, and alignment, allow one to estimate the prediction accuracy by cross-validation for the model estimates resulted from the ET-Lasso procedure, which can be helpful for one to choose other tuning parameters (e.g., alpha).

References

Yang, S., Wen, J., Zhan, X., & Kifer, D. (2019). ET-Lasso: A new efficient tuning of lasso-type regularization for high-dimensional data. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (pp. 607–616).