Skip to contents

Experimental implementations of multi-category classifiers with sup-norm penalties proposed by Zhang, et al. (2008) and Li & Zhang (2021).

Usage

supclass(
  x,
  y,
  model = c("logistic", "psvm", "svm"),
  penalty = c("lasso", "scad"),
  start = NULL,
  control = list(),
  ...
)

supclass.control(
  lambda = 0.1,
  adaptive_weight = NULL,
  scad_a = 3.7,
  maxit = 50,
  epsilon = 1e-04,
  shrinkage = 1e-04,
  warm_start = TRUE,
  standardize = TRUE,
  verbose = 0L,
  ...
)

Arguments

x

A numeric matrix representing the design matrix. No missing valus are allowed. The coefficient estimates for constant columns will be zero. Thus, one should set the argument intercept to TRUE to include an intercept term instead of adding an all-one column to x.

y

An integer vector, a character vector, or a factor vector representing the response label.

model

A charactor vector specifying the classification model. The available options are "logistic" for multi-nomial logistic regression model, "psvm" for proximal support vector machine (PSVM), "svm" for multi-category support vector machine.

penalty

A charactor vector specifying the penalty function for the sup-norms. The available options are "lasso" for sup-norm regularization proposed by Zhang et al. (2008) and "scad" for supSCAD regularization proposed by Li & Zhang (2021).

start

A numeric matrix representing the starting values for the quadratic approximation procedure behind the scene.

control

A list with named elements.

...

Optional control parameters passed to the supclass.control().

lambda

A numeric vector specifying the tuning parameter lambda. The default value is 0.1. Users should tune this parameter for a better model fit. The specified lambda will be sorted in decreasing order internally and only the unique values will be kept.

adaptive_weight

A numeric vector or matrix representing the adaptive penalty weights. The default value is NULL for equal weights. Zhang, et al. (2008) proposed two ways to employ the adaptive weights. The first approach applies the weights to the sup-norm of coefficient estimates, while the second approach applies element-wise multiplication to the weights and coefficient estimates inside the sup-norms. The first or second approach will be applied if a numeric vector or matrix is specified, respectively. The adaptive weights are supported for lasso penalty only.

scad_a

A positive number specifying the tuning parameter a in the SCAD penalty.

maxit

A positive integer specifying the maximum number of iteration. The default value is 50 as suggested in Li & Zhang (2021).

epsilon

A positive number specifying the relative tolerance that determines convergence. The default value is 1e-4.

shrinkage

A nonnegative tolerance to shrink estimates with sup-norm close enough to zero (within the specified tolerance) to zeros. The default value is 1e-4. ## @param ridge_lambda The tuning parameter lambda of the ridge penalty used to ## set the (first set of) starting values.

warm_start

A logical value indicating if the estimates from last lambda should be used as the starting values for the next lambda. If FALSE, the user-specified starting values will be used instead.

standardize

A logical value indicating if a standardization procedure should be performed so that each column of the design matrix has mean zero and standardization

verbose

A nonnegative integer specifying if the estimation procedure is allowed to print out intermediate steps/results. The default value is 0 for silent estimation procedure.

Details

For the multinomial logistic model or the proximal SVM model, this function utilizes the function quadprog::solve.QP() to solve the equivalent quadratic problem; For the multi-class SVM, this function utilizes GNU GLPK to solve the equivalent linear programming problem via the package Rglpk. It is recommended to use a recent version of GLPK.

References

Zhang, H. H., Liu, Y., Wu, Y., & Zhu, J. (2008). Variable selection for the multicategory SVM via adaptive sup-norm regularization. Electronic Journal of Statistics, 2, 149--167.

Li, N., & Zhang, H. H. (2021). Sparse learning with non-convex penalty in multi-classification. Journal of Data Science, 19(1), 56--74.

Examples

library(abclass)
set.seed(123)

## toy examples for demonstration purpose
## reference: example 1 in Zhang and Liu (2014)
ntrain <- 100 # size of training set
ntest <- 1000 # size of testing set
p0 <- 2       # number of actual predictors
p1 <- 2       # number of random predictors
k <- 3        # number of categories

n <- ntrain + ntest; p <- p0 + p1
train_idx <- seq_len(ntrain)
y <- sample(k, size = n, replace = TRUE)         # response
mu <- matrix(rnorm(p0 * k), nrow = k, ncol = p0) # mean vector
## normalize the mean vector so that they are distributed on the unit circle
mu <- mu / apply(mu, 1, function(a) sqrt(sum(a ^ 2)))
x0 <- t(sapply(y, function(i) rnorm(p0, mean = mu[i, ], sd = 0.25)))
x1 <- matrix(rnorm(p1 * n, sd = 0.3), nrow = n, ncol = p1)
x <- cbind(x0, x1)
train_x <- x[train_idx, ]
test_x <- x[- train_idx, ]
y <- factor(paste0("label_", y))
train_y <- y[train_idx]
test_y <- y[- train_idx]

## regularization with the supnorm lasso penalty
options("mc.cores" = 1)
model <- supclass(train_x, train_y, model = "psvm", penalty = "lasso")
pred <- predict(model, test_x)
table(test_y, pred)
#>          pred
#> test_y    label_1 label_2 label_3
#>   label_1     317       4       2
#>   label_2       0     327       0
#>   label_3       0       0     350
mean(test_y == pred) # accuracy
#> [1] 0.994