Multi-category angle-based large-margin classifiers with regularization by the elastic-net or groupwise penalty.
Usage
abclass(
x,
y,
intercept = TRUE,
weight = NULL,
loss = c("logistic", "boost", "hinge-boost", "lum"),
control = list(),
...
)
abclass.control(
lambda = NULL,
alpha = 1,
nlambda = 50L,
lambda_min_ratio = NULL,
grouped = TRUE,
group_weight = NULL,
group_penalty = c("lasso", "scad", "mcp"),
dgamma = 1,
lum_a = 1,
lum_c = 1,
boost_umin = -5,
maxit = 100000L,
epsilon = 1e-04,
standardize = TRUE,
varying_active_set = TRUE,
verbose = 0L,
...
)
Arguments
- x
A numeric matrix representing the design matrix. No missing valus are allowed. The coefficient estimates for constant columns will be zero. Thus, one should set the argument
intercept
toTRUE
to include an intercept term instead of adding an all-one column tox
.- y
An integer vector, a character vector, or a factor vector representing the response label.
- intercept
A logical value indicating if an intercept should be considered in the model. The default value is
TRUE
and the intercept is excluded from regularization.- weight
A numeric vector for nonnegative observation weights. Equal observation weights are used by default.
- loss
A character value specifying the loss function. The available options are
"logistic"
for the logistic deviance loss,"boost"
for the exponential loss approximating Boosting machines,"hinge-boost"
for hybrid of SVM and AdaBoost machine, and"lum"
for largin-margin unified machines (LUM). See Liu, et al. (2011) for details.- control
A list of control parameters. See
abclass.control()
for details.- ...
Other control parameters passed to
abclass.control()
.- lambda
A numeric vector specifying the tuning parameter lambda. A data-driven lambda sequence will be generated and used according to specified
alpha
,nlambda
andlambda_min_ratio
if this argument is left asNULL
by default. The specifiedlambda
will be sorted in decreasing order internally and only the unique values will be kept.- alpha
A numeric value in [0, 1] representing the mixing parameter alpha. The default value is
1.0
.- nlambda
A positive integer specifying the length of the internally generated lambda sequence. This argument will be ignored if a valid
lambda
is specified. The default value is50
.- lambda_min_ratio
A positive number specifying the ratio of the smallest lambda parameter to the largest lambda parameter. The default value is set to
1e-4
if the sample size is larger than the number of predictors, and1e-2
otherwise.- grouped
A logicial value. Experimental flag to apply group penalties.
- group_weight
A numerical vector with nonnegative values representing the adaptive penalty factors for the specified group penalty.
- group_penalty
A character vector specifying the name of the group penalty.
- dgamma
A positive number specifying the increment to the minimal gamma parameter for group SCAD or group MCP.
- lum_a
A positive number greater than one representing the parameter a in LUM, which will be used only if
loss = "lum"
. The default value is1.0
.- lum_c
A nonnegative number specifying the parameter c in LUM, which will be used only if
loss = "hinge-boost"
orloss = "lum"
. The default value is1.0
.- boost_umin
A negative number for adjusting the boosting loss for the internal majorization procedure.
- maxit
A positive integer specifying the maximum number of iteration. The default value is
10^5
.- epsilon
A positive number specifying the relative tolerance that determines convergence. The default value is
1e-4
.- standardize
A logical value indicating if each column of the design matrix should be standardized internally to have mean zero and standard deviation equal to the sample size. The default value is
TRUE
. Notice that the coefficient estimates are always returned on the original scale.- varying_active_set
A logical value indicating if the active set should be updated after each cycle of coordinate-majorization-descent algorithm. The default value is
TRUE
for usually more efficient estimation procedure.- verbose
A nonnegative integer specifying if the estimation procedure is allowed to print out intermediate steps/results. The default value is
0
for silent estimation procedure.
Value
The function abclass()
returns an object of class
abclass
representing a trained classifier; The function
abclass.control()
returns an object of class abclass.control
representing a list of control parameters.
References
Zhang, C., & Liu, Y. (2014). Multicategory Angle-Based Large-Margin Classification. Biometrika, 101(3), 625--640.
Liu, Y., Zhang, H. H., & Wu, Y. (2011). Hard or soft classification? large-margin unified machines. Journal of the American Statistical Association, 106(493), 166--177.
Examples
library(abclass)
set.seed(123)
## toy examples for demonstration purpose
## reference: example 1 in Zhang and Liu (2014)
ntrain <- 100 # size of training set
ntest <- 100 # size of testing set
p0 <- 5 # number of actual predictors
p1 <- 5 # number of random predictors
k <- 5 # number of categories
n <- ntrain + ntest; p <- p0 + p1
train_idx <- seq_len(ntrain)
y <- sample(k, size = n, replace = TRUE) # response
mu <- matrix(rnorm(p0 * k), nrow = k, ncol = p0) # mean vector
## normalize the mean vector so that they are distributed on the unit circle
mu <- mu / apply(mu, 1, function(a) sqrt(sum(a ^ 2)))
x0 <- t(sapply(y, function(i) rnorm(p0, mean = mu[i, ], sd = 0.25)))
x1 <- matrix(rnorm(p1 * n, sd = 0.3), nrow = n, ncol = p1)
x <- cbind(x0, x1)
train_x <- x[train_idx, ]
test_x <- x[- train_idx, ]
y <- factor(paste0("label_", y))
train_y <- y[train_idx]
test_y <- y[- train_idx]
## Regularization through ridge penalty
control1 <- abclass.control(nlambda = 5, lambda_min_ratio = 1e-3,
alpha = 1, grouped = FALSE)
model1 <- abclass(train_x, train_y, loss = "logistic",
control = control1)
pred1 <- predict(model1, test_x, s = 5)
table(test_y, pred1)
#> pred1
#> test_y label_1 label_2 label_3 label_4 label_5
#> label_1 22 0 3 0 0
#> label_2 0 15 0 4 1
#> label_3 0 0 12 0 2
#> label_4 1 0 0 16 0
#> label_5 0 1 1 0 22
mean(test_y == pred1) # accuracy
#> [1] 0.87
## groupwise regularization via group lasso
model2 <- abclass(train_x, train_y, loss = "boost",
grouped = TRUE, nlambda = 5)
pred2 <- predict(model2, test_x, s = 5)
table(test_y, pred2)
#> pred2
#> test_y label_1 label_2 label_3 label_4 label_5
#> label_1 24 0 1 0 0
#> label_2 0 19 0 1 0
#> label_3 0 0 13 0 1
#> label_4 1 1 0 15 0
#> label_5 0 1 1 0 22
mean(test_y == pred2) # accuracy
#> [1] 0.93