Title: | Recommended Learners for 'mlr3' |
---|---|
Description: | Recommended Learners for 'mlr3'. Extends 'mlr3' with interfaces to essential machine learning packages on CRAN. This includes, but is not limited to: (penalized) linear and logistic regression, linear and quadratic discriminant analysis, k-nearest neighbors, naive Bayes, support vector machines, and gradient boosting. |
Authors: | Michel Lang [aut] , Quay Au [aut] , Stefan Coors [aut] , Patrick Schratz [aut] , Marc Becker [cre, aut] |
Maintainer: | Marc Becker <[email protected]> |
License: | LGPL-3 |
Version: | 0.8.0 |
Built: | 2024-10-26 08:19:15 UTC |
Source: | https://github.com/mlr-org/mlr3learners |
More learners are implemented in the mlr3extralearners package. A guide on how to create custom learners is covered in the book: https://mlr3book.mlr-org.com. Feel invited to contribute a missing learner to the mlr3 ecosystem!
Maintainer: Marc Becker [email protected] (ORCID)
Authors:
Michel Lang [email protected] (ORCID)
Quay Au [email protected] (ORCID)
Stefan Coors [email protected] (ORCID)
Patrick Schratz [email protected] (ORCID)
Useful links:
Report bugs at https://github.com/mlr-org/mlr3learners/issues
Generalized linear models with elastic net regularization.
Calls glmnet::cv.glmnet()
from package glmnet.
The default for hyperparameter family
is set to "binomial"
or "multinomial"
,
depending on the number of classes.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("classif.cv_glmnet") lrn("classif.cv_glmnet")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, glmnet
Id | Type | Default | Levels | Range |
alignment | character | lambda | lambda, fraction | - |
alpha | numeric | 1 | |
|
big | numeric | 9.9e+35 | |
|
devmax | numeric | 0.999 | |
|
dfmax | integer | - | |
|
epsnr | numeric | 1e-08 | |
|
eps | numeric | 1e-06 | |
|
exclude | integer | - | |
|
exmx | numeric | 250 | |
|
fdev | numeric | 1e-05 | |
|
foldid | untyped | NULL | - | |
gamma | untyped | - | - | |
grouped | logical | TRUE | TRUE, FALSE | - |
intercept | logical | TRUE | TRUE, FALSE | - |
keep | logical | FALSE | TRUE, FALSE | - |
lambda.min.ratio | numeric | - | |
|
lambda | untyped | - | - | |
lower.limits | untyped | - | - | |
maxit | integer | 100000 | |
|
mnlam | integer | 5 | |
|
mxitnr | integer | 25 | |
|
mxit | integer | 100 | |
|
nfolds | integer | 10 | |
|
nlambda | integer | 100 | |
|
offset | untyped | NULL | - | |
parallel | logical | FALSE | TRUE, FALSE | - |
penalty.factor | untyped | - | - | |
pmax | integer | - | |
|
pmin | numeric | 1e-09 | |
|
prec | numeric | 1e-10 | |
|
predict.gamma | numeric | gamma.1se | |
|
relax | logical | FALSE | TRUE, FALSE | - |
s | numeric | lambda.1se | |
|
standardize | logical | TRUE | TRUE, FALSE | - |
standardize.response | logical | FALSE | TRUE, FALSE | - |
thresh | numeric | 1e-07 | |
|
trace.it | integer | 0 | |
|
type.gaussian | character | - | covariance, naive | - |
type.logistic | character | - | Newton, modified.Newton | - |
type.measure | character | deviance | deviance, class, auc, mse, mae | - |
type.multinomial | character | - | ungrouped, grouped | - |
upper.limits | untyped | - | - | |
Starting with mlr3 v0.5.0, the order of class labels is reversed prior to
model fitting to comply to the stats::glm()
convention that the negative class is provided
as the first factor level.
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifCVGlmnet
new()
Creates a new instance of this R6 class.
LearnerClassifCVGlmnet$new()
selected_features()
Returns the set of selected features as reported by glmnet::predict.glmnet()
with type
set to "nonzero"
.
LearnerClassifCVGlmnet$selected_features(lambda = NULL)
lambda
(numeric(1)
)
Custom lambda
, defaults to the active lambda depending on parameter set.
(character()
) of feature names.
clone()
The objects of this class are cloneable with this method.
LearnerClassifCVGlmnet$clone(deep = FALSE)
deep
Whether to make a deep clone.
Friedman J, Hastie T, Tibshirani R (2010). “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software, 33(1), 1–22. doi:10.18637/jss.v033.i01.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("glmnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.cv_glmnet") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("glmnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.cv_glmnet") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Generalized linear models with elastic net regularization.
Calls glmnet::glmnet()
from package glmnet.
Caution: This learner is different to learners calling glmnet::cv.glmnet()
in that it does not use the internal optimization of parameter lambda
.
Instead, lambda
needs to be tuned by the user (e.g., via mlr3tuning).
When lambda
is tuned, the glmnet
will be trained for each tuning iteration.
While fitting the whole path of lambda
s would be more efficient, as is done
by default in glmnet::glmnet()
, tuning/selecting the parameter at prediction time
(using parameter s
) is currently not supported in mlr3
(at least not in efficient manner).
Tuning the s
parameter is, therefore, currently discouraged.
When the data are i.i.d. and efficiency is key, we recommend using the respective
auto-tuning counterparts in mlr_learners_classif.cv_glmnet()
or
mlr_learners_regr.cv_glmnet()
.
However, in some situations this is not applicable, usually when data are
imbalanced or not i.i.d. (longitudinal, time-series) and tuning requires
custom resampling strategies (blocked design, stratification).
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("classif.glmnet") lrn("classif.glmnet")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, glmnet
Id | Type | Default | Levels | Range |
alpha | numeric | 1 | |
|
big | numeric | 9.9e+35 | |
|
devmax | numeric | 0.999 | |
|
dfmax | integer | - | |
|
eps | numeric | 1e-06 | |
|
epsnr | numeric | 1e-08 | |
|
exact | logical | FALSE | TRUE, FALSE | - |
exclude | integer | - | |
|
exmx | numeric | 250 | |
|
fdev | numeric | 1e-05 | |
|
gamma | numeric | 1 | |
|
intercept | logical | TRUE | TRUE, FALSE | - |
lambda | untyped | - | - | |
lambda.min.ratio | numeric | - | |
|
lower.limits | untyped | - | - | |
maxit | integer | 100000 | |
|
mnlam | integer | 5 | |
|
mxit | integer | 100 | |
|
mxitnr | integer | 25 | |
|
nlambda | integer | 100 | |
|
newoffset | untyped | - | - | |
offset | untyped | NULL | - | |
penalty.factor | untyped | - | - | |
pmax | integer | - | |
|
pmin | numeric | 1e-09 | |
|
prec | numeric | 1e-10 | |
|
relax | logical | FALSE | TRUE, FALSE | - |
s | numeric | 0.01 | |
|
standardize | logical | TRUE | TRUE, FALSE | - |
standardize.response | logical | FALSE | TRUE, FALSE | - |
thresh | numeric | 1e-07 | |
|
trace.it | integer | 0 | |
|
type.gaussian | character | - | covariance, naive | - |
type.logistic | character | - | Newton, modified.Newton | - |
type.multinomial | character | - | ungrouped, grouped | - |
upper.limits | untyped | - | - | |
Starting with mlr3 v0.5.0, the order of class labels is reversed prior to
model fitting to comply to the stats::glm()
convention that the negative class is provided
as the first factor level.
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifGlmnet
new()
Creates a new instance of this R6 class.
LearnerClassifGlmnet$new()
selected_features()
Returns the set of selected features as reported by glmnet::predict.glmnet()
with type
set to "nonzero"
.
LearnerClassifGlmnet$selected_features(lambda = NULL)
lambda
(numeric(1)
)
Custom lambda
, defaults to the active lambda depending on parameter set.
(character()
) of feature names.
clone()
The objects of this class are cloneable with this method.
LearnerClassifGlmnet$clone(deep = FALSE)
deep
Whether to make a deep clone.
Friedman J, Hastie T, Tibshirani R (2010). “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software, 33(1), 1–22. doi:10.18637/jss.v033.i01.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("glmnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.glmnet") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("glmnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.glmnet") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
k-Nearest-Neighbor classification.
Calls kknn::kknn()
from package kknn.
store_model
:
See note.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("classif.kknn") lrn("classif.kknn")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, kknn
Id | Type | Default | Levels | Range |
k | integer | 7 | |
|
distance | numeric | 2 | |
|
kernel | character | optimal | rectangular, triangular, epanechnikov, biweight, triweight, cos, inv, gaussian, rank, optimal | - |
scale | logical | TRUE | TRUE, FALSE | - |
ykernel | untyped | NULL | - | |
store_model | logical | FALSE | TRUE, FALSE | - |
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifKKNN
new()
Creates a new instance of this R6 class.
LearnerClassifKKNN$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifKKNN$clone(deep = FALSE)
deep
Whether to make a deep clone.
There is no training step for k-NN models, just storing the training data to
process it during the predict step.
Therefore, $model
returns a list with the following elements:
formula
: Formula for calling kknn::kknn()
during $predict()
.
data
: Training data for calling kknn::kknn()
during $predict()
.
pv
: Training parameters for calling kknn::kknn()
during $predict()
.
kknn
: Model as returned by kknn::kknn()
, only available after $predict()
has been called.
This is not stored by default, you must set hyperparameter store_model
to TRUE
.
Hechenbichler, Klaus, Schliep, Klaus (2004). “Weighted k-nearest-neighbor techniques and ordinal classification.” Technical Report Discussion Paper 399, SFB 386, Ludwig-Maximilians University Munich. doi:10.5282/ubm/epub.1769.
Samworth, J R (2012). “Optimal weighted nearest neighbour classifiers.” The Annals of Statistics, 40(5), 2733–2763. doi:10.1214/12-AOS1049.
Cover, Thomas, Hart, Peter (1967). “Nearest neighbor pattern classification.” IEEE transactions on information theory, 13(1), 21–27. doi:10.1109/TIT.1967.1053964.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("kknn", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.kknn") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("kknn", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.kknn") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Linear discriminant analysis.
Calls MASS::lda()
from package MASS.
Parameters method
and prior
exist for training and prediction but
accept different values for each. Therefore, arguments for
the predict stage have been renamed to predict.method
and predict.prior
,
respectively.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("classif.lda") lrn("classif.lda")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, MASS
Id | Type | Default | Levels | Range |
dimen | untyped | - | - | |
method | character | moment | moment, mle, mve, t | - |
nu | integer | - | |
|
predict.method | character | plug-in | plug-in, predictive, debiased | - |
predict.prior | untyped | - | - | |
prior | untyped | - | - | |
tol | numeric | - | |
|
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifLDA
new()
Creates a new instance of this R6 class.
LearnerClassifLDA$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifLDA$clone(deep = FALSE)
deep
Whether to make a deep clone.
Venables WN, Ripley BD (2002). Modern Applied Statistics with S, Fourth edition. Springer, New York. ISBN 0-387-95457-0, http://www.stats.ox.ac.uk/pub/MASS4/.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("MASS", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.lda") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("MASS", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.lda") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Classification via logistic regression.
Calls stats::glm()
with family
set to "binomial"
.
Starting with mlr3 v0.5.0, the order of class labels is reversed prior to
model fitting to comply to the stats::glm()
convention that the negative class is provided
as the first factor level.
model
:
Actual default: TRUE
.
Adjusted default: FALSE
.
Reason for change: Save some memory.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("classif.log_reg") lrn("classif.log_reg")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “character”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, 'stats'
Id | Type | Default | Levels | Range |
dispersion | untyped | NULL | - | |
epsilon | numeric | 1e-08 | |
|
etastart | untyped | - | - | |
maxit | numeric | 25 | |
|
model | logical | TRUE | TRUE, FALSE | - |
mustart | untyped | - | - | |
offset | untyped | - | - | |
singular.ok | logical | TRUE | TRUE, FALSE | - |
start | untyped | NULL | - | |
trace | logical | FALSE | TRUE, FALSE | - |
x | logical | FALSE | TRUE, FALSE | - |
y | logical | TRUE | TRUE, FALSE | - |
To ensure reproducibility, this learner always uses the default contrasts:
contr.treatment()
for unordered factors, and
contr.poly()
for ordered factors.
Setting the option "contrasts"
does not have any effect.
Instead, set the respective hyperparameter or use mlr3pipelines to create dummy features.
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifLogReg
new()
Creates a new instance of this R6 class.
LearnerClassifLogReg$new()
loglik()
Extract the log-likelihood (e.g., via stats::logLik()
from the fitted model.
LearnerClassifLogReg$loglik()
clone()
The objects of this class are cloneable with this method.
LearnerClassifLogReg$clone(deep = FALSE)
deep
Whether to make a deep clone.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("stats", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.log_reg") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("stats", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.log_reg") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Multinomial log-linear models via neural networks.
Calls nnet::multinom()
from package nnet.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("classif.multinom") lrn("classif.multinom")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”
Required Packages: mlr3, mlr3learners, nnet
Id | Type | Default | Levels | Range |
Hess | logical | FALSE | TRUE, FALSE | - |
abstol | numeric | 1e-04 | |
|
censored | logical | FALSE | TRUE, FALSE | - |
decay | numeric | 0 | |
|
entropy | logical | FALSE | TRUE, FALSE | - |
mask | untyped | - | - | |
maxit | integer | 100 | |
|
MaxNWts | integer | 1000 | |
|
model | logical | FALSE | TRUE, FALSE | - |
linout | logical | FALSE | TRUE, FALSE | - |
rang | numeric | 0.7 | |
|
reltol | numeric | 1e-08 | |
|
size | integer | - | |
|
skip | logical | FALSE | TRUE, FALSE | - |
softmax | logical | FALSE | TRUE, FALSE | - |
summ | character | 0 | 0, 1, 2, 3 | - |
trace | logical | TRUE | TRUE, FALSE | - |
Wts | untyped | - | - | |
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifMultinom
new()
Creates a new instance of this R6 class.
LearnerClassifMultinom$new()
loglik()
Extract the log-likelihood (e.g., via stats::logLik()
from the fitted model.
LearnerClassifMultinom$loglik()
clone()
The objects of this class are cloneable with this method.
LearnerClassifMultinom$clone(deep = FALSE)
deep
Whether to make a deep clone.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("nnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.multinom") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("nnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.multinom") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Naive Bayes classification.
Calls e1071::naiveBayes()
from package e1071.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("classif.naive_bayes") lrn("classif.naive_bayes")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”
Required Packages: mlr3, mlr3learners, e1071
Id | Type | Default | Range |
eps | numeric | 0 | |
laplace | numeric | 0 | |
threshold | numeric | 0.001 | |
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifNaiveBayes
new()
Creates a new instance of this R6 class.
LearnerClassifNaiveBayes$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifNaiveBayes$clone(deep = FALSE)
deep
Whether to make a deep clone.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("e1071", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.naive_bayes") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("e1071", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.naive_bayes") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Single Layer Neural Network.
Calls nnet::nnet.formula()
from package nnet.
Note that modern neural networks with multiple layers are connected via package mlr3torch.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("classif.nnet") lrn("classif.nnet")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, nnet
Id | Type | Default | Levels | Range |
Hess | logical | FALSE | TRUE, FALSE | - |
MaxNWts | integer | 1000 | |
|
Wts | untyped | - | - | |
abstol | numeric | 1e-04 | |
|
censored | logical | FALSE | TRUE, FALSE | - |
contrasts | untyped | NULL | - | |
decay | numeric | 0 | |
|
mask | untyped | - | - | |
maxit | integer | 100 | |
|
na.action | untyped | - | - | |
rang | numeric | 0.7 | |
|
reltol | numeric | 1e-08 | |
|
size | integer | 3 | |
|
skip | logical | FALSE | TRUE, FALSE | - |
subset | untyped | - | - | |
trace | logical | TRUE | TRUE, FALSE | - |
formula | untyped | - | - | |
size
:
Adjusted default: 3L.
Reason for change: no default in nnet()
.
formula
: if not provided, the formula is set to task$formula()
.
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifNnet
new()
Creates a new instance of this R6 class.
LearnerClassifNnet$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifNnet$clone(deep = FALSE)
deep
Whether to make a deep clone.
Ripley BD (1996). Pattern Recognition and Neural Networks. Cambridge University Press. doi:10.1017/cbo9780511812651.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("nnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.nnet") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("nnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.nnet") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Quadratic discriminant analysis.
Calls MASS::qda()
from package MASS.
Parameters method
and prior
exist for training and prediction but
accept different values for each. Therefore, arguments for
the predict stage have been renamed to predict.method
and predict.prior
,
respectively.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("classif.qda") lrn("classif.qda")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, MASS
Id | Type | Default | Levels | Range |
method | character | moment | moment, mle, mve, t | - |
nu | integer | - | |
|
predict.method | character | plug-in | plug-in, predictive, debiased | - |
predict.prior | untyped | - | - | |
prior | untyped | - | - | |
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifQDA
new()
Creates a new instance of this R6 class.
LearnerClassifQDA$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifQDA$clone(deep = FALSE)
deep
Whether to make a deep clone.
Venables WN, Ripley BD (2002). Modern Applied Statistics with S, Fourth edition. Springer, New York. ISBN 0-387-95457-0, http://www.stats.ox.ac.uk/pub/MASS4/.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("MASS", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.qda") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("MASS", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.qda") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Random classification forest.
Calls ranger::ranger()
from package ranger.
mtry
:
This hyperparameter can alternatively be set via our hyperparameter mtry.ratio
as mtry = max(ceiling(mtry.ratio * n_features), 1)
.
Note that mtry
and mtry.ratio
are mutually exclusive.
num.threads
:
Actual default: NULL
, triggering auto-detection of the number of CPUs.
Adjusted value: 1.
Reason for change: Conflicting with parallelization via future.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("classif.ranger") lrn("classif.ranger")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “character”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, ranger
Id | Type | Default | Levels | Range |
alpha | numeric | 0.5 | |
|
always.split.variables | untyped | - | - | |
class.weights | untyped | NULL | - | |
holdout | logical | FALSE | TRUE, FALSE | - |
importance | character | - | none, impurity, impurity_corrected, permutation | - |
keep.inbag | logical | FALSE | TRUE, FALSE | - |
max.depth | integer | NULL | |
|
min.bucket | integer | 1 | |
|
min.node.size | integer | NULL | |
|
minprop | numeric | 0.1 | |
|
mtry | integer | - | |
|
mtry.ratio | numeric | - | |
|
num.random.splits | integer | 1 | |
|
node.stats | logical | FALSE | TRUE, FALSE | - |
num.threads | integer | 1 | |
|
num.trees | integer | 500 | |
|
oob.error | logical | TRUE | TRUE, FALSE | - |
regularization.factor | untyped | 1 | - | |
regularization.usedepth | logical | FALSE | TRUE, FALSE | - |
replace | logical | TRUE | TRUE, FALSE | - |
respect.unordered.factors | character | ignore | ignore, order, partition | - |
sample.fraction | numeric | - | |
|
save.memory | logical | FALSE | TRUE, FALSE | - |
scale.permutation.importance | logical | FALSE | TRUE, FALSE | - |
se.method | character | infjack | jack, infjack | - |
seed | integer | NULL | |
|
split.select.weights | untyped | NULL | - | |
splitrule | character | gini | gini, extratrees, hellinger | - |
verbose | logical | TRUE | TRUE, FALSE | - |
write.forest | logical | TRUE | TRUE, FALSE | - |
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifRanger
new()
Creates a new instance of this R6 class.
LearnerClassifRanger$new()
importance()
The importance scores are extracted from the model slot variable.importance
.
Parameter importance.mode
must be set to "impurity"
, "impurity_corrected"
, or
"permutation"
LearnerClassifRanger$importance()
Named numeric()
.
oob_error()
The out-of-bag error, extracted from model slot prediction.error
.
LearnerClassifRanger$oob_error()
numeric(1)
.
clone()
The objects of this class are cloneable with this method.
LearnerClassifRanger$clone(deep = FALSE)
deep
Whether to make a deep clone.
Wright, N. M, Ziegler, Andreas (2017). “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software, 77(1), 1–17. doi:10.18637/jss.v077.i01.
Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. ISSN 1573-0565, doi:10.1023/A:1010933404324.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("ranger", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.ranger") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("ranger", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.ranger") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Support vector machine for classification.
Calls e1071::svm()
from package e1071.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("classif.svm") lrn("classif.svm")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, e1071
Id | Type | Default | Levels | Range |
cachesize | numeric | 40 | |
|
class.weights | untyped | NULL | - | |
coef0 | numeric | 0 | |
|
cost | numeric | 1 | |
|
cross | integer | 0 | |
|
decision.values | logical | FALSE | TRUE, FALSE | - |
degree | integer | 3 | |
|
epsilon | numeric | 0.1 | |
|
fitted | logical | TRUE | TRUE, FALSE | - |
gamma | numeric | - | |
|
kernel | character | radial | linear, polynomial, radial, sigmoid | - |
nu | numeric | 0.5 | |
|
scale | untyped | TRUE | - | |
shrinking | logical | TRUE | TRUE, FALSE | - |
tolerance | numeric | 0.001 | |
|
type | character | C-classification | C-classification, nu-classification | - |
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifSVM
new()
Creates a new instance of this R6 class.
LearnerClassifSVM$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifSVM$clone(deep = FALSE)
deep
Whether to make a deep clone.
Cortes, Corinna, Vapnik, Vladimir (1995). “Support-vector networks.” Machine Learning, 20(3), 273–297. doi:10.1007/BF00994018.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("e1071", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.svm") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("e1071", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.svm") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
eXtreme Gradient Boosting classification.
Calls xgboost::xgb.train()
from package xgboost.
If not specified otherwise, the evaluation metric is set to the default "logloss"
for binary classification problems and set to "mlogloss"
for multiclass problems.
This was necessary to silence a deprecation warning.
Note that using the watchlist
parameter directly will lead to problems when wrapping this mlr3::Learner in a
mlr3pipelines
GraphLearner
as the preprocessing steps will not be applied to the data in the watchlist.
See the section Early Stopping and Validation on how to do this.
nrounds
:
Actual default: no default.
Adjusted default: 1000.
Reason for change: Without a default construction of the learner would error. The lightgbm learner has a default of 1000, so we use the same here.
nthread
:
Actual value: Undefined, triggering auto-detection of the number of CPUs.
Adjusted value: 1.
Reason for change: Conflicting with parallelization via future.
verbose
:
Actual default: 1.
Adjusted default: 0.
Reason for change: Reduce verbosity.
In order to monitor the validation performance during the training, you can set the $validate
field of the Learner.
For information on how to configure the valdiation set, see the Validation section of mlr3::Learner.
This validation data can also be used for early stopping, which can be enabled by setting the early_stopping_rounds
parameter.
The final (or in the case of early stopping best) validation scores can be accessed via $internal_valid_scores
, and the optimal nrounds
via $internal_tuned_values
.
The internal validation measure can be set via the eval_metric
parameter that can be a mlr3::Measure, a function, or a character string for the internal xgboost measures.
Using an mlr3::Measure is slower than the internal xgboost measures, but allows to use the same measure for tuning and validation.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("classif.xgboost") lrn("classif.xgboost")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, xgboost
Id | Type | Default | Levels | Range |
alpha | numeric | 0 | |
|
approxcontrib | logical | FALSE | TRUE, FALSE | - |
base_score | numeric | 0.5 | |
|
booster | character | gbtree | gbtree, gblinear, dart | - |
callbacks | untyped | list() | - | |
colsample_bylevel | numeric | 1 | |
|
colsample_bynode | numeric | 1 | |
|
colsample_bytree | numeric | 1 | |
|
device | untyped | "cpu" | - | |
disable_default_eval_metric | logical | FALSE | TRUE, FALSE | - |
early_stopping_rounds | integer | NULL | |
|
eta | numeric | 0.3 | |
|
eval_metric | untyped | - | - | |
feature_selector | character | cyclic | cyclic, shuffle, random, greedy, thrifty | - |
gamma | numeric | 0 | |
|
grow_policy | character | depthwise | depthwise, lossguide | - |
interaction_constraints | untyped | - | - | |
iterationrange | untyped | - | - | |
lambda | numeric | 1 | |
|
lambda_bias | numeric | 0 | |
|
max_bin | integer | 256 | |
|
max_delta_step | numeric | 0 | |
|
max_depth | integer | 6 | |
|
max_leaves | integer | 0 | |
|
maximize | logical | NULL | TRUE, FALSE | - |
min_child_weight | numeric | 1 | |
|
missing | numeric | NA | |
|
monotone_constraints | untyped | 0 | - | |
nrounds | integer | - | |
|
normalize_type | character | tree | tree, forest | - |
nthread | integer | 1 | |
|
ntreelimit | integer | NULL | |
|
num_parallel_tree | integer | 1 | |
|
objective | untyped | "binary:logistic" | - | |
one_drop | logical | FALSE | TRUE, FALSE | - |
outputmargin | logical | FALSE | TRUE, FALSE | - |
predcontrib | logical | FALSE | TRUE, FALSE | - |
predinteraction | logical | FALSE | TRUE, FALSE | - |
predleaf | logical | FALSE | TRUE, FALSE | - |
print_every_n | integer | 1 | |
|
process_type | character | default | default, update | - |
rate_drop | numeric | 0 | |
|
refresh_leaf | logical | TRUE | TRUE, FALSE | - |
reshape | logical | FALSE | TRUE, FALSE | - |
seed_per_iteration | logical | FALSE | TRUE, FALSE | - |
sampling_method | character | uniform | uniform, gradient_based | - |
sample_type | character | uniform | uniform, weighted | - |
save_name | untyped | NULL | - | |
save_period | integer | NULL | |
|
scale_pos_weight | numeric | 1 | |
|
skip_drop | numeric | 0 | |
|
strict_shape | logical | FALSE | TRUE, FALSE | - |
subsample | numeric | 1 | |
|
top_k | integer | 0 | |
|
training | logical | FALSE | TRUE, FALSE | - |
tree_method | character | auto | auto, exact, approx, hist, gpu_hist | - |
tweedie_variance_power | numeric | 1.5 | |
|
updater | untyped | - | - | |
verbose | integer | 1 | |
|
watchlist | untyped | NULL | - | |
xgb_model | untyped | NULL | - | |
mlr3::Learner
-> mlr3::LearnerClassif
-> LearnerClassifXgboost
internal_valid_scores
(named list()
or NULL
)
The validation scores extracted from model$evaluation_log
.
If early stopping is activated, this contains the validation scores of the model for the optimal nrounds
,
otherwise the nrounds
for the final model.
internal_tuned_values
(named list()
or NULL
)
If early stopping is activated, this returns a list with nrounds
,
which is extracted from $best_iteration
of the model and otherwise NULL
.
validate
(numeric(1)
or character(1)
or NULL
)
How to construct the internal validation data. This parameter can be either NULL
,
a ratio, "test"
, or "predefined"
.
new()
Creates a new instance of this R6 class.
LearnerClassifXgboost$new()
importance()
The importance scores are calculated with xgboost::xgb.importance()
.
LearnerClassifXgboost$importance()
Named numeric()
.
clone()
The objects of this class are cloneable with this method.
LearnerClassifXgboost$clone(deep = FALSE)
deep
Whether to make a deep clone.
To compute on GPUs, you first need to compile xgboost yourself and link against CUDA. See https://xgboost.readthedocs.io/en/stable/build.html#building-with-gpu-support.
Chen, Tianqi, Guestrin, Carlos (2016). “Xgboost: A scalable tree boosting system.” In Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 785–794. ACM. doi:10.1145/2939672.2939785.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
## Not run: if (requireNamespace("xgboost", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.xgboost") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() } ## End(Not run) ## Not run: # Train learner with early stopping on spam data set task = tsk("spam") # use 30 percent for validation # Set early stopping parameter learner = lrn("classif.xgboost", nrounds = 100, early_stopping_rounds = 10, validate = 0.3 ) # Train learner with early stopping learner$train(task) # Inspect optimal nrounds and validation performance learner$internal_tuned_values learner$internal_valid_scores ## End(Not run)
## Not run: if (requireNamespace("xgboost", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.xgboost") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() } ## End(Not run) ## Not run: # Train learner with early stopping on spam data set task = tsk("spam") # use 30 percent for validation # Set early stopping parameter learner = lrn("classif.xgboost", nrounds = 100, early_stopping_rounds = 10, validate = 0.3 ) # Train learner with early stopping learner$train(task) # Inspect optimal nrounds and validation performance learner$internal_tuned_values learner$internal_valid_scores ## End(Not run)
Generalized linear models with elastic net regularization.
Calls glmnet::cv.glmnet()
from package glmnet.
The default for hyperparameter family
is set to "gaussian"
.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("regr.cv_glmnet") lrn("regr.cv_glmnet")
Task type: “regr”
Predict Types: “response”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, glmnet
Id | Type | Default | Levels | Range |
alignment | character | lambda | lambda, fraction | - |
alpha | numeric | 1 | |
|
big | numeric | 9.9e+35 | |
|
devmax | numeric | 0.999 | |
|
dfmax | integer | - | |
|
eps | numeric | 1e-06 | |
|
epsnr | numeric | 1e-08 | |
|
exclude | integer | - | |
|
exmx | numeric | 250 | |
|
family | character | gaussian | gaussian, poisson | - |
fdev | numeric | 1e-05 | |
|
foldid | untyped | NULL | - | |
gamma | untyped | - | - | |
grouped | logical | TRUE | TRUE, FALSE | - |
intercept | logical | TRUE | TRUE, FALSE | - |
keep | logical | FALSE | TRUE, FALSE | - |
lambda | untyped | - | - | |
lambda.min.ratio | numeric | - | |
|
lower.limits | untyped | - | - | |
maxit | integer | 100000 | |
|
mnlam | integer | 5 | |
|
mxit | integer | 100 | |
|
mxitnr | integer | 25 | |
|
nfolds | integer | 10 | |
|
nlambda | integer | 100 | |
|
offset | untyped | NULL | - | |
parallel | logical | FALSE | TRUE, FALSE | - |
penalty.factor | untyped | - | - | |
pmax | integer | - | |
|
pmin | numeric | 1e-09 | |
|
prec | numeric | 1e-10 | |
|
predict.gamma | numeric | gamma.1se | |
|
relax | logical | FALSE | TRUE, FALSE | - |
s | numeric | lambda.1se | |
|
standardize | logical | TRUE | TRUE, FALSE | - |
standardize.response | logical | FALSE | TRUE, FALSE | - |
thresh | numeric | 1e-07 | |
|
trace.it | integer | 0 | |
|
type.gaussian | character | - | covariance, naive | - |
type.logistic | character | - | Newton, modified.Newton | - |
type.measure | character | deviance | deviance, class, auc, mse, mae | - |
type.multinomial | character | - | ungrouped, grouped | - |
upper.limits | untyped | - | - | |
mlr3::Learner
-> mlr3::LearnerRegr
-> LearnerRegrCVGlmnet
new()
Creates a new instance of this R6 class.
LearnerRegrCVGlmnet$new()
selected_features()
Returns the set of selected features as reported by glmnet::predict.glmnet()
with type
set to "nonzero"
.
LearnerRegrCVGlmnet$selected_features(lambda = NULL)
lambda
(numeric(1)
)
Custom lambda
, defaults to the active lambda depending on parameter set.
(character()
) of feature names.
clone()
The objects of this class are cloneable with this method.
LearnerRegrCVGlmnet$clone(deep = FALSE)
deep
Whether to make a deep clone.
Friedman J, Hastie T, Tibshirani R (2010). “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software, 33(1), 1–22. doi:10.18637/jss.v033.i01.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("glmnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.cv_glmnet") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("glmnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.cv_glmnet") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Generalized linear models with elastic net regularization.
Calls glmnet::glmnet()
from package glmnet.
The default for hyperparameter family
is set to "gaussian"
.
Caution: This learner is different to learners calling glmnet::cv.glmnet()
in that it does not use the internal optimization of parameter lambda
.
Instead, lambda
needs to be tuned by the user (e.g., via mlr3tuning).
When lambda
is tuned, the glmnet
will be trained for each tuning iteration.
While fitting the whole path of lambda
s would be more efficient, as is done
by default in glmnet::glmnet()
, tuning/selecting the parameter at prediction time
(using parameter s
) is currently not supported in mlr3
(at least not in efficient manner).
Tuning the s
parameter is, therefore, currently discouraged.
When the data are i.i.d. and efficiency is key, we recommend using the respective
auto-tuning counterparts in mlr_learners_classif.cv_glmnet()
or
mlr_learners_regr.cv_glmnet()
.
However, in some situations this is not applicable, usually when data are
imbalanced or not i.i.d. (longitudinal, time-series) and tuning requires
custom resampling strategies (blocked design, stratification).
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("regr.glmnet") lrn("regr.glmnet")
Task type: “regr”
Predict Types: “response”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, glmnet
Id | Type | Default | Levels | Range |
alignment | character | lambda | lambda, fraction | - |
alpha | numeric | 1 | |
|
big | numeric | 9.9e+35 | |
|
devmax | numeric | 0.999 | |
|
dfmax | integer | - | |
|
eps | numeric | 1e-06 | |
|
epsnr | numeric | 1e-08 | |
|
exact | logical | FALSE | TRUE, FALSE | - |
exclude | integer | - | |
|
exmx | numeric | 250 | |
|
family | character | gaussian | gaussian, poisson | - |
fdev | numeric | 1e-05 | |
|
gamma | numeric | 1 | |
|
grouped | logical | TRUE | TRUE, FALSE | - |
intercept | logical | TRUE | TRUE, FALSE | - |
keep | logical | FALSE | TRUE, FALSE | - |
lambda | untyped | - | - | |
lambda.min.ratio | numeric | - | |
|
lower.limits | untyped | - | - | |
maxit | integer | 100000 | |
|
mnlam | integer | 5 | |
|
mxit | integer | 100 | |
|
mxitnr | integer | 25 | |
|
newoffset | untyped | - | - | |
nlambda | integer | 100 | |
|
offset | untyped | NULL | - | |
parallel | logical | FALSE | TRUE, FALSE | - |
penalty.factor | untyped | - | - | |
pmax | integer | - | |
|
pmin | numeric | 1e-09 | |
|
prec | numeric | 1e-10 | |
|
relax | logical | FALSE | TRUE, FALSE | - |
s | numeric | 0.01 | |
|
standardize | logical | TRUE | TRUE, FALSE | - |
standardize.response | logical | FALSE | TRUE, FALSE | - |
thresh | numeric | 1e-07 | |
|
trace.it | integer | 0 | |
|
type.gaussian | character | - | covariance, naive | - |
type.logistic | character | - | Newton, modified.Newton | - |
type.multinomial | character | - | ungrouped, grouped | - |
upper.limits | untyped | - | - | |
mlr3::Learner
-> mlr3::LearnerRegr
-> LearnerRegrGlmnet
new()
Creates a new instance of this R6 class.
LearnerRegrGlmnet$new()
selected_features()
Returns the set of selected features as reported by glmnet::predict.glmnet()
with type
set to "nonzero"
.
LearnerRegrGlmnet$selected_features(lambda = NULL)
lambda
(numeric(1)
)
Custom lambda
, defaults to the active lambda depending on parameter set.
(character()
) of feature names.
clone()
The objects of this class are cloneable with this method.
LearnerRegrGlmnet$clone(deep = FALSE)
deep
Whether to make a deep clone.
Friedman J, Hastie T, Tibshirani R (2010). “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software, 33(1), 1–22. doi:10.18637/jss.v033.i01.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("glmnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.glmnet") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("glmnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.glmnet") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
k-Nearest-Neighbor regression.
Calls kknn::kknn()
from package kknn.
store_model
:
See note.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("regr.kknn") lrn("regr.kknn")
Task type: “regr”
Predict Types: “response”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, kknn
Id | Type | Default | Levels | Range |
k | integer | 7 | |
|
distance | numeric | 2 | |
|
kernel | character | optimal | rectangular, triangular, epanechnikov, biweight, triweight, cos, inv, gaussian, rank, optimal | - |
scale | logical | TRUE | TRUE, FALSE | - |
ykernel | untyped | NULL | - | |
store_model | logical | FALSE | TRUE, FALSE | - |
mlr3::Learner
-> mlr3::LearnerRegr
-> LearnerRegrKKNN
new()
Creates a new instance of this R6 class.
LearnerRegrKKNN$new()
clone()
The objects of this class are cloneable with this method.
LearnerRegrKKNN$clone(deep = FALSE)
deep
Whether to make a deep clone.
There is no training step for k-NN models, just storing the training data to
process it during the predict step.
Therefore, $model
returns a list with the following elements:
formula
: Formula for calling kknn::kknn()
during $predict()
.
data
: Training data for calling kknn::kknn()
during $predict()
.
pv
: Training parameters for calling kknn::kknn()
during $predict()
.
kknn
: Model as returned by kknn::kknn()
, only available after $predict()
has been called.
This is not stored by default, you must set hyperparameter store_model
to TRUE
.
Hechenbichler, Klaus, Schliep, Klaus (2004). “Weighted k-nearest-neighbor techniques and ordinal classification.” Technical Report Discussion Paper 399, SFB 386, Ludwig-Maximilians University Munich. doi:10.5282/ubm/epub.1769.
Samworth, J R (2012). “Optimal weighted nearest neighbour classifiers.” The Annals of Statistics, 40(5), 2733–2763. doi:10.1214/12-AOS1049.
Cover, Thomas, Hart, Peter (1967). “Nearest neighbor pattern classification.” IEEE transactions on information theory, 13(1), 21–27. doi:10.1109/TIT.1967.1053964.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("kknn", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.kknn") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("kknn", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.kknn") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Kriging regression.
Calls DiceKriging::km()
from package DiceKriging.
The predict type hyperparameter "type" defaults to "sk" (simple kriging).
The additional hyperparameter nugget.stability
is used to overwrite the
hyperparameter nugget
with nugget.stability * var(y)
before training to
improve the numerical stability. We recommend a value of 1e-8
.
The additional hyperparameter jitter
can be set to add
N(0, [jitter])
-distributed noise to the data before prediction to avoid
perfect interpolation. We recommend a value of 1e-12
.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("regr.km") lrn("regr.km")
Task type: “regr”
Predict Types: “response”, “se”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, DiceKriging
Id | Type | Default | Levels | Range |
bias.correct | logical | FALSE | TRUE, FALSE | - |
checkNames | logical | TRUE | TRUE, FALSE | - |
coef.cov | untyped | NULL | - | |
coef.trend | untyped | NULL | - | |
coef.var | untyped | NULL | - | |
control | untyped | NULL | - | |
cov.compute | logical | TRUE | TRUE, FALSE | - |
covtype | character | matern5_2 | gauss, matern5_2, matern3_2, exp, powexp | - |
estim.method | character | MLE | MLE, LOO | - |
gr | logical | TRUE | TRUE, FALSE | - |
iso | logical | FALSE | TRUE, FALSE | - |
jitter | numeric | 0 | |
|
kernel | untyped | NULL | - | |
knots | untyped | NULL | - | |
light.return | logical | FALSE | TRUE, FALSE | - |
lower | untyped | NULL | - | |
multistart | integer | 1 | |
|
noise.var | untyped | NULL | - | |
nugget | numeric | - | |
|
nugget.estim | logical | FALSE | TRUE, FALSE | - |
nugget.stability | numeric | 0 | |
|
optim.method | character | BFGS | BFGS, gen | - |
parinit | untyped | NULL | - | |
penalty | untyped | NULL | - | |
scaling | logical | FALSE | TRUE, FALSE | - |
se.compute | logical | TRUE | TRUE, FALSE | - |
type | character | SK | SK, UK | - |
upper | untyped | NULL | - | |
mlr3::Learner
-> mlr3::LearnerRegr
-> LearnerRegrKM
new()
Creates a new instance of this R6 class.
LearnerRegrKM$new()
clone()
The objects of this class are cloneable with this method.
LearnerRegrKM$clone(deep = FALSE)
deep
Whether to make a deep clone.
Roustant O, Ginsbourger D, Deville Y (2012). “DiceKriging, DiceOptim: Two R Packages for the Analysis of Computer Experiments by Kriging-Based Metamodeling and Optimization.” Journal of Statistical Software, 51(1), 1–55. doi:10.18637/jss.v051.i01.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("DiceKriging", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.km") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("DiceKriging", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.km") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Ordinary linear regression.
Calls stats::lm()
.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("regr.lm") lrn("regr.lm")
Task type: “regr”
Predict Types: “response”, “se”
Feature Types: “logical”, “integer”, “numeric”, “character”, “factor”
Required Packages: mlr3, mlr3learners, 'stats'
Id | Type | Default | Levels | Range |
df | numeric | Inf | |
|
interval | character | - | none, confidence, prediction | - |
level | numeric | 0.95 | |
|
model | logical | TRUE | TRUE, FALSE | - |
offset | logical | - | TRUE, FALSE | - |
pred.var | untyped | - | - | |
qr | logical | TRUE | TRUE, FALSE | - |
scale | numeric | NULL | |
|
singular.ok | logical | TRUE | TRUE, FALSE | - |
x | logical | FALSE | TRUE, FALSE | - |
y | logical | FALSE | TRUE, FALSE | - |
rankdeficient | character | - | warnif, simple, non-estim, NA, NAwarn | - |
tol | numeric | 1e-07 | |
|
verbose | logical | FALSE | TRUE, FALSE | - |
To ensure reproducibility, this learner always uses the default contrasts:
contr.treatment()
for unordered factors, and
contr.poly()
for ordered factors.
Setting the option "contrasts"
does not have any effect.
Instead, set the respective hyperparameter or use mlr3pipelines to create dummy features.
mlr3::Learner
-> mlr3::LearnerRegr
-> LearnerRegrLM
new()
Creates a new instance of this R6 class.
LearnerRegrLM$new()
loglik()
Extract the log-likelihood (e.g., via stats::logLik()
from the fitted model.
LearnerRegrLM$loglik()
clone()
The objects of this class are cloneable with this method.
LearnerRegrLM$clone(deep = FALSE)
deep
Whether to make a deep clone.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("stats", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.lm") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("stats", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.lm") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Single Layer Neural Network.
Calls nnet::nnet.formula()
from package nnet.
Note that modern neural networks with multiple layers are connected via package mlr3torch.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("regr.nnet") lrn("regr.nnet")
Task type: “regr”
Predict Types: “response”
Feature Types: “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, nnet
Id | Type | Default | Levels | Range |
Hess | logical | FALSE | TRUE, FALSE | - |
MaxNWts | integer | 1000 | |
|
Wts | untyped | - | - | |
abstol | numeric | 1e-04 | |
|
censored | logical | FALSE | TRUE, FALSE | - |
contrasts | untyped | NULL | - | |
decay | numeric | 0 | |
|
mask | untyped | - | - | |
maxit | integer | 100 | |
|
na.action | untyped | - | - | |
rang | numeric | 0.7 | |
|
reltol | numeric | 1e-08 | |
|
size | integer | 3 | |
|
skip | logical | FALSE | TRUE, FALSE | - |
subset | untyped | - | - | |
trace | logical | TRUE | TRUE, FALSE | - |
formula | untyped | - | - | |
size
:
Adjusted default: 3L.
Reason for change: no default in nnet()
.
formula
: if not provided, the formula is set to task$formula()
.
mlr3::Learner
-> mlr3::LearnerRegr
-> LearnerRegrNnet
new()
Creates a new instance of this R6 class.
LearnerRegrNnet$new()
clone()
The objects of this class are cloneable with this method.
LearnerRegrNnet$clone(deep = FALSE)
deep
Whether to make a deep clone.
Ripley BD (1996). Pattern Recognition and Neural Networks. Cambridge University Press. doi:10.1017/cbo9780511812651.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("nnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.nnet") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("nnet", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.nnet") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Random regression forest.
Calls ranger::ranger()
from package ranger.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("regr.ranger") lrn("regr.ranger")
Task type: “regr”
Predict Types: “response”, “se”, “quantiles”
Feature Types: “logical”, “integer”, “numeric”, “character”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, ranger
Id | Type | Default | Levels | Range |
alpha | numeric | 0.5 | |
|
always.split.variables | untyped | - | - | |
holdout | logical | FALSE | TRUE, FALSE | - |
importance | character | - | none, impurity, impurity_corrected, permutation | - |
keep.inbag | logical | FALSE | TRUE, FALSE | - |
max.depth | integer | NULL | |
|
min.bucket | integer | 1 | |
|
min.node.size | integer | 5 | |
|
minprop | numeric | 0.1 | |
|
mtry | integer | - | |
|
mtry.ratio | numeric | - | |
|
node.stats | logical | FALSE | TRUE, FALSE | - |
num.random.splits | integer | 1 | |
|
num.threads | integer | 1 | |
|
num.trees | integer | 500 | |
|
oob.error | logical | TRUE | TRUE, FALSE | - |
regularization.factor | untyped | 1 | - | |
regularization.usedepth | logical | FALSE | TRUE, FALSE | - |
replace | logical | TRUE | TRUE, FALSE | - |
respect.unordered.factors | character | ignore | ignore, order, partition | - |
sample.fraction | numeric | - | |
|
save.memory | logical | FALSE | TRUE, FALSE | - |
scale.permutation.importance | logical | FALSE | TRUE, FALSE | - |
se.method | character | infjack | jack, infjack | - |
seed | integer | NULL | |
|
split.select.weights | untyped | NULL | - | |
splitrule | character | variance | variance, extratrees, maxstat | - |
verbose | logical | TRUE | TRUE, FALSE | - |
write.forest | logical | TRUE | TRUE, FALSE | - |
mtry
:
This hyperparameter can alternatively be set via our hyperparameter mtry.ratio
as mtry = max(ceiling(mtry.ratio * n_features), 1)
.
Note that mtry
and mtry.ratio
are mutually exclusive.
num.threads
:
Actual default: NULL
, triggering auto-detection of the number of CPUs.
Adjusted value: 1.
Reason for change: Conflicting with parallelization via future.
mlr3::Learner
-> mlr3::LearnerRegr
-> LearnerRegrRanger
new()
Creates a new instance of this R6 class.
LearnerRegrRanger$new()
importance()
The importance scores are extracted from the model slot variable.importance
.
Parameter importance.mode
must be set to "impurity"
, "impurity_corrected"
, or
"permutation"
LearnerRegrRanger$importance()
Named numeric()
.
oob_error()
The out-of-bag error, extracted from model slot prediction.error
.
LearnerRegrRanger$oob_error()
numeric(1)
.
clone()
The objects of this class are cloneable with this method.
LearnerRegrRanger$clone(deep = FALSE)
deep
Whether to make a deep clone.
Wright, N. M, Ziegler, Andreas (2017). “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software, 77(1), 1–17. doi:10.18637/jss.v077.i01.
Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. ISSN 1573-0565, doi:10.1023/A:1010933404324.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.svm
,
mlr_learners_regr.xgboost
if (requireNamespace("ranger", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.ranger") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("ranger", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.ranger") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
Support vector machine for regression.
Calls e1071::svm()
from package e1071.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("regr.svm") lrn("regr.svm")
Task type: “regr”
Predict Types: “response”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, e1071
Id | Type | Default | Levels | Range |
cachesize | numeric | 40 | |
|
coef0 | numeric | 0 | |
|
cost | numeric | 1 | |
|
cross | integer | 0 | |
|
degree | integer | 3 | |
|
epsilon | numeric | 0.1 | |
|
fitted | logical | TRUE | TRUE, FALSE | - |
gamma | numeric | - | |
|
kernel | character | radial | linear, polynomial, radial, sigmoid | - |
nu | numeric | 0.5 | |
|
scale | untyped | TRUE | - | |
shrinking | logical | TRUE | TRUE, FALSE | - |
tolerance | numeric | 0.001 | |
|
type | character | eps-regression | eps-regression, nu-regression | - |
mlr3::Learner
-> mlr3::LearnerRegr
-> LearnerRegrSVM
new()
Creates a new instance of this R6 class.
LearnerRegrSVM$new()
clone()
The objects of this class are cloneable with this method.
LearnerRegrSVM$clone(deep = FALSE)
deep
Whether to make a deep clone.
Cortes, Corinna, Vapnik, Vladimir (1995). “Support-vector networks.” Machine Learning, 20(3), 273–297. doi:10.1007/BF00994018.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.xgboost
if (requireNamespace("e1071", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.svm") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
if (requireNamespace("e1071", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.svm") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() }
eXtreme Gradient Boosting regression.
Calls xgboost::xgb.train()
from package xgboost.
To compute on GPUs, you first need to compile xgboost yourself and link against CUDA. See https://xgboost.readthedocs.io/en/stable/build.html#building-with-gpu-support.
Note that using the watchlist
parameter directly will lead to problems when wrapping this mlr3::Learner in a
mlr3pipelines
GraphLearner
as the preprocessing steps will not be applied to the data in the watchlist.
See the section Early Stopping and Validation on how to do this.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn()
:
mlr_learners$get("regr.xgboost") lrn("regr.xgboost")
Task type: “regr”
Predict Types: “response”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, xgboost
Id | Type | Default | Levels | Range |
alpha | numeric | 0 | |
|
approxcontrib | logical | FALSE | TRUE, FALSE | - |
base_score | numeric | 0.5 | |
|
booster | character | gbtree | gbtree, gblinear, dart | - |
callbacks | untyped | list() | - | |
colsample_bylevel | numeric | 1 | |
|
colsample_bynode | numeric | 1 | |
|
colsample_bytree | numeric | 1 | |
|
device | untyped | "cpu" | - | |
disable_default_eval_metric | logical | FALSE | TRUE, FALSE | - |
early_stopping_rounds | integer | NULL | |
|
eta | numeric | 0.3 | |
|
eval_metric | untyped | "rmse" | - | |
feature_selector | character | cyclic | cyclic, shuffle, random, greedy, thrifty | - |
gamma | numeric | 0 | |
|
grow_policy | character | depthwise | depthwise, lossguide | - |
interaction_constraints | untyped | - | - | |
iterationrange | untyped | - | - | |
lambda | numeric | 1 | |
|
lambda_bias | numeric | 0 | |
|
max_bin | integer | 256 | |
|
max_delta_step | numeric | 0 | |
|
max_depth | integer | 6 | |
|
max_leaves | integer | 0 | |
|
maximize | logical | NULL | TRUE, FALSE | - |
min_child_weight | numeric | 1 | |
|
missing | numeric | NA | |
|
monotone_constraints | untyped | 0 | - | |
normalize_type | character | tree | tree, forest | - |
nrounds | integer | - | |
|
nthread | integer | 1 | |
|
ntreelimit | integer | NULL | |
|
num_parallel_tree | integer | 1 | |
|
objective | untyped | "reg:squarederror" | - | |
one_drop | logical | FALSE | TRUE, FALSE | - |
outputmargin | logical | FALSE | TRUE, FALSE | - |
predcontrib | logical | FALSE | TRUE, FALSE | - |
predinteraction | logical | FALSE | TRUE, FALSE | - |
predleaf | logical | FALSE | TRUE, FALSE | - |
print_every_n | integer | 1 | |
|
process_type | character | default | default, update | - |
rate_drop | numeric | 0 | |
|
refresh_leaf | logical | TRUE | TRUE, FALSE | - |
reshape | logical | FALSE | TRUE, FALSE | - |
sampling_method | character | uniform | uniform, gradient_based | - |
sample_type | character | uniform | uniform, weighted | - |
save_name | untyped | NULL | - | |
save_period | integer | NULL | |
|
scale_pos_weight | numeric | 1 | |
|
seed_per_iteration | logical | FALSE | TRUE, FALSE | - |
skip_drop | numeric | 0 | |
|
strict_shape | logical | FALSE | TRUE, FALSE | - |
subsample | numeric | 1 | |
|
top_k | integer | 0 | |
|
training | logical | FALSE | TRUE, FALSE | - |
tree_method | character | auto | auto, exact, approx, hist, gpu_hist | - |
tweedie_variance_power | numeric | 1.5 | |
|
updater | untyped | - | - | |
verbose | integer | 1 | |
|
watchlist | untyped | NULL | - | |
xgb_model | untyped | NULL | - | |
In order to monitor the validation performance during the training, you can set the $validate
field of the Learner.
For information on how to configure the valdiation set, see the Validation section of mlr3::Learner.
This validation data can also be used for early stopping, which can be enabled by setting the early_stopping_rounds
parameter.
The final (or in the case of early stopping best) validation scores can be accessed via $internal_valid_scores
, and the optimal nrounds
via $internal_tuned_values
.
The internal validation measure can be set via the eval_metric
parameter that can be a mlr3::Measure, a function, or a character string for the internal xgboost measures.
Using an mlr3::Measure is slower than the internal xgboost measures, but allows to use the same measure for tuning and validation.
nrounds
:
Actual default: no default.
Adjusted default: 1000.
Reason for change: Without a default construction of the learner would error. The lightgbm learner has a default of 1000, so we use the same here.
nthread
:
Actual value: Undefined, triggering auto-detection of the number of CPUs.
Adjusted value: 1.
Reason for change: Conflicting with parallelization via future.
verbose
:
Actual default: 1.
Adjusted default: 0.
Reason for change: Reduce verbosity.
mlr3::Learner
-> mlr3::LearnerRegr
-> LearnerRegrXgboost
internal_valid_scores
(named list()
or NULL
)
The validation scores extracted from model$evaluation_log
.
If early stopping is activated, this contains the validation scores of the model for the optimal nrounds
,
otherwise the nrounds
for the final model.
internal_tuned_values
(named list()
or NULL
)
If early stopping is activated, this returns a list with nrounds
,
which is extracted from $best_iteration
of the model and otherwise NULL
.
validate
(numeric(1)
or character(1)
or NULL
)
How to construct the internal validation data. This parameter can be either NULL
,
a ratio, "test"
, or "predefined"
.
Returns the $best_iteration
when early stopping is activated.
new()
Creates a new instance of this R6 class.
LearnerRegrXgboost$new()
importance()
The importance scores are calculated with xgboost::xgb.importance()
.
LearnerRegrXgboost$importance()
Named numeric()
.
clone()
The objects of this class are cloneable with this method.
LearnerRegrXgboost$clone(deep = FALSE)
deep
Whether to make a deep clone.
To compute on GPUs, you first need to compile xgboost yourself and link against CUDA. See https://xgboost.readthedocs.io/en/stable/build.html#building-with-gpu-support.
Chen, Tianqi, Guestrin, Carlos (2016). “Xgboost: A scalable tree boosting system.” In Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 785–794. ACM. doi:10.1145/2939672.2939785.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners)
for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet
,
mlr_learners_classif.glmnet
,
mlr_learners_classif.kknn
,
mlr_learners_classif.lda
,
mlr_learners_classif.log_reg
,
mlr_learners_classif.multinom
,
mlr_learners_classif.naive_bayes
,
mlr_learners_classif.nnet
,
mlr_learners_classif.qda
,
mlr_learners_classif.ranger
,
mlr_learners_classif.svm
,
mlr_learners_classif.xgboost
,
mlr_learners_regr.cv_glmnet
,
mlr_learners_regr.glmnet
,
mlr_learners_regr.kknn
,
mlr_learners_regr.km
,
mlr_learners_regr.lm
,
mlr_learners_regr.nnet
,
mlr_learners_regr.ranger
,
mlr_learners_regr.svm
## Not run: if (requireNamespace("xgboost", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.xgboost") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() } ## End(Not run) ## Not run: # Train learner with early stopping on spam data set task = tsk("mtcars") # use 30 percent for validation # Set early stopping parameter learner = lrn("regr.xgboost", nrounds = 100, early_stopping_rounds = 10, validate = 0.3 ) # Train learner with early stopping learner$train(task) # Inspect optimal nrounds and validation performance learner$internal_tuned_values learner$internal_valid_scores ## End(Not run)
## Not run: if (requireNamespace("xgboost", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.xgboost") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() } ## End(Not run) ## Not run: # Train learner with early stopping on spam data set task = tsk("mtcars") # use 30 percent for validation # Set early stopping parameter learner = lrn("regr.xgboost", nrounds = 100, early_stopping_rounds = 10, validate = 0.3 ) # Train learner with early stopping learner$train(task) # Inspect optimal nrounds and validation performance learner$internal_tuned_values learner$internal_valid_scores ## End(Not run)