| Title: | Recommended Learners for 'mlr3' |
|---|---|
| Description: | Recommended Learners for 'mlr3'. Extends 'mlr3' with interfaces to essential machine learning packages on CRAN. This includes, but is not limited to: (penalized) linear and logistic regression, linear and quadratic discriminant analysis, k-nearest neighbors, naive Bayes, support vector machines, and gradient boosting. |
| Authors: | Michel Lang [aut] (ORCID: <https://orcid.org/0000-0001-9754-0393>), Quay Au [aut] (ORCID: <https://orcid.org/0000-0002-5252-8902>), Stefan Coors [aut] (ORCID: <https://orcid.org/0000-0002-7465-2146>), Patrick Schratz [aut] (ORCID: <https://orcid.org/0000-0003-0748-6624>), Marc Becker [cre, aut] (ORCID: <https://orcid.org/0000-0002-8115-0400>), John Zobolas [aut] (ORCID: <https://orcid.org/0000-0002-3609-8674>) |
| Maintainer: | Marc Becker <[email protected]> |
| License: | LGPL-3 |
| Version: | 0.14.0 |
| Built: | 2026-05-20 10:23:55 UTC |
| Source: | https://github.com/mlr-org/mlr3learners |
More learners are implemented in the mlr3extralearners package. A guide on how to create custom learners is covered in the book: https://mlr3book.mlr-org.com. Feel invited to contribute a missing learner to the mlr3 ecosystem!
Maintainer: Marc Becker [email protected] (ORCID)
Authors:
Michel Lang [email protected] (ORCID)
Quay Au [email protected] (ORCID)
Stefan Coors [email protected] (ORCID)
Patrick Schratz [email protected] (ORCID)
John Zobolas [email protected] (ORCID)
Useful links:
Report bugs at https://github.com/mlr-org/mlr3learners/issues
Generalized linear models with elastic net regularization.
Calls glmnet::cv.glmnet() from package glmnet.
The default for hyperparameter family is set to "binomial" or "multinomial",
depending on the number of classes.
If a Task contains a column with the offset role, it is automatically incorporated during training via the offset argument in glmnet::glmnet().
During prediction, the offset column from the test set is used only if use_pred_offset = TRUE (default), passed via the newoffset argument in glmnet::predict.glmnet().
Otherwise, if the user sets use_pred_offset = FALSE, a zero offset is applied, effectively disabling the offset adjustment during prediction.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("classif.cv_glmnet")
lrn("classif.cv_glmnet")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, glmnet
| Id | Type | Default | Levels | Range |
| alignment | character | lambda | lambda, fraction | - |
| alpha | numeric | 1 | |
|
| big | numeric | 9.9e+35 | |
|
| devmax | numeric | 0.999 | |
|
| dfmax | integer | - | |
|
| epsnr | numeric | 1e-08 | |
|
| eps | numeric | 1e-06 | |
|
| exclude | integer | - | |
|
| exmx | numeric | 250 | |
|
| fdev | numeric | 1e-05 | |
|
| foldid | untyped | NULL | - | |
| gamma | untyped | - | - | |
| grouped | logical | TRUE | TRUE, FALSE | - |
| intercept | logical | TRUE | TRUE, FALSE | - |
| keep | logical | FALSE | TRUE, FALSE | - |
| lambda.min.ratio | numeric | - | |
|
| lambda | untyped | - | - | |
| lower.limits | untyped | - | - | |
| maxit | integer | 100000 | |
|
| mnlam | integer | 5 | |
|
| mxitnr | integer | 25 | |
|
| mxit | integer | 100 | |
|
| nfolds | integer | 10 | |
|
| nlambda | integer | 100 | |
|
| use_pred_offset | logical | TRUE | TRUE, FALSE | - |
| parallel | logical | FALSE | TRUE, FALSE | - |
| penalty.factor | untyped | - | - | |
| pmax | integer | - | |
|
| pmin | numeric | 1e-09 | |
|
| prec | numeric | 1e-10 | |
|
| predict.gamma | numeric | gamma.1se | |
|
| relax | logical | FALSE | TRUE, FALSE | - |
| s | numeric | lambda.1se | |
|
| standardize | logical | TRUE | TRUE, FALSE | - |
| standardize.response | logical | FALSE | TRUE, FALSE | - |
| thresh | numeric | 1e-07 | |
|
| trace.it | integer | 0 | |
|
| type.gaussian | character | - | covariance, naive | - |
| type.logistic | character | - | Newton, modified.Newton | - |
| type.measure | character | deviance | deviance, class, auc, mse, mae | - |
| type.multinomial | character | - | ungrouped, grouped | - |
| upper.limits | untyped | - | - |
Starting with mlr3 v0.5.0, the order of class labels is reversed prior to
model fitting to comply to the stats::glm() convention that the negative class is provided
as the first factor level.
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifCVGlmnet
new()
Creates a new instance of this R6 class.
LearnerClassifCVGlmnet$new()
selected_features()
Returns the set of selected features as reported by glmnet::predict.glmnet()
with type set to "nonzero".
LearnerClassifCVGlmnet$selected_features(lambda = NULL)
lambda(numeric(1))
Custom lambda, defaults to the active lambda depending on parameter set.
(character()) of feature names.
clone()
The objects of this class are cloneable with this method.
LearnerClassifCVGlmnet$clone(deep = FALSE)
deepWhether to make a deep clone.
Friedman J, Hastie T, Tibshirani R (2010). “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software, 33(1), 1–22. doi:10.18637/jss.v033.i01.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("classif.cv_glmnet") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("classif.cv_glmnet") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Generalized linear models with elastic net regularization.
Calls glmnet::glmnet() from package glmnet.
Caution: This learner is different to learners calling glmnet::cv.glmnet()
in that it does not use the internal optimization of parameter lambda.
Instead, lambda needs to be tuned by the user (e.g., via mlr3tuning).
When lambda is tuned, the glmnet will be trained for each tuning iteration.
While fitting the whole path of lambdas would be more efficient, as is done
by default in glmnet::glmnet(), tuning/selecting the parameter at prediction time
(using parameter s) is currently not supported in mlr3
(at least not in efficient manner).
Tuning the s parameter is, therefore, currently discouraged.
When the data are i.i.d. and efficiency is key, we recommend using the respective
auto-tuning counterparts in mlr_learners_classif.cv_glmnet() or
mlr_learners_regr.cv_glmnet().
However, in some situations this is not applicable, usually when data are
imbalanced or not i.i.d. (longitudinal, time-series) and tuning requires
custom resampling strategies (blocked design, stratification).
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("classif.glmnet")
lrn("classif.glmnet")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, glmnet
| Id | Type | Default | Levels | Range |
| alpha | numeric | 1 | |
|
| big | numeric | 9.9e+35 | |
|
| devmax | numeric | 0.999 | |
|
| dfmax | integer | - | |
|
| eps | numeric | 1e-06 | |
|
| epsnr | numeric | 1e-08 | |
|
| exact | logical | FALSE | TRUE, FALSE | - |
| exclude | integer | - | |
|
| exmx | numeric | 250 | |
|
| fdev | numeric | 1e-05 | |
|
| gamma | numeric | 1 | |
|
| intercept | logical | TRUE | TRUE, FALSE | - |
| lambda | untyped | - | - | |
| lambda.min.ratio | numeric | - | |
|
| lower.limits | untyped | - | - | |
| maxit | integer | 100000 | |
|
| mnlam | integer | 5 | |
|
| mxit | integer | 100 | |
|
| mxitnr | integer | 25 | |
|
| nlambda | integer | 100 | |
|
| use_pred_offset | logical | TRUE | TRUE, FALSE | - |
| penalty.factor | untyped | - | - | |
| pmax | integer | - | |
|
| pmin | numeric | 1e-09 | |
|
| prec | numeric | 1e-10 | |
|
| relax | logical | FALSE | TRUE, FALSE | - |
| s | numeric | 0.01 | |
|
| standardize | logical | TRUE | TRUE, FALSE | - |
| standardize.response | logical | FALSE | TRUE, FALSE | - |
| thresh | numeric | 1e-07 | |
|
| trace.it | integer | 0 | |
|
| type.gaussian | character | - | covariance, naive | - |
| type.logistic | character | - | Newton, modified.Newton | - |
| type.multinomial | character | - | ungrouped, grouped | - |
| upper.limits | untyped | - | - |
Starting with mlr3 v0.5.0, the order of class labels is reversed prior to
model fitting to comply to the stats::glm() convention that the negative class is provided
as the first factor level.
If a Task contains a column with the offset role, it is automatically incorporated during training via the offset argument in glmnet::glmnet().
During prediction, the offset column from the test set is used only if use_pred_offset = TRUE (default), passed via the newoffset argument in glmnet::predict.glmnet().
Otherwise, if the user sets use_pred_offset = FALSE, a zero offset is applied, effectively disabling the offset adjustment during prediction.
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifGlmnet
new()
Creates a new instance of this R6 class.
LearnerClassifGlmnet$new()
selected_features()
Returns the set of selected features as reported by glmnet::predict.glmnet()
with type set to "nonzero".
LearnerClassifGlmnet$selected_features(lambda = NULL)
lambda(numeric(1))
Custom lambda, defaults to the active lambda depending on parameter set.
(character()) of feature names.
clone()
The objects of this class are cloneable with this method.
LearnerClassifGlmnet$clone(deep = FALSE)
deepWhether to make a deep clone.
Friedman J, Hastie T, Tibshirani R (2010). “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software, 33(1), 1–22. doi:10.18637/jss.v033.i01.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("classif.glmnet") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("classif.glmnet") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
k-Nearest-Neighbor classification.
Calls kknn::kknn() from package kknn.
store_model:
See note.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("classif.kknn")
lrn("classif.kknn")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, kknn
| Id | Type | Default | Levels | Range |
| k | integer | 7 | |
|
| distance | numeric | 2 | |
|
| kernel | character | optimal | rectangular, triangular, epanechnikov, biweight, triweight, cos, inv, gaussian, rank, optimal | - |
| scale | logical | TRUE | TRUE, FALSE | - |
| ykernel | untyped | NULL | - | |
| store_model | logical | FALSE | TRUE, FALSE | - |
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifKKNN
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerClassifKKNN$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifKKNN$clone(deep = FALSE)
deepWhether to make a deep clone.
There is no training step for k-NN models, just storing the training data to
process it during the predict step.
Therefore, $model returns a list with the following elements:
formula: Formula for calling kknn::kknn() during $predict().
data: Training data for calling kknn::kknn() during $predict().
pv: Training parameters for calling kknn::kknn() during $predict().
kknn: Model as returned by kknn::kknn(), only available after $predict() has been called.
This is not stored by default, you must set hyperparameter store_model to TRUE.
Hechenbichler, Klaus, Schliep, Klaus (2004). “Weighted k-nearest-neighbor techniques and ordinal classification.” Technical Report Discussion Paper 399, SFB 386, Ludwig-Maximilians University Munich. doi:10.5282/ubm/epub.1769.
Samworth, J R (2012). “Optimal weighted nearest neighbour classifiers.” The Annals of Statistics, 40(5), 2733–2763. doi:10.1214/12-AOS1049.
Cover, Thomas, Hart, Peter (1967). “Nearest neighbor pattern classification.” IEEE transactions on information theory, 13(1), 21–27. doi:10.1109/TIT.1967.1053964.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("classif.kknn") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("classif.kknn") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Linear discriminant analysis.
Calls MASS::lda() from package MASS.
Parameters method and prior exist for training and prediction but
accept different values for each. Therefore, arguments for
the predict stage have been renamed to predict.method and predict.prior,
respectively.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("classif.lda")
lrn("classif.lda")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, MASS
| Id | Type | Default | Levels | Range |
| dimen | untyped | - | - | |
| method | character | moment | moment, mle, mve, t | - |
| nu | integer | - | |
|
| predict.method | character | plug-in | plug-in, predictive, debiased | - |
| predict.prior | untyped | - | - | |
| prior | untyped | - | - | |
| tol | numeric | - |
|
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifLDA
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerClassifLDA$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifLDA$clone(deep = FALSE)
deepWhether to make a deep clone.
Venables WN, Ripley BD (2002). Modern Applied Statistics with S, Fourth edition. Springer, New York. ISBN 0-387-95457-0, http://www.stats.ox.ac.uk/pub/MASS4/.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("classif.lda") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("classif.lda") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Classification via logistic regression.
Calls stats::glm() with family set to "binomial".
Starting with mlr3 v0.5.0, the order of class labels is reversed prior to
model fitting to comply to the stats::glm() convention that the negative class is provided
as the first factor level.
model:
Actual default: TRUE.
Adjusted default: FALSE.
Reason for change: Save some memory.
If a Task has a column with the role offset, it will automatically be used during training.
The offset is incorporated through the formula interface to ensure compatibility with stats::glm().
We add it to the model formula as offset(<column_name>) and also include it in the training data.
During prediction, the default behavior is to use the offset column from the test set (enabled by use_pred_offset = TRUE).
Otherwise, if the user sets use_pred_offset = FALSE, a zero offset is applied, effectively disabling the offset adjustment during prediction.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("classif.log_reg")
lrn("classif.log_reg")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “character”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, 'stats'
| Id | Type | Default | Levels | Range |
| dispersion | untyped | NULL | - | |
| epsilon | numeric | 1e-08 | |
|
| etastart | untyped | - | - | |
| maxit | numeric | 25 | |
|
| model | logical | TRUE | TRUE, FALSE | - |
| mustart | untyped | - | - | |
| singular.ok | logical | TRUE | TRUE, FALSE | - |
| start | untyped | NULL | - | |
| trace | logical | FALSE | TRUE, FALSE | - |
| x | logical | FALSE | TRUE, FALSE | - |
| y | logical | TRUE | TRUE, FALSE | - |
| use_pred_offset | logical | TRUE | TRUE, FALSE | - |
To ensure reproducibility, this learner always uses the default contrasts:
contr.treatment() for unordered factors, and
contr.poly() for ordered factors.
Setting the option "contrasts" does not have any effect.
Instead, set the respective hyperparameter or use mlr3pipelines to create dummy features.
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifLogReg
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerClassifLogReg$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifLogReg$clone(deep = FALSE)
deepWhether to make a deep clone.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("classif.log_reg") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("classif.log_reg") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Multinomial log-linear models via neural networks.
Calls nnet::multinom() from package nnet.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("classif.multinom")
lrn("classif.multinom")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”
Required Packages: mlr3, mlr3learners, nnet
| Id | Type | Default | Levels | Range |
| Hess | logical | FALSE | TRUE, FALSE | - |
| abstol | numeric | 1e-04 | |
|
| censored | logical | FALSE | TRUE, FALSE | - |
| decay | numeric | 0 | |
|
| entropy | logical | FALSE | TRUE, FALSE | - |
| mask | untyped | - | - | |
| maxit | integer | 100 | |
|
| MaxNWts | integer | 1000 | |
|
| model | logical | FALSE | TRUE, FALSE | - |
| linout | logical | FALSE | TRUE, FALSE | - |
| rang | numeric | 0.7 | |
|
| reltol | numeric | 1e-08 | |
|
| size | integer | - | |
|
| skip | logical | FALSE | TRUE, FALSE | - |
| softmax | logical | FALSE | TRUE, FALSE | - |
| summ | character | 0 | 0, 1, 2, 3 | - |
| trace | logical | TRUE | TRUE, FALSE | - |
| Wts | untyped | - | - |
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifMultinom
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerClassifMultinom$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifMultinom$clone(deep = FALSE)
deepWhether to make a deep clone.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("classif.multinom") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("classif.multinom") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Naive Bayes classification.
Calls e1071::naiveBayes() from package e1071.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("classif.naive_bayes")
lrn("classif.naive_bayes")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”
Required Packages: mlr3, mlr3learners, e1071
| Id | Type | Default | Range |
| eps | numeric | 0 | |
| laplace | numeric | 0 | |
| threshold | numeric | 0.001 |
|
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifNaiveBayes
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerClassifNaiveBayes$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifNaiveBayes$clone(deep = FALSE)
deepWhether to make a deep clone.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("classif.naive_bayes") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("classif.naive_bayes") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Single Layer Neural Network.
Calls nnet::nnet.formula() from package nnet.
Note that modern neural networks with multiple layers are connected via package mlr3torch.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("classif.nnet")
lrn("classif.nnet")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, nnet
| Id | Type | Default | Levels | Range |
| Hess | logical | FALSE | TRUE, FALSE | - |
| MaxNWts | integer | 1000 | |
|
| Wts | untyped | - | - | |
| abstol | numeric | 1e-04 | |
|
| censored | logical | FALSE | TRUE, FALSE | - |
| contrasts | untyped | NULL | - | |
| decay | numeric | 0 | |
|
| mask | untyped | - | - | |
| maxit | integer | 100 | |
|
| na.action | untyped | - | - | |
| rang | numeric | 0.7 | |
|
| reltol | numeric | 1e-08 | |
|
| size | integer | 3 | |
|
| skip | logical | FALSE | TRUE, FALSE | - |
| subset | untyped | - | - | |
| trace | logical | TRUE | TRUE, FALSE | - |
| formula | untyped | - | - |
size:
Adjusted default: 3L.
Reason for change: no default in nnet().
formula: if not provided, the formula is set to task$formula().
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifNnet
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerClassifNnet$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifNnet$clone(deep = FALSE)
deepWhether to make a deep clone.
Ripley BD (1996). Pattern Recognition and Neural Networks. Cambridge University Press. doi:10.1017/cbo9780511812651.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("classif.nnet") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("classif.nnet") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Quadratic discriminant analysis.
Calls MASS::qda() from package MASS.
Parameters method and prior exist for training and prediction but
accept different values for each. Therefore, arguments for
the predict stage have been renamed to predict.method and predict.prior,
respectively.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("classif.qda")
lrn("classif.qda")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, MASS
| Id | Type | Default | Levels | Range |
| method | character | moment | moment, mle, mve, t | - |
| nu | integer | - | |
|
| predict.method | character | plug-in | plug-in, predictive, debiased | - |
| predict.prior | untyped | - | - | |
| prior | untyped | - | - |
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifQDA
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerClassifQDA$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifQDA$clone(deep = FALSE)
deepWhether to make a deep clone.
Venables WN, Ripley BD (2002). Modern Applied Statistics with S, Fourth edition. Springer, New York. ISBN 0-387-95457-0, http://www.stats.ox.ac.uk/pub/MASS4/.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("classif.qda") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("classif.qda") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Random classification forest.
Calls ranger::ranger() from package ranger.
mtry:
This hyperparameter can alternatively be set via our hyperparameter mtry.ratio
as mtry = max(ceiling(mtry.ratio * n_features), 1).
Note that mtry and mtry.ratio are mutually exclusive.
num.threads:
Actual default: 2, using two threads, while also respecting environment variable
R_RANGER_NUM_THREADS, options(ranger.num.threads = N), or options(Ncpus = N), with
precedence in that order.
Adjusted value: 1.
Reason for change: Conflicting with parallelization via future.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("classif.ranger")
lrn("classif.ranger")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”, “character”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, ranger
| Id | Type | Default | Levels | Range |
| always.split.variables | untyped | - | - | |
| class.weights | untyped | NULL | - | |
| holdout | logical | FALSE | TRUE, FALSE | - |
| importance | character | - | none, impurity, impurity_corrected, permutation | - |
| keep.inbag | logical | FALSE | TRUE, FALSE | - |
| max.depth | integer | NULL | |
|
| min.bucket | untyped | 1L | - | |
| min.node.size | untyped | NULL | - | |
| mtry | integer | - | |
|
| mtry.ratio | numeric | - | |
|
| na.action | character | na.learn | na.learn, na.omit, na.fail | - |
| num.random.splits | integer | 1 | |
|
| node.stats | logical | FALSE | TRUE, FALSE | - |
| num.threads | integer | 1 | |
|
| num.trees | integer | 500 | |
|
| oob.error | logical | TRUE | TRUE, FALSE | - |
| regularization.factor | untyped | 1 | - | |
| regularization.usedepth | logical | FALSE | TRUE, FALSE | - |
| replace | logical | TRUE | TRUE, FALSE | - |
| respect.unordered.factors | character | - | ignore, order, partition | - |
| sample.fraction | numeric | - | |
|
| save.memory | logical | FALSE | TRUE, FALSE | - |
| scale.permutation.importance | logical | FALSE | TRUE, FALSE | - |
| seed | integer | NULL | |
|
| split.select.weights | untyped | NULL | - | |
| splitrule | character | gini | gini, extratrees, hellinger | - |
| verbose | logical | TRUE | TRUE, FALSE | - |
| write.forest | logical | TRUE | TRUE, FALSE | - |
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifRanger
new()
Creates a new instance of this R6 class.
LearnerClassifRanger$new()
importance()
The importance scores are extracted from the model slot variable.importance.
Parameter importance.mode must be set to "impurity", "impurity_corrected", or
"permutation"
LearnerClassifRanger$importance()
Named numeric().
oob_error()
The out-of-bag error, extracted from model slot prediction.error.
LearnerClassifRanger$oob_error()
numeric(1).
selected_features()
The set of features used for node splitting in the forest.
LearnerClassifRanger$selected_features()
character().
clone()
The objects of this class are cloneable with this method.
LearnerClassifRanger$clone(deep = FALSE)
deepWhether to make a deep clone.
Wright, N. M, Ziegler, Andreas (2017). “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software, 77(1), 1–17. doi:10.18637/jss.v077.i01.
Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. ISSN 1573-0565, doi:10.1023/A:1010933404324.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("classif.ranger") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("classif.ranger") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Support vector machine for classification.
Calls e1071::svm() from package e1071.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("classif.svm")
lrn("classif.svm")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, e1071
| Id | Type | Default | Levels | Range |
| cachesize | numeric | 40 | |
|
| class.weights | untyped | NULL | - | |
| coef0 | numeric | 0 | |
|
| cost | numeric | 1 | |
|
| cross | integer | 0 | |
|
| decision.values | logical | FALSE | TRUE, FALSE | - |
| degree | integer | 3 | |
|
| epsilon | numeric | 0.1 | |
|
| fitted | logical | TRUE | TRUE, FALSE | - |
| gamma | numeric | - | |
|
| kernel | character | radial | linear, polynomial, radial, sigmoid | - |
| nu | numeric | 0.5 | |
|
| scale | untyped | TRUE | - | |
| shrinking | logical | TRUE | TRUE, FALSE | - |
| tolerance | numeric | 0.001 | |
|
| type | character | C-classification | C-classification, nu-classification | - |
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifSVM
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerClassifSVM$new()
clone()
The objects of this class are cloneable with this method.
LearnerClassifSVM$clone(deep = FALSE)
deepWhether to make a deep clone.
Cortes, Corinna, Vapnik, Vladimir (1995). “Support-vector networks.” Machine Learning, 20(3), 273–297. doi:10.1007/BF00994018.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("classif.svm") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("classif.svm") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
eXtreme Gradient Boosting classification.
Calls xgboost::xgb.train() from package xgboost.
Note that using the evals parameter directly will lead to problems
when wrapping this mlr3::Learner in a mlr3pipelines GraphLearner
as the preprocessing steps will not be applied to the data in evals.
See the section Early Stopping and Validation on how to do this.
nrounds:
Actual default: no default.
Adjusted default: 1000.
Reason for change: Without a default construction of the learner would error. The lightgbm learner has a default of 1000, so we use the same here.
nthread:
Actual value: Undefined, triggering auto-detection of the number of CPUs.
Adjusted value: 1.
Reason for change: Conflicting with parallelization via future.
verbose:
Actual default: 1.
Adjusted default: 0.
Reason for change: Reduce verbosity.
verbosity:
Actual default: 1.
Adjusted default: 0.
Reason for change: Reduce verbosity.
In order to monitor the validation performance during the training, you can set the $validate field of the Learner.
For information on how to configure the validation set, see the Validation section of mlr3::Learner.
This validation data can also be used for early stopping, which can be enabled by setting the early_stopping_rounds parameter.
The final (or in the case of early stopping best) validation scores can be accessed via $internal_valid_scores, and the optimal nrounds via $internal_tuned_values.
The internal validation measure can be set via the custom_metric parameter that can be a mlr3::Measure, a function, or a character string for the internal xgboost measures.
Using an mlr3::Measure is slower than the internal xgboost measures, but allows to use the same measure for tuning and validation.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("classif.xgboost")
lrn("classif.xgboost")
Task type: “classif”
Predict Types: “response”, “prob”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, xgboost
| Id | Type | Default | Levels | Range |
| alpha | numeric | 0 | |
|
| approxcontrib | logical | FALSE | TRUE, FALSE | - |
| base_score | numeric | 0.5 | |
|
| booster | character | gbtree | gbtree, gblinear, dart | - |
| callbacks | untyped | list() | - | |
| colsample_bylevel | numeric | 1 | |
|
| colsample_bynode | numeric | 1 | |
|
| colsample_bytree | numeric | 1 | |
|
| device | untyped | "cpu" | - | |
| disable_default_eval_metric | logical | FALSE | TRUE, FALSE | - |
| early_stopping_rounds | integer | NULL | |
|
| eta | numeric | 0.3 | |
|
| evals | untyped | NULL | - | |
| eval_metric | untyped | - | - | |
| custom_metric | untyped | - | - | |
| extmem_single_page | logical | FALSE | TRUE, FALSE | - |
| feature_selector | character | cyclic | cyclic, shuffle, random, greedy, thrifty | - |
| gamma | numeric | 0 | |
|
| grow_policy | character | depthwise | depthwise, lossguide | - |
| interaction_constraints | untyped | - | - | |
| iterationrange | untyped | - | - | |
| lambda | numeric | 1 | |
|
| max_bin | integer | 256 | |
|
| max_cached_hist_node | integer | 65536 | |
|
| max_cat_to_onehot | integer | - | |
|
| max_cat_threshold | numeric | - | |
|
| max_delta_step | numeric | 0 | |
|
| max_depth | integer | 6 | |
|
| max_leaves | integer | 0 | |
|
| maximize | logical | NULL | TRUE, FALSE | - |
| min_child_weight | numeric | 1 | |
|
| missing | numeric | NA | |
|
| monotone_constraints | untyped | 0 | - | |
| nrounds | integer | - | |
|
| normalize_type | character | tree | tree, forest | - |
| nthread | integer | - | |
|
| num_parallel_tree | integer | 1 | |
|
| objective | untyped | "binary:logistic" | - | |
| one_drop | logical | FALSE | TRUE, FALSE | - |
| print_every_n | integer | 1 | |
|
| rate_drop | numeric | 0 | |
|
| refresh_leaf | logical | TRUE | TRUE, FALSE | - |
| seed | integer | - | |
|
| seed_per_iteration | logical | FALSE | TRUE, FALSE | - |
| sampling_method | character | uniform | uniform, gradient_based | - |
| sample_type | character | uniform | uniform, weighted | - |
| save_name | untyped | NULL | - | |
| save_period | integer | NULL | |
|
| scale_pos_weight | numeric | 1 | |
|
| skip_drop | numeric | 0 | |
|
| subsample | numeric | 1 | |
|
| top_k | integer | 0 | |
|
| training | logical | FALSE | TRUE, FALSE | - |
| tree_method | character | auto | auto, exact, approx, hist, gpu_hist | - |
| tweedie_variance_power | numeric | 1.5 | |
|
| updater | untyped | - | - | |
| use_rmm | logical | - | TRUE, FALSE | - |
| validate_features | logical | TRUE | TRUE, FALSE | - |
| verbose | integer | - | |
|
| verbosity | integer | - | |
|
| xgb_model | untyped | NULL | - |
If a Task has a column with the role offset, it will automatically be used during training.
The offset is incorporated through the xgboost::xgb.DMatrix interface, using the base_margin field.
No offset is applied during prediction for this learner.
mlr3::Learner -> mlr3::LearnerClassif -> LearnerClassifXgboost
internal_valid_scores(named list() or NULL)
The validation scores extracted from model$evaluation_log.
If early stopping is activated, this contains the validation scores of the model for the optimal nrounds,
otherwise the nrounds for the final model.
internal_tuned_values(named list() or NULL)
If early stopping is activated, this returns a list with nrounds,
which is extracted from $best_iteration of the model and otherwise NULL.
validate(numeric(1) or character(1) or NULL)
How to construct the internal validation data. This parameter can be either NULL,
a ratio, "test", or "predefined".
model(any)
The fitted model. Only available after $train() has been called.
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerClassif$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerClassifXgboost$new()
importance()
The importance scores are calculated with xgboost::xgb.importance().
LearnerClassifXgboost$importance()
Named numeric().
clone()
The objects of this class are cloneable with this method.
LearnerClassifXgboost$clone(deep = FALSE)
deepWhether to make a deep clone.
To compute on GPUs, you first need to compile xgboost yourself and link against CUDA. See https://xgboost.readthedocs.io/en/stable/build.html#building-with-gpu-support.
The outputmargin, predcontrib, predinteraction, and predleaf parameters are not supported.
You can still call e.g. predict(learner$model, newdata = newdata, outputmargin = TRUE) to get these predictions.
Chen, Tianqi, Guestrin, Carlos (2016). “Xgboost: A scalable tree boosting system.” In Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 785–794. ACM. doi:10.1145/2939672.2939785.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
## Not run: if (requireNamespace("xgboost", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.xgboost") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() } ## End(Not run) ## Not run: # Train learner with early stopping on spam data set task = tsk("spam") # use 30 percent for validation # Set early stopping parameter learner = lrn("classif.xgboost", nrounds = 100, early_stopping_rounds = 10, validate = 0.3 ) # Train learner with early stopping learner$train(task) # Inspect optimal nrounds and validation performance learner$internal_tuned_values learner$internal_valid_scores ## End(Not run)## Not run: if (requireNamespace("xgboost", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("classif.xgboost") print(learner) # Define a Task task = tsk("sonar") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() } ## End(Not run) ## Not run: # Train learner with early stopping on spam data set task = tsk("spam") # use 30 percent for validation # Set early stopping parameter learner = lrn("classif.xgboost", nrounds = 100, early_stopping_rounds = 10, validate = 0.3 ) # Train learner with early stopping learner$train(task) # Inspect optimal nrounds and validation performance learner$internal_tuned_values learner$internal_valid_scores ## End(Not run)
Generalized linear models with elastic net regularization.
Calls glmnet::cv.glmnet() from package glmnet.
The default for hyperparameter family is set to "gaussian".
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("regr.cv_glmnet")
lrn("regr.cv_glmnet")
Task type: “regr”
Predict Types: “response”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, glmnet
| Id | Type | Default | Levels | Range |
| alignment | character | lambda | lambda, fraction | - |
| alpha | numeric | 1 | |
|
| big | numeric | 9.9e+35 | |
|
| devmax | numeric | 0.999 | |
|
| dfmax | integer | - | |
|
| eps | numeric | 1e-06 | |
|
| epsnr | numeric | 1e-08 | |
|
| exclude | integer | - | |
|
| exmx | numeric | 250 | |
|
| family | character | gaussian | gaussian, poisson | - |
| fdev | numeric | 1e-05 | |
|
| foldid | untyped | NULL | - | |
| gamma | untyped | - | - | |
| grouped | logical | TRUE | TRUE, FALSE | - |
| intercept | logical | TRUE | TRUE, FALSE | - |
| keep | logical | FALSE | TRUE, FALSE | - |
| lambda | untyped | - | - | |
| lambda.min.ratio | numeric | - | |
|
| lower.limits | untyped | - | - | |
| maxit | integer | 100000 | |
|
| mnlam | integer | 5 | |
|
| mxit | integer | 100 | |
|
| mxitnr | integer | 25 | |
|
| nfolds | integer | 10 | |
|
| nlambda | integer | 100 | |
|
| use_pred_offset | logical | TRUE | TRUE, FALSE | - |
| parallel | logical | FALSE | TRUE, FALSE | - |
| penalty.factor | untyped | - | - | |
| pmax | integer | - | |
|
| pmin | numeric | 1e-09 | |
|
| prec | numeric | 1e-10 | |
|
| predict.gamma | numeric | gamma.1se | |
|
| relax | logical | FALSE | TRUE, FALSE | - |
| s | numeric | lambda.1se | |
|
| standardize | logical | TRUE | TRUE, FALSE | - |
| standardize.response | logical | FALSE | TRUE, FALSE | - |
| thresh | numeric | 1e-07 | |
|
| trace.it | integer | 0 | |
|
| type.gaussian | character | - | covariance, naive | - |
| type.logistic | character | - | Newton, modified.Newton | - |
| type.measure | character | deviance | deviance, class, auc, mse, mae | - |
| type.multinomial | character | - | ungrouped, grouped | - |
| upper.limits | untyped | - | - |
If a Task contains a column with the offset role, it is automatically incorporated during training via the offset argument in glmnet::glmnet().
During prediction, the offset column from the test set is used only if use_pred_offset = TRUE (default), passed via the newoffset argument in glmnet::predict.glmnet().
Otherwise, if the user sets use_pred_offset = FALSE, a zero offset is applied, effectively disabling the offset adjustment during prediction.
mlr3::Learner -> mlr3::LearnerRegr -> LearnerRegrCVGlmnet
new()
Creates a new instance of this R6 class.
LearnerRegrCVGlmnet$new()
selected_features()
Returns the set of selected features as reported by glmnet::predict.glmnet()
with type set to "nonzero".
LearnerRegrCVGlmnet$selected_features(lambda = NULL)
lambda(numeric(1))
Custom lambda, defaults to the active lambda depending on parameter set.
(character()) of feature names.
clone()
The objects of this class are cloneable with this method.
LearnerRegrCVGlmnet$clone(deep = FALSE)
deepWhether to make a deep clone.
Friedman J, Hastie T, Tibshirani R (2010). “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software, 33(1), 1–22. doi:10.18637/jss.v033.i01.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("regr.cv_glmnet") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("regr.cv_glmnet") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Generalized linear models with elastic net regularization.
Calls glmnet::glmnet() from package glmnet.
The default for hyperparameter family is set to "gaussian".
Caution: This learner is different to learners calling glmnet::cv.glmnet()
in that it does not use the internal optimization of parameter lambda.
Instead, lambda needs to be tuned by the user (e.g., via mlr3tuning).
When lambda is tuned, the glmnet will be trained for each tuning iteration.
While fitting the whole path of lambdas would be more efficient, as is done
by default in glmnet::glmnet(), tuning/selecting the parameter at prediction time
(using parameter s) is currently not supported in mlr3
(at least not in efficient manner).
Tuning the s parameter is, therefore, currently discouraged.
When the data are i.i.d. and efficiency is key, we recommend using the respective
auto-tuning counterparts in mlr_learners_classif.cv_glmnet() or
mlr_learners_regr.cv_glmnet().
However, in some situations this is not applicable, usually when data are
imbalanced or not i.i.d. (longitudinal, time-series) and tuning requires
custom resampling strategies (blocked design, stratification).
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("regr.glmnet")
lrn("regr.glmnet")
Task type: “regr”
Predict Types: “response”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, glmnet
| Id | Type | Default | Levels | Range |
| alignment | character | lambda | lambda, fraction | - |
| alpha | numeric | 1 | |
|
| big | numeric | 9.9e+35 | |
|
| devmax | numeric | 0.999 | |
|
| dfmax | integer | - | |
|
| eps | numeric | 1e-06 | |
|
| epsnr | numeric | 1e-08 | |
|
| exact | logical | FALSE | TRUE, FALSE | - |
| exclude | integer | - | |
|
| exmx | numeric | 250 | |
|
| family | character | gaussian | gaussian, poisson | - |
| fdev | numeric | 1e-05 | |
|
| gamma | numeric | 1 | |
|
| grouped | logical | TRUE | TRUE, FALSE | - |
| intercept | logical | TRUE | TRUE, FALSE | - |
| keep | logical | FALSE | TRUE, FALSE | - |
| lambda | untyped | - | - | |
| lambda.min.ratio | numeric | - | |
|
| lower.limits | untyped | - | - | |
| maxit | integer | 100000 | |
|
| mnlam | integer | 5 | |
|
| mxit | integer | 100 | |
|
| mxitnr | integer | 25 | |
|
| use_pred_offset | logical | TRUE | TRUE, FALSE | - |
| nlambda | integer | 100 | |
|
| parallel | logical | FALSE | TRUE, FALSE | - |
| penalty.factor | untyped | - | - | |
| pmax | integer | - | |
|
| pmin | numeric | 1e-09 | |
|
| prec | numeric | 1e-10 | |
|
| relax | logical | FALSE | TRUE, FALSE | - |
| s | numeric | 0.01 | |
|
| standardize | logical | TRUE | TRUE, FALSE | - |
| standardize.response | logical | FALSE | TRUE, FALSE | - |
| thresh | numeric | 1e-07 | |
|
| trace.it | integer | 0 | |
|
| type.gaussian | character | - | covariance, naive | - |
| type.logistic | character | - | Newton, modified.Newton | - |
| type.multinomial | character | - | ungrouped, grouped | - |
| upper.limits | untyped | - | - |
If a Task contains a column with the offset role, it is automatically incorporated during training via the offset argument in glmnet::glmnet().
During prediction, the offset column from the test set is used only if use_pred_offset = TRUE (default), passed via the newoffset argument in glmnet::predict.glmnet().
Otherwise, if the user sets use_pred_offset = FALSE, a zero offset is applied, effectively disabling the offset adjustment during prediction.
mlr3::Learner -> mlr3::LearnerRegr -> LearnerRegrGlmnet
new()
Creates a new instance of this R6 class.
LearnerRegrGlmnet$new()
selected_features()
Returns the set of selected features as reported by glmnet::predict.glmnet()
with type set to "nonzero".
LearnerRegrGlmnet$selected_features(lambda = NULL)
lambda(numeric(1))
Custom lambda, defaults to the active lambda depending on parameter set.
(character()) of feature names.
clone()
The objects of this class are cloneable with this method.
LearnerRegrGlmnet$clone(deep = FALSE)
deepWhether to make a deep clone.
Friedman J, Hastie T, Tibshirani R (2010). “Regularization Paths for Generalized Linear Models via Coordinate Descent.” Journal of Statistical Software, 33(1), 1–22. doi:10.18637/jss.v033.i01.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("regr.glmnet") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("regr.glmnet") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
k-Nearest-Neighbor regression.
Calls kknn::kknn() from package kknn.
store_model:
See note.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("regr.kknn")
lrn("regr.kknn")
Task type: “regr”
Predict Types: “response”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, kknn
| Id | Type | Default | Levels | Range |
| k | integer | 7 | |
|
| distance | numeric | 2 | |
|
| kernel | character | optimal | rectangular, triangular, epanechnikov, biweight, triweight, cos, inv, gaussian, rank, optimal | - |
| scale | logical | TRUE | TRUE, FALSE | - |
| ykernel | untyped | NULL | - | |
| store_model | logical | FALSE | TRUE, FALSE | - |
mlr3::Learner -> mlr3::LearnerRegr -> LearnerRegrKKNN
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerRegr$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerRegrKKNN$new()
clone()
The objects of this class are cloneable with this method.
LearnerRegrKKNN$clone(deep = FALSE)
deepWhether to make a deep clone.
There is no training step for k-NN models, just storing the training data to
process it during the predict step.
Therefore, $model returns a list with the following elements:
formula: Formula for calling kknn::kknn() during $predict().
data: Training data for calling kknn::kknn() during $predict().
pv: Training parameters for calling kknn::kknn() during $predict().
kknn: Model as returned by kknn::kknn(), only available after $predict() has been called.
This is not stored by default, you must set hyperparameter store_model to TRUE.
Hechenbichler, Klaus, Schliep, Klaus (2004). “Weighted k-nearest-neighbor techniques and ordinal classification.” Technical Report Discussion Paper 399, SFB 386, Ludwig-Maximilians University Munich. doi:10.5282/ubm/epub.1769.
Samworth, J R (2012). “Optimal weighted nearest neighbour classifiers.” The Annals of Statistics, 40(5), 2733–2763. doi:10.1214/12-AOS1049.
Cover, Thomas, Hart, Peter (1967). “Nearest neighbor pattern classification.” IEEE transactions on information theory, 13(1), 21–27. doi:10.1109/TIT.1967.1053964.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("regr.kknn") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("regr.kknn") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Kriging regression.
Calls DiceKriging::km() from package DiceKriging.
The predict type hyperparameter "type" defaults to "SK" (simple kriging).
The additional hyperparameter nugget.stability is used to overwrite the
hyperparameter nugget with nugget.stability * var(y) before training to
improve the numerical stability. We recommend a value of 1e-8.
The additional hyperparameter jitter can be set to add
N(0, [jitter])-distributed noise to the data before prediction to avoid
perfect interpolation. We recommend a value of 1e-12.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("regr.km")
lrn("regr.km")
Task type: “regr”
Predict Types: “response”, “se”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, DiceKriging
| Id | Type | Default | Levels | Range |
| bias.correct | logical | FALSE | TRUE, FALSE | - |
| checkNames | logical | TRUE | TRUE, FALSE | - |
| coef.cov | untyped | NULL | - | |
| coef.trend | untyped | NULL | - | |
| coef.var | untyped | NULL | - | |
| control | untyped | NULL | - | |
| cov.compute | logical | TRUE | TRUE, FALSE | - |
| covtype | character | matern5_2 | gauss, matern5_2, matern3_2, exp, powexp | - |
| estim.method | character | MLE | MLE, LOO | - |
| gr | logical | TRUE | TRUE, FALSE | - |
| iso | logical | FALSE | TRUE, FALSE | - |
| jitter | numeric | 0 | |
|
| kernel | untyped | NULL | - | |
| knots | untyped | NULL | - | |
| light.return | logical | FALSE | TRUE, FALSE | - |
| lower | untyped | NULL | - | |
| multistart | integer | 1 | |
|
| noise.var | untyped | NULL | - | |
| nugget | numeric | - | |
|
| nugget.estim | logical | FALSE | TRUE, FALSE | - |
| nugget.stability | numeric | 0 | |
|
| optim.method | character | BFGS | BFGS, gen | - |
| parinit | untyped | NULL | - | |
| penalty | untyped | NULL | - | |
| scaling | logical | FALSE | TRUE, FALSE | - |
| se.compute | logical | TRUE | TRUE, FALSE | - |
| type | character | SK | SK, UK | - |
| upper | untyped | NULL | - |
mlr3::Learner -> mlr3::LearnerRegr -> LearnerRegrKM
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerRegr$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerRegrKM$new()
clone()
The objects of this class are cloneable with this method.
LearnerRegrKM$clone(deep = FALSE)
deepWhether to make a deep clone.
Roustant O, Ginsbourger D, Deville Y (2012). “DiceKriging, DiceOptim: Two R Packages for the Analysis of Computer Experiments by Kriging-Based Metamodeling and Optimization.” Journal of Statistical Software, 51(1), 1–55. doi:10.18637/jss.v051.i01.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("regr.km") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("regr.km") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Ordinary linear regression.
Calls stats::lm().
If a Task has a column with the role offset, it will automatically be used during training.
The offset is incorporated through the formula interface to ensure compatibility with stats::lm().
We add it to the model formula as offset(<column_name>) and also include it in the training data.
During prediction, the default behavior is to use the offset column from the test set (enabled by use_pred_offset = TRUE).
Otherwise, if the user sets use_pred_offset = FALSE, a zero offset is applied, effectively disabling the offset adjustment during prediction.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("regr.lm")
lrn("regr.lm")
Task type: “regr”
Predict Types: “response”, “se”
Feature Types: “logical”, “integer”, “numeric”, “character”, “factor”
Required Packages: mlr3, mlr3learners, 'stats'
| Id | Type | Default | Levels | Range |
| df | numeric | Inf | |
|
| interval | character | - | none, confidence, prediction | - |
| level | numeric | 0.95 | |
|
| model | logical | TRUE | TRUE, FALSE | - |
| pred.var | untyped | - | - | |
| qr | logical | TRUE | TRUE, FALSE | - |
| scale | numeric | NULL | |
|
| singular.ok | logical | TRUE | TRUE, FALSE | - |
| x | logical | FALSE | TRUE, FALSE | - |
| y | logical | FALSE | TRUE, FALSE | - |
| rankdeficient | character | - | warnif, simple, non-estim, NA, NAwarn | - |
| tol | numeric | 1e-07 | |
|
| verbose | logical | FALSE | TRUE, FALSE | - |
| use_pred_offset | logical | TRUE | TRUE, FALSE | - |
To ensure reproducibility, this learner always uses the default contrasts:
contr.treatment() for unordered factors, and
contr.poly() for ordered factors.
Setting the option "contrasts" does not have any effect.
Instead, set the respective hyperparameter or use mlr3pipelines to create dummy features.
mlr3::Learner -> mlr3::LearnerRegr -> LearnerRegrLM
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerRegr$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerRegrLM$new()
clone()
The objects of this class are cloneable with this method.
LearnerRegrLM$clone(deep = FALSE)
deepWhether to make a deep clone.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("regr.lm") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("regr.lm") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Single Layer Neural Network.
Calls nnet::nnet.formula() from package nnet.
Note that modern neural networks with multiple layers are connected via package mlr3torch.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("regr.nnet")
lrn("regr.nnet")
Task type: “regr”
Predict Types: “response”
Feature Types: “logical”, “integer”, “numeric”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, nnet
| Id | Type | Default | Levels | Range |
| Hess | logical | FALSE | TRUE, FALSE | - |
| MaxNWts | integer | 1000 | |
|
| Wts | untyped | - | - | |
| abstol | numeric | 1e-04 | |
|
| censored | logical | FALSE | TRUE, FALSE | - |
| contrasts | untyped | NULL | - | |
| decay | numeric | 0 | |
|
| mask | untyped | - | - | |
| maxit | integer | 100 | |
|
| na.action | untyped | - | - | |
| rang | numeric | 0.7 | |
|
| reltol | numeric | 1e-08 | |
|
| size | integer | 3 | |
|
| skip | logical | FALSE | TRUE, FALSE | - |
| subset | untyped | - | - | |
| trace | logical | TRUE | TRUE, FALSE | - |
| formula | untyped | - | - |
size:
Adjusted default: 3L.
Reason for change: no default in nnet().
formula: if not provided, the formula is set to task$formula().
mlr3::Learner -> mlr3::LearnerRegr -> LearnerRegrNnet
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerRegr$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerRegrNnet$new()
clone()
The objects of this class are cloneable with this method.
LearnerRegrNnet$clone(deep = FALSE)
deepWhether to make a deep clone.
Ripley BD (1996). Pattern Recognition and Neural Networks. Cambridge University Press. doi:10.1017/cbo9780511812651.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.ranger,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("regr.nnet") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("regr.nnet") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Random regression forest.
Calls ranger() from package ranger.
Additionally to the uncertainty estimation methods provided by the ranger package, the learner provides a ensemble standard deviation and law of total variance uncertainty estimation. Both methods compute the empirical mean and variance of the training data points that fall into the predicted leaf nodes. The ensemble standard deviation method calculates the standard deviation of the mean of the leaf nodes. The law of total variance method calculates the mean of the variance of the leaf nodes plus the variance of the means of the leaf nodes. Formulas for the ensemble standard deviation and law of total variance method are given in Hutter et al. (2015).
For these 2 methods, the parameter sigma2.threshold can be used to set a threshold for the variance of the leaf nodes,
this is a minimal value for the variance of the leaf nodes, if the variance is below this threshold, it is set to this value (as described in the paper).
Default is 1e-2.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("regr.ranger")
lrn("regr.ranger")
Task type: “regr”
Predict Types: “response”, “se”, “quantiles”
Feature Types: “logical”, “integer”, “numeric”, “character”, “factor”, “ordered”
Required Packages: mlr3, mlr3learners, ranger
| Id | Type | Default | Levels | Range |
| always.split.variables | untyped | - | - | |
| holdout | logical | FALSE | TRUE, FALSE | - |
| importance | character | - | none, impurity, impurity_corrected, permutation | - |
| keep.inbag | logical | FALSE | TRUE, FALSE | - |
| max.depth | integer | NULL | |
|
| min.bucket | integer | 1 | |
|
| min.node.size | integer | 5 | |
|
| mtry | integer | - | |
|
| mtry.ratio | numeric | - | |
|
| na.action | character | na.learn | na.learn, na.omit, na.fail | - |
| node.stats | logical | FALSE | TRUE, FALSE | - |
| num.random.splits | integer | 1 | |
|
| num.threads | integer | 1 | |
|
| num.trees | integer | 500 | |
|
| oob.error | logical | TRUE | TRUE, FALSE | - |
| poisson.tau | numeric | 1 | |
|
| regularization.factor | untyped | 1 | - | |
| regularization.usedepth | logical | FALSE | TRUE, FALSE | - |
| replace | logical | TRUE | TRUE, FALSE | - |
| respect.unordered.factors | character | - | ignore, order, partition | - |
| sample.fraction | numeric | - | |
|
| save.memory | logical | FALSE | TRUE, FALSE | - |
| scale.permutation.importance | logical | FALSE | TRUE, FALSE | - |
| se.method | character | infjack | jack, infjack, ensemble_standard_deviation, law_of_total_variance | - |
| sigma2.threshold | numeric | 0.01 | |
|
| seed | integer | NULL | |
|
| split.select.weights | untyped | NULL | - | |
| splitrule | character | variance | variance, extratrees, maxstat, beta, poisson | - |
| verbose | logical | TRUE | TRUE, FALSE | - |
| write.forest | logical | TRUE | TRUE, FALSE | - |
mtry:
This hyperparameter can alternatively be set via our hyperparameter mtry.ratio
as mtry = max(ceiling(mtry.ratio * n_features), 1).
Note that mtry and mtry.ratio are mutually exclusive.
num.threads:
Actual default: 2, using two threads, while also respecting environment variable
R_RANGER_NUM_THREADS, options(ranger.num.threads = N), or options(Ncpus = N), with
precedence in that order.
Adjusted value: 1.
Reason for change: Conflicting with parallelization via future.
mlr3::Learner -> mlr3::LearnerRegr -> LearnerRegrRanger
new()
Creates a new instance of this R6 class.
LearnerRegrRanger$new()
importance()
The importance scores are extracted from the model slot variable.importance.
Parameter importance.mode must be set to "impurity", "impurity_corrected", or
"permutation"
LearnerRegrRanger$importance()
Named numeric().
oob_error()
The out-of-bag error, extracted from model slot prediction.error.
LearnerRegrRanger$oob_error()
numeric(1)
selected_features()
The set of features used for node splitting in the forest.
LearnerRegrRanger$selected_features()
character().
clone()
The objects of this class are cloneable with this method.
LearnerRegrRanger$clone(deep = FALSE)
deepWhether to make a deep clone.
Wright, N. M, Ziegler, Andreas (2017). “ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R.” Journal of Statistical Software, 77(1), 1–17. doi:10.18637/jss.v077.i01.
Breiman, Leo (2001). “Random Forests.” Machine Learning, 45(1), 5–32. ISSN 1573-0565, doi:10.1023/A:1010933404324.
Hutter, Frank, Xu, Lin, Hoos, H. H, Leyton-Brown, Kevin (2015). “Algorithm runtime prediction: methods and evaluation.” In Proceedings of the 24th International Conference on Artificial Intelligence, series IJCAI'15, 4197–4201.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.svm,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("regr.ranger") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("regr.ranger") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
Support vector machine for regression.
Calls e1071::svm() from package e1071.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("regr.svm")
lrn("regr.svm")
Task type: “regr”
Predict Types: “response”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, e1071
| Id | Type | Default | Levels | Range |
| cachesize | numeric | 40 | |
|
| coef0 | numeric | 0 | |
|
| cost | numeric | 1 | |
|
| cross | integer | 0 | |
|
| degree | integer | 3 | |
|
| epsilon | numeric | 0.1 | |
|
| fitted | logical | TRUE | TRUE, FALSE | - |
| gamma | numeric | - | |
|
| kernel | character | radial | linear, polynomial, radial, sigmoid | - |
| nu | numeric | 0.5 | |
|
| scale | untyped | TRUE | - | |
| shrinking | logical | TRUE | TRUE, FALSE | - |
| tolerance | numeric | 0.001 | |
|
| type | character | eps-regression | eps-regression, nu-regression | - |
mlr3::Learner -> mlr3::LearnerRegr -> LearnerRegrSVM
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerRegr$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerRegrSVM$new()
clone()
The objects of this class are cloneable with this method.
LearnerRegrSVM$clone(deep = FALSE)
deepWhether to make a deep clone.
Cortes, Corinna, Vapnik, Vladimir (1995). “Support-vector networks.” Machine Learning, 20(3), 273–297. doi:10.1007/BF00994018.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.xgboost
# Define the Learner and set parameter values learner = lrn("regr.svm") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()# Define the Learner and set parameter values learner = lrn("regr.svm") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # Print the model print(learner$model) # Importance method if ("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score()
eXtreme Gradient Boosting regression.
Calls xgboost::xgb.train() from package xgboost.
To compute on GPUs, you first need to compile xgboost yourself and link against CUDA. See https://xgboost.readthedocs.io/en/stable/build.html#building-with-gpu-support.
Note that using the evals parameter directly will lead to problems when wrapping this mlr3::Learner in a mlr3pipelines GraphLearner
as the preprocessing steps will not be applied to the data in evals.
See the section Early Stopping and Validation on how to do this.
If a Task has a column with the role offset, it will automatically be used during training.
The offset is incorporated through the xgboost::xgb.DMatrix interface, using the base_margin field.
No offset is applied during prediction for this learner.
This mlr3::Learner can be instantiated via the dictionary mlr3::mlr_learners or with the associated sugar function mlr3::lrn():
mlr_learners$get("regr.xgboost")
lrn("regr.xgboost")
Task type: “regr”
Predict Types: “response”
Feature Types: “logical”, “integer”, “numeric”
Required Packages: mlr3, mlr3learners, xgboost
| Id | Type | Default | Levels | Range |
| alpha | numeric | 0 | |
|
| approxcontrib | logical | FALSE | TRUE, FALSE | - |
| base_score | numeric | 0.5 | |
|
| booster | character | gbtree | gbtree, gblinear, dart | - |
| callbacks | untyped | list() | - | |
| colsample_bylevel | numeric | 1 | |
|
| colsample_bynode | numeric | 1 | |
|
| colsample_bytree | numeric | 1 | |
|
| device | untyped | "cpu" | - | |
| disable_default_eval_metric | logical | FALSE | TRUE, FALSE | - |
| early_stopping_rounds | integer | NULL | |
|
| eta | numeric | 0.3 | |
|
| evals | untyped | NULL | - | |
| eval_metric | untyped | - | - | |
| custom_metric | untyped | - | - | |
| extmem_single_page | logical | FALSE | TRUE, FALSE | - |
| feature_selector | character | cyclic | cyclic, shuffle, random, greedy, thrifty | - |
| gamma | numeric | 0 | |
|
| grow_policy | character | depthwise | depthwise, lossguide | - |
| huber_slope | numeric | 1 | |
|
| interaction_constraints | untyped | - | - | |
| iterationrange | untyped | - | - | |
| lambda | numeric | 1 | |
|
| max_bin | integer | 256 | |
|
| max_cached_hist_node | integer | 65536 | |
|
| max_cat_to_onehot | integer | - | |
|
| max_cat_threshold | numeric | - | |
|
| max_delta_step | numeric | 0 | |
|
| max_depth | integer | 6 | |
|
| max_leaves | integer | 0 | |
|
| maximize | logical | NULL | TRUE, FALSE | - |
| min_child_weight | numeric | 1 | |
|
| missing | numeric | NA | |
|
| monotone_constraints | untyped | 0 | - | |
| nrounds | integer | - | |
|
| normalize_type | character | tree | tree, forest | - |
| nthread | integer | - | |
|
| num_parallel_tree | integer | 1 | |
|
| objective | untyped | "reg:squarederror" | - | |
| one_drop | logical | FALSE | TRUE, FALSE | - |
| print_every_n | integer | 1 | |
|
| rate_drop | numeric | 0 | |
|
| refresh_leaf | logical | TRUE | TRUE, FALSE | - |
| seed | integer | - | |
|
| seed_per_iteration | logical | FALSE | TRUE, FALSE | - |
| sampling_method | character | uniform | uniform, gradient_based | - |
| sample_type | character | uniform | uniform, weighted | - |
| save_name | untyped | NULL | - | |
| save_period | integer | NULL | |
|
| scale_pos_weight | numeric | 1 | |
|
| skip_drop | numeric | 0 | |
|
| subsample | numeric | 1 | |
|
| top_k | integer | 0 | |
|
| training | logical | FALSE | TRUE, FALSE | - |
| tree_method | character | auto | auto, exact, approx, hist, gpu_hist | - |
| tweedie_variance_power | numeric | 1.5 | |
|
| updater | untyped | - | - | |
| use_rmm | logical | - | TRUE, FALSE | - |
| validate_features | logical | TRUE | TRUE, FALSE | - |
| verbose | integer | - | |
|
| verbosity | integer | - | |
|
| xgb_model | untyped | NULL | - |
In order to monitor the validation performance during the training, you can set the $validate field of the Learner.
For information on how to configure the validation set, see the Validation section of mlr3::Learner.
This validation data can also be used for early stopping, which can be enabled by setting the early_stopping_rounds parameter.
The final (or in the case of early stopping best) validation scores can be accessed via $internal_valid_scores, and the optimal nrounds via $internal_tuned_values.
The internal validation measure can be set via the custom_metric parameter that can be a mlr3::Measure, a function, or a character string for the internal xgboost measures.
Using an mlr3::Measure is slower than the internal xgboost measures, but allows to use the same measure for tuning and validation.
nrounds:
Actual default: no default.
Adjusted default: 1000.
Reason for change: Without a default construction of the learner would error. The lightgbm learner has a default of 1000, so we use the same here.
nthread:
Actual value: Undefined, triggering auto-detection of the number of CPUs.
Adjusted value: 1.
Reason for change: Conflicting with parallelization via future.
verbose:
Actual default: 1.
Adjusted default: 0.
Reason for change: Reduce verbosity.
verbosity:
Actual default: 1.
Adjusted default: 0.
Reason for change: Reduce verbosity.
mlr3::Learner -> mlr3::LearnerRegr -> LearnerRegrXgboost
internal_valid_scores(named list() or NULL)
The validation scores extracted from model$evaluation_log.
If early stopping is activated, this contains the validation scores of the model for the optimal nrounds,
otherwise the nrounds for the final model.
internal_tuned_values(named list() or NULL)
If early stopping is activated, this returns a list with nrounds,
which is extracted from $best_iteration of the model and otherwise NULL.
validate(numeric(1) or character(1) or NULL)
How to construct the internal validation data. This parameter can be either NULL,
a ratio, "test", or "predefined".
model(any)
The fitted model. Only available after $train() has been called.
Returns the $best_iteration when early stopping is activated.
mlr3::Learner$base_learner()mlr3::Learner$configure()mlr3::Learner$encapsulate()mlr3::Learner$format()mlr3::Learner$help()mlr3::Learner$predict()mlr3::Learner$predict_newdata()mlr3::Learner$print()mlr3::Learner$reset()mlr3::Learner$selected_features()mlr3::Learner$train()mlr3::LearnerRegr$predict_newdata_fast()new()
Creates a new instance of this R6 class.
LearnerRegrXgboost$new()
importance()
The importance scores are calculated with xgboost::xgb.importance().
LearnerRegrXgboost$importance()
Named numeric().
clone()
The objects of this class are cloneable with this method.
LearnerRegrXgboost$clone(deep = FALSE)
deepWhether to make a deep clone.
To compute on GPUs, you first need to compile xgboost yourself and link against CUDA. See https://xgboost.readthedocs.io/en/stable/build.html#building-with-gpu-support.
The outputmargin, predcontrib, predinteraction, and predleaf parameters are not supported.
You can still call e.g. predict(learner$model, newdata = newdata, outputmargin = TRUE) to get these predictions.
Chen, Tianqi, Guestrin, Carlos (2016). “Xgboost: A scalable tree boosting system.” In Proceedings of the 22nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 785–794. ACM. doi:10.1145/2939672.2939785.
Chapter in the mlr3book: https://mlr3book.mlr-org.com/chapters/chapter2/data_and_basic_modeling.html#sec-learners
Package mlr3extralearners for more learners.
as.data.table(mlr_learners) for a table of available Learners in the running session (depending on the loaded packages).
mlr3pipelines to combine learners with pre- and postprocessing steps.
Extension packages for additional task types:
mlr3proba for probabilistic supervised regression and survival analysis.
mlr3cluster for unsupervised clustering.
mlr3tuning for tuning of hyperparameters, mlr3tuningspaces for established default tuning spaces.
Other Learner:
mlr_learners_classif.cv_glmnet,
mlr_learners_classif.glmnet,
mlr_learners_classif.kknn,
mlr_learners_classif.lda,
mlr_learners_classif.log_reg,
mlr_learners_classif.multinom,
mlr_learners_classif.naive_bayes,
mlr_learners_classif.nnet,
mlr_learners_classif.qda,
mlr_learners_classif.ranger,
mlr_learners_classif.svm,
mlr_learners_classif.xgboost,
mlr_learners_regr.cv_glmnet,
mlr_learners_regr.glmnet,
mlr_learners_regr.kknn,
mlr_learners_regr.km,
mlr_learners_regr.lm,
mlr_learners_regr.nnet,
mlr_learners_regr.ranger,
mlr_learners_regr.svm
## Not run: if (requireNamespace("xgboost", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.xgboost") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() } ## End(Not run) ## Not run: # Train learner with early stopping on spam data set task = tsk("mtcars") # use 30 percent for validation # Set early stopping parameter learner = lrn("regr.xgboost", nrounds = 100, early_stopping_rounds = 10, validate = 0.3 ) # Train learner with early stopping learner$train(task) # Inspect optimal nrounds and validation performance learner$internal_tuned_values learner$internal_valid_scores ## End(Not run)## Not run: if (requireNamespace("xgboost", quietly = TRUE)) { # Define the Learner and set parameter values learner = lrn("regr.xgboost") print(learner) # Define a Task task = tsk("mtcars") # Create train and test set ids = partition(task) # Train the learner on the training ids learner$train(task, row_ids = ids$train) # print the model print(learner$model) # importance method if("importance" %in% learner$properties) print(learner$importance) # Make predictions for the test rows predictions = learner$predict(task, row_ids = ids$test) # Score the predictions predictions$score() } ## End(Not run) ## Not run: # Train learner with early stopping on spam data set task = tsk("mtcars") # use 30 percent for validation # Set early stopping parameter learner = lrn("regr.xgboost", nrounds = 100, early_stopping_rounds = 10, validate = 0.3 ) # Train learner with early stopping learner$train(task) # Inspect optimal nrounds and validation performance learner$internal_tuned_values learner$internal_valid_scores ## End(Not run)