--- title: Common Issues output: rmarkdown::html_vignette: toc: true vignette: > %\VignetteIndexEntry{Common Issues when Creating a new Learner} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- This vignette lists some common issues that one should pay attention to when implementing a `Learner`. ## Ordering Features When implementing the private `$.predict()` method it is a good idea to retrieve the columns in the same order as they were during `$.train()`. While this might not matter in most cases, we have encountered examples where this was necessary to ensure that the learner worked as expected. For this purpose, the `ordered_features` helper function exists that is also used in the learner template. ## Accessing Internals From `$state` Sometimes, one needs additional information that might not be stored in the machine learning model itself. In such cases, one might be tempted to access data from the `$state` of a `Learner` beyond `$state$model`. However, this is a rather internal data structure that might change in the future and one should not rely on it. Furthermore, the `$state` that is created when a learner is `resample`d is not the same that is created in a manual `$train()`. If you absolutely need additional information, you can make the private `$.train()` method return a `list()` with the actual model and additional metadata so that both is available during `$.train()`. In such cases, it is a good idea to first consult with the maintainer of `mlr3extralearners` if this is really necessary. ## Default vs. Initial Parameters When creating a parameter (via `p_dbl()` and friends), there are two similar arguments that can be specified, namely `init` and `default`. On the one hand, the argument `default` should describe the default value that the upstream package uses when no specific value is set. E.g., if one were to connect the linear model to `mlr3`, the default for the parameter `singualr.ok` can be accessed via `formals(lm)$singular.ok`. Note that these default values do not set any specific parameter values in the `$param_set` of the `Learner`. On the other hand, the `init` field describes to what the parameter value should be initialized to (so it is then also accessible via `learner$param_set$values$`). By default, it is not initialized to any value, which means that the default behavior is used. ## Complex Defaults When annotating the default value for a parameter (argument `default`) in the `ParamSet`, there are cases where defaults are complex expressions such as `sample.int(10000L)`. In such cases, it is ok to not specify any `default` value for the parameter, which `paradox` then interpretes as having a complex default.