Title: | Support for Spatial Objects Within the 'mlr3' Ecosystem |
---|---|
Description: | Extends the 'mlr3' ML framework with methods for spatial objects. Data storage and prediction are supported for packages 'terra', 'raster' and 'stars'. |
Authors: | Marc Becker [aut, cre] , Patrick Schratz [aut] |
Maintainer: | Marc Becker <[email protected]> |
License: | LGPL-3 |
Version: | 0.5.0 |
Built: | 2024-12-27 02:57:47 UTC |
Source: | https://github.com/mlr-org/mlr3spatial |
Extends the 'mlr3' ML framework with methods for spatial objects. Data storage and prediction are supported for packages 'terra', 'raster' and 'stars'.
Book on mlr3: https://mlr3book.mlr-org.com
Use cases and examples gallery: https://mlr3gallery.mlr-org.com
Cheat Sheets: https://github.com/mlr-org/mlr3cheatsheets
Preprocessing and machine learning pipelines: mlr3pipelines
Analysis of benchmark experiments: mlr3benchmark
More classification and regression tasks: mlr3data
Solid selection of good classification and regression learners: mlr3learners
Even more learners: https://github.com/mlr-org/mlr3extralearners
Tuning of hyperparameters: mlr3tuning
Hyperband tuner: mlr3hyperband
Visualizations for many mlr3 objects: mlr3viz
Survival analysis and probabilistic regression: mlr3proba
Cluster analysis: mlr3cluster
Feature selection filters: mlr3filters
Feature selection wrappers: mlr3fselect
Interface to real (out-of-memory) data bases: mlr3db
Performance measures as plain functions: mlr3measures
"mlr3.debug"
: If set to TRUE
, parallelization via future is
disabled to simplify debugging and provide more concise tracebacks. Note that
results computed with debug mode enabled use a different seeding mechanism
and are not reproducible.
"mlr3.allow_utf8_names"
: If set to TRUE
, checks on the feature names
are relaxed, allowing non-ascii characters in column names. This is an
experimental and temporal option to pave the way for text analysis, and will
likely be removed in a future version of the package. analysis.
Maintainer: Marc Becker [email protected] (ORCID)
Authors:
Patrick Schratz [email protected] (ORCID)
Becker M, Schratz P (2024). mlr3spatial: Support for Spatial Objects Within the 'mlr3' Ecosystem. https://mlr3spatial.mlr-org.com, https://github.com/mlr-org/mlr3spatial.
Useful links:
Report bugs at https://github.com/mlr-org/mlr3spatial/issues
Wraps a DataBackend around spatial objects.
Currently these S3 methods are only alternative ways for writing DataBackendRaster$new()
.
They do not support coercing from other backends yet.
## S3 method for class 'stars' as_data_backend(data, primary_key = NULL, ...) ## S3 method for class 'SpatRaster' as_data_backend(data, primary_key = NULL, ...) ## S3 method for class 'RasterBrick' as_data_backend(data, primary_key = NULL, ...) ## S3 method for class 'RasterStack' as_data_backend(data, primary_key = NULL, ...) ## S3 method for class 'sf' as_data_backend(data, primary_key = NULL, keep_rownames = FALSE, ...)
## S3 method for class 'stars' as_data_backend(data, primary_key = NULL, ...) ## S3 method for class 'SpatRaster' as_data_backend(data, primary_key = NULL, ...) ## S3 method for class 'RasterBrick' as_data_backend(data, primary_key = NULL, ...) ## S3 method for class 'RasterStack' as_data_backend(data, primary_key = NULL, ...) ## S3 method for class 'sf' as_data_backend(data, primary_key = NULL, keep_rownames = FALSE, ...)
data |
(terra::SpatRaster) |
primary_key |
( |
... |
( |
keep_rownames |
( |
Convert object to a TaskClassifST. This is a S3 generic, specialized for at least the following objects:
TaskClassifST: Ensure the identity.
data.frame()
and DataBackend: Provides an alternative to the constructor of TaskClassifST.
sf::sf: Extracts spatial meta data before construction.
TaskRegr: Calls convert_task()
.
as_task_classif_st(x, ...) ## S3 method for class 'TaskClassifST' as_task_classif_st(x, clone = FALSE, ...) ## S3 method for class 'data.frame' as_task_classif_st( x, target, id = deparse(substitute(x)), positive = NULL, coordinate_names, crs = NA_character_, coords_as_features = FALSE, label = NA_character_, ... ) ## S3 method for class 'DataBackend' as_task_classif_st( x, target, id = deparse(substitute(x)), positive = NULL, coordinate_names, crs, coords_as_features = FALSE, label = NA_character_, ... ) ## S3 method for class 'sf' as_task_classif_st( x, target = NULL, id = deparse(substitute(x)), positive = NULL, coords_as_features = FALSE, label = NA_character_, ... ) ## S3 method for class 'TaskRegrST' as_task_classif_st( x, target = NULL, drop_original_target = FALSE, drop_levels = TRUE, ... )
as_task_classif_st(x, ...) ## S3 method for class 'TaskClassifST' as_task_classif_st(x, clone = FALSE, ...) ## S3 method for class 'data.frame' as_task_classif_st( x, target, id = deparse(substitute(x)), positive = NULL, coordinate_names, crs = NA_character_, coords_as_features = FALSE, label = NA_character_, ... ) ## S3 method for class 'DataBackend' as_task_classif_st( x, target, id = deparse(substitute(x)), positive = NULL, coordinate_names, crs, coords_as_features = FALSE, label = NA_character_, ... ) ## S3 method for class 'sf' as_task_classif_st( x, target = NULL, id = deparse(substitute(x)), positive = NULL, coords_as_features = FALSE, label = NA_character_, ... ) ## S3 method for class 'TaskRegrST' as_task_classif_st( x, target = NULL, drop_original_target = FALSE, drop_levels = TRUE, ... )
x |
(any) |
... |
(any) |
clone |
( |
target |
( |
id |
( |
positive |
( |
coordinate_names |
( |
crs |
( |
coords_as_features |
( |
label |
( |
drop_original_target |
( |
drop_levels |
( |
Convert object to a TaskRegrST. This is a S3 generic, specialized for at least the following objects:
TaskRegrST: Ensure the identity.
data.frame()
and DataBackend: Provides an alternative to the constructor of TaskRegrST.
sf::sf: Extracts spatial meta data before construction.
TaskClassif: Calls convert_task()
.
as_task_regr_st(x, ...) ## S3 method for class 'TaskRegrST' as_task_regr_st(x, clone = FALSE, ...) ## S3 method for class 'data.frame' as_task_regr_st( x, target, id = deparse(substitute(x)), coordinate_names, crs = NA_character_, coords_as_features = FALSE, label = NA_character_, ... ) ## S3 method for class 'DataBackend' as_task_regr_st( x, target, id = deparse(substitute(x)), coordinate_names, crs, coords_as_features = FALSE, label = NA_character_, ... ) ## S3 method for class 'sf' as_task_regr_st( x, target = NULL, id = deparse(substitute(x)), coords_as_features = FALSE, label = NA_character_, ... ) ## S3 method for class 'TaskClassifST' as_task_regr_st( x, target = NULL, drop_original_target = FALSE, drop_levels = TRUE, ... )
as_task_regr_st(x, ...) ## S3 method for class 'TaskRegrST' as_task_regr_st(x, clone = FALSE, ...) ## S3 method for class 'data.frame' as_task_regr_st( x, target, id = deparse(substitute(x)), coordinate_names, crs = NA_character_, coords_as_features = FALSE, label = NA_character_, ... ) ## S3 method for class 'DataBackend' as_task_regr_st( x, target, id = deparse(substitute(x)), coordinate_names, crs, coords_as_features = FALSE, label = NA_character_, ... ) ## S3 method for class 'sf' as_task_regr_st( x, target = NULL, id = deparse(substitute(x)), coords_as_features = FALSE, label = NA_character_, ... ) ## S3 method for class 'TaskClassifST' as_task_regr_st( x, target = NULL, drop_original_target = FALSE, drop_levels = TRUE, ... )
x |
(any) |
... |
(any) |
clone |
( |
target |
( |
id |
( |
coordinate_names |
( |
crs |
( |
coords_as_features |
( |
label |
( |
drop_original_target |
( |
drop_levels |
( |
mlr3::DataBackend for terra::SpatRaster raster objects.
There are two different ways the reading of values is performed internally:
"Block mode" reads complete rows of the raster file and subsets the requested cells. This mode is faster than "cell mode" if the complete raster file is iterated over.
"Cell mode" reads individual cells. This is faster than "block mode" if only a few cells are sampled.
"Block mode" is activated if $data(rows)
is used with a increasing integer sequence e.g. 200:300
.
If only a single cell is requested, "cell mode" is used.
mlr3::DataBackend
-> DataBackendRaster
rownames
(integer()
)
Returns vector of all distinct row identifiers, i.e. the contents of the primary key column.
colnames
(character()
)
Returns vector of all column names.
nrow
(integer(1)
)
Number of rows (observations).
ncol
(integer(1)
)
Number of columns (variables).
stack
(SpatRaster
)
Raster stack.
new()
Creates a new instance of this R6 class.
DataBackendRaster$new(data)
data
(terra::SpatRaster)
The input terra::SpatRaster.
data()
Returns a slice of the raster in the specified format.
Currently, the only supported formats is "data.table"
.
The rows must be addressed as vector of cells indices, columns must be referred to via layer names. Queries for rows with no matching row id and queries for columns with no matching column name are silently ignored.
Rows are guaranteed to be returned in the same order as rows
, columns
may be returned in an arbitrary order. Duplicated row ids result in
duplicated rows, duplicated column names lead to an exception.
DataBackendRaster$data(rows, cols, data_format = "data.table")
rows
integer()
Row indices. Row indices start with 1 in the upper left corner in the
raster, increase from left to right and then from top to bottom. The last
cell is in the bottom right corner and the row index equals the number of
cells in the raster.
cols
character()
Column names.
data_format
(character(1)
)
Desired data format. Currently only "data.table"
supported.
head()
Retrieve the first n
rows.
DataBackendRaster$head(n = 6L)
n
(integer(1)
)
Number of rows.
data.table::data.table()
of the first n
rows.
distinct()
Returns a named list of vectors of distinct values for each column
specified. If na_rm
is TRUE
, missing values are removed from the
returned vectors of distinct values. Non-existing rows and columns are
silently ignored.
DataBackendRaster$distinct(rows, cols, na_rm = TRUE)
rows
integer()
Row indices. Row indices start with 1 in the upper left corner in the
raster, increase from left to right and then from top to bottom. The last
cell is in the bottom right corner and the row index equals the number of
cells in the raster.
cols
character()
Column names.
na_rm
logical(1)
Whether to remove NAs or not.
Named list()
of distinct values.
missings()
Returns the number of missing values per column in the specified slice of data. Non-existing rows and columns are silently ignored.
DataBackendRaster$missings(rows, cols)
rows
integer()
Row indices. Row indices start with 1 in the upper left corner in the
raster, increase from left to right and then from top to bottom. The last
cell is in the bottom right corner and the row index equals the number of
cells in the raster.
cols
character()
Column names.
Total of missing values per column (named numeric()
).
coordinates()
Returns the coordinates of rows
.
If rows
is missing, all coordinates are returned.
DataBackendRaster$coordinates(rows)
rows
integer()
Row indices. Row indices start with 1 in the upper left corner in the
raster, increase from left to right and then from top to bottom. The last
cell is in the bottom right corner and the row index equals the number of
cells in the raster.
data.table::data.table()
of coordinates of rows
.
mlr3::DataBackend for sf::sf vector objects.
mlr3::DataBackend
-> mlr3::DataBackendDataTable
-> DataBackendVector
sfc
(sf::sfc)
Returns the sfc object.
new()
Creates a new instance of this R6 class.
DataBackendVector$new(data, primary_key)
data
(sf
)
A raster object.
primary_key
(character(1)
| integer()
)
Name of the primary key column, or integer vector of row ids.
Point survey of land cover in Leipzig. Includes Sentinel-2 spectral bands and NDVI.
Copernicus Sentinel Data (2021). Retrieved from Copernicus Open Access Hub and processed by European Space Agency.
if (requireNamespace("sf")) { library(sf) data("leipzig", package = "mlr3spatial") print(leipzig) }
if (requireNamespace("sf")) { library(sf) data("leipzig", package = "mlr3spatial") print(leipzig) }
This function allows to directly predict mlr3 learners on various spatial objects.
predict_spatial( newdata, learner, chunksize = 200L, format = "terra", filename = NULL )
predict_spatial( newdata, learner, chunksize = 200L, format = "terra", filename = NULL )
newdata |
(terra::SpatRaster | |
learner |
(Learner). Learner with trained model. |
chunksize |
( The default of |
format |
( |
filename |
( |
Spatial object of class given in argument format
.
library(terra, exclude = "resample") # fit rpart on training points task_train = tsk("leipzig") learner = lrn("classif.rpart") learner$train(task_train) # load raster stack = rast(system.file("extdata", "leipzig_raster.tif", package = "mlr3spatial")) # predict land cover classes pred = predict_spatial(stack, learner, chunksize = 1L)
library(terra, exclude = "resample") # fit rpart on training points task_train = tsk("leipzig") learner = lrn("classif.rpart") learner$train(task_train) # load raster stack = rast(system.file("extdata", "leipzig_raster.tif", package = "mlr3spatial")) # predict land cover classes pred = predict_spatial(stack, learner, chunksize = 1L)
This task specializes TaskClassif for spatiotemporal classification problems.
A spatial example task is available via tsk("ecuador")
.
The coordinate reference system passed during initialization must match the one which was used during data creation, otherwise offsets of multiple meters may occur.
By default, coordinates are not used as features.
This can be changed by setting coords_as_features = TRUE
.
mlr3::Task
-> mlr3::TaskSupervised
-> mlr3::TaskClassif
-> TaskClassifST
crs
(character(1)
)
Returns coordinate reference system of task.
coordinate_names
(character()
)
Returns coordinate names.
coords_as_features
(logical(1)
)
If TRUE
, coordinates are used as features.
mlr3::Task$add_strata()
mlr3::Task$cbind()
mlr3::Task$data()
mlr3::Task$filter()
mlr3::Task$format()
mlr3::Task$formula()
mlr3::Task$head()
mlr3::Task$help()
mlr3::Task$levels()
mlr3::Task$missings()
mlr3::Task$rbind()
mlr3::Task$rename()
mlr3::Task$select()
mlr3::Task$set_col_roles()
mlr3::Task$set_levels()
mlr3::Task$set_row_roles()
mlr3::TaskClassif$droplevels()
mlr3::TaskClassif$truth()
new()
Creates a new instance of this R6 class.
The function as_task_classif_st()
provides an alternative way to construct classification tasks.
TaskClassifST$new( id, backend, target, positive = NULL, label = NA_character_, coordinate_names, crs = NA_character_, coords_as_features = FALSE, extra_args = list() )
id
(character(1)
)
Identifier for the new instance.
backend
(DataBackend)
Either a DataBackend, or any object which is convertible to a DataBackend with as_data_backend()
.
E.g., am sf
will be converted to a DataBackendDataTable.
target
(character(1)
)
Name of the target column.
positive
(character(1)
)
Only for binary classification: Name of the positive class.
The levels of the target columns are reordered accordingly, so that the first element of $class_names
is the positive class, and the second element is the negative class.
label
(character(1)
)
Label for the new instance.
coordinate_names
(character(1)
)
The column names of the coordinates in the data.
crs
(character(1)
)
Coordinate reference system.
WKT2 or EPSG string.
coords_as_features
(logical(1)
)
If TRUE
, coordinates are used as features.
extra_args
(named list()
)
Named list of constructor arguments, required for converting task types via convert_task()
.
coordinates()
Returns coordinates of observations.
TaskClassifST$coordinates(row_ids = NULL)
row_ids
(integer()
)
Vector of rows indices as subset of task$row_ids
.
print()
Print the task.
TaskClassifST$print(...)
...
Arguments passed to the $print()
method of the superclass.
clone()
The objects of this class are cloneable with this method.
TaskClassifST$clone(deep = FALSE)
deep
Whether to make a deep clone.
This task specializes TaskRegr for spatiotemporal regression problems.
A spatial example task is available via tsk("cookfarm_mlr3")
.
The coordinate reference system passed during initialization must match the one which was used during data creation, otherwise offsets of multiple meters may occur.
By default, coordinates are not used as features.
This can be changed by setting coords_as_features = TRUE
.
mlr3::Task
-> mlr3::TaskSupervised
-> mlr3::TaskRegr
-> TaskRegrST
crs
(character(1)
)
Returns coordinate reference system of the task.
coordinate_names
(character()
)
Returns coordinate names.
coords_as_features
(logical(1)
)
If TRUE
, coordinates are used as features.
mlr3::Task$add_strata()
mlr3::Task$cbind()
mlr3::Task$data()
mlr3::Task$droplevels()
mlr3::Task$filter()
mlr3::Task$format()
mlr3::Task$formula()
mlr3::Task$head()
mlr3::Task$help()
mlr3::Task$levels()
mlr3::Task$missings()
mlr3::Task$rbind()
mlr3::Task$rename()
mlr3::Task$select()
mlr3::Task$set_col_roles()
mlr3::Task$set_levels()
mlr3::Task$set_row_roles()
mlr3::TaskRegr$truth()
new()
Creates a new instance of this R6 class.
The function as_task_regr_st()
provides an alternative way to construct classification tasks.
TaskRegrST$new( id, backend, target, label = NA_character_, coordinate_names, crs = NA_character_, coords_as_features = FALSE, extra_args = list() )
id
(character(1)
)
Identifier for the new instance.
backend
(DataBackend)
Either a DataBackend, or any object which is convertible to a DataBackend with as_data_backend()
.
E.g., am sf
will be converted to a DataBackendDataTable.
target
(character(1)
)
Name of the target column.
label
(character(1)
)
Label for the new instance.
coordinate_names
(character(1)
)
The column names of the coordinates in the data.
crs
(character(1)
)
Coordinate reference system.
WKT2 or EPSG string.
coords_as_features
(logical(1)
)
If TRUE
, coordinates are used as features.
extra_args
(named list()
)
Named list of constructor arguments, required for converting task types via convert_task()
.
coordinates()
Returns coordinates of observations.
TaskRegrST$coordinates(row_ids = NULL)
row_ids
(integer()
)
Vector of rows indices as subset of task$row_ids
.
print()
Print the task.
TaskRegrST$print(...)
...
Arguments passed to the $print()
method of the superclass.
clone()
The objects of this class are cloneable with this method.
TaskRegrST$clone(deep = FALSE)
deep
Whether to make a deep clone.