Other R packages may be useful for generating basis expansions for certain kinds of data. This page lists several options.
embed
package provides several methods for encoding categorical predictors
based on their relationship with an outcome variable. Users should note
that this creates a feedback effect, where the outcome variable is used
to define the predictors, which may cause problems in certain
statistical workflows.embed
package allows for discretizing continuous variables based on their
relationship with an outcome variable, using CART, and extracting PCA
components from a set of numerical predictors. Users should note that
ridge() regression automatically shrinks lower-variance PCA
components more than higher-variance components, and providing
normalized PCA components to a predictive model may lead to unintuitive
results in some cases. Selecting a sparse subset of principal components
may be useful, however.The conText
package estimates context-specific word and document
embeddings.
The text2vec
package provides a number of tools to convert text to numeric vectors,
including fitting custom GloVe models and topic modeling, and is
designed to handle large-scale data.
The text2map
and its accompanying text2map.pretrained
package (not on CRAN) provides access to a number of pre-trained word
embeddings.
The torchvision
package provides functions for various transformations of image data. It
also provides access to pre-trained models from which image embeddings
may be extracted.
The torchaudio
package provides functions for various transformations of audio data,
including a number of spectral transformations.