Introduction to Preprocessing¶
Sequentia provides a number of useful preprocessing methods for sequential data.
Custom Transformations (
Custom
)Constant Trimming (
TrimConstants
)Minmax Scaling (
MinMaxScale
)Centering (
Center
)Standardizing (
Standardize
)Downsampling (
Downsample
)Filtering (
Filter
)
Additionally, the provided Compose
class makes it possible to apply multiple transformations.
Note
The existing preprocessing methods in sequentia.preprocessing
are currently only
applicable to lists of numpy.ndarray
objects, and therefore cannot be applied
as transformations for torch.Tensor
objects.
Unfortunately this means that the preprocessing methods can only be used to preprocess data for
sequentia.classifiers.knn.KNNClassifier
and sequentia.classifiers.hmm.HMMClassifier
,
and not sequentia.classifiers.rnn.DeepGRU
.
It is possible to attempt to use these transformations on torch.Tensor
objects by
bypassing validation when applying the transformation,
x = torch.rand(5, 3)
x = Center()(x, validate=False)
but this likely will not work due to differences in numpy.ndarray
and torch.Tensor
.
Each of the transformations follow a similar interface, based on the abstract Transform
class:

class
sequentia.preprocessing.
Transform
[source]¶ Base class representing a single transformation.

__call__
(X, validate=True)[source]¶ Applies the transformation to the observation sequence(s).
 Parameters
 X: numpy.ndarray (float) or list of numpy.ndarray (float)
An individual observation sequence or a list of multiple observation sequences.
 validate: bool
Whether or not to validate the input sequences.
 Returns
 transformed:
numpy.ndarray
(float) or list ofnumpy.ndarray
(float) The transformed input observation sequence(s).
 transformed:

transform
(x)[source]¶ Applies the transformation to a single observation sequence.
 Parameters
 X: numpy.ndarray (float)
An individual observation sequence.
 Returns
 transformed:
numpy.ndarray
(float) The transformed input observation sequence.
 transformed:

fit
(X, validate=True)[source]¶ Fit the transformation on the provided observation sequence(s) (without transforming them).
 Parameters
 X: numpy.ndarray (float) or list of numpy.ndarray (float)
An individual observation sequence or a list of multiple observation sequences.
 validate: bool
Whether or not to validate the input sequences.

fit_transform
(X, validate=True)[source]¶ Fit the transformation with the provided observation sequence(s) and transform them.
 Parameters
 X: numpy.ndarray (float) or list of numpy.ndarray (float)
An individual observation sequence or a list of multiple observation sequences.
 validate: bool
Whether or not to validate the input sequences.
 Returns
 transformed:
numpy.ndarray
(float) or list ofnumpy.ndarray
(float) The transformed input observation sequence(s).
 transformed:
