Skip to main content

unary_operations

Unary (one reference argument) transformations.

This module contains the base class and concrete classes for unary transformations, those that take a single reference argument (i.e. a column or transformation name).

Classes

InclusionTransformation

class InclusionTransformation(    *, name: str = None, output: bool = False, arg: str, in_str: str,):

Represents the test for substring inclusion in a column's entries.

Check whether in_str (the test string) is in the elements of arg (the column).

Arguments

  • arg: The argument to the transformation as a string.
  • in_str: The string to test for inclusion.
  • name: The name of the transformation. If not provided a unique name will be generated from the class name.
  • output: Whether or not this transformation should be included in the final output. Defaults to False.

Raises

  • TransformationRegistryError: If the transformation name is already in use.
  • TransformationRegistryError: If the transformation name hasn't been provided and the transformation is not registered.

Method generated by attrs for class InclusionTransformation.

Variables

  • static in_str : str

Static methods


schema

def schema()> marshmallow.schema.Schema:

Inherited from:

StringUnaryOperation.schema :

Gets an instance of the Schema associated with this Transformation.

Raises

  • TypeError: If the transformation doesn't have a TransformationSchema as the schema.

OneHotEncodingTransformation

class OneHotEncodingTransformation(    *,    name: str = None,    output: bool = False,    arg: str,    unknown_suffix: str = 'UNKNOWN',    raw_values: Union[List[Any], Dict[Any, Optional[str]]],):

One hot encoding transformation.

Represents the transformation of a column into a series of one-hot encoded columns.

Arguments

  • arg: Column or transformation reference to one-hot encode.

  • name: The name of the transformation. If not provided a unique name will be generated from the class name.

  • output: Whether or not this transformation should be included in the final output. Defaults to False.

  • unknown_suffix: The suffix to use to create a column for encoding unknown values. The column will be created as {name}_{unknown_suffix}. Default is "UNKNOWN".

  • values: Column values that should be one-hot encoded. This can either be a list of values, in which case the one-hot encoding will produce columns named {name}_{value}, or a dictionary of values to desired column suffixes, in which case the encoding will use those suffixes (if an entry in the dictionary maps to None, the column name will be generated in the same way as described above).

    If name is not set, the column or transformation reference from arg will be used instead.

    Any value found in the column which is not enumerated in this argument will be encoded in an {name}_{unknown_suffix} column. This column is therefore protected and any value or value-column mapping that could clash will raise ValueError. If you need to encode such a value, unknown_suffix must be changed

Raises

  • TransformationRegistryError: If the transformation name is already in use.
  • TransformationRegistryError: If the transformation name hasn't been provided and the transformation is not registered.
  • ValueError: If any name in values would cause a clash with the unknown value column created by unknown_suffix or with another generated column.
  • ValueError: If no values were provided.
  • ValueError: If no name is provided and the reference in arg cannot be found.

Method generated by attrs for class OneHotEncodingTransformation.

Variables

  • static unknown_suffix : str
  • static values : Dict[Any, str]
  • columns : List[str] - Lists the columns that will be output.
  • prefix - Uses name as prefix or extract from arg (should be col or transform ref).
  • unknown_col : str - Returns the name of the column that unknown values are encoded to.

Static methods


schema

def schema()> marshmallow.schema.Schema:

Inherited from:

StringUnaryOperation.schema :

Gets an instance of the Schema associated with this Transformation.

Raises

  • TypeError: If the transformation doesn't have a TransformationSchema as the schema.

StringUnaryOperation

class StringUnaryOperation(*, name: str = None, output: bool = False, arg: str):

This class represents any UnaryOperation where arg can only be a string.

Arguments

  • arg: The argument to the transformation as a string.
  • name: The name of the transformation. If not provided a unique name will be generated from the class name.
  • output: Whether or not this transformation should be included in the final output. Defaults to False.

Raises

  • TransformationRegistryError: If the transformation name is already in use.
  • TransformationRegistryError: If the transformation name hasn't been provided and the transformation is not registered.

Method generated by attrs for class StringUnaryOperation.

Variables

  • static arg : str

Static methods


schema

def schema()> marshmallow.schema.Schema:

Inherited from:

UnaryOperation.schema :

Gets an instance of the Schema associated with this Transformation.

Raises

  • TypeError: If the transformation doesn't have a TransformationSchema as the schema.

UnaryOperation

class UnaryOperation(*, name: str = None, output: bool = False, arg: Any):

The base abstract class for all Unary Operation Transformations.

Arguments

  • arg: The argument to the transformation.
  • name: The name of the transformation. If not provided a unique name will be generated from the class name.
  • output: Whether or not this transformation should be included in the final output. Defaults to False.

Raises

  • TransformationRegistryError: If the transformation name is already in use.
  • TransformationRegistryError: If the transformation name hasn't been provided and the transformation is not registered.

Method generated by attrs for class UnaryOperation.

Variables

  • static arg : Any

Static methods


schema

def schema()> marshmallow.schema.Schema:

Inherited from:

Transformation.schema :

Gets an instance of the Schema associated with this Transformation.

Raises

  • TypeError: If the transformation doesn't have a TransformationSchema as the schema.