filtering_algorithm
Algorithm for filtering data records based on configurable strategies.
Classes
AgeRangeFilterArgs
class AgeRangeFilterArgs(*args, **kwargs):
Arguments for AGE_RANGE filter strategy.
This filtering strategy keeps only records within a specified age range in a given column.
Ancestors
- builtins.dict
FilterStrategy
class FilterStrategy( value, names=None, *, module=None, qualname=None, type=None, start=1,):
Enumeration of available filtering strategies.
Inherits from str to allow for easy conversion to string and comparison with other strings. The ordering of the inheritance is important (first str then Enum). This replicates the strEnum behaviour in Python 3.11+. TODO: [Python 3.11] Convert to strEnum when Python 3.11 is the minimum version.
FilterStrategyClass
class FilterStrategyClass( value, names=None, *, module=None, qualname=None, type=None, start=1,):
Enumeration map of filter strategies to TypedDict and classes.
FrequencyFilterArgs
class FrequencyFilterArgs(*args, **kwargs):
Arguments for FREQUENCY filter strategy.
This filtering strategy keeps only records with a specified frequency of ID occurrence.
Ancestors
- builtins.dict
Variables
- static
id_column : Union[str, list[str]]
- static
max_frequency : int
- static
min_frequency : int
LatestFilterArgs
class LatestFilterArgs(*args, **kwargs):
Arguments for LATEST filter strategy.
This filtering strategy keeps only the latest records per ID.
Ancestors
- builtins.dict
Variables
- static
date_column : str
- static
id_column : Union[str, list[str]]
- static
num_latest : int
RecordFilterAlgorithm
class RecordFilterAlgorithm( datastructure: DataStructure, strategies: Sequence[Union[FilterStrategy, str]], filter_args_list: list[FilterArgs],):
Algorithm factory for filtering records based on various strategies.
Arguments
- **
**kwargs
**: Additional keyword arguments. datastructure
: The data structure to use for the algorithm.filter_args_list
: List of strategy-specific argumentsstrategies
: List of filtering strategies
Attributes
class_name
: The name of the algorithm class.datastructure
: The data structure to use for the algorithmfields_dict
: A dictionary mapping all attributes that will be serialized in the class to their marshamllow field type. (e.g. fields_dict ={"class_name": fields.Str()}
).filter_args_list
: List of strategy-specific argumentsnested_fields
: A dictionary mapping all nested attributes to a registry that contains class names mapped to the respective classes. (e.g. nested_fields ={"datastructure": datastructure.registry}
)strategies
: List of filtering strategies
Raises
ValueError
: If required parameters for a strategy are missing
Ancestors
- BaseNonModelAlgorithmFactory
- BaseAlgorithmFactory
- abc.ABC
- bitfount.federated.roles._RolesMixIn
- bitfount.types._BaseSerializableObjectMixIn
Variables
- static
fields_dict : ClassVar[T_FIELDS_DICT]
Methods
create
def create(self, role: Union[str, Role], **kwargs: Any) ‑> Any:
Create an instance representing the role specified.
modeller
def modeller( self, **kwargs: Any,) ‑> NoResultsModellerAlgorithm:
Modeller-side of the algorithm.
worker
def worker( self, **kwargs: Any,) ‑> bitfount.federated.algorithms.filtering_algorithm._WorkerSide:
Worker-side of the algorithm.