Skip to main content

filters

Task-level filter types for runtime data filtering.

This module provides the data structures for specifying filters at task runtime that work alongside existing datasource-level filters. Task filters are defined in DataStructure, serialized to workers, and merged with datasource config using intersection logic (most restrictive bounds win).

Example: >>> from bitfount.data.filters import TaskFilter >>> filter = TaskFilter( ... filter_type="min-frames", ... value=49, ... )

Module

Functions

meets_range_criteria

def meets_range_criteria(    value: RangeBoundInput,    min_value: Optional[RangeBoundInput] = None,    max_value: Optional[RangeBoundInput] = None,)> bool:

Check if a value falls within an optional min/max range.

This helper is used by filter methods to determine if a data value meets the configured filter criteria. Both bounds are inclusive.

Arguments

  • value: The value to check. Must not be NaN.
  • min_value: Optional minimum bound (inclusive). If None, no lower bound.
  • max_value: Optional maximum bound (inclusive). If None, no upper bound.

Returns True if value is within the range, False otherwise.

Raises

  • TypeError: If value and bounds have incompatible types (e.g., int vs date).
  • ValueError: If value is NaN.

Note datetime values (for value or bounds) are automatically converted to date (time component is discarded) for consistent comparison.

resolve_bool_filter

def resolve_bool_filter(ds_value: bool, task_value: Optional[TaskFilterValue])> bool:

Resolve a bool filter - task can enable if datasource has disabled.

Arguments

  • ds_value: Datasource filter value (defaults to False if not set)
  • task_value: Task filter value (should be bool if not None)

Returns True if either datasource or task enables the filter, False otherwise

resolve_date_filter

def resolve_date_filter(    ds_value: Union[Date, DateTD, None],    task_value: Optional[TaskFilterValue],    is_min: bool,)> Optional[datetime.date]:

Resolve a date filter value using intersection logic.

Converts Date/DateTD objects to datetime.date objects.

Arguments

  • ds_value: Datasource filter value (Date or DateTD)
  • task_value: Task filter value (should be Date or DateTD if not None)
  • is_min: True for minimum filters (take max), False for maximum (take min)

Returns Resolved date value or None

resolve_float_filter

def resolve_float_filter(    ds_value: Optional[float], task_value: Optional[TaskFilterValue], is_min: bool,)> Optional[float]:

Resolve a float filter value using intersection logic.

Arguments

  • ds_value: Datasource filter value
  • task_value: Task filter value (should be float/int if not None)
  • is_min: True for minimum filters (take max), False for maximum (take min)

Returns Resolved float value or None

resolve_int_filter

def resolve_int_filter(    ds_value: Optional[int], task_value: Optional[TaskFilterValue], is_min: bool,)> Optional[int]:

Resolve an integer filter value using intersection logic.

Arguments

  • ds_value: Datasource filter value
  • task_value: Task filter value (should be int if not None)
  • is_min: True for minimum filters (take max), False for maximum (take min)

Returns Resolved integer value or None

resolve_list_filter

def resolve_list_filter(    ds_value: Optional[list[str]], task_value: Optional[TaskFilterValue],)> Optional[list[str]]:

Resolve a list filter value by taking the union of both lists.

Arguments

  • ds_value: Datasource filter value
  • task_value: Task filter value (should be list[str] if not None)

Returns Union of the two lists or None

resolve_modality_filter

def resolve_modality_filter(    ds_value: "Optional[Literal['OCT', 'SLO']]", task_value: Optional[TaskFilterValue],)> Optional[Literal['OCT', 'SLO']]:

Resolve a modality filter value - must match if both present.

Arguments

  • ds_value: Datasource filter value
  • task_value: Task filter value (should be str if not None)

Returns Resolved modality value or None

Raises

  • ValueError: If both values present but don't match

resolve_str_filter

def resolve_str_filter(    filter_name: str, ds_value: Optional[str], task_value: Optional[TaskFilterValue],)> Optional[str]:

Resolve a string filter value - must match if both present.

should_persist_skip_to_cache

def should_persist_skip_to_cache(    *,    value: Optional[RangeBoundInput] = None,    datasource_min: Optional[RangeBoundInput] = None,    datasource_max: Optional[RangeBoundInput] = None,    missing_fields: Optional[set[str]] = None,    datasource_required_fields: Optional[set[str]] = None,    file_modality: Optional[OphthalmologyModalityType] = None,    datasource_modality: Optional[OphthalmologyModalityType] = None,    file_series_description: Optional[str] = None,    datasource_series_description: Optional[str] = None,)> bool:

Determine if a skipped file should be persisted to the skip cache.

Checks whether a file that failed the merged filter would also fail the datasource-only filter. If yes, cache the skip. If no, don't cache.

Assumes the file has ALREADY failed the merged filter check.

Arguments

  • value: File's value for range filters (e.g., num_frames, file_size).
  • datasource_min: Minimum bound from datasource config.
  • datasource_max: Maximum bound from datasource config.
  • missing_fields: Field names the file is missing.
  • datasource_required_fields: Fields required by datasource config.
  • file_modality: The file's actual modality (e.g., "OCT", "SLO").
  • datasource_modality: Modality required by datasource config.
  • file_series_description: The file's actual series description.
  • datasource_series_description: Series description required by datasource config.

Raises

  • ValueError: If multiple filter categories provided or none provided.

to_date

def to_date(value: Union[Date, DateTD, dict[str, int], None])> Optional[datetime.date]:

Convert a Date or DateTD object to a datetime.date object.

Classes

MergedFilterConfig

class MergedFilterConfig(    min_num_frames: Optional[int] = None,    max_num_frames: Optional[int] = None,    minimum_dob: Optional[date] = None,    maximum_dob: Optional[date] = None,    check_required_fields: bool = False,    required_field_names: Optional[list[str]] = None,    file_creation_min_date: Optional[date] = None,    file_creation_max_date: Optional[date] = None,    file_modification_min_date: Optional[date] = None,    file_modification_max_date: Optional[date] = None,    min_file_size: Optional[float] = None,    max_file_size: Optional[float] = None,    modality: "Optional[Literal['OCT', 'SLO']]" = None,    min_acquisition_date: Optional[date] = None,    max_acquisition_date: Optional[date] = None,    series_description: Optional[str] = None,):

Effective filter configuration after merging task and datasource filters.

This represents the final, resolved filter values that should be applied when loading data. It is produced by merging task-level filters with datasource-level filters using intersection logic (most restrictive wins).

All fields are optional as different datasource types support different filter capabilities.

Variables

  • static check_required_fields : bool
  • static max_file_size : Optional[float]
  • static max_num_frames : Optional[int]
  • static min_file_size : Optional[float]
  • static min_num_frames : Optional[int]
  • static modality : Optional[Literal['OCT', 'SLO']]
  • static required_field_names : Optional[list[str]]
  • static series_description : Optional[str]

TaskFilter

class TaskFilter(filter_type: str, value: TaskFilterValue):

A single task-level filter to apply at runtime.

Task-level filters work with datasource-level filters using intersection logic. They can only further restrict data selection, not expand it.

Arguments

  • filter_type: The type of filter (kebab-case string matching TaskFilterType).
  • value: The filter value. Type depends on filter_type: - Date filters: dict with year (required), month (optional), day (optional) - Size filters: float (MB) - Frame filters: int - check-required-fields: bool to enable required field checking - required-field-names: list[str] of required field names to check - modality: str, either "OCT" or "SLO"

Variables

  • static filter_type : str
  • static value : Union[DateDateTD, int, float, str, list[str], dict[str, int]]

TaskFilterType

class TaskFilterType(*args, **kwds):

Types of task-level filters available at runtime.

Each filter type maps to an existing filter capability in the codebase. Values use kebab-case for FE compatibility.

Ancestors

Variables

  • static CHECK_REQUIRED_FIELDS
  • static FILE_CREATION_MAX_DATE
  • static FILE_CREATION_MIN_DATE
  • static FILE_MODIFICATION_MAX_DATE
  • static FILE_MODIFICATION_MIN_DATE
  • static MAX_DOB
  • static MAX_FILE_SIZE
  • static MAX_FRAMES
  • static MIN_DOB
  • static MIN_FILE_SIZE
  • static MIN_FRAMES
  • static MODALITY
  • static REQUIRED_FIELD_NAMES
  • static SCAN_ACQUISITION_MAX_DATE
  • static SCAN_ACQUISITION_MIN_DATE
  • static SERIES_DESCRIPTION