Skip to main content

background_file_counter

Background file counting utility that works with any FileSystemIterableSource.

Module

Functions

get_background_file_count

def get_background_file_count(datasource: "'FileSystemIterableSource'")> Optional[int]:

Get the background file count for a datasource.

serialize_datasource_configs

def serialize_datasource_configs(    datasource: "'FileSystemIterableSource'",)> Dict[str, Any]:

Create serializable data from datasource configuration.

Arguments

  • datasource: The datasource to serialize

Returns Serialized datasource configuration

start_background_file_counting

def start_background_file_counting(datasource: "'FileSystemIterableSource'")> None:

Start background file counting for a datasource.

stop_background_file_counting

def stop_background_file_counting(datasource: "'FileSystemIterableSource'")> None:

Stop background file counting for a datasource.

Classes

BackgroundFileCounter

class BackgroundFileCounter(datasource: FileSystemIterableSource):

Non-intrusive background file counter for FileSystemIterableSource.

Arguments

  • datasource: The FileSystemIterableSource to count files for.

The BackgroundFileCounter class allows counting files in a FileSystemIterableSource while a task is executing in a different process. Using threads instead of processes is discouraged because it can block execution within the app, as the Flask server operates in a single-threaded mode, making file counting a blocking task.

When the datasource has selected_file_names_override populated (e.g. after a RecordFilterAlgorithm runs during initial setup), the count is already known and no background process is spawned. This avoids an unnecessary full directory scan and ensures the reported count reflects only the filtered files.

Methods


get_count

def get_count(self)> Optional[int]:

Get current file count (None if counting not complete).

is_counting_complete

def is_counting_complete(self)> bool:

Check if counting is complete.

start_counting

def start_counting(self)> None:

Start background file counting.

If the file count is already known (e.g. from selected_file_names_override), the count is stored immediately without spawning a background process.

stop_counting

def stop_counting(self)> None:

Stop background file counting.