Skip to main content

dataframe_generation_extensions

Additional functionality for DataFrame processing.

Provides functions that can be used for additional column generation.

Module

Functions

extract_hypertransmission

def extract_hypertransmission(df: pd.DataFrame)> pandas.core.frame.DataFrame:

Extension function for extracting hypertransmission area.

extract_is_os_disruption

def extract_is_os_disruption(df: pd.DataFrame)> pandas.core.frame.DataFrame:

Extension function for extracting IS/OS disruption area.

extract_json_value

def extract_json_value(    df: pd.DataFrame, json_column: str, key: str, new_column_name: str,)> pandas.core.frame.DataFrame:

Extracts a specific value from a JSON string or dictionary column.

Arguments

  • df: The DataFrame to process.
  • json_column: The name of the column containing JSON strings or dictionaries.
  • key: The key to extract from the JSON or dictionary.
  • new_column_name: The name for the new column containing the extracted values.

Returns The DataFrame with the new column added.

extract_neurosensory_retina_atrophy

def extract_neurosensory_retina_atrophy(df: pd.DataFrame)> pandas.core.frame.DataFrame:

Extension function for extracting neurosensory retina atrophy area.

extract_rpe_atrophy

def extract_rpe_atrophy(df: pd.DataFrame)> pandas.core.frame.DataFrame:

Extension function for extracting RPE atrophy area.

extract_rpe_disruption

def extract_rpe_disruption(df: pd.DataFrame)> pandas.core.frame.DataFrame:

Extension function for extracting RPE disruption area.

generate_bitfount_patient_id

def generate_bitfount_patient_id(    df: pd.DataFrame,    name_col: str = "Patient's Name",    dob_col: str = "Patient's Birth Date",)> pandas.core.frame.DataFrame:

Adds a BitfountPatientID column to the provided DataFrame.

This mutates the input dataframe with the new column.

The generated IDs are the hash of the concatenated string of a Bitfount-specific key, full name, and date of birth.

generate_subfoveal_indicator

def generate_subfoveal_indicator(    df: pd.DataFrame,    distance_from_fovea_col: str = 'distance_from_fovea_centre',    max_distance: float = 0.1,)> pandas.core.frame.DataFrame:

Adds a 'Subfoveal?' column to the provided DataFrame.

This mutates the input dataframe with the new column.

The column will contain 'Y' if the distance from fovea is less than the specified maximum distance, 'N' if it's greater, and 'Fovea not detected' if the distance value is not available.

Arguments

  • df: The DataFrame to add the column to.
  • distance_from_fovea_col: The name of the column containing the distance from fovea. Defaults to DISTANCE_FROM_FOVEA_CENTRE_COL.
  • max_distance: The maximum distance to consider as subfoveal. Defaults to 0.0.

Returns The modified DataFrame with the new column.

Raises

  • DataFrameExtensionError: If the distance from fovea column is not available in the DataFrame.

generate_subfoveal_indicator_extension

def generate_subfoveal_indicator_extension(    df: pd.DataFrame,)> pandas.core.frame.DataFrame:

Extension function for generating the subfoveal indicator column.

Note that this is a wrapper function since extensions do not support parameters yet. Once they do, we can remove this wrapper function.

id_safe_string

def id_safe_string(s: str)> str:

Converts a string to a normalised version safe for use in IDs.

In particular, converts accented/diacritic characters to their closest ASCII representation, ensures lowercase, and replaces any non-word characters with underscores.

This allows us to map potentially different spellings (e.g. Francois John-Smith vs François John Smith) to the same string (francois_john_smith).

safe_format_date

def safe_format_date(value: Any)> Any:

Safely format a date string.

Arguments

  • value: The input value, which can be a date string, integer, or NaN.

Returns Formatted date string or the original value as a string if formatting fails.

Classes

DataFrameExtensionError

class DataFrameExtensionError(*args, **kwargs):

Indicates an error whilst trying to apply an extension function.

Ancestors

DataFrameExtensionFunction

class DataFrameExtensionFunction(*args, **kwargs):

Callback protocol for DataFrame extension functions.