Skip to main content

pod

Pods for responding to tasks.

Classes

DatasourceContainer

class DatasourceContainer(    name: str,    datasource: Union[BaseSource, ViewDatasourceConfig],    datasource_details: PodDetailsConfig,    data_config: PodDataConfig,    schema: BitfountSchema,):

Contains a datasource and all the data related to it.

This represents a datasource configuration post-data-loading/configuration and so the data config and schema must be present.

Variables

  • static name : str

DatasourceContainerConfig

class DatasourceContainerConfig(    name: str,    datasource: Union[BaseSource, ViewDatasourceConfig],    datasource_details: Optional[PodDetailsConfig] = None,    data_config: Optional[PodDataConfig] = None,    schema: Union[str, os.PathLike, BitfountSchema, ForwardRef(None)] = None,):

Contains a datasource and maybe some data related to it.

This represents a datasource configuration pre-data-loading/configuration and so the data config and schema are not required.

Variables

  • static name : str

Pod

class Pod(    name: str,    datasource: Optional[BaseSource] = None,    datasources: Optional[Iterable[DatasourceContainerConfig]] = None,    username: Optional[str] = None,    data_config: Optional[PodDataConfig] = None,    schema: Union[str, os.PathLike, BitfountSchema, ForwardRef(None)] = None,    pod_details_config: Optional[PodDetailsConfig] = None,    hub: Optional[BitfountHub] = None,    message_service: Optional[MessageServiceConfig] = None,    access_manager: Optional[BitfountAM] = None,    pod_keys: Optional[RSAKeyPair] = None,    approved_pods: Optional[List[str]] = None,    differential_privacy: Optional[DPPodConfig] = None,    pod_db: Union[bool, PodDbConfig] = False,    show_datapoints_with_results_in_db: bool = True,    update_schema: bool = False,    secrets: Union[APIKeys, JWT, ForwardRef(None)] = None,):

Makes data and computation available remotely and responds to tasks.

The basic component of the Bitfount network is the Pod (Processor of Data). Pods are co-located with data, check users are authorized to do given operations on the data and then do any approved computation. Creating a Pod will register the pod with Bitfount Hub.

Example usage:
import bitfount as bf

pod = bf.Pod(
name="really_cool_data",
data="/path/to/data",
)
pod.start()
tip

Once you start a Pod, you can just leave it running in the background. It will automatically respond to any tasks without any intervention required.

Arguments

  • name: Name of the pod. This will appear on Bitfount Hub and Bitfount AM. This is also used for the name of the table in a single-table BaseSource.
  • datasource: (Deprecated, use datasources instead) A concrete instance of the BaseSource object.
  • datasources: The list of datasources to be associated and registered with this pod. Each will have their own data config and schema (although not necessarily present at this point).
  • username: Username of the user who is registering the pod. Defaults to None.
  • data_config: (Deprecated, use datasources instead) Configuration for the data. Defaults to None.
  • schema: (Deprecated, use datasources instead) Schema for the data. This can be a BitfountSchema object or a Path to a serialized BitfountSchema. This will be generated automatically if not provided. Defaults to None.
  • pod_details_config: (Deprecated, use datasources instead) Configuration for the pod details. Defaults to None.
  • hub: Bitfount Hub to register the pod with. Defaults to None.
  • message_service: Configuration for the message service. Defaults to None.
  • access_manager: Access manager to use for checking access. Defaults to None.
  • pod_keys: Keys for the pod. Defaults to None.
  • approved_pods: List of other pod identifiers this pod is happy to share a training task with. Required if the protocol uses the SecureAggregator aggregator.
  • differential_privacy: Differential privacy configuration for the pod. Defaults to None.
  • pod_db: Whether the results should be stored in a database. Defaults to False. If argument is set to True, then a SQLite database will be created for the pod in order to enable results storage for protocols that return them. It also keeps track of the pod datapoints so any repeat task is ran only on new datapoints.
  • show_datapoints_with_results_in_db: Whether the original datapoints should be included in the results database. Defaults to True. This argument is ignored if pod_db argument is set to False.
  • update_schema: Whether the schema needs to be re-generated even if provided. Defaults to False.
  • secrets: Secrets for authenticating with Bitfount services. If not provided then an interactive flow will trigger for authentication.

Attributes

  • datasources: The set of datasources associated with this pod.
  • name: Name of the pod.
  • pod_identifier: Identifier of the pod.
  • private_key: Private key of the pod.

Raises

  • PodRegistrationError: If the pod could not be registered for any reason.
  • DataSourceError: If the BaseSource for the provided datasource has not been initialised properly. This can be done by calling super().__init__(**kwargs) in the __init__ of the DataSource.

Variables

  • datasource : Optional[DatasourceContainer] - If there is only a single datasource, this is a shorthand for retrieving it.

    If there is more than one datasource (or no datasources) this will log a warning and return None.

  • name : str - Pod name property.

Methods


start

def start(self)> None:

Starts a pod instance, listening for tasks.

Whenever a task is received, a worker is created to handle it. Runs continuously and asynchronously orchestrates training whenever a task arrives i.e. multiple tasks can run concurrently.

start_async

async def start_async(self)> None:

Starts a pod instance, listening for tasks.

Whenever a task is received, a worker is created to handle it. Runs continuously and asynchronously orchestrates training whenever a task arrives i.e. multiple tasks can run concurrently.