dataloaders

HuggingFace compatible dataloaders.

Classes

HuggingFaceBitfountDataLoader

class HuggingFaceBitfountDataLoader(    dataset: _IterableHuggingFaceDataset, batch_size: int = 1, shuffle: bool = False,):

Wraps a PyTorch DataLoader with bitfount functions.

Arguments

batch_size: The batch size for the dataloader. Defaults to 1.
dataset: An pytorch compatible dataset.
shuffle: A boolean value indicating whether the values in the dataset should be shuffled. Defaults to False.

Attributes

batch_size: The batch size for the dataloader. Defaults to 1.
shuffle: A boolean value indicating whether the values in the dataset should be shuffled. Defaults to False.

Ancestors

BitfountDataLoader

Methods

expect_key_in_iter

def expect_key_in_iter(self) ‑> bool:

Will there be a data key entry in the output from iteration?

HuggingFaceIterableBitfountDataLoader

class HuggingFaceIterableBitfountDataLoader(    dataset: _IterableBitfountDataset,    batch_size: int = 1,    shuffle: bool = False,    secure_rng: bool = False,):

Wraps a PyTorch DataLoader with bitfount functions.

Arguments

batch_size: The batch size for the dataloader. Defaults to None.
dataset: An HuggingFace compatible dataset.

Ancestors

BitfountDataLoader

Variables

static dataset : bitfount.data.huggingface.datasets._IterableHuggingFaceDataset

buffer_size : int - Number of elements to buffer.

The size of the buffer is the greater of the batch size and default buffer size unless the dataset is smaller than the default buffer in which case the dataset size is used. PyTorch already ensures that the batch size is not greater than the dataset size under the hood.

Static methods

convert_input_target

def convert_input_target(    batch: _DataBatchAllowingText,) ‑> list[typing.Union[torch.Tensor, collections.abc.Sequence[torch.Tensor]]]:

Convert the input and target to match the hugging face expected inputs_.

convert_input_target_key

def convert_input_target_key(    batch: _DataBatchAllowingTextWithKey,) ‑> list[typing.Union[torch.Tensor, collections.abc.Sequence[torch.Tensor], collections.abc.Sequence[str]]]:

Convert the input and target to match the hugging face expected inputs_.

Ensures that the data key is also passed through to the output.

Methods

expect_key_in_iter

def expect_key_in_iter(self) ‑> bool:

Will there be a data key entry in the output from iteration?

Classes​

HuggingFaceBitfountDataLoader​

Ancestors​

Methods​

expect_key_in_iter​

HuggingFaceIterableBitfountDataLoader​

Ancestors​

Variables​

Static methods​

convert_input_target​

convert_input_target_key​

Methods​

expect_key_in_iter​

Classes

HuggingFaceBitfountDataLoader

Ancestors

Methods

expect_key_in_iter

HuggingFaceIterableBitfountDataLoader

Ancestors

Variables

Static methods

convert_input_target

convert_input_target_key

Methods

expect_key_in_iter