Datasets
Datasets in Bitfount act as references to your data, storing only metadata and schema—not the raw data itself. Your datasets always remain on your system and are never transferred or stored by Bitfount.
This guide covers how to connect a dataset to Bitfount, link it to a project, and manage dataset access.
Connecting datasets
Before using a dataset in a project, you must first connect it to Bitfount using Bitfount Desktop. Connecting a dataset to Bitfount is like registering it—only its metadata (name, description, and schema) is stored, never the raw data itself.
Format
It's important to ensure your dataset is formatted correctly to be compatible with the task used in the project. If you are joining an existing project, please check with the project contact to ensure your dataset meets the requirements for the task.
Selecting a data source
To connect a dataset, click Connect dataset
either from the Datasets
page,
or within the project when you link a dataset, and choose from the available
data sources supported by Bitfount.
If your dataset contains DICOM files and you intend to run Ophthalmic tasks, we recommend selecting the DICOM (Ophthalmology) data source for optimal compatibility.
After selecting a data source, enter a dataset name and optionally, a
description, then click Connect dataset
. The system will then process the connection,
making the dataset available within Bitfount.
Once connected, the dataset should appear Online
.
Can't find the data source you need? Please reach out to the Bitfount support team—we're happy to help you connect your dataset to Bitfount.
Schema
When you connect a dataset, Bitfount automatically generates a schema that defines the column names and data types within your dataset. This schema is used to verify compatibility with the task used in a project, and does not contain any actual data (such as patient records), only structural information about the dataset.
If you are working with data scientists, they may also reference the schema to design analyses and tasks that align with your dataset's structure.
Managing datasets
Status
When you start Bitfount Desktop, the system automatically attempts to establish a connection with all connected datasets, whether they are online or offline.
If needed, you can manually take a dataset offline from the Settings
tab, which will temporarily disable task execution for that dataset.
Tasks cannot run until Bitfount has finished connecting all datasets at startup
History
A full audit trail is available for datasets via the Activity history
tab. To
view project-specific activity, navigate to the same tab in the relevant
project.
Archiving
From the Settings
tab you can archive your dataset. Archiving does not delete
the raw data source connected to Bitfount. Archived datasets can be unarchived
and reused in projects when appropriate.
Access
You can view all projects the dataset is currently linked to via the
Linked projects
tab on the Dataset's detail page. Unlinking a dataset from a project can be completed at
any time by clicking the Unlink dataset
button within the project's Datasets
tab.
If linking your dataset to projects does not fit your use case and you are
working directly with a data scientist, please
see Managing Pod Access.
This guide outlines how to manage direct access to datasets outside of the
context of a project via the Assigned roles
tab.