Postprocessing
Postprocessing turns raw model outputs into human-friendly results and task artefacts. This is an optional final step in the data pipeline and is not required for all models.
Postprocessing is supported by the following algorithms:
bitfount.ModelInferencebitfount.HuggingFaceImageClassificationInferencebitfount.HuggingFaceNERInferencebitfount.HuggingFaceTextClassificationInference
Available postprocessors
Bitfount provides a suite of built-in postprocessors to handle common output-preparation needs. You can mix and match them, even chaining several together using the compound type.
Built-in postprocessor types
General postprocessors
These postprocessors can be used with any of the supported algorithms listed above.
| Name | Description | Example Use Case |
|---|---|---|
rename | Rename DataFrame columns. | Change "pred" column to "Prediction". |
transform | Apply a transformation function from bitfount.transformations on output columns. | Apply softmax or custom transform to logits. |
json_restructure | Move fields between levels in nested JSON structures. | Move a key from nested JSON upwards for flattening. |
string_to_json | Parse columns containing JSON as strings into JSON objects. | Safely load prediction results stored as strings. |
json_key_rename | Rename keys within JSON fields in columns. | Change "class1" to "cat" inside prediction JSON. |
json_wrap_in_list | Wrap JSON fields in an additional list. | Ensure all predictions are in a JSON array format. |
compound | Chain multiple postprocessors together in sequence. | Apply string_to_json followed by json_key_rename. |
Hugging Face postprocessors
These postprocessors are designed specifically for use with Hugging Face algorithms.
| Name | Description | Example Use Case |
|---|---|---|
huggingface_apply_id_to_labels | Maps model output IDs to human-readable labels using a mapping file from the Hugging Face model repository. | Convert numeric class IDs to descriptive labels for multi-headed models. |
ner_deidentification | De-identifies text by replacing named entities detected by NER models with placeholder tokens. | Remove patient names from clinical text after NER inference. |
Example configuration in YAML
The postprocessors can be supplied as a list of dictionaries. The only required key is type which refers to the name of the postprocessor to use.
The other keys are specific to the postprocessor and are passed as keyword arguments to the postprocessor.
More information about the available postprocessors and their arguments can be found in the API documentation:
ModelInference example
algorithm:
- name: bitfount.ModelInference
model:
bitfount_model:
model_ref: MyModel
model_version: 2
username: amin-nejad
hyperparameters:
batch_size: 8
arguments:
postprocessors:
- type: rename
columns: ["logits"]
mapping:
logits: probabilities
- type: transform
columns: ["probabilities"]
transform: softmax
Hugging Face image classification example
algorithm:
- name: bitfount.HuggingFaceImageClassificationInference
arguments:
model_id: google/vit-base-patch16-224
postprocessors:
- type: huggingface_apply_id_to_labels
model_id: google/vit-base-patch16-224
filepath: config.json
key: id2label