Image Classification Loader#

class oxen.loaders.ImageClassificationLoader(imagery_root_dir, label_file, df_file, path_name='path', label_name='label', resize_to=None, resize_method='crop')#

Prepares data from an Oxen repository for use in supervised image classification tasks.

__init__(imagery_root_dir, label_file, df_file, path_name='path', label_name='label', resize_to=None, resize_method='crop')#

Creates a new ImageClassificationLoader.

Parameters:
  • imagery_root_dir (str) – Directory relative to which the image paths in the DataFrame file are specified.

  • label_file (str) – Path to a text file containing a line-separated list of canonical labels for the dataset.

  • df_file (str) – Path to a tabular file containing the image paths and associate labels (and any additional metadata).

  • path_name (str) – Column name in df_file containing the image paths

  • label_name (str) – Column name in df_file containing the image labels

  • resize_to (int | None) – Size to which images should be resized (square, in pixels)

  • resize_method (str) –

    Method to use for resizing images. One of β€œcrop”, β€œpad”, or β€œsquash”.
    cropresize (preserving aspect) such

    that smaller size = target size, then center crop

    pad: resize (prserving aspect) such that larger size = target size,

    then pad with zeros equally on all sides

    squash: resize (not presercing aspect)

run()#
Returns:

  • outputs[0] (images) (np.ndarray) – All images found in the dataset, as a numpy array of shape (n, h, w, c)

  • outputs[1] (labels) (np.nadarray) – Encoded labels for training, index-matched to the images array

  • outputs[2] (mapper) (dict) – A dictionary mapping the encoded labels to their canonical names

Usage#

from oxen import LocalRepo
from oxen.loaders import ImageClassificationLoader

repo = LocalRepo()

# Demo data for supervised image classification
repo.clone("https://hub.oxen.ai/ba/dataloader-images")

loader = ImageClassificationLoader(
    imagery_root_dir = repo.path,
    label_file = f"{repo.path}/annotations/labels.txt",
    df_file = f"{repo.path}/annotations/train.csv",
    path_name = "file",
    label_name = "hair_color"
)

X_train, y_train, mapper = loader.run()