Python#

The Oxen python interface makes it easy to integrate Oxen datasets directly into machine learning dataloaders or other data pipelines.

Authentication#

Generate a user config file to set your name and email for commits.

First, create a .oxen directory in your home directory.

# Mac/Linux
mkdir ~/.oxen

# Windows
cd %userprofile%
mkdir .oxen

Then run the following to create a user config file in the new directory.

import oxen

oxen.auth.create_user_config(name="Ox", email="hello@oxen.ai")

To contribute to an OxenHub repository, run the following to add your OxenHub API key to your user config.

import oxen

oxen.auth.add_host_auth(host="hub.oxen.ai", token=f"{YOUR_API_KEY}")

Repositories#

There are two types of repositories one can interact with, a LocalRepo and a RemoteRepo.

Local Repo#

To fully clone all the data to your local machine, you can use the LocalRepo class.

import oxen

repo = LocalRepo("path/to/repository")
repo.clone("https://hub.oxen.ai/ox/CatDogBBox")

If there is a specific version of your data you want to access, you can specify the branch when cloning.

repo.clone("https://hub.oxen.ai/ox/CatDogBBox", branch="my-pets")

Once you have a repository locally, you can perform the same operations you might via the command line, through the python api.

For example, you can checkout a branch, add a file, commit, and push the data to the same remote you cloned it from.

import oxen

repo = LocalRepo("path/to/repository")
repo.clone("https://hub.oxen.ai/ox/CatDogBBox")
repo.checkout()

Remote Repo#

If you don’t want to download the data locally, you can use the RemoteRepo class to interact with a remote repository on OxenHub.

import oxen

repo = RemoteRepo("https://hub.oxen.ai/ox/CatDogBBox")

To stage and commit files to a specific version of the data, you can checkout an existing branch or create a new one.

repo.create_branch("dev")
repo.checkout("dev")

You can then stage files to the remote repository by specifying the file path and destination directory.

repo.add("new-cat.png", "images") # Stage to images/new-cat.png on remote
repo.commit("Adding another training image")

Note that no β€œpush” command is required here, since the above code creates a commit directly on the remote branch.