wildboar.datasets._repository#
Module Contents#
Classes#
Base class for handling dataset bundles |
|
A repository is a collection of bundles |
|
bundle of numpy binary files |
|
A repository is a collection of bundles |
|
Attributes#
- class wildboar.datasets._repository.Bundle(*, key, version, name, tag=None, arrays=None, description=None, collections=None)[source]#
Base class for handling dataset bundles
- Parameters:
key (str) – A unique key of the bundle
version (str) – The version of the bundle
name (str) – Human-readable name of the bundle
description (str) – Description of the bundle
arrays (list) – The arrays of the dataset
- list(archive, collection=None)[source]#
List all datasets in this bundle
- Parameters:
archive (ZipFile) – The bundle file
collection (str, optional) – The collection name
- Returns:
dataset_names – A sorted list of datasets in the bundle
- Return type:
list
- load(name, archive)[source]#
Load a dataset from the bundle
- Parameters:
name (str) – Name of the dataset
archive (ZipFile) – The zip-file bundle
- Returns:
x (ndarray) – Data samples
y (ndarray) – Data labels
n_training_samples (int) – Number of samples that are for training. The value is <= x.shape[0]
extras (dict, optional) – Extra numpy arrays
- class wildboar.datasets._repository.JSONRepository(url)[source]#
Bases:
RepositoryA repository is a collection of bundles
- property download_url[source]#
The url template for downloading bundles
- Returns:
str
- Return type:
the download url
- property version[source]#
The repository version
- Returns:
str
- Return type:
the version of the repository
- class wildboar.datasets._repository.NpBundle(*, key, version, name, tag=None, arrays=None, description=None, collections=None)[source]#
Bases:
Bundlebundle of numpy binary files
- Parameters:
key (str) – A unique key of the bundle
version (str) – The version of the bundle
name (str) – Human-readable name of the bundle
description (str) – Description of the bundle
arrays (list) – The arrays of the dataset
- class wildboar.datasets._repository.Repository[source]#
A repository is a collection of bundles
- abstract property download_url[source]#
The url template for downloading bundles
- Returns:
str
- Return type:
the download url
- abstract property name[source]#
Name of the repository
- Returns:
str
- Return type:
the name of the repository
- abstract property version[source]#
The repository version
- Returns:
str
- Return type:
the version of the repository
- abstract property wildboar_requires[source]#
The minimum required wildboar version
- Returns:
str
- Return type:
the min version
- get_bundle(key)[source]#
Get a bundle with the specified key
- Parameters:
key (str) – Key of the bundle
- Returns:
bundle – A bundle or None
- Return type:
Bundle, optional
- abstract get_bundles()[source]#
Get all bundles
- Returns:
dict
- Return type:
a dictionary of key and bundle
- list_datasets(bundle, *, cache_dir, collection=None, version=None, tag=None, create_cache_dir=True, progress=True, force=False)[source]#