datafs.core package¶
Submodules¶
datafs.core.data_api module¶
-
class
datafs.core.data_api.
DataAPI
(default_versions=None, **kwargs)[source]¶ Bases:
object
-
DefaultAuthorityName
= None¶
-
batch_get_archive
(archive_names, default_versions=None)[source]¶ Batch version of
get_archive()
Parameters: Returns: archives – List of
DataArchive
objects. If an archive is not found, it is omitted (batch_get_archive does not raise aKeyError
on invalid archive names).Return type:
-
cache
¶
-
create
(archive_name, authority_name=None, versioned=True, raise_on_err=True, metadata=None, tags=None, helper=False)[source]¶ Create a DataFS archive
Parameters: - archive_name (str) – Name of the archive
- authority_name (str) – Name of the data service to use as the archive’s data authority
- versioned (bool) – If true, store all versions with explicit version numbers (defualt)
- raise_on_err (bool) – Raise an error if the archive already exists (default True)
- metadata (dict) – Dictionary of additional archive metadata
- helper (bool) – If true, interactively prompt for required metadata (default False)
-
delete_archive
(archive_name)[source]¶ Delete an archive
Parameters: archive_name (str) – Name of the archive to delete
-
filter
(pattern=None, engine='path', prefix=None)[source]¶ Performs a filtered search on entire universe of archives according to pattern or prefix.
Parameters: - prefix (str) – string matching beginning characters of the archive or set of archives you are filtering
- pattern (str) – string matching the characters within the archive or set of archives you are filtering on
- engine (str) – string of value ‘str’, ‘path’, or ‘regex’. That indicates the type of pattern you are filtering on
Returns: Return type: generator
-
get_archive
(archive_name, default_version=None)[source]¶ Retrieve a data archive
Parameters: - archive_name (str) – Name of the archive to retrieve
- default_version (version) – str or
StrictVersion
giving the default version number to be used on read operations
Returns: archive – New
DataArchive
objectReturn type: Raises: KeyError : – A KeyError is raised when the
archive_name
is not found
-
static
hash_file
(f)[source]¶ Utility function for hashing file contents
Overload this function to change the file equality checking algorithm
Parameters: f (file-like) – File-like object or file path from which to compute checksum value Returns: checksum – dictionary with {‘algorithm’: ‘md5’, ‘checksum’: hexdigest} Return type: dict
-
listdir
(location, authority_name=None)[source]¶ List archive path components at a given location
Note
When using listdir on versioned archives, listdir will provide the version numbers when a full archive path is supplied as the location argument. This is because DataFS stores the archive path as a directory and the versions as the actual files when versioning is on.
Parameters: - location (str) –
Path of the “directory” to search
location can be a path relative to the authority root (e.g /MyFiles/Data) or can include authority as a protocol (e.g. my_auth://MyFiles/Data). If the authority is specified as a protocol, the authority_name argument is ignored.
- authority_name (str) –
Name of the authority to search (optional)
If no authority is specified, the default authority is used (if only one authority is attached or if
DefaultAuthorityName
is assigned).
Returns: Archive path components that exist at the given “directory” location on the specified authority
Return type: Raises: ValueError
– A ValueError is raised if the authority is ambiguous or invalid- location (str) –
-
manager
¶
-
-
exception
datafs.core.data_api.
PermissionError
[source]¶ Bases:
exceptions.NameError
datafs.core.data_archive module¶
-
class
datafs.core.data_archive.
DataArchive
(api, archive_name, authority_name, archive_path, versioned=True, default_version=None)[source]¶ Bases:
object
Set tags for a given archive
-
archive_path
¶
-
delete
()[source]¶ Delete the archive
Warning
Deleting an archive will erase all data and metadata permanently. For help setting user permissions, see Administrative Tools
Deletes tags for a given archive
-
download
(filepath, version=None)[source]¶ Downloads a file from authority to local path
- First checks in cache to check if file is there and if it is, is it up to date
- If it is not up to date, it will download the file to cache
-
get_dependencies
(version=None)[source]¶ Parameters: version (str) – string representing version number whose dependencies you are looking up
-
get_local_path
(*args, **kwds)[source]¶ Returns a local path for read/write
Parameters: - version (str) – Version number of the file to retrieve (default latest)
- bumpversion (str) – Version component to update on write if archive is versioned. Valid bumpversion values are ‘major’, ‘minor’, and ‘patch’, representing the three components of the strict version numbering system (e.g. “1.2.3”). If bumpversion is None the version number is not updated on write. Either bumpversion or prerelease (or both) must be a non-None value. If the archive is not versioned, bumpversion is ignored.
- prerelease (str) – Prerelease component of archive version to update on write if archive is versioned. Valid prerelease values are ‘alpha’ and ‘beta’. Either bumpversion or prerelease (or both) must be a non-None value. If the archive is not versioned, prerelease is ignored.
- metadata (dict) – Updates to archive metadata. Pass {key: None} to remove a key from the archive’s metadata.
Returns a list of tags for the archive
-
get_version_path
(version=None)[source]¶ Returns a storage path for the archive and version
If the archive is versioned, the version number is used as the file path and the archive path is the directory. If not, the archive path is used as the file path.
Parameters: version (str or object) – Version number to use as file name on versioned archives (default latest unless default_version
set)Examples
>>> arch = DataArchive(None, 'arch', None, 'a1', versioned=False) >>> print(arch.get_version_path()) a1 >>> >>> ver = DataArchive(None, 'ver', None, 'a2', versioned=True) >>> print(ver.get_version_path('0.0.0')) a2/0.0 >>> >>> print(ver.get_version_path('0.0.1a1')) a2/0.0.1a1 >>> >>> print(ver.get_version_path('latest')) Traceback (most recent call last): ... AttributeError: 'NoneType' object has no attribute 'manager'
-
getmeta
(version=None, *args, **kwargs)[source]¶ Get the value of a filesystem meta value, if it exists
-
open
(*args, **kwds)[source]¶ Opens a file for read/write
Parameters: - mode (str) – Specifies the mode in which the file is opened (default ‘r’)
- version (str) – Version number of the file to open (default latest)
- bumpversion (str) – Version component to update on write if archive is versioned. Valid bumpversion values are ‘major’, ‘minor’, and ‘patch’, representing the three components of the strict version numbering system (e.g. “1.2.3”). If bumpversion is None the version number is not updated on write. Either bumpversion or prerelease (or both) must be a non-None value. If the archive is not versioned, bumpversion is ignored.
- prerelease (str) – Prerelease component of archive version to update on write if archive is versioned. Valid prerelease values are ‘alpha’ and ‘beta’. Either bumpversion or prerelease (or both) must be a non-None value. If the archive is not versioned, prerelease is ignored.
- metadata (dict) – Updates to archive metadata. Pass {key: None} to remove a key from the archive’s metadata.
args, kwargs sent to file system opener
-
update
(filepath, cache=False, remove=False, bumpversion=None, prerelease=None, dependencies=None, metadata=None, message=None)[source]¶ Enter a new version to a DataArchive
Parameters: - filepath (str) – The path to the file on your local file system
- cache (bool) – Turn on caching for this archive if not already on before update
- remove (bool) – removes a file from your local directory
- bumpversion (str) – Version component to update on write if archive is versioned. Valid bumpversion values are ‘major’, ‘minor’, and ‘patch’, representing the three components of the strict version numbering system (e.g. “1.2.3”). If bumpversion is None the version number is not updated on write. Either bumpversion or prerelease (or both) must be a non-None value. If the archive is not versioned, bumpversion is ignored.
- prerelease (str) – Prerelease component of archive version to update on write if archive is versioned. Valid prerelease values are ‘alpha’ and ‘beta’. Either bumpversion or prerelease (or both) must be a non-None value. If the archive is not versioned, prerelease is ignored.
- metadata (dict) – Updates to archive metadata. Pass {key: None} to remove a key from the archive’s metadata.
-
versioned
¶
datafs.core.data_file module¶
-
datafs.core.data_file.
get_local_path
(*args, **kwds)[source]¶ Context manager for retrieving a system path for I/O and updating on change
Parameters: