Creating Data Archives¶
Archives are the basic unit of a DataFS filesystem. They are essentially files, metadata, history, versions, and dependencies wrapped into a single object.
You can create archives from the command line interface or from python.
Create an archive using the
$ datafs create my_archive created versioned archive <DataArchive local://my_archive>
Archives can be named anything, as long as the data service you use can handle the name.
For example, Amazon’s S3 storage cannot handle underscores in object names. If you create an archive with underscores in the name, you will receive an error on write (rather than on archive creation). Since this is an error specific to the storage service, we do not catch this error on creation.
Arbitrary metadata can be added as keyword arguments:
$ datafs create my_archive --description 'my test archive' created versioned archive <DataArchive local://my_archive>
Administrators can set up metadata requirements using the manager’s Administrative Tools tools. If these required fields are not provided, an error will be raised on archive creation.
For example, when connected to a manager requiring the ‘description’ field:
$ datafs create my_archive --doi '10.1038/nature15725' \ > --author "burke" # doctest: +ELLIPSIS +NORMALIZE_WHITESPACE Traceback (most recent call last): ... AssertionError: Required value "description" not found. Use helper=True or the --helper flag for assistance.
Trying again with a
--description "[desc]" argument will work as expected.
Using the Helper¶
Instead of providing all fields in the
create call, you can optionally use the
helper flag. Using the flag
--helper will start an interactive prompt, requesting each required item of metadata:
$ datafs create my_archive --helper created versioned archive <DataArchive local://my_archive>