Managing Data Dependencies¶
Dependency graphs can be tracked explicitly in datafs, and each version can have its own dependencies.
You specify dependencies from within python or using the command line interface.
Note
Dependencies are not currently validated in any way, so entering a dependency that is not a valid archive name or version will not raise an error.
View the source for the code samples on this page in Python API: Dependencies.
Specifying Dependencies¶
On write¶
Dependencies can be set when using the dependencies
argument to DataArchive
’s update()
, open()
, or get_local_path()
methods.
dependencies
must be a dictionary containing archive names as keys and version numbers as values. A value of None
is also a valid dependency specification, where the version is treated as unpinned and is always interpreted as the dependency’s latest version.
For example:
>>> my_archive = api.create('my_archive')
>>> with my_archive.open('w+',
... dependencies={'archive2': '1.1', 'archive3': None}) as f:
...
... res = f.write(u'contents depend on archive 2 v1.1')
...
>>> my_archive.get_dependencies()
{'archive2': '1.1', 'archive3': None}
After write¶
Dependencies can also be added to the latest version of an archive using the set_dependencies()
method:
>>> with my_archive.open('w+') as f:
...
... res = f.write(u'contents depend on archive 2 v1.2')
...
>>> my_archive.set_dependencies({'archive2': '1.2'})
>>> my_archive.get_dependencies()
{'archive2': '1.2'}
Using a requirements file¶
If a requirements file is present at api creation, all archives written with that api object will have the specified dependencies by default.
For example, with the following requirements file as requirements_data.txt
:
1 2 | dep1==1.0
dep2==0.4.1a3
|
Archives written while in this working directory will have these requirements:
>>> api = datafs.get_api(
... requirements = 'requirements_data.txt')
...
>>>
>>> my_archive = api.get_archive('my_archive')
>>> with my_archive.open('w+') as f:
... res = f.write(u'depends on dep1 and dep2')
...
>>> my_archive.get_dependencies()
{'dep1': '1.0', 'dep2': '0.4.1a3'}
Using Dependencies¶
Retrieve dependencies with DataArchive
’s get_dependencies()
method:
>>> my_archive.get_dependencies()
{'dep1': '1.0', 'dep2': '0.4.1a3'}
Get dependencies for older versions using the version
argument:
>>> my_archive.get_dependencies(version='0.0.1')
{'archive2': '1.1', 'archive3': None}