Managing Data Dependencies

Dependency graphs can be tracked explicitly in datafs, and each version can have its own dependencies.

You specify dependencies from the command line interface or from within python.

Note

Dependencies are not currently validated in any way, so entering a dependency that is not a valid archive name or version will not raise an error.

Specifying Dependencies

On write

Dependencies can be set when using the --dependency option to the update command. To specify several dependencies, use multiple --dependency archive[==version] arguments.

Each --dependency value should have the syntax archive_name==version. Supplying only the archive name will result in a value of None. A value of None is a valid dependency specification, where the version is treated as unpinned and is always interpreted as the dependency’s latest version.

For example:

$ datafs create my_archive

$ echo "contents depend on archive 2 v1.1" >> arch.txt

$ datafs update my_archive arch.txt  --dependency "archive2==1.1" --dependency "archive3"

$ datafs get_dependencies my_archive
{'archive2': '1.1', 'archive3': None}

After write

Dependencies can also be added to the latest version of an archive using the set_dependencies command:

$ datafs set_dependencies my_archive --dependency archive2==1.2

$ datafs get_dependencies my_archive
{'archive2': '1.2'}

Using a requirements file

If a requirements file is present at api creation, all archives written with that api object will have the specified dependencies by default.

For example, with the following requirements file as requirements_data.txt:

1
2
dep1==1.0
dep2==0.4.1a3

Archives written while in this working directory will have these requirements:

$ echo "depends on dep1 and dep2" > arch.txt

$ datafs update my_archive arch.txt --requirements_file 'requirements_data.txt'

$ datafs get_dependencies my_archive
{'dep1': '1.0', 'dep2': '0.4.1a3'}

Using Dependencies

Retrieve dependencies with the dependencies command:

$ datafs get_dependencies my_archive
{'dep1': '1.0', 'dep2': '0.4.1a3'}

Get dependencies for older versions using the --version argument:

$ datafs get_dependencies my_archive --version 0.0.1
{'archive2': '1.1', 'archive3': None}