A package manager for machine learning datasets and models.
Design of overal architecture and documentation to start onboarding contributors.
Current state of the Architecture
Uses either a legecy (interactive) command-line or a ratatui frontend. Whereas the latter is
experimental and hidden behind the feature flag tui
.
The Nebula CLI provides a set of commands to interact with the Nebula package manager.
This is a copy from the nebula_cli --help
output:
A package manager for machine learning datasets and models acting as client for Nebula registries.
Usage: nebula_cli [OPTIONS] [COMMAND]
Commands:
init init a virtual environment in the given folder (not yet)
status prints status information (not yet)
install Installs a package (not yet)
update Updates a specific package or all packages (not yet)
uninstall Uninstall a specific package or all packages (not yet)
search Searches packages by complex criteria (not yet)
list List packages that fit simple criteria e.g.(non)-installed,
sync Sync the local cache with the remote registry
help Print this message or the help of the given subcommand(s)
Options:
--tui use a [ratatui] based terminal user interface instead of a simple cmd-tool
-i, --interactive start the cmd-tool in interactive mode, that allows typing multiple commands
-v, --verbose use verbose output, only in non TUI mode
-t, --tick-rate <FLOAT> Tick rate, i.e. number of ticks per second in tui [default: 4]
-f, --frame-rate <FLOAT> Frame rate, i.e. number of frames per second in tui [default: 60]
-h, --help Print help
-V, --version Print version
help # Shows help for a specific command
Examples:
nebula sync # gets the newest metadata locally from the remote registry
nebula search climate_data # Search for packages related to climate data
nebula install neural_net_model_v2 --version 1.0.1 # Install a specific version of a model
nebula install climate_dataset_2023 # Install the latest version of a dataset
nebula update --all # Update all installed datasets and models
nebula uninstall outdated_model # Remove an outdated model
The Nebula CLI communicates with the registry via gRPC using Tonic. The registry can be self-hosted if desired and using the CLI we can configure the registry URL.
As for now the registry supports the following endpoints:
service NebulaPackageQuery {
// Gets detailed information for one specific package
rpc GetPackageInfo (PackageRequest) returns (PackageInfo);
// List all packages with very simple search criteria
rpc ListPackages (ListPackagesRequest) returns (PackageList);
// Search packages applying several filters
rpc SearchPackages (SearchPackagesRequest) returns (PackageList);
}
For more information see the proto file.
The datasets and models are stored elsewhere for now and based on the URL the client is expected to send further GET requests.
In the far away future we might implement a web interface for the Nebula registry.
The following contributors have either helped to start this project, have contributed code, are actively maintaining it (including documentation), or in other ways being awesome contributors to this project. We'd like to take a moment to recognize them.
The BSD 3-Clause License.