Datasets.Dataset_utilsUtilities for downloading and managing datasets.
Return the platform-specific cache directory path for the given dataset.
The default location is "~/.cache/ocannl/datasets/dataset_name/".
Parameters
Returns
Download a file from a URL to a destination path.
Creates parent directories as needed, downloads the file from url, and saves it to dest_path.
Parameters
Raises
Failure on download or write error.Ensure a file exists at the given path, downloading if necessary.
Checks if dest_path exists. If not, downloads the file from url.
Parameters
Raises
Failure on download or write error.val ensure_extracted_archive :
url:string ->
archive_path:string ->
extract_dir:string ->
check_file:string ->
unitEnsure an archive is downloaded, extracted, and a file exists.
Checks if check_file (relative to extract_dir) exists. If not, downloads the archive from url to archive_path, extracts it into extract_dir, and verifies check_file is present. Currently supports only .tar.gz archives.
Parameters
extract_dir to verify extraction.Raises
Failure on download, extraction, or missing check_file.Ensure a gzip-compressed file is decompressed to a target path.
If target_path exists, does nothing and returns true. Otherwise, if gz_path exists, decompresses it to target_path.
Parameters
Returns
true if target_path exists after the operation.false if gz_path does not exist.Raises
Failure on gzip decompression error.Parse a CSV cell as a float.
Attempts to convert value to a float. On failure, raises Failure with a descriptive message including context ().
Parameters
Returns
Raises
Failure if value cannot be parsed as a float.Parse a CSV cell as an integer.
Attempts to convert value to an int. On failure, raises Failure with a descriptive message including context ().
Parameters
Returns
Raises
Failure if value cannot be parsed as an int.Recursively create a directory and its parents.
Creates the directory at path, along with any missing parent directories. If path already exists as a directory, does nothing.
Parameters
Raises
Unix.Unix_error if creation fails for other reasons.