🧰 API Reference

exasol.bucketfs.Service

class exasol.bucketfs.Service(url: str, credentials: Mapping[str, Mapping[str, str]] | None = None, verify: bool | str = True, service_name: str | None = None)[source]

Bases: object

Provides a simple to use api to access a bucketfs service.

buckets

lists all available buckets.

__init__(url: str, credentials: Mapping[str, Mapping[str, str]] | None = None, verify: bool | str = True, service_name: str | None = None)[source]

Create a new Service instance.

Parameters:
  • url – Url of the bucketfs service, e.g. http(s)://127.0.0.1:2580.

  • credentials – A mapping containing credentials (username and password) for buckets. E.g. {“bucket1”: { “username”: “foo”, “password”: “bar” }}

  • verify – Either a boolean, in which case it controls whether we verify the server’s TLS certificate, or a string, in which case it must be a path to a CA bundle to use. Defaults to True.

  • service_name – Optional name of the bucketfs service.

property buckets: MutableMapping[str, Bucket]

List all available buckets.

exasol.bucketfs.BucketLike

class exasol.bucketfs.BucketLike(*args, **kwargs)[source]

Bases: Protocol

Definition of the Bucket interface. It is compatible with both on-premises an SaaS BucketFS systems.

delete(path: str) None[source]

Deletes a file in the bucket.

Parameters:

path – Path of the file to be deleted.

Q. What happens if the path doesn’t exist? A. It does nothing, no error.

Q. What happens if the path points to a directory? A. Same. There are no directories as such in the BucketFS, hence a directory path is just a non-existent file.

download(path: str, chunk_size: int = 8192) Iterable[ByteString][source]

Downloads a file from the bucket. The content of the file will be provided in chunks of the specified size. The full content of the file can be constructed using code similar to the line below. content = b’’.join(api.download_file(path))

Parameters:
  • path – Path of the file in the bucket that should be downloaded.

  • chunk_size – Size of the chunks the file content will be delivered in.

Q. What happens if the file specified by the path doesn’t exist. A. BucketFsError will be raised.

Q. What happens if the path points to a directory. A. Same, since a “directory” in the BucketFS is just a non-existent file.

property files: Iterable[str]

Returns an iterator over the bucket files.

A usage example: print(list(bucket_api.files)) output: [dir1/subdir1/file1.dat, dir1/subdir2/file2.txt, ….]

Note that the paths will look like in the example above, i.e. POSIX style, no backslash at the start or at the end.

property name: str

Returns the bucket name.

property udf_path: str

Returns the path to the bucket’s base directory, as it’s seen from a UDF.

upload(path: str, data: ByteString | BinaryIO) None[source]

Uploads a file to the bucket.

Parameters:
  • path – Path in the bucket where the file should be uploaded.

  • data – Either a binary array or a binary stream, e.g. a file opened in the binary mode.

Q. What happens if the parent is missing? A. The bucket doesn’t care about the structure of the file’s path. Looking from the prospective of a file system, the bucket will create the missing parent, but in reality it will just store the data indexed by the provided path.

Q. What happens if the path points to an existing file? A. That’s fine, the file will be updated.

Q. What happens if the path points to an existing directory? A. The bucket doesn’t care about the structure of the file’s path. Looking from the prospective of a file system, there will exist a file and directory with the same name.

Q. How should the path look like? A. It should look like a POSIX path, but it should not contain any of the NTFS invalid characters. It can have the leading and/or ending backslashes, which will be subsequently removed. If the path doesn’t conform to this format an BucketFsError will be raised.

exasol.bucketfs.Bucket

class exasol.bucketfs.Bucket(name: str, service: str, username: str, password: str, verify: bool | str = True, service_name: str | None = None)[source]

Bases: object

Implementation of the BucketLike interface for the BucketFS in Exasol On-Premises database.

Parameters:
  • name – Name of the bucket.

  • service – Url where this bucket is hosted on.

  • username – Username used for authentication.

  • password – Password used for authentication.

  • verify – Either a boolean, in which case it controls whether we verify the server’s TLS certificate, or a string, in which case it must be a path to a CA bundle to use. Defaults to True.

  • service_name – Optional name of the BucketFS service.

delete(path) None[source]
download(path: str, chunk_size: int = 8192) Iterable[ByteString][source]
property files: Iterable[str]
property name: str
property udf_path: str
upload(path: str, data: ByteString | BinaryIO | Iterable[ByteString]) None[source]

exasol.bucketfs.SaaSBucket

class exasol.bucketfs.SaaSBucket(url: str, account_id: str, database_id: str, pat: str)[source]

Bases: object

Implementation of the BucketLike interface for the BucketFS in Exasol SaaS.

Parameters:
  • url – Url of the Exasol SaaS service.

  • account_id – SaaS user account ID.

  • database_id – SaaS database ID.

  • pat – Personal Access Token

delete(path: str) None[source]
download(path: str, chunk_size: int = 8192) Iterable[ByteString][source]
property files: Iterable[str]
property name: str
property udf_path: str
upload(path: str, data: ByteString | BinaryIO) None[source]

exasol.bucketfs.MountedBucket

class exasol.bucketfs.MountedBucket(service_name: str = 'bfsdefault', bucket_name: str = 'default', base_path: str | None = None)[source]

Bases: object

Implementation of the BucketLike interface backed by a normal file system. The targeted use case is the access to the BucketFS files from a UDF.

Parameters:
  • service_name – Name of the BucketFS service (not a service url). Defaults to ‘bfsdefault’.

  • bucket_name – Name of the bucket. Defaults to ‘default’.

  • base_path – Instead of specifying the names of the service and the bucket, one can provide a full path to the root directory. This can be a useful option for testing when the backend is a local file system. If this parameter is not provided the root directory is set to buckets/<service_name>/<bucket_name>.

delete(path: str) None[source]
download(path: str, chunk_size: int) Iterable[ByteString][source]
property files: list[str]
property name: str
property udf_path: str
upload(path: str, data: ByteString | BinaryIO) None[source]

exasol.bucketfs.path.PathLike

class exasol.bucketfs._path.PathLike(*args, **kwargs)[source]

Bases: Protocol

Definition of the PathLike view of the files in a Bucket.

as_udf_path() str[source]

This method is specific to a BucketFS flavour of the PathLike. It returns a corresponding path, as it’s seen from a UDF.

as_uri() str[source]

Represent the path as a file URI. Can be used to reconstruct the location/path.

exists() bool[source]

Return True if the path points to an existing file or directory.

is_dir() bool[source]

Return True if the path points to a directory, False if it points to another kind of file.

is_file() bool[source]

Return True if the path points to a regular file, False if it points to another kind of file.

iterdir() Generator[PathLike, None, None][source]

When the path points to a directory, yield path objects of the directory contents.

Note

If path points to a file then iterdir() will yield nothing.

Yields:

All direct children of the pathlike object.

joinpath(*path_segments) PathLike[source]

Calling this method is equivalent to combining the path with each of the given path segments in turn.

Returns:

A new pathlike object pointing the combined path.

property name: str

A string representing the final path component, excluding the drive and root, if any.

property parent: PathLike

The logical parent of this path.

read(chunk_size: int = 8192) Iterable[ByteString][source]

Read the content of the file behind this path.

Only works for PathLike objects which return True for is_file().

Parameters:

chunk_size – which will be yielded by the iterator.

Returns:

Returns an iterator which can be used to read the contents of the path in chunks.

Raises:
rm() None[source]

Remove this file.

Note

If exists() and is_file yields true for this path, the path will be deleted, otherwise exception will be thrown.

Raises:

FileNotFoundError – If the file does not exist.

rmdir(recursive: bool = False) None[source]

Removes this directory.

Note: In order to stay close to pathlib, by default rmdir with recursive

set to False won’t delete non-empty directories.

Parameters:

recursive – if true the directory itself and its entire contents (files and subdirs) will be deleted. If false and the directory is not empty an error will be thrown.

Raises:
property root: str

A string representing the root, if any.

property suffix: str

The file extension of the final component, if any.

walk(top_down: bool = True) Generator[tuple[PathLike, list[str], list[str]], None, None][source]

Generate the file names in a directory tree by walking the tree either top-down or bottom-up.

Note

Try to mimik https://docs.python.org/3/library/pathlib.html#pathlib.Path.walk as closely as possible, except the functionality associated with the parameters of the pathlib walk.

Yields:

A 3-tuple of (dirpath, dirnames, filenames).

write(data: ByteString | BinaryIO | Iterable[ByteString]) None[source]

Writes data to this path.

Q. Should it create the parent directory if it doesn’t exit? A. Yes, it should.

After successfully writing to this path exists will yield true for this path. If the file already existed it will be overwritten.

Parameters:

data – which shall be writen to the path.

Raises:

NotAFileError – if the pathlike object is not a file path.

exasol.bucketfs.path.build_path

exasol.bucketfs._path.build_path(**kwargs) PathLike[source]

Creates a PathLike object based on a bucket in one of the BucketFS storage backends. It provides the same interface for the following BucketFS implementations: - On-Premises - SaaS - BucketFS files mounted as read-only directory in a UDF.

Parameters:
  • backend – This is a mandatory parameter that indicates the BucketFS storage backend. The available backends are defined in the StorageBackend enumeration, Currently, these are “onprem”, “saas” and “mounted”. The parameter value can be provided either as a string, e.g. “onprem”, or as an enum, e.g. StorageBackend.onprem.

  • path

    Optional parameter that selects a path within the bucket. If not provided the returned PathLike objects corresponds to the root of the bucket. Hence, an alternative way of creating a PathLike pointing to a particular file or directory is as in the code below. path = build_path(…) / “the_desired_path”

    The rest of the arguments are backend specific.

    On-prem arguments:

  • url – Url of the BucketFS service, e.g. http(s)://127.0.0.1:2580.

  • username – BucketFS username (generally, different from the DB username).

  • password – BucketFS user password.

  • bucket_name – Name of the bucket. Currently, a PathLike cannot span multiple buckets.

  • verify – Either a boolean, in which case it controls whether we verify the server’s TLS certificate, or a string, in which case it must be a path to a CA bundle to use. Defaults to True.

  • service_name

    Optional name of the BucketFS service.

    SaaS arguments:

  • url – Url of the Exasol SaaS. Defaults to ‘https://cloud.exasol.com’.

  • account_id – SaaS user account ID, e.g. ‘org_LVeOj4pwXhPatNz5’ (given example is not a valid ID of an existing account).

  • database_id – Database ID, e.g. ‘msduZKlMR8QCP_MsLsVRwy’ (given example is not a valid ID of an existing database).

  • pat

    Personal Access Token, e.g. ‘exa_pat_aj39AsM3bYR9bQ4qk2wiG8SWHXbRUGNCThnep5YV73az6A’ (given example is not a valid PAT).

    Mounted BucketFS directory arguments:

  • service_name – Name of the BucketFS service (not a service url). Defaults to ‘bfsdefault’.

  • bucket_name – Name of the bucket. Currently, a PathLike cannot span multiple buckets.

  • base_path – Explicitly specified root path in a file system. This is an alternative to providing the service_name and the bucket_name.

exasol.bucketfs.as_bytes

exasol.bucketfs.as_bytes(chunks: Iterable[ByteString]) ByteString[source]

Transforms a set of byte chunks into a bytes like object.

Parameters:

chunks – which shall be concatenated.

Returns:

A single continues byte like object.

exasol.bucketfs.as_string

exasol.bucketfs.as_string(chunks: Iterable[ByteString], encoding: str = 'utf-8') str[source]

Transforms a set of byte chunks into a string.

Parameters:
  • chunks – which shall be converted into a single string.

  • encoding – which shall be used to convert the bytes to a string.

Returns:

A string representation of the converted bytes.

exasol.bucketfs.as_file

exasol.bucketfs.as_file(chunks: Iterable[ByteString], filename: str | Path) Path[source]

Transforms a set of byte chunks into a string.

Parameters:
  • chunks – which shall be written to file.

  • filename – for the file which is to be created.

Returns:

A path to the created file.

exasol.bucketfs.as_hash

exasol.bucketfs.as_hash(chunks: Iterable[ByteString], algorithm: str = 'sha1') ByteString[source]

Calculate the hash for a set of byte chunks.

Parameters:
  • chunks – which shall be used as input for the checksum.

  • algorithm – which shall be used for calculating the checksum.

Returns:

A string representing the hex digest.

exasol.bucketfs.MappedBucket

class exasol.bucketfs.MappedBucket(bucket: Bucket, chunk_size: int = 8192)[source]

Bases: object

Wraps a bucket and provides various convenience features to it (e.g. index based access).

Attention

Even though this class provides a very convenient interface, the functionality of this class should be used with care. Even though it may not be obvious, all the provided features do involve interactions with a bucketfs service in the background (upload, download, sync, etc.). Keep this in mind when using this class.

__init__(bucket: Bucket, chunk_size: int = 8192)[source]

Creates a new MappedBucket.

Parameters:
  • bucket – which shall be wrapped.

  • chunk_size – which shall be used for downloads.

property chunk_size: int

Chunk size which will be used for downloads.

exasol.bucketfs.BucketFsError

exception exasol.bucketfs.BucketFsError(*args, **kwargs)[source]

Error occurred while interacting with the bucket fs service.