API Reference¶

pyexasol.connect(**kwargs) → ExaConnection[source]¶

Create a new connection object.

Parameters:: **kwargs – For details, refer to the pyexasol.ExaConnection class.

pyexasol.connect_local_config(config_section, config_path=None, **kwargs) → ExaConnection[source]¶

Constructor for connection objects based on a local config file.

Info:

The default config path is ~/.pyexasol.ini
Extra arguments override values from config

Parameters:

config_section – Name of config section (required!)
config_path – Custom path to local config file
kwargs – Arguments for “connect()” function

pyexasol.http_transport(ipaddr, port, compression=False, encryption=True) → ExaHTTPTransportWrapper[source]¶

Constructor for HTTP Transport wrapper for parallel HTTP Transport (EXPORT or IMPORT)

Parameters:

ipaddr – IP address of one of Exasol nodes received from pyexasol.ExaConnection.get_nodes()
port – Port of one of Exasol nodes received from pyexasol.ExaConnection.get_nodes()
compression – Use zlib compression for HTTP transport, must be the same as compression of main connection
encryption – Use SSL/TLS encryption for HTTP transport, must be the same as encryption of main connection

Info:

Compression and encryption arguments should match pyexasol.connect()

How to use:

Parent process opens main connection to Exasol with pyexasol.connect()
Parent process creates any number of child processes (possibly on remote host or another container)
Every child process starts HTTP transport sub-connection with pyexasol.http_transport()
and gets “ipaddr:port” string using ExaHTTPTransportWrapper.address
Every child process sends address string to parent process using any communication method (Pipe, Queue, Redis, etc.)
Parent process runs .export_parallel() or .import_parallel(), which initiates EXPORT or IMPORT query in Exasol
Every child process receives or sends a chunk of data using ExaHTTPTransportWrapper.export_*() or .import_*()
Parent process waits for Exasol query and for child processes to finish

All child processes should run in parallel. It is NOT possible to process some data first, and process some more data later.

If an exception is raised in child process, it will close the pipe used for HTTP transport. Closing the pipe prematurely will cause SQL query to fail and will raise an exception in parent process. Parent process is responsible for closing other child processes and cleaning up.

PyExasol does not provide a complete solution to manage child processes, only examples. The final solution depends on your hardware, network configuration, cloud provider and container orchestration software.

class pyexasol.ExaConnection(dsn: str | None = None, user: str | None = None, password: str | None = None, schema: str = '', autocommit=True, snapshot_transactions=None, connection_timeout=10, socket_timeout=30, query_timeout=0, compression: bool = False, encryption: bool = True, fetch_dict: bool = False, fetch_mapper=None, fetch_size_bytes=5242880, lower_ident: bool = False, quote_ident: bool = False, json_lib: str = 'json', verbose_error: bool = True, debug: bool = False, debug_logdir=None, udf_output_bind_address=None, udf_output_connect_address=None, udf_output_dir=None, http_proxy=None, resolve_hostnames: bool = True, client_name=None, client_version=None, client_os_username=None, protocol_version=3, websocket_sslopt: dict | None = None, access_token: str | None = None, refresh_token: str | None = None)[source]¶

Bases: object

Warning

Threads may share the module, but not connections One connection may be used by different threads, just not at the same time pyexasol.ExaConnection.abort_query() is an exception, it is meant to be called from another thread

Note

It is advisable to use multiprocessing instead of threading and create a new connection in each sub-process

Public Attributes:

attr:: Read-only dict of attributes of current connection.
login_info:: Read-only dict of login information returned by second response of LOGIN command.
options:: Read-only dict of arguments passed to pyexasol.ExaConnection.connect().

__init__(dsn: str | None = None, user: str | None = None, password: str | None = None, schema: str = '', autocommit=True, snapshot_transactions=None, connection_timeout=10, socket_timeout=30, query_timeout=0, compression: bool = False, encryption: bool = True, fetch_dict: bool = False, fetch_mapper=None, fetch_size_bytes=5242880, lower_ident: bool = False, quote_ident: bool = False, json_lib: str = 'json', verbose_error: bool = True, debug: bool = False, debug_logdir=None, udf_output_bind_address=None, udf_output_connect_address=None, udf_output_dir=None, http_proxy=None, resolve_hostnames: bool = True, client_name=None, client_version=None, client_os_username=None, protocol_version=3, websocket_sslopt: dict | None = None, access_token: str | None = None, refresh_token: str | None = None)[source]¶

Exasol connection object

Parameters:

dsn – Connection string, same format as standard JDBC / ODBC drivers (e.g. 10.10.127.1..11:8564)
user – Username
password – Password
schema – Open schema after connection (Default: ‘’, no schema)
autocommit – Enable autocommit on connection (Default: True)
snapshot_transactions – Explicitly enable or disable snapshot transactions on connection (Default: None, database default)
connection_timeout – Socket timeout in seconds used to establish connection (Default: 10)
socket_timeout – Socket timeout in seconds used for requests after connection was established (Default: 30)
query_timeout – Maximum execution time of queries before automatic abort, in seconds (Default: 0, no timeout)
compression – Use zlib compression both for WebSocket and HTTP transport (Default: False)
encryption – Use SSL to encrypt client-server communications for WebSocket and HTTP transport (Default: True)
fetch_dict – Fetch result rows as dicts instead of tuples (Default: False)
fetch_mapper – Use custom mapper function to convert Exasol values into Python objects during fetching (Default: None)
fetch_size_bytes – Maximum size of data message for single fetch request in bytes (Default: 5Mb)
lower_ident – Automatically lowercase identifiers (table names, column names, etc.) returned from relevant functions (Default: False)
quote_ident – Add double quotes and escape identifiers passed to relevant functions (export_*, import_*, ext.*, etc.) (Default: False)
json_lib – Supported values: rapidjson, ujson, orjson, json (Default: json)
verbose_error – Display additional information when error occurs (Default: True)
debug – Output debug information for client-server communication and connection attempts to STDERR
debug_logdir – Store debug information into files in debug_logdir instead of outputting it to STDERR
udf_output_bind_address – Specific server_address to bind TCP server for UDF script output (default: (‘’, 0))
udf_output_connect_address – Specific SCRIPT_OUTPUT_ADDRESS value to connect from Exasol to UDF script output server (default: inherited from TCP server)
udf_output_dir – Directory to store captured UDF script output logs, split by <session_id>_<statement_id>/<vm_num>
http_proxy – HTTP proxy string in Linux http_proxy format (default: None)
resolve_hostnames – Explicitly resolve host names to IP addresses before connecting. Deactivating this will let the operating system resolve the host name (default: True)
client_name – Custom name of client application displayed in Exasol sessions tables (Default: PyExasol)
client_version – Custom version of client application (Default: pyexasol.__version__)
client_os_username – Custom OS username displayed in Exasol sessions table (Default: getpass.getuser())
protocol_version – Major WebSocket protocol version requested for connection (Default: pyexasol.PROTOCOL_V3)
websocket_sslopt – Set custom SSL options for WebSocket client (Default: None)
access_token – OpenID access token to use for the login process
refresh_token – OpenID refresh token to use for the login process

abort_query()[source]¶

Abort running query

Warning

This function should be called from a separate thread and has no response Response should be checked in the main thread which started execution of query

There are three possible outcomes of calling this function:

Query is aborted normally, connection remains active
Query was stuck in a state which cannot be aborted, so Exasol has to terminate connection
Query might be finished successfully before abort call had a chance to take effect

close(disconnect=True)[source]¶

Closes connection to database.

Parameters:: disconnect – If true send optional “disconnect” command to free resources and close session on Exasol server side properly.

Note

Please note that “disconnect” should always be False when .close() is being called from .req()-like functions to prevent an infinite loop if websocket exception happens during handling of “disconnect” command

cls_extension¶: alias of ExaExtension

cls_formatter¶: alias of ExaFormatter

cls_logger¶: alias of ExaLogger

cls_meta¶: alias of ExaMetaData

cls_statement¶: alias of ExaStatement

commit()[source]¶: Wrapper for query ‘COMMIT’

current_schema()[source]¶

Get the name of the current schema.

Returns:: Name of currently opened schema. Returns an empty string if no schema was opened.

property exasol_db_version: Version | None¶

Version of the Exasol database of the current session.

The login information is returned by the second response of LOGIN command and calls this “releaseVersion”.

execute(query: str, query_params: dict | None = None) → ExaStatement[source]¶

Execute SQL query with optional query formatting parameters

Parameters:

query – SQL query text, possibly with placeholders
query_params – Values for placeholders

Returns:

ExaStatement object

Examples

>>> con = ExaConnection(...)
>>> con.execute(
...        query="SELECT * FROM {table!i} WHERE col1={col1}",
...        query_params={'table': 'users', 'col1':'bar'}
...)

execute_udf_output(query: str, query_params: dict | None = None)[source]¶

Execute SQL query with UDF script, capture output

Note

Exasol should be able to open connection to the machine where current script is running. It is usually OK in the same data centre, but it is normally not working if you try to run this function on local laptop.

Parameters:

query – SQL query text, possibly with placeholders
query_params – Values for placeholders

Returns:

(1) instance of pyexasol.ExaStatement and (2) list of Path objects for script output log files.

Return type:

Return tuple with two elements

Attention

Exasol should be able to open connection to the machine where current script is running

Examples

>>> con = ExaConnection(...)
>>> stmt, output_files = con.execute_udf_output(
...        query="SELECT * FROM {table!i} WHERE col1={col1}",
...        query_params={'table': 'users', 'col1':'bar'}
...)

export_parallel(exa_address_list, query_or_table, query_params=None, export_params=None)[source]¶

This function is part of Parallel API.

Parameters:

exa_address_list – List of ipaddr:port strings obtained from HTTP transport .address.
query_or_table – SQL query or table for the export.
query_params – Values for SQL query placeholders.
export_params – Custom parameters for Export query.

Note

Init HTTP transport in child processes first using pyexasol.http_transport()
Get internal Exasol address from each child process using .address
Pass address strings to parent process, combine into single list and use it for export_parallel() call

export_to_callback(callback: Callable, dst, query_or_table: str, query_params: dict | None = None, callback_params: dict | None = None, export_params: dict | None = None)[source]¶

Export a large amount of data to a user-defined callback function

Parameters:

callback – Callback function.
dst – (optional) Path to file or file-like object where data will be exported to.
query_or_table – SQL query or table from which to export data.
query_params – Values for SQL query placeholders.
callback_params – Dictionary with additional parameters for callback function.
export_params – Custom parameters for EXPORT query.

Returns:

result of callback function

Warning

This function may run out of memory

Examples

>>> cb = lambda args: print(args)
>>> con = ExaConnection(...)
>>> con.export_to_callback(
...    callback=cb,
...    query_or_table="SELECT * FROM table"
... )

export_to_file(dst, query_or_table: str, query_params: dict | None = None, export_params: dict | None = None)[source]¶

Export large amount of data from Exasol to file or file-like object using fast HTTP transport.

Note

File must be opened in binary mode.

Parameters:

dst – Path to file or file-like object where data will be exported to.
query_or_table – SQL query or table from which to export data.
query_params – Values for SQL query placeholders.
export_params – Custom parameters for EXPORT query.

Examples

>>> con = ExaConnection(...)
>>> with open('/tmp/file.csv', 'wb') as f:
...     con.export_to_file(
...         dst=f,
...         query_or_table="SELECT * FROM table"
...     )

export_to_list(query_or_table: str, query_params: dict | None = None, export_params: dict | None = None) → list[source]¶

Export large amount of data from Exasol to basic Python list using fast HTTP transport.

Parameters:

query_or_table – SQL query or table from which to export data.
query_params – Values for SQL query placeholders.
export_params – Custom parameters for EXPORT query.

Returns:

list of tuples

Warning

This function may run out of memory

Examples

>>> con = ExaConnection(...)
>>> myresult = con.export_to_list(
...    query_or_table="SELECT * FROM table"
... )

export_to_pandas(query_or_table: str, query_params: dict | None = None, callback_params: dict | None = None, export_params: dict | None = None) → pandas.DataFrame[source]¶

Export large amount of data from Exasol to pandas.DataFrame.

Parameters:

query_or_table – SQL query or table from which to export data.
query_params – Values for SQL query placeholders.
callback_params – Dictionary with additional parameters for callback function pandas.read_csv.
export_params – Custom parameters for EXPORT query.

Returns:

instance of pandas.DataFrame

Warning

This function may run out of memory

Examples

>>> con = ExaConnection(...)
>>> myresult = con.export_to_pandas(
...    query_or_table="SELECT * FROM table"
... )

export_to_parquet(dst: Path | str, query_or_table: str, query_params: dict | None = None, callback_params: dict | None = None, export_params: dict | None = None)[source]¶

Export large amounts of data from Exasol to local parquet file(s).

Parameters:

dst – Local path to directory for exporting files. Can be one either a Path or str. The default behavior is that the specified directory should be empty. If this is not the case, an exception is thrown.
query_or_table – SQL query or table from which to export data.
query_params – Values for SQL query placeholders.
callback_params –
Dictionary with additional parameters for callback function pyarrow.dataset.write_dataset. Some important defaults to note are:

existing_data_behavior
Set to error, which requires that the specified dst not contain any files or an exception will be thrown..

max_rows_per_file
Set to 0, which means that all rows will be written to 1 file. If max_rows_per_file is altered, ensure that max_rows_per_group is set to a value less than or equal to the value of max_rows_per_file.

use_threads
Set to True and preserve_order is set to False. This means that the writing of multiple files will be done in parallel and that the order is not guaranteed to be preserved.
export_params – Custom parameters for EXPORT query.

export_to_polars(query_or_table: str, query_params: dict | None = None, callback_params: dict | None = None, export_params: dict | None = None) → polars.DataFrame[source]¶

Export large amount of data from Exasol to polars.DataFrame.

Parameters:

query_or_table – SQL query or table from which to export data.
query_params – Values for SQL query placeholders.
callback_params – Dictionary with additional parameters for callback function polars.read_csv.
export_params – Custom parameters for EXPORT query.

Returns:

instance of polars.DataFrame

Warning

This function may run out of memory

Examples

>>> con = ExaConnection(...)
>>> df = con.export_to_polars(
...    query_or_table="SELECT * FROM table"
... )

get_attr()[source]¶

get_nodes(pool_size=None)[source]¶

List of currently active Exasol nodes which is normally used for Parallel.

Parameters:: pool_size – Return list of specific size.
Returns:: list of dictionaries describing active Exasol nodes

Note

Format: {'ipaddr': <ip_address>, 'port': <port>, 'idx': <incremental index of returned node>}

If pool_size is bigger than number of nodes, list will wrap around and nodes will repeat with different ‘idx’
If pool_size is omitted, return every active node once
It is useful to balance workload for parallel IMPORT and EXPORT Exasol shuffles list for every connection
Exasol shuffles list for every connection.

import_from_callback(callback: Callable, src, table: str, callback_params: dict | None = None, import_params: dict | None = None)[source]¶

Import a large amount of data from a user-defined callback function.

Parameters:

callback – Callback function.
src – Source for the callback function.
table – Destination table for IMPORT.
callback_params – Dictionary with additional parameters for callback function.
import_params – Custom parameters for IMPORT query.

Raises:

ValueError – callback argument isn’t callable.

import_from_file(src, table: str, import_params: dict | None = None)[source]¶

Import a large amount of data from a file or file-like object.

Parameters:

src – Source file or file-like object.
table – Destination table for IMPORT.
import_params – Custom parameters for IMPORT query.

Note

File must be opened in binary mode.

import_from_iterable(src: Iterable, table: str, import_params: dict | None = None)[source]¶

Import a large amount of data from an iterable Python object.

Parameters:

src – Source object implementing __iter__. Iterator must return tuples of values.
table – Destination table for IMPORT.
import_params – Custom parameters for IMPORT query.

import_from_pandas(src: pandas.DataFrame, table: str, callback_params: dict | None = None, import_params: dict | None = None)[source]¶

Import a large amount of data from pandas.DataFrame.

Parameters:

src – Source pandas.DataFrame instance.
table – Destination table for IMPORT.
callback_params – Dictionary with additional parameters for callback function pandas.DataFrame.to_csv.
import_params – Custom parameters for IMPORT query.

import_from_parquet(source: list[Path] | Path | str, table: str, callback_params: dict | None = None, import_params: dict | None = None)[source]¶

Import a large amount of data from pyarrow.parquet.Table.

Parameters:

source – Local filepath specification(s) to process. Can be one of: - list[pathlib.Path]: list of specific files - pathlib.Path: can be either a file or directory. If it’s a directory, all files matching this pattern .parquet will be processed. - str: representing a filepath which already contains a glob pattern (e.g., “/local_dir/.parquet”)
table – Destination table for IMPORT.
callback_params – Dict with additional parameters for callback function parquet.ParquetFile.iter_batches.
import_params – Custom parameters for IMPORT query.

import_from_polars(src: polars.LazyFrame | polars.DataFrame, table: str, callback_params: dict | None = None, import_params: dict | None = None)[source]¶

Import a large amount of data from polars.DataFrame or polars.LazyFrame.

Parameters:

src – Source polars.DataFrame or polars.LazyFrame instance.
table – Destination table for IMPORT.
callback_params – Dictionary with additional parameters for callback function polars.DataFrame.write_csv.
import_params – Custom parameters for IMPORT query.

import_parallel(exa_address_list, table, import_params=None)[source]¶

This function is part of Parallel API.

Parameters:

exa_address_list – List of ipaddr:port strings obtained from HTTP transport .address.
table – Table to import to.
import_params – Custom parameters for import.

Note

Init HTTP transport in child processes first using pyexasol.http_transport()
Get internal Exasol address from each child process using .address
Pass address strings to parent process, combine into single list and use it for import_parallel() call

last_statement() → ExaStatement[source]¶

Last created statement object

Returns:: last created statement.
Return type:: ExaStatement

Note

It is mainly used for HTTP transport to access internal IMPORT / EXPORT query, measure execution time and number of rows

Tip

It is useful while working with export_* and import_* functions normally returning result of callback function instead of statement object.

open_schema(schema)[source]¶

Wrapper for OPEN SCHEMA

Parameters:: schema – Schema name

protocol_version() → int[source]¶

Actual protocol version used by the established connection.

Returns:: 0 if connection was not established yet (e.g. due to exception handling), otherwise protocol version as int.

Warning

Actual Protocol version might be downgraded from requested protocol version if Exasol server does not support it

Note

The actual protocol version may be lower than the requested protocol version defined by the protocol_version connection option. For further details, refer to WebSocket Protocol Version.

req(req)[source]¶: Send WebSocket request and wait for response

rollback()[source]¶: Wrapper for query ‘ROLLBACK’

session_id() → str[source]¶

Session id of current session.

Returns:: Unique SESSION_ID of the current session as string.

set_attr(new_attr)[source]¶

set_autocommit(val: str) → None[source]¶

Set autocommit mode.

Parameters:: val – Set False to execute following statements in transaction. Set True to get back to automatic COMMIT after each statement.

Note

Autocommit is True by default because Exasol has to commit indexes and statistics objects even for pure SELECT statements. Lack of default COMMIT may lead to serious performance degradation.

set_query_timeout(val)[source]¶

Set the maximum time in seconds for which a query can run before Exasol kills it automatically.

Parameters:: val – Timeout value in seconds. Set value 0 to disable timeout.

Note

It is highly recommended to set timeout for UDF scripts to avoid potential infinite loops and very long transactions.

threadsafety = 1¶

class pyexasol.ExaStatement(connection, query=None, query_params=None, prepare=False, meta_nosql=False, **options)[source]¶

Bases: object

This class executes and helps to fetch result set of single Exasol SQL statement.

Warning

Unlike typical Cursor object, ExaStatement is not reusable.

Note

pyexasol.ExaStatement may fetch result set rows as tuples (default) or as dict (set fetch_dict=True in connection options).

pyexasol.ExaStatement may use custom data-type mapper during fetching (set fetch_mapper=<func> in connection options). Mapper function accepts two arguments (raw value and dataType object) and returns custom object or value.

pyexasol.ExaStatement fetches big result sets in chunks. The size of chunk may be adjusted (set fetch_size_bytes=<int> in connection options).

Public Attributes:

execution_time:: Execution time of SQL statement. It is measured by wall-clock time of WebSocket request, so real execution time is a bit faster.

__init__(connection, query=None, query_params=None, prepare=False, meta_nosql=False, **options)[source]¶

Parameters:

connection
query
query_params
prepare
meta_nosql
options – additonal kwargs

__iter__()[source]¶

The best way to fetch result set of statement is to use iterator:

Yields:: tuple or dict depending on fetch_dict connection option.

Examples

>>> st = pyexasol.execute('SELECT * FROM table')
... for row in st:
...     print(row)

close()[source]¶: Closes result set handle if it was opened.

Warning

You won’t be able to fetch next chunk of large dataset after calling this function, but no other side effects.

column_names()[source]¶: List of column names.

columns()[source]¶

Retrieves column information of returned data.

Returns:: A dict with keys as column names and values as dataType objects.

Notes

The dict will containt the following data:

Names	Type	Description
type	string	column data type
precision	number	(optional) column precision
scale	number	(optional) column scale
size	number	(optional) maximum size in bytes of a column value
characterSet	string	(optional) character encoding of a text column
withLocalTimeZone	true, false	(optional) specifies if a timestamp has a local time zone
fraction	number	(optional) fractional part of number
srid	number	(optional) spatial reference system identifier

execute_prepared(data=None)[source]¶

fetchall()[source]¶

Fetches all remaining rows.

Returns:: list of tuples or list of dict. Empty list if all rows were fetched previously.

Warning

This function may exhaust available memory.

fetchcol()[source]¶

Fetches all values from the first column.

Returns:: list of values. Empty list if all rows were fetched previously.

fetchmany(size=10000)[source]¶

Fetch multiple rows.

Parameters:: size – Set the specific number of rows to fetch (Default: 10000)
Returns:: list of tuples or list of dict. Empty list if all rows were fetched previously.

fetchone()[source]¶

Fetches one row of data.

Returns:: tuple or dict. None if all rows were fetched.

fetchval()[source]¶

Fetches first column of first row.

Returns:: Value, None if all rows were fetched previously.

Tip

This may be useful for queries returning single value like SELECT count(*) FROM table.

rowcount()[source]¶

Number of selected/processed rows.

Returns:: Total amount of selected rows for statements with result set (num_rows). Total amount of processed rows for DML queries (row_count).

class pyexasol.ExaFormatter(connection)[source]¶

Bases: Formatter

pyexasol.ExaFormatter is a subclass of string.Formatter designed to prevent SQL injections in Exasol dynamic SQL queries.

Note:
It introduces set of placeholders to prevent SQL injections specifically in Exasol dynamic SQL queries. It also completely disables format_spec section of standard formatting since it has no use in context of SQL queries and may cause more harm than good.

You may access these functions using .format property of connection object. Example:

Examples:

>>> C = pyexasol.connect(...)
... print(C.format.escape('abc'))

convert_field(value, conversion)[source]¶

classmethod escape(val)[source]¶

Takes a raw value and converts it into an and escaped string.

Parameters:: val – Value to be escaped.
Returns:: A string where all single quotes ' have been replaced with two single quotes ''.

classmethod escape_ident(val)[source]¶

Takes a raw value and converts it into an and escaped string.

Parameters:: val – Value to be escaped.
Returns:: A string where all double quotes " have been replaced with two double quotes "".

classmethod escape_like(val)[source]¶

Escape LIKE-patterns.

Parameters:: val – Value to be escaped.
Returns:: A string where all double quotes \ have been replaced with \\, where % have been replaced with \%, where _ have been replaced with \_.

format_field(value, format_spec)[source]¶

classmethod quote(val)[source]¶: Escapes a string using pyexasol.ExaFormatter.escape() and wraps it in single quotes '.

classmethod quote_ident(val)[source]¶

Escapes an object or a tuple of objects using pyexasol.ExaFormatter.escape_ident() and wraps it in double quotes ".

Parameters:: val – Raw identifier(s) to be escaped.
Returns:: The formatted and quoted identifier, or joined identifiers if a tuple was provided.
Return type:: str

classmethod safe_decimal(val)[source]¶

Convert a decimal safely to string.

Parameters:: val – Decimal value to convert.
Returns:: Validates identifier as string.
Raises:: ValueError – If value is not valid.

safe_decimal_regexp = re.compile('^(\\+|-)?[0-9]+(\\.[0-9]+)?$')¶

classmethod safe_float(val)[source]¶

Convert a float safely to string.

Parameters:: val – Float value to convert.
Returns:: Validates identifier as string.
Raises:: ValueError – If value is not valid, e.g.: +infinity or -infinity.

safe_float_regexp = re.compile('^(\\+|-)?[0-9]+(\\.[0-9]+((e|E)(\\+|-)[0-9]+)?)?$')¶

classmethod safe_ident(val)[source]¶

Convert a raw identifier safely.

Parameters:: val (str or tuple) – Raw identifier(s).
Returns:: Validates identifier as string.
Raises:: ValueError If passed values is not a valid identifier (e.g. contains spaces) –

Warning

It puts it into SQL query without any quotting.

safe_ident_regexp = re.compile('^[A-Za-z_][A-Za-z0-9_]*$')¶

class pyexasol.ExaMetaData(connection)[source]¶

Bases: object

This class implements lock-free metadata requests using /*snapshot execution*/ SQL hint, as described further in https://docs.exasol.com/db/latest/database_concepts/snapshot_mode.htm.

Note

If you still get locks, please make sure to update Exasol server to the latest minor version.

Examples

You may access these functions using .meta property of connection object.

>>> C = pyexasol.connect(...)
... print(C.meta.sql_columns('SELECT 1 AS id'))

execute_meta_nosql(meta_command, meta_params=None)[source]¶

Execute no SQL metadata command introduced in Exasol 7.0+

Parameters:

meta_command – Metadata command.
meta_params – Parameters for metadata command.

Returns:

ExaStatement

Note

This feature requires WebSocket protocol v2 or higher

List of available commands can be found here.

execute_snapshot(query, query_params=None)[source]¶

Execute query in snapshot transaction mode using SQL hint

Parameters:

query – SQL query text, possibly with placeholders.
query_params – Values for placeholders.

Returns:

ExaStatement

Warning

Please do not try to query normal tables with this method. It will fail during creation of indices or statistics objects.

Note

fetch_dict=Tru is enforced to prevent users from relying on order of columns in system views

list_columns(column_schema_pattern='%', column_table_pattern='%', column_object_type_pattern='%', column_name_pattern='%')[source]¶

List Columns.

Parameters:

column_schema_pattern – Schema name or LIKE-pattern to filter on. (default: '%')
column_table_pattern – Table name or LIKE-pattern to filter on. (default: '%')
column_object_type_pattern – Object type or LIKE-pattern to filter on. (default: '%')
column_name_pattern – Column name or LIKE-pattern to filter on. (default: '%')

Returns:

List of columns from EXA_ALL_COLUMNS system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

list_indices(index_schema_pattern='%', index_table_pattern='%', index_owner_pattern='%')[source]¶

List indicies.

Parameters:

index_schema_pattern – Schema name or LIKE-pattern to filter on. (default: '%')
index_table_pattern – Table name or LIKE-pattern to filter on. (default: '%')
index_owner_pattern – Owner name or LIKE-pattern to filter on. (default: '%')

Returns:

List of indices with sizes from EXA_ALL_INDICES system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

list_object_sizes(object_name_pattern='%', object_type_pattern='%', owner_pattern='%', root_name_pattern='%')[source]¶

List Objects with their respective size.

Parameters:

object_name_pattern – Object name or LIKE-pattern to filter on. (default: '%')
object_type_pattern – Object type or LIKE-pattern to filter on. (default: '%')
owner_pattern – Owner name or LIKE-pattern to filter on. (default: '%')
root_name_pattern – Root name or LIKE-pattern to filter on.j It normally refers to schema name. (default: '%')

Returns:

List of objects with sizes from EXA_ALL_OBJECT_SIZES system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

list_objects(object_name_pattern='%', object_type_pattern='%', owner_pattern='%', root_name_pattern='%')[source]¶

List Objects.

Parameters:

object_name_pattern – Object name or LIKE-pattern to filter on. (default: '%')
object_type_pattern – Object type or LIKE-pattern to filter on. (default: '%')
owner_pattern – Owner name or LIKE-pattern to filter on. (default: '%')
root_name_pattern – Root name or LIKE-pattern to filter on.j It normally refers to schema name. (default: '%')

Returns:

List of objects from EXA_ALL_OBJECTS system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

list_schemas(schema_name_pattern='%')[source]¶

List Schemas.

Parameters:: schema_name_pattern – Schema name or LIKE-pattern to filter on. (default: '%')
Returns:: List of schemas from EXA_SCHEMAS system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

list_sql_keywords()[source]¶

Get reserved SQL keywords which cannot be used as identifiers without double-quote escaping

Returns:: List of SQL keywords from EXA_SQL_KEYWORDS system view.

Warning

Never hardcode this list! It might change with next Exasol server version without warning

Note

These keywords cannot be used as identifiers without double quotes.

list_tables(table_schema_pattern='%', table_name_pattern='%')[source]¶

List Tables.

Parameters:

table_schema_pattern – Schema name or LIKE-pattern to filter on. (default: '%')
table_name_pattern – Table name or LIKE-pattern to filter on. (default: '%')

Returns:

List of tables from EXA_ALL_TABLES system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

list_views(view_schema_pattern='%', view_name_pattern='%')[source]¶

List Views.

Parameters:

view_schema_pattern – Schema name or LIKE-pattern to filter on. (default: '%')
view_name_pattern – Table name or LIKE-pattern to filter on. (default: '%')

Returns:

List of views from EXA_ALL_VIEWS system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

schema_exists(schema_name)[source]¶

Check if schema exists.

Parameters:: schema_name – Name of the schema to check.
Returns:: True if the schema exists, otherwise False.

snapshot_execution_hint = '/*snapshot execution*/'¶

sql_columns(query, query_params=None)[source]¶

Get result set columns of SQL query without executing it

Parameters:

query – SQL query text, possibly with placeholders.
query_params – Values for placeholders.

Returns:

Columns of SQL query result without executing it. Output format is similar to pyexasol.ExaStatement.columns().

table_exists(table_name)[source]¶

Check if table exists.

Parameters:: table_name – Name of the table to check for. If schema was not specified, current_schema is used.
Returns:: True if the table exists, otherwise False.

view_exists(view_name)[source]¶

Check if view exists.

Parameters:: view_name – Name of the table to check for. If schema was not specified, current_schema is used.
Returns:: True if the view exists, otherwise False.

class pyexasol.ExaExtension(connection)[source]¶

Bases: object

This class extends the functionality of a simple SQL driver to address common Exasol-related problems.

Tip

You may access these functions using .ext property of connection object.

Examples

>>> C = pyexasol.connect(...)
... print(C.ext.get_disk_space_usage())

explain_last(details: bool = False)[source]¶

Parameters:

details (bool) –

False, the function returns the average (AVG) or maximum (MAX) values aggregated for all Exasol nodes.
True, the function returns separate rows for each individual Exasol node, with a column labeled “iproc” representing the node.

Returns:

Profiling information for last executed query.

Note

This function should be called immediately after execute() COMMIT, ROLLBACK and FLUSH STATISTICS queries are ignored.

Tip

Details are useful to detect bad data distribution and imbalanced execution If you want to see real values of CPU, MEM, HDD, NET columns, please enable Exasol profiling first with:

ALTER SESSION SET PROFILE = 'ON';

Please refer to Exasol User Manuals for explanations about profiling columns.

export_to_pandas_with_dtype(query_or_table, query_params=None)[source]¶

Export to pandas and attempt to guess correct dtypes based on Exasol columns.

Parameters:

query_or_table – Query or table to export.
query_params – Additional query parameters.

Note

Since pandas has significantly smaller range of allowed values, this function makes many assumptions. Please use it as baseline for your own function for complex cases.

Exasol Type	Pandas Type
Small decimal	int32
Big decimal	int64
Double	float64
Date, Timestamp	datetime64[ns]
Everything else	category (!)

get_columns(object_name)[source]¶

Get information about columns of table or view.

Parameters:: object_name – Object name may be passed as tuple to specify custom schema.

Caution

DEPRECATED, please use .meta.sql_columns instead.

get_columns_sql(query, query_params=None)[source]¶

Get columns of SQL query without executing it.

Parameters:: object_name – Object name may be passed as tuple to specify custom schema.

Caution

DEPRECATED, please use .meta.sql_columns instead.

Note

It relies on a prepared statement that will be closed immediately without execution

get_disk_space_usage()[source]¶

Get the disk space usage of the Exasol DB.

Returns:

A dict with 4 keys, providing all disk space details.

Key	Description
`occupied_size`	How much space is occupied (in bytes)
`free_size`	How much space is available (in bytes)
`total_size`	occupied_size + free_size
`occupied_size_percent`	Percentage of occupied disk space (0-100%)

Note

Exasol still lacks a standard function to measure actual disk space usage. We are trying to mitigate this problem by creating a custom function.

get_reserved_words()[source]¶: Get reserved keywords which cannot be used as identifiers without double-quotes.

Caution

DEPRECATED, please use .meta.list_sql_keywords instead.

Warning

Never hard-code this list! It changes with every Exasol versions.

get_sys_columns(object_name)[source]¶

Get information about columns of table or view (SYS format)

Parameters:: object_name – Object name may be passed as tuple to specify custom schema.

Caution

DEPRECATED, please use .meta.list_columns instead.

get_sys_schemas(schema_name_prefix: str = '')[source]¶

Get information about schemas (SYS format)

Parameters:: schema_name_prefix – Output may be optionally filtered by schema name prefix

Caution

DEPRECATED, please use .meta.list_schemas instead.

get_sys_tables(schema=None, table_name_prefix='')[source]¶

Get information about tables in selected schema(SYS format)

Parameters:

schema
table_name_prefix – Output may be optionally filtered by table name prefix.

Caution

DEPRECATED, please use .meta.list_tables instead.

get_sys_views(schema: str | None = None, view_name_prefix: str = '')[source]¶

Get information about views in selected schema (SYS format)

Parameters:

schema
view_name_prefix – Output may be optionally filtered by view name prefix.

Caution

DEPRECATED, please use .meta.list_views instead.

insert_multi(table_name, data, columns=None)[source]¶

Insert a small number of rows into a table using a prepared statement.

Parameters:

table_name – Target table for INSERT.
data – Source object implementing __iter__ (e.g. list or tuple).
columns – List of column names to specify custom order of columns.

Tip

Compared to .import_from_iterable, this method offers better performance for small data sets of 10,000 rows or fewer.

Use .import_from_iterable for larger data sets and better memory efficiency
Use .import_from_pandas to import from pandas.DataFrame regardless of its size
Use .import_from_parquet to import from parquet files
Use .import_from_polars to import from polars.DataFrame regardless of its size

You may use “columns” argument to specify custom order of columns for insertion If some columns are not included in this list, NULL or DEFAULT value will be used instead

Note

Please note that data should be presented in a row format. You may use zip(*data_cols) to convert columnar format into row format.

class pyexasol.ExaHTTPTransportWrapper(ipaddr: str, port: int, compression: bool = False, encryption: bool = True)[source]¶

Bases: object

Wrapper for Parallel.

You may create this wrapper using pyexasol.http_transport().

Note

Starts an HTTP server, obtains the address (the "ipaddr:port" string), and sends it to the parent process.

Block into export_*() or import_*() call, wait for incoming connection, process data and exit.

property exa_address: str¶: Internal Exasol address as ipaddr:port string.

Note

This string should be passed from child processes to parent process and used as an argument for export_parallel() and import_parallel() functions.

export_to_callback(callback, dst, callback_params=None)[source]¶

Exports chunk of data using callback function.

Parameters:

callback – Callback function.
dst – Export destination for callback function.
callback_params – Dict with additional parameters for callback function.

Returns:

Result of the callback function.

Note

You may use exactly the same callbacks utilized by standard non-parallel export_to_callback() function.

get_proxy()[source]¶: Caution

DEPRECATED, please use .exa_address property

import_from_callback(callback, src, callback_params=None)[source]¶

Import chunk of data using callback function.

Parameters:

callback – Callback function.
src – Import source for the callback function.
callback_params – Dict with additional parameters for the callback function.

Returns:

Result of callback function

Note

You may use exactly the same callbacks utilized by standard non-parallel import_from_callback() function.

class pyexasol.ExaLocalConfig(config_path=None)[source]¶

Bases: object

Parse local config file Prepare arguments for pyexasol.connect() method

Arguments accepting functions class names and tuples are not supported You may pass such arguments directly in code using **kwargs

__init__(config_path=None)[source]¶

arg_types = {'autocommit': <class 'bool'>, 'compression': <class 'bool'>, 'debug': <class 'bool'>, 'encryption': <class 'bool'>, 'fetch_dict': <class 'bool'>, 'fetch_size_bytes': <class 'int'>, 'lower_ident': <class 'bool'>, 'query_timeout': <class 'int'>, 'quote_ident': <class 'bool'>, 'snapshot_transactions': <class 'bool'>, 'socket_timeout': <class 'int'>, 'udf_output_port': <class 'int'>, 'verbose_error': <class 'bool'>}¶

default_filename = '.pyexasol.ini'¶

get_arg_type(k)[source]¶

get_args(section)[source]¶

get_default_path()[source]¶