API Reference

pyexasol.connect(**kwargs) ExaConnection[source]

Create a new connection object. For details regarding kwargs, refer to the pyexasol.ExaConnection class.

pyexasol.connect_local_config(config_section, config_path=None, **kwargs) ExaConnection[source]

Constructor for connection objects based on a local config file.

Info:
  • The default config path is ~/.pyexasol.ini

  • Extra arguments override values from config

Parameters:
  • config_section – Name of config section (required!)

  • config_path – Custom path to local config file

  • kwargs – Arguments for “connect()” function

pyexasol.http_transport(ipaddr, port, compression=False, encryption=True) ExaHTTPTransportWrapper[source]

Constructor for HTTP Transport wrapper for parallel HTTP Transport (EXPORT or IMPORT)

Parameters:
  • ipaddr – IP address of one of Exasol nodes received from pyexasol.ExaConnection.get_nodes()

  • port – Port of one of Exasol nodes received from pyexasol.ExaConnection.get_nodes()

  • compression – Use zlib compression for HTTP transport, must be the same as compression of main connection

  • encryption – Use SSL/TLS encryption for HTTP transport, must be the same as encryption of main connection

Info:

Compression and encryption arguments should match pyexasol.connect()

How to use:

  1. Parent process opens main connection to Exasol with pyexasol.connect()

  2. Parent process creates any number of child processes (possibly on remote host or another container)

  3. Every child process starts HTTP transport sub-connection with pyexasol.http_transport()

  4. and gets “ipaddr:port” string using ExaHTTPTransportWrapper.address

  5. Every child process sends address string to parent process using any communication method (Pipe, Queue, Redis, etc.)

  6. Parent process runs .export_parallel() or .import_parallel(), which initiates EXPORT or IMPORT query in Exasol

  7. Every child process receives or sends a chunk of data using ExaHTTPTransportWrapper.export_*() or .import_*()

  8. Parent process waits for Exasol query and for child processes to finish

All child processes should run in parallel. It is NOT possible to process some data first, and process some more data later.

If an exception is raised in child process, it will close the pipe used for HTTP transport. Closing the pipe prematurely will cause SQL query to fail and will raise an exception in parent process. Parent process is responsible for closing other child processes and cleaning up.

PyEXASOL does not provide a complete solution to manage child processes, only examples. The final solution depends on your hardware, network configuration, cloud provider and container orchestration software.

class pyexasol.ExaConnection(dsn=None, user=None, password=None, schema='', autocommit=True, snapshot_transactions=None, connection_timeout=10, socket_timeout=30, query_timeout=0, compression=False, encryption=True, fetch_dict=False, fetch_mapper=None, fetch_size_bytes=5242880, lower_ident=False, quote_ident=False, json_lib='json', verbose_error=True, debug=False, debug_logdir=None, udf_output_bind_address=None, udf_output_connect_address=None, udf_output_dir=None, http_proxy=None, resolve_hostnames=True, client_name=None, client_version=None, client_os_username=None, protocol_version=3, websocket_sslopt=None, access_token=None, refresh_token=None)[source]

Bases: object

Warning

Threads may share the module, but not connections One connection may be used by different threads, just not at the same time pyexasol.ExaConnection.abort_query() is an exception, it is meant to be called from another thread

Note

It is advisable to use multiprocessing instead of threading and create a new connection in each sub-process

Public Attributes:
attr:

Read-only dict of attributes of current connection.

login_info:

Read-only dict of login information returned by second response of LOGIN command.

options:

Read-only dict of arguments passed to pyexasol.ExaConnection.connect().

__init__(dsn=None, user=None, password=None, schema='', autocommit=True, snapshot_transactions=None, connection_timeout=10, socket_timeout=30, query_timeout=0, compression=False, encryption=True, fetch_dict=False, fetch_mapper=None, fetch_size_bytes=5242880, lower_ident=False, quote_ident=False, json_lib='json', verbose_error=True, debug=False, debug_logdir=None, udf_output_bind_address=None, udf_output_connect_address=None, udf_output_dir=None, http_proxy=None, resolve_hostnames=True, client_name=None, client_version=None, client_os_username=None, protocol_version=3, websocket_sslopt=None, access_token=None, refresh_token=None)[source]

Exasol connection object

Parameters:
  • dsn – Connection string, same format as standard JDBC / ODBC drivers (e.g. 10.10.127.1..11:8564)

  • user – Username

  • password – Password

  • schema – Open schema after connection (Default: ‘’, no schema)

  • autocommit – Enable autocommit on connection (Default: True)

  • snapshot_transactions – Explicitly enable or disable snapshot transactions on connection (Default: None, database default)

  • connection_timeout – Socket timeout in seconds used to establish connection (Default: 10)

  • socket_timeout – Socket timeout in seconds used for requests after connection was established (Default: 30)

  • query_timeout – Maximum execution time of queries before automatic abort, in seconds (Default: 0, no timeout)

  • compression – Use zlib compression both for WebSocket and HTTP transport (Default: False)

  • encryption – Use SSL to encrypt client-server communications for WebSocket and HTTP transport (Default: True)

  • fetch_dict – Fetch result rows as dicts instead of tuples (Default: False)

  • fetch_mapper – Use custom mapper function to convert Exasol values into Python objects during fetching (Default: None)

  • fetch_size_bytes – Maximum size of data message for single fetch request in bytes (Default: 5Mb)

  • lower_ident – Automatically lowercase identifiers (table names, column names, etc.) returned from relevant functions (Default: False)

  • quote_ident – Add double quotes and escape identifiers passed to relevant functions (export_*, import_*, ext.*, etc.) (Default: False)

  • json_lib – Supported values: rapidjson, ujson, orjson, json (Default: json)

  • verbose_error – Display additional information when error occurs (Default: True)

  • debug – Output debug information for client-server communication and connection attempts to STDERR

  • debug_logdir – Store debug information into files in debug_logdir instead of outputting it to STDERR

  • udf_output_bind_address – Specific server_address to bind TCP server for UDF script output (default: (‘’, 0))

  • udf_output_connect_address – Specific SCRIPT_OUTPUT_ADDRESS value to connect from Exasol to UDF script output server (default: inherited from TCP server)

  • udf_output_dir – Directory to store captured UDF script output logs, split by <session_id>_<statement_id>/<vm_num>

  • http_proxy – HTTP proxy string in Linux http_proxy format (default: None)

  • resolve_hostnames – Explicitly resolve host names to IP addresses before connecting. Deactivating this will let the operating system resolve the host name (default: True)

  • client_name – Custom name of client application displayed in Exasol sessions tables (Default: PyEXASOL)

  • client_version – Custom version of client application (Default: pyexasol.__version__)

  • client_os_username – Custom OS username displayed in Exasol sessions table (Default: getpass.getuser())

  • protocol_version – Major WebSocket protocol version requested for connection (Default: pyexasol.PROTOCOL_V3)

  • websocket_sslopt – Set custom SSL options for WebSocket client (Default: None)

  • access_token – OpenID access token to use for the login process

  • refresh_token – OpenID refresh token to use for the login process

abort_query()[source]

Abort running query

Warning

This function should be called from a separate thread and has no response Response should be checked in the main thread which started execution of query

There are three possible outcomes of calling this function:

  1. Query is aborted normally, connection remains active

  2. Query was stuck in a state which cannot be aborted, so Exasol has to terminate connection

  3. Query might be finished successfully before abort call had a chance to take effect

close(disconnect=True)[source]

Closes connection to database.

Parameters:

disconnect – If true send optional “disconnect” command to free resources and close session on Exasol server side properly.

Note

Please note that “disconnect” should always be False when .close() is being called from .req()-like functions to prevent an infinite loop if websocket exception happens during handling of “disconnect” command

cls_extension

alias of ExaExtension

cls_formatter

alias of ExaFormatter

cls_logger

alias of ExaLogger

cls_meta

alias of ExaMetaData

cls_statement

alias of ExaStatement

commit()[source]

Wrapper for query ‘COMMIT’

current_schema()[source]

Get the name of the current schema.

Returns:

Name of currently opened schema. Returns an empty string if no schema was opened.

execute(query, query_params=None) ExaStatement[source]

Execute SQL query with optional query formatting parameters

Parameters:
  • query – SQL query text, possibly with placeholders

  • query_params – Values for placeholders

Returns:

ExaStatement object

Examples

>>> con = ExaConnection(...)
>>> con.execute(
...        query="SELECT * FROM {table!i} WHERE col1={col1}",
...        query_params={'table': 'users', 'col1':'bar'}
...)
execute_udf_output(query, query_params=None)[source]

Execute SQL query with UDF script, capture output

Note

Exasol should be able to open connection to the machine where current script is running. It is usually OK in the same data centre, but it is normally not working if you try to run this function on local laptop.

Parameters:
  • query – SQL query text, possibly with placeholders

  • query_params – Values for placeholders |

Returns:

(1) instance of pyexasol.ExaStatement and (2) list of Path objects for script output log files.

Return type:

Return tuple with two elements

Attention

Exasol should be able to open connection to the machine where current script is running

Examples

>>> con = ExaConnection(...)
>>> stmt, output_files = con.execute_udf_output(
...        query="SELECT * FROM {table!i} WHERE col1={col1}",
...        query_params={'table': 'users', 'col1':'bar'}
...)
export_parallel(exa_address_list, query_or_table, query_params=None, export_params=None)[source]

This function is part of HTTP Transport (parallel) API.

Parameters:
  • exa_address_list – List of ipaddr:port strings obtained from HTTP transport .address.

  • query_or_table – SQL query or table for the export.

  • query_params – Values for SQL query placeholders.

  • export_params – Custom parameters for Export query.

Note

  • Init HTTP transport in child processes first using pyexasol.http_transport()

  • Get internal Exasol address from each child process using .address

  • Pass address strings to parent process, combine into single list and use it for export_parallel() call

export_to_callback(callback, dst, query_or_table, query_params=None, callback_params=None, export_params=None)[source]

Export large amount of data to user-defined callback function

Parameters:
  • callback – Callback function

  • query_or_table – SQL query or table for export.

  • query_params – Values for SQL query placeholders.

  • export_params – Custom parameters for Export query.

Returns:

result of callback function

Warning

  • This function may run out of memory

Examples

>>> cb = lamda args: print(args)
>>> con = ExaConnection(...)
>>> con.export_to_callback(
...    callback=cb,
...    query_or_table="SELECT * FROM table"
... )
export_to_file(dst, query_or_table, query_params=None, export_params=None)[source]

Export large amount of data from Exasol to file or file-like object using fast HTTP transport.

Note

File must be opened in binary mode.

Parameters:
  • dst – Path to file or file-like object.

  • query_or_table – SQL query or table for export.

  • query_params – Values for SQL query placeholders.

  • export_params – Custom parameters for Export query.

Examples

>>> con = ExaConnection(...)
>>> with open('/tmp/file.csv', 'wb') as f:
...     con.export_to_file(
...         dst=f,
...         query_or_table="SELECT * FROM table"
...     )
export_to_list(query_or_table, query_params=None, export_params=None)[source]

Export large amount of data from Exasol to basic Python list using fast HTTP transport.

Parameters:
  • query_or_table – SQL query or table for export.

  • query_params – Values for SQL query placeholders.

  • export_params – Custom parameters for Export query.

Returns:

list of tuples

Warning

  • This function may run out of memory

Examples

>>> con = ExaConnection(...)
>>> myresult = con.export_to_list(
...    query_or_table="SELECT * FROM table"
... )
export_to_pandas(query_or_table, query_params=None, callback_params=None, export_params=None)[source]

Export large amount of data from Exasol to pandas.DataFrame.

Parameters:
  • query_or_table – SQL query or table for export.

  • query_params – Values for SQL query placeholders.

  • export_params – Custom parameters for Export query.

Returns:

instance of pandas.DataFrame

Warning

  • This function may run out of memory

Examples

>>> con = ExaConnection(...)
>>> myresult = con.export_to_pandas(
...    query_or_table="SELECT * FROM table"
... )
get_attr()[source]
get_nodes(pool_size=None)[source]

List of currently active Exasol nodes which is normally used for HTTP Transport (parallel).

Parameters:

pool_size – Return list of specific size.

Returns:

list of dictionaries describing active Exasol nodes

Note

Format: {'ipaddr': <ip_address>, 'port': <port>, 'idx': <incremental index of returned node>}

  • If pool_size is bigger than number of nodes, list will wrap around and nodes will repeat with different ‘idx’

  • If pool_size is omitted, return every active node once

  • It is useful to balance workload for parallel IMPORT and EXPORT Exasol shuffles list for every connection

  • Exasol shuffles list for every connection.

import_from_callback(callback, src, table, callback_params=None, import_params=None)[source]

Import a large amount of data from a user-defined callback function.

Parameters:
  • callback – Callback function.

  • src – Source for the callback function.

  • table – Destination table for IMPORT.

  • callback_params – Dict with additional parameters for callback function

  • import_params – Custom parameters for IMPORT query.

Raises:

ValueError – callback argument isn’t callable.

import_from_file(src, table, import_params=None)[source]

Import a large amount of data from a file or file-like object.

Parameters:
  • src – Source file or file-like object.

  • table – Destination table for IMPORT.

  • import_params – Custom parameters for import query.

Note

File must be opened in binary mode.

import_from_iterable(src, table, import_params=None)[source]

Import a large amount of data from an iterable Python object.

Parameters:
  • src – Source object implementing __iter__. Iterator must return tuples of values.

  • table – Destination table for IMPORT.

  • import_params – Custom parameters for import query.

import_from_pandas(src, table, callback_params=None, import_params=None)[source]

Import a large amount of data from pandas.DataFrame.

Parameters:
  • src – Source pandas.DataFrame instance.

  • table – Destination table for IMPORT.

  • import_params – Custom parameters for import query.

import_parallel(exa_address_list, table, import_params=None)[source]

This function is part of HTTP Transport (parallel) API.

Parameters:
  • exa_address_list – List of ipaddr:port strings obtained from HTTP transport .address.

  • table – Table to import to.

  • import_params – Custom parameters for import.

Note

  • Init HTTP transport in child processes first using pyexasol.http_transport()

  • Get internal Exasol address from each child process using .address

  • Pass address strings to parent process, combine into single list and use it for import_parallel() call

last_statement() ExaStatement[source]

Last created statement object

Returns:

last created statement.

Return type:

ExaStatement

Note

It is mainly used for HTTP transport to access internal IMPORT / EXPORT query, measure execution time and number of rows

Tip

It is useful while working with export_* and import_* functions normally returning result of callback function instead of statement object.

open_schema(schema)[source]

Wrapper for OPEN SCHEMA

Parameters:

schema – Schema name

protocol_version()[source]

Actual protocol version used by the the established connection.

Returns:

0 if connection was not established yet (e.g. due to exception handling), otherwise protocol version as int.

Warning

Actual Protocol version might be downgraded from requested protocol version if Exasol server does not support it

Note

The actual protocol version may be lower than the requested protocol version defined by the protocol_version connection option. For further details, refer to WebSocket protocol versions.

req(req)[source]

Send WebSocket request and wait for response

rollback()[source]

Wrapper for query ‘ROLLBACK’

session_id()[source]

Session id of current session.

Returns:

Unique SESSION_ID of the current session as string.

set_attr(new_attr)[source]
set_autocommit(val)[source]

Set autocommit mode.

Parameters:

val – Set False to execute following statements in transaction. Set True to get back to automatic COMMIT after each statement.

Note

Autocommit is True by default because Exasol has to commit indexes and statistics objects even for pure SELECT statements. Lack of default COMMIT may lead to serious performance degradation.

set_query_timeout(val)[source]

Set the maximum time in seconds for which a query can run before Exasol kills it automatically.

Parameters:

val – Timeout value in seconds. Set value 0 to disable timeout.

Note

It is highly recommended to set timeout for UDF scripts to avoid potential infinite loops and very long transactions.

threadsafety = 1
class pyexasol.ExaStatement(connection, query=None, query_params=None, prepare=False, meta_nosql=False, **options)[source]

Bases: object

This class executes and helps to fetch result set of single Exasol SQL statement.

Warning

Unlike typical Cursor object, ExaStatement is not reusable.

Note

pyexasol.ExaStatement may fetch result set rows as tuples (default) or as dict (set fetch_dict=True in connection options).

pyexasol.ExaStatement may use custom data-type mapper during fetching (set fetch_mapper=<func> in connection options). Mapper function accepts two arguments (raw value and dataType object) and returns custom object or value.

pyexasol.ExaStatement fetches big result sets in chunks. The size of chunk may be adjusted (set fetch_size_bytes=<int> in connection options).

Public Attributes:
execution_time:

Execution time of SQL statement. It is measured by wall-clock time of WebSocket request, so real execution time is a bit faster.

__init__(connection, query=None, query_params=None, prepare=False, meta_nosql=False, **options)[source]
Parameters:
  • connection

  • query

  • query_params

  • prepare

  • meta_nosql

  • options – additonal kwargs

__iter__()[source]

The best way to fetch result set of statement is to use iterator:

Yields:

tuple or dict depending on fetch_dict connection option.

Examples

>>> st = pyexasol.execute('SELECT * FROM table')
... for row in st:
...     print(row)
close()[source]

Closes result set handle if it was opened.

Warning

You won’t be able to fetch next chunk of large dataset after calling this function, but no other side-effects.

column_names()[source]

List of column names.

columns()[source]

Retrieves column information of returned data.

Returns:

A dict with keys as column names and values as dataType objects.

Notes

The dict will containt the following data:

Names

Type

Description

type

string

column data type

precision

number

(optional) column precision

scale

number

(optional) column scale

size

number

(optional) maximum size in bytes of a column value

characterSet

string

(optional) character encoding of a text column

withLocalTimeZone

true, false

(optional) specifies if a timestamp has a local time zone

fraction

number

(optional) fractional part of number

srid

number

(optional) spatial reference system identifier

execute_prepared(data=None)[source]
fetchall()[source]

Fetches all remaining rows.

Returns:

list of tuples or list of dict. Empty list if all rows were fetched previously.

Warning

This function may exhaust available memory.

fetchcol()[source]

Fetches all values from the first column.

Returns:

list of values. Empty list if all rows were fetched previously.

fetchmany(size=10000)[source]

Fetch multiple rows.

Parameters:

size – Set the specific number of rows to fetch (Default: 10000)

Returns:

list of tuples or list of dict. Empty list if all rows were fetched previously.

fetchone()[source]

Fetches one row of data.

Returns:

tuple or dict. None if all rows were fetched.

fetchval()[source]

Fetches first column of first row.

Returns:

Value, None if all rows were fetched previously.

Tip

This may be useful for queries returning single value like SELECT count(*) FROM table.

rowcount()[source]

Number of selected/processed rows.

Returns:

Total amount of selected rows for statements with result set (num_rows). Total amount of processed rows for DML queries (row_count).

class pyexasol.ExaFormatter(connection)[source]

Bases: Formatter

pyexasol.ExaFormatter is a subclass of string.Formatter designed to prevent SQL injections in Exasol dynamic SQL queries.

Note:

It introduces set of placeholders to prevent SQL injections specifically in Exasol dynamic SQL queries. It also completely disables format_spec section of standard formatting since it has no use in context of SQL queries and may cause more harm than good.

You may access these functions using .format property of connection object. Example:

Examples:

>>> C = pyexasol.connect(...)
... print(C.format.escape('abc'))
convert_field(value, conversion)[source]
classmethod escape(val)[source]

Takes a raw value and converts it into an and escaped string.

Parameters:

val – Value to be escaped.

Returns:

A string where all single quotes ' have been replaced with two single quotes ''.

classmethod escape_ident(val)[source]

Takes a raw value and converts it into an and escaped string.

Parameters:

val – Value to be escaped.

Returns:

A string where all double quotes " have been replaced with two double quotes "".

classmethod escape_like(val)[source]

Escape LIKE-patterns.

Parameters:

val – Value to be escaped.

Returns:

A string where all double quotes \ have been replaced with \\, where % have been replaced with \%, where _ have been replaced with \_.

format_field(value, format_spec)[source]
classmethod quote(val)[source]

Escapes a string using pyexasol.ExaFormatter.escape() and wraps it in single quotes '.

classmethod quote_ident(val)[source]

Escapes an object or a tuple of objects using pyexasol.ExaFormatter.escape_ident() and wraps it in double quotes ".

Parameters:

val – Raw identifier(s) to be escaped.

Returns:

The formatted and quoted identifier, or joined identifiers if a tuple was provided.

Return type:

str

classmethod safe_decimal(val)[source]

Convert a decimal safely to string.

Parameters:

val – Decimal value to convert.

Returns:

Validates identifier as string.

Raises:

ValueError – If value is not valid.

safe_decimal_regexp = re.compile('^(\\+|-)?[0-9]+(\\.[0-9]+)?$')
classmethod safe_float(val)[source]

Convert a float safely to string.

Parameters:

val – Float value to convert.

Returns:

Validates identifier as string.

Raises:

ValueError – If value is not valid, e.g.: +infinity or -infinity.

safe_float_regexp = re.compile('^(\\+|-)?[0-9]+(\\.[0-9]+((e|E)(\\+|-)[0-9]+)?)?$')
classmethod safe_ident(val)[source]

Convert a raw identifier safely.

Parameters:

val (str or tuple) – Raw identifier(s).

Returns:

Validates identifier as string.

Raises:

ValueError If passed values is not a valid identifier (e.g. contains spaces)

Warning

It puts it into SQL query without any quotting.

safe_ident_regexp = re.compile('^[A-Za-z_][A-Za-z0-9_]*$')
class pyexasol.ExaMetaData(connection)[source]

Bases: object

This class implements lock-free meta data requests using /*snapshot execution*/ SQL hint described in IDEA-476.

Note

If you still get locks, please make sure to update Exasol server to the latest minor version

Examples

You may access these functions using .meta property of connection object.

>>> C = pyexasol.connect(...)
... print(C.meta.sql_columns('SELECT 1 AS id'))
execute_meta_nosql(meta_command, meta_params=None)[source]

Execute no SQL meta data command introduced in Exasol 7.0+

Parameters:
  • meta_command – Metadata command.

  • meta_params – Parameters for metadata command.

Returns:

ExaStatement

Note

This feature requires WebSocket protocol v2 or higher

List of available commands can be found here.

execute_snapshot(query, query_params=None)[source]

Execute query in snapshot transaction mode using SQL hint

Parameters:
  • query – SQL query text, possibly with placeholders.

  • query_params – Values for placeholders.

Returns:

ExaStatement

Warning

Please do not try to query normal tables with this method. It will fail during creation of indices or statistics objects.

Note

fetch_dict=Tru is enforced to prevent users from relying on order of columns in system views

list_columns(column_schema_pattern='%', column_table_pattern='%', column_object_type_pattern='%', column_name_pattern='%')[source]

List Columns.

Parameters:
  • column_schema_pattern – Schema name or LIKE-pattern to filter on. (default: '%')

  • column_table_pattern – Table name or LIKE-pattern to filter on. (default: '%')

  • column_object_type_pattern – Object type or LIKE-pattern to filter on. (default: '%')

  • column_name_pattern – Column name or LIKE-pattern to filter on. (default: '%')

Returns:

List of columns from EXA_ALL_COLUMNS system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

list_indices(index_schema_pattern='%', index_table_pattern='%', index_owner_pattern='%')[source]

List indicies.

Parameters:
  • index_schema_pattern – Schema name or LIKE-pattern to filter on. (default: '%')

  • index_table_pattern – Table name or LIKE-pattern to filter on. (default: '%')

  • index_owner_pattern – Owner name or LIKE-pattern to filter on. (default: '%')

Returns:

List of indices with sizes from EXA_ALL_INDICES system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

list_object_sizes(object_name_pattern='%', object_type_pattern='%', owner_pattern='%', root_name_pattern='%')[source]

List Objects with their respective size.

Parameters:
  • object_name_pattern – Object name or LIKE-pattern to filter on. (default: '%')

  • object_type_pattern – Object type or LIKE-pattern to filter on. (default: '%')

  • owner_pattern – Owner name or LIKE-pattern to filter on. (default: '%')

  • root_name_pattern – Root name or LIKE-pattern to filter on.j It normally refers to schema name. (default: '%')

Returns:

List of objects with sizes from EXA_ALL_OBJECT_SIZES system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

list_objects(object_name_pattern='%', object_type_pattern='%', owner_pattern='%', root_name_pattern='%')[source]

List Objects.

Parameters:
  • object_name_pattern – Object name or LIKE-pattern to filter on. (default: '%')

  • object_type_pattern – Object type or LIKE-pattern to filter on. (default: '%')

  • owner_pattern – Owner name or LIKE-pattern to filter on. (default: '%')

  • root_name_pattern – Root name or LIKE-pattern to filter on.j It normally refers to schema name. (default: '%')

Returns:

List of objects from EXA_ALL_OBJECTS system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

list_schemas(schema_name_pattern='%')[source]

List Schemas.

Parameters:

schema_name_pattern – Schema name or LIKE-pattern to filter on. (default: '%')

Returns:

List of schemas from EXA_SCHEMAS system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

list_sql_keywords()[source]

Get reserved SQL keywords which cannot be used as identifiers without double-quote escaping

Returns:

List of SQL keywords from EXA_SQL_KEYWORDS system view.

Warning

Never hardcode this list! It might change with next Exasol server version without warning

Note

These keywords cannot be used as identifiers without double quotes.

list_tables(table_schema_pattern='%', table_name_pattern='%')[source]

List Tables.

Parameters:
  • table_schema_pattern – Schema name or LIKE-pattern to filter on. (default: '%')

  • table_name_pattern – Table name or LIKE-pattern to filter on. (default: '%')

Returns:

List of tables from EXA_ALL_TABLES system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

list_views(view_schema_pattern='%', view_name_pattern='%')[source]

List Views.

Parameters:
  • view_schema_pattern – Schema name or LIKE-pattern to filter on. (default: '%')

  • view_name_pattern – Table name or LIKE-pattern to filter on. (default: '%')

Returns:

List of views from EXA_ALL_VIEWS system view matching LIKE-pattern.

Note

Patterns are case-sensitive. You may escape LIKE-patterns.

schema_exists(schema_name)[source]

Check if schema exists.

Parameters:

schema_name – Name of the schema to check.

Returns:

True if the schema exists, otherwise False.

snapshot_execution_hint = '/*snapshot execution*/'
sql_columns(query, query_params=None)[source]

Get result set columns of SQL query without executing it

Parameters:
  • query – SQL query text, possibly with placholders.

  • query_params – Values for placholders.

Returns:

Columns of SQL query result without executing it. Output format is similar to pyexasol.ExaStatement.columns().

table_exists(table_name)[source]

Check if table exists.

Parameters:

table_name – Name of the table to check for. If schema was not specified, current_schema is used.

Returns:

True if the table exists, otherwise False.

view_exists(view_name)[source]

Check if view exists.

Parameters:

view_name – Name of the table to check for. If schema was not specified, current_schema is used.

Returns:

True if the view exists, otherwise False.

class pyexasol.ExaExtension(connection)[source]

Bases: object

This class extends the functionality of a simple SQL driver to address common Exasol-related problems.

Tip

You may access these functions using .ext property of connection object.

Examples

>>> C = pyexasol.connect(...)
... print(C.ext.get_disk_space_usage())
explain_last(details=False)[source]
Parameters:

details (bool) –

  • False, the function returns the average (AVG) or maximum (MAX) values aggregated for all Exasol nodes.

  • True, the function returns separate rows for each individual Exasol node, with a column labeled “iproc” representing the node.

Returns:

Profiling information for last executed query.

Note

This function should be called immediately after execute() COMMIT, ROLLBACK and FLUSH STATISTICS queries are ignored.

Tip

Details are useful to detect bad data distribution and imbalanced execution If you want to see real values of CPU, MEM, HDD, NET columns, please enable Exasol profiling first with:

ALTER SESSION SET PROFILE = 'ON';

Please refer to Exasol User Manuals for explanations about profiling columns.

export_to_pandas_with_dtype(query_or_table, query_params=None)[source]

Export to pandas and attempt to guess correct dtypes based on Exasol columns.

Parameters:
  • query_or_table – Query or table to export.

  • query_params – Additional query parameters.

Note

Since pandas has significantly smaller range of allowed values, this function makes many assumptions Please use it as baseline for your own function for complex cases

Exasol Type

Pandas Type

Small decimal

int32

Big decimal

int64

Double

float64

Date, Timestamp

datetime64[ns]

Everything else

category (!)

get_columns(object_name)[source]

Get information about columns of table or view.

Parameters:

object_name – Object name may be passed as tuple to specify custom schema.

Caution

DEPRECATED, please use .meta.sql_columns instead.

get_columns_sql(query, query_params=None)[source]

Get columns of SQL query without executing it.

Parameters:

object_name – Object name may be passed as tuple to specify custom schema.

Caution

DEPRECATED, please use .meta.sql_columns instead.

Note

It relies on prepared statement which is closed immediately without execution

get_disk_space_usage()[source]

Get the disk space usage of the exasol DB.

Returns:

A dict with 4 keys, providing all disk space details.

Key

Description

occupied_size

How much space is occupied (in bytes)

free_size

How much space is available (in bytes)

total_size

occupied_size + free_size

occupied_size_percent

Percentage of occupied disk space (0-100%)

Note

Exasol still lacks a standard function to measure actual disk space usage. We are trying to mitigate this problem by creating a custom function.

get_reserved_words()[source]

Get reserved keywords which cannot be used as identifiers without double-quotes.

Caution

DEPRECATED, please use .meta.list_sql_keywords instead.

Warning

Never hard-code this list! It changes with every Exasol versions.

get_sys_columns(object_name)[source]

Get information about columns of table or view (SYS format)

Parameters:

object_name – Object name may be passed as tuple to specify custom schema.

Caution

DEPRECATED, please use .meta.list_columns instead.

get_sys_schemas(schema_name_prefix='')[source]

Get information about schemas (SYS format)

Parameters:

schema_name_prefix – Output may be optionally filtered by schema name prefix

Caution

DEPRECATED, please use .meta.list_schemas instead.

get_sys_tables(schema=None, table_name_prefix='')[source]

Get information about tables in selected schema(SYS format)

Parameters:
  • schema

  • table_name_prefix – Output may be optionally filtered by table name prefix.

Caution

DEPRECATED, please use .meta.list_tables instead.

get_sys_views(schema=None, view_name_prefix='')[source]

Get information about views in selected schema(SYS format)

Parameters:
  • schema

  • view_name_prefix – Output may be optionally filtered by view name prefix.

Caution

DEPRECATED, please use .meta.list_views instead.

insert_multi(table_name, data, columns=None)[source]

Insert a samll number of rows into a table using a prepared statement.

Parameters:
  • table_name – Target table for INSERT.

  • data – Source object implementing __iter__ (e.g. list or tuple).

  • columns – List of column names to specify custom order of columns.

Tip

Compared to .import_from_iterable, this method offers better performance for small data sets of 10,000 rows or fewer.

  • Use .import_from_iterable for larger data sets and better memory efficiency

  • Use .import_from_pandas to import from data frame regardless of its size

You may use “columns” argument to specify custom order of columns for insertion If some columns are not included in this list, NULL or DEFAULT value will be used instead

Note

Please note that data should be presented in a row format. You may use zip(*data_cols) to convert columnar format into row format.

class pyexasol.ExaHTTPTransportWrapper(ipaddr, port, compression=False, encryption=True)[source]

Bases: object

Wrapper for HTTP Transport (parallel).

You may create this wrapper using pyexasol.http_transport().

Note

Starts an HTTP server, obtains the address (the "ipaddr:port" string), and sends it to the parent process.

Block into export_*() or import_*() call, wait for incoming connection, process data and exit.

property exa_address

Internal Exasol address as ipaddr:port string.

Note

This string should be passed from child processes to parent process and used as an argument for export_parallel() and import_parallel() functions.

export_to_callback(callback, dst, callback_params=None)[source]

Exports chunk of data using callback function.

Parameters:
  • callback – Callback function.

  • dst – Export destination for callback function.

  • callback_params – Dict with additional parameters for callback function.

Returns:

Result of the callback function.

Note

You may use exactly the same callbacks utilized by standard non-parallel export_to_callback() function.

get_proxy()[source]

Caution

DEPRECATED, please use .exa_address property

import_from_callback(callback, src, callback_params=None)[source]

Import chunk of data using callback function.

Parameters:
  • callback – Callback function.

  • src – Import source for the callback function.

  • callback_params – Dict with additional parameters for the callback function.

Returns:

Result of callback function

Note

You may use exactly the same callbacks utilized by standard non-parallel import_from_callback() function.