HTTP Transport

The main purpose of HTTP transport is to reduce massive fetching overhead associated with large data sets (1M+ rows). It uses native Exasol commands EXPORT and IMPORT specifically designed to move large amounts of data. Data is transferred using CSV format with optional zlib compression.

This is a powerful tool which helps to bypass the creation of intermediate Python objects altogether and dramatically increases performance.

PyEXASOL offloads HTTP communication and decompression to a separate thread using the threading module. The main thread deals with a simple pipe opened in binary mode.

You may specify a custom callback function to read or write from the raw pipe and to apply complex logic. Use callback_params to pass additional parameters to the callback function (e.g. options for pandas).

You may also specify import_params or export_params to alter the IMPORT or EXPORT query and modify the CSV data stream.

Pre-defined Functions

Export from Exasol to pandas

Export data from Exasol into pandas.DataFrame. You may use the callback_params argument to pass custom options for the pandas read_csv function.

# Read from SQL
pd = C.export_to_pandas("SELECT * FROM users")

# Read from table
pd = C.export_to_pandas("users")

Import from pandas to Exasol

Import data from pandas.DataFrame into Exasol table. You may use the callback_params argument to pass custom options for the pandas to_csv function.

C.import_from_pandas(pd, "users")

Import from list (a-la INSERT)

my_list = [
    (1, 'Bob', False, '2018-01-01'),
    (2, 'Gill', True, '2018-02-01'),
]

C.import_from_iterable(my_list, "users")

Import from generator

This function is suitable for very big INSERTS as long as the generator returns rows one-by-one and does not run out of memory.

def my_generator():
    for i in range(5):
        yield (i, 'Bob', True, '2017-01-01')

C.import_from_iterable(my_generator(), "users")

Import from file

Import data from a file, path object, or file-like object opened in binary mode. You may import from process STDIN using sys.stdin.buffer.

# Import from file defined with string path
C.import_from_file('/test/my_file.csv', "users")

# Import from path object
C.import_from_file(pathlib.Path('/test/my_file.csv'), "users")

# Import from opened file
file = open('/test/my_file.csv', 'rb')
C.import_from_file(file, "users")
file.close()

# Import from STDIN
C.import_from_file(sys.stdin.buffer, "users")

Export to file

Export data from Exasol into a file, path object, or file-like object opened in binary mode. You may export to process STDOUT using sys.stdout.buffer.

# Export from file defined with string path
C.export_to_file('my_file.csv', "users")

# Export into STDOUT
C.export_to_file(sys.stdout.buffer, "users")

Parameters

Please refer to the Exasol User Manual to learn more about IMPORT and EXPORT parameters.

import_params

Name

Example

Description

column_separator

,

Column separator for CSV

column_delimiter

Column delimiter for CSV (quoting)

columns

[‘id’, ‘name’]

List of table columns in the data source, useful if the column order of data source does not match the column order of Exasol table

csv_cols

[“1..5”, “6 FORMAT=’999.99’”, “8”]

List of CSV columns with optional numeric or date formats

row_separator

LF

Row separator for CSV (line-endings)

encoding

UTF8

File encoding

with_column_names

True

Add column names as the first line, useful for Pandas

null

N

Custom NULL value

delimit

AUTO

Delimiter mode: AUTO, ALWAYS, NEVER

format

gz

Import file or stream compressed with gz, bzip2, zip

comment

This is a query description

Add a comment before the beginning of the query

export_params

Name

Example

Description

column_separator

,

Column separator for CSV

column_delimiter

Column delimiter for CSV (quoting)

columns

[‘id’, ‘name’]

List of table columns, useful to reorder table columns during export from table

csv_cols

[“1..5”, “6 FORMAT=’999.99’”, “8”]

List of CSV columns with optional numeric or date formats

row_separator

LF

Row separator for CSV (line-endings)

encoding

UTF8

File encoding

skip

1

How many first rows to skip, useful for skipping header

null

N

Custom NULL value

trim

TRIM

Trim mode: TRIM, RTRIM, LTRIM

format

gz

Export file or stream compressed with gz, bzip2, zip

comment

This is a query description

Add a comment before the beginning of the query

The comment parameter, for adding comments to queries

For any export_* or import_* call, you can add a comment that will be inserted before the beginning of the query.

This can be used for profiling and auditing. Example:

C.import_from_file('/test/my_file.csv', 'users', import_params={'comment': '''
This comment will be inserted before the query.
This query is importing user from CSV.
'''})

The comment is inserted as a block comment (/* <comment> */). The block comment closing sequence (*/) is forbidden in the comment.

How to write custom EXPORT / IMPORT functions

A full collection of pre-defined callback functions is available in callback.py module.

Example of a callback exporting into a basic Python list.

# Define callback function
def export_to_list(pipe, dst, **kwargs):
    wrapped_pipe = io.TextIOWrapper(pipe, newline='\n')
    reader = csv.reader(wrapped_pipe, lineterminator='\n', **kwargs)

    return [row for row in reader]

# Run EXPORT using the defined callback function
C.export_to_callback(export_to_list, None, 'my_table')

Example of a callback importing from Pandas into Exasol.

df = <pandas.DataFrame>

def import_from_pandas(pipe, src, **kwargs):
    wrapped_pipe = io.TextIOWrapper(pipe, newline='\n')
    return src.to_csv(wrapped_pipe, header=False, index=False, quoting=csv.QUOTE_NONNUMERIC, **kwargs)

# Run IMPORT using the defined callback function
C.export_from_callback(import_from_pandas, df, 'my_table')