HTTP Transport¶

The main purpose of HTTP transport is to reduce massive fetching overhead associated with large data sets (1M+ rows). It uses native Exasol commands EXPORT and IMPORT specifically designed to move large amounts of data. Data is transferred using CSV format with optional zlib compression.

This is a powerful tool which helps to bypass the creation of intermediate Python objects altogether and dramatically increases performance.

PyExasol offloads HTTP communication and decompression to a separate thread using the threading module. The main thread deals with a simple pipe opened in binary mode.

You may specify a custom callback function to read or write from the raw pipe and to apply complex logic. Use callback_params to pass additional parameters to the callback function (e.g. options for pandas).

You may also specify import_params or export_params to alter the IMPORT or EXPORT query and modify the CSV data stream.

Pre-defined Functions¶

Export from Exasol to pandas¶

Export data from Exasol into pandas.DataFrame. You may use the callback_params argument to pass custom options for the pandas read_csv function.

# Read from SQL
pd = C.export_to_pandas("SELECT * FROM users")

# Read from table
pd = C.export_to_pandas("users")

Import from pandas to Exasol¶

Import data from pandas.DataFrame into Exasol table. You may use the callback_params argument to pass custom options for the pandas to_csv function.

C.import_from_pandas(pd, "users")

Import from list (a-la INSERT)¶

my_list = [
    (1, 'Bob', False, '2018-01-01'),
    (2, 'Gill', True, '2018-02-01'),
]

C.import_from_iterable(my_list, "users")

Import from generator¶

This function is suitable for very big INSERTS as long as the generator returns rows one-by-one and does not run out of memory.

def my_generator():
    for i in range(5):
        yield (i, 'Bob', True, '2017-01-01')

C.import_from_iterable(my_generator(), "users")

Import from file¶

Import data from a file, path object, or file-like object opened in binary mode. You may import from process STDIN using sys.stdin.buffer.

# Import from file defined with string path
C.import_from_file('/test/my_file.csv', "users")

# Import from path object
C.import_from_file(pathlib.Path('/test/my_file.csv'), "users")

# Import from opened file
file = open('/test/my_file.csv', 'rb')
C.import_from_file(file, "users")
file.close()

# Import from STDIN
C.import_from_file(sys.stdin.buffer, "users")

Export to file¶

Export data from Exasol into a file, path object, or file-like object opened in binary mode. You may export to process STDOUT using sys.stdout.buffer.

# Export from file defined with string path
C.export_to_file('my_file.csv', "users")

# Export into STDOUT
C.export_to_file(sys.stdout.buffer, "users")

Parameters¶

Please refer to the Exasol User Manual to learn more about IMPORT and EXPORT parameters.

import_params¶

Name	Example	Description
column_separator	,	Column separator for CSV
column_delimiter	“	Column delimiter for CSV (quoting)
columns	[‘id’, ‘name’]	List of table columns in the data source, useful if the column order of data source does not match the column order of Exasol table
csv_cols	[“1..5”, “6 FORMAT=’999.99’”, “8”]	List of CSV columns with optional numeric or date formats
row_separator	LF	Row separator for CSV (line-endings)
encoding	UTF8	File encoding
with_column_names	True	Add column names as the first line, useful for Pandas
null	N	Custom NULL value
delimit	AUTO	Delimiter mode: AUTO, ALWAYS, NEVER
format	gz	Import file or stream compressed with gz, bzip2, zip
comment	This is a query description	Add a comment before the beginning of the query

export_params¶

Name	Example	Description
column_separator	,	Column separator for CSV
column_delimiter	“	Column delimiter for CSV (quoting)
columns	[‘id’, ‘name’]	List of table columns, useful to reorder table columns during export from table
csv_cols	[“1..5”, “6 FORMAT=’999.99’”, “8”]	List of CSV columns with optional numeric or date formats
row_separator	LF	Row separator for CSV (line-endings)
encoding	UTF8	File encoding
skip	1	How many first rows to skip, useful for skipping header
null	N	Custom NULL value
trim	TRIM	Trim mode: TRIM, RTRIM, LTRIM
format	gz	Export file or stream compressed with gz, bzip2, zip
comment	This is a query description	Add a comment before the beginning of the query

The comment parameter, for adding comments to queries¶

For any export_* or import_* call, you can add a comment that will be inserted before the beginning of the query.

This can be used for profiling and auditing. Example:

C.import_from_file('/test/my_file.csv', 'users', import_params={'comment': '''
This comment will be inserted before the query.
This query is importing user from CSV.
'''})

The comment is inserted as a block comment (/* <comment> */). The block comment closing sequence (*/) is forbidden in the comment.

How to write custom EXPORT / IMPORT functions¶

A full collection of pre-defined callback functions is available in callback.py module.

Example of a callback exporting into a basic Python list.

# Define callback function
def export_to_list(pipe, dst, **kwargs):
    wrapped_pipe = io.TextIOWrapper(pipe, newline='\n')
    reader = csv.reader(wrapped_pipe, lineterminator='\n', **kwargs)

    return [row for row in reader]

# Run EXPORT using the defined callback function
C.export_to_callback(export_to_list, None, 'my_table')

Example of a callback importing from Pandas into Exasol.

df = <pandas.DataFrame>

def import_from_pandas(pipe, src, **kwargs):
    wrapped_pipe = io.TextIOWrapper(pipe, newline='\n')
    return src.to_csv(wrapped_pipe, header=False, index=False, quoting=csv.QUOTE_NONNUMERIC, **kwargs)

# Run IMPORT using the defined callback function
C.export_from_callback(import_from_pandas, df, 'my_table')