Getting Started

Welcome to PyExasol

PyExasol is the officially supported Python connector that revolutionizes how users interact with Exasol databases. This powerful tool is specifically designed to handle massive volumes of data with efficiency, providing a performance boost over traditional ODBC/JDBC solutions.

Why Choose PyExasol?

  • Easy and fast access to Exasol from Python

  • Bulk import and export from/to pandas and polars to Exasol

  • Exasol UDF debugging support

Prerequisites

  • Exasol >= 7.1

  • Python >= 3.9

Optional Dependencies

  • orjson is required for json_lib=orjson to improve JSON parsing performance

  • pandas is required for HTTP Transport functions working with pandas.DataFrame

  • polars is required for HTTP Transport functions working with polars.DataFrame

  • pproxy is used in the Examples to test an HTTP proxy

  • rapidjson is required for json_lib=rapidjson to improve JSON parsing performance

  • ujson is required for json_lib=ujson to improve JSON parsing performance

Installing

PyExasol is distributed through PyPI. It can be installed via pip, poetry, or any other compatible dependency management tool:

pip install pyexasol

To install with optional dependencies, use:

pip install pyexasol[<optional-package-name>]

For a list of optional dependencies, see Optional Dependencies.

First Steps

For a user’s first steps, it is recommended to try out running basic queries and exporting data from an Exasol table well-known Python packages, like pandas or polars.

Note

These examples are written assuming a newly installed or otherwise safe-to-test Exasol database. If that is not the case, it is recommended, in particular with the export examples, to check the API Reference and your to-be-queried table to ensure that the output you will receive is as desired (and not, i.e. millions of rows).

Run basic query

Note

For more options when running a basic query, check out pyexasol.ExaStatement, which is the returned object from C.execute().

import pyexasol

# Usage of the context manager for a DB connection is helpful as it ensures proper
# resource management -- like closing the connection after proper usage or an
# exception is raised.
with pyexasol.connect(dsn='<host:port>', user='sys', password='exasol') as C:
    with C.execute("SELECT * FROM EXA_ALL_USERS") as stmt:
        # to fetch 1 row
        print(stmt.fetchone())

        # to fetch n=3 rows
        print(stmt.fetchmany(3))

        # to fetch all remaining rows
        print(stmt.fetchall())

    # This is not needed for the code to run, but it a value of a context manager.
    print(stmt.is_closed)
# This is not needed for the code to run, but it a value of a context manager.
print(C.is_closed)

with pyexasol.connect(dsn='<host:port>', user='sys', password='exasol') as C:
    with C.execute("SELECT * FROM EXA_ALL_USERS") as stmt:
        # to iterate through all rows
        for row in stmt:
            print(row)

Export data into a DataFrame

Using pandas

# pip install pyexasol[pandas]
import pyexasol

C = pyexasol.connect(dsn='<host:port>', user='sys', password='exasol', compression=True)
df = C.export_to_pandas("SELECT * FROM EXA_ALL_USERS")
print(df.head())

Using polars

# pip install pyexasol[polars]
import pyexasol

C = pyexasol.connect(dsn='<host:port>', user='sys', password='exasol', compression=True)
df = C.export_to_polars("SELECT * FROM EXA_ALL_USERS")
print(df.head())

Diving Deeper

The PyExasol documentation covers many topics at different levels of experience: