0.22.0 - 2021-11-19¶
BREAKING (!): HTTP transport was significantly reworked in this version. Now it uses threading instead of subprocess to handle CSV data streaming.
There are no changes in a common single-process HTTP transport.
There are some breaking changes in parallel HTTP transport:
Argument
mode
was removed fromhttp_transport()
function, it is no longer needed.Word “proxy” used in context of HTTP transport was replaced with “exa_address” in documentation and code. Word “proxy” now refers to connections routed through an actual HTTP proxy only.
Function
ExaHTTPTransportWrapper.get_proxy()
was replaced with propertyExaHTTPTransportWrapper.exa_address
. Function.get_proxy()
is still available for backwards compatibility, but it is deprecated.Module
pyexasol_utils.http_transport
no longer exists.Constants
HTTP_EXPORT
andHTTP_IMPORT
are no longer exposed inpyexasol
module.
Rationale:
Threading provides much better compatibility with Windows OS and various exotic setups (e.g. uWSGI).
Orphan “http_transport” processes will no longer be a problem.
Modern Pandas and Dask can (mostly) release GIL when reading or writing CSV streams.
HTTP thread is primarily dealing with network I/O and zlib compression, which (mostly) release GIL as well.
Execution time for small data sets might be improved by 1-2s, since another Python interpreter is no longer started from scratch. Execution time for very large data sets might be ~2-5% worse for CPU bound workloads and unchanged for network bound workloads.
Also, examples were re-arranged in this version, refactored and grouped into multiple categories.