SQL Server administration and T-SQL development, Web Programming with ASP.NET, HTML5 and Javascript, Windows Phone 8 app development, SAP Smartforms and ABAP Programming, Windows 7, Visual Studio and MS Office software
Exasol Data Warehouse Administration and SQL Development tutorials, guides and code samples, tools and downloads for SQL Database Professionals

Split String UDF Script Function using Python on Exasol Database

Exasol is a high performance analytic database where SQL developers can create UDF scripts using Python, R, Java or Lua for problems that cannot be solved with SQL codes easily. In this Exasol SQL tutorial, I want to show how analytic database developers can create Python UDF script for building a split function.

Exasol analytics database

Exasol has a huge list of built-in functions I could not see a string split function for SQL database developers.
And instead of building old style string split functions created by identifying the place of comma character for example as the delimiter and looping until all seperator characters are processed, I chosed to create a UDF script using Python.
Python string has a split() method which I want to use in my Exasol split functions built as a UDF script

Here is the sample split string UDF script for SQL developers running Exasol analytic database.

 "myStr" VARCHAR(2000) UTF8
) AS
def run(ctx):
 txt = ctx.myStr
 for line in txt.split(','): ctx.emit(line)

I use DbVisualizer to manage and develop code on Exasol database.
Please note that since we are developing our UDF script using Python language, the indents within source code is important.
Here is the output when I execute above Python script to create an UDF script on Exasol database.

Exasol UDF Script for string split function

Now I can see my split string UDF script in the object directory under Scripts > UDFs of database schema named Kodyaz. Kodyaz is the schema which I was connected when I executed Python UDF script create command.

Exasol database developers can use the UDF Script for splitting string as follows.

select kodyaz.split('apple,grape,banana')

The output of the above SQL SELECT statement is as follows

Python UDF script for splitting string on Exasol database

If the UDF script function was emitting or returning more than one column value, the SELECT statement could be used as follows to execute the UDF script as a sub-select

select str as fruit
from (
 select kodyaz.split('apple,grape,banana')
) as tbl

SQL Select from UDF Script on Exasol database

Our sample string split UDF script function only takes string parameter which is to be splitted by the predefined character comma ","

Split a Sentence into list of Words using UDF Script on Exasol Database

Exasol database developers can enhance the sample Python UDF script for string splitting adding an additional parameter which is used as separator character or splitting character.
Following is the UDF script SPLIT_BY developed in Python with 2 input string parameters; one for string to be splitted and one as the splitting character.

 myStr NVARCHAR(2000),
 splitChar NVARCHAR(1)
 str NVARCHAR(2000)
) AS
def run(ctx):
 for line in ctx.myStr.split(ctx.splitChar): ctx.emit(line)

Above Split_By UDF script can be used to split a given sentence into a list of contained words by passing the space character as the split character.

Let's execute Split_By Python UDF Script on Exasol database with sample data as follows

select kodyaz.split_by('This is a sample UDF script in Python on Exasol database', ' ')

list of words in a sentence by using split UDF script on Exasol SQL database

I hope SQL database developers on Exasol analytic databases will find this Python UDF script useful as a string split function.

Of course, for performance developers can compare UDF scripts created using different languages like Java, R and Lua beside Python.


Copyright © 2004 - 2021 Eralper YILMAZ. All rights reserved.