SQL Server administration and T-SQL development, Web Programming with ASP.NET, HTML5 and Javascript, Windows Phone 8 app development, SAP Smartforms and ABAP Programming, Windows 7, Visual Studio and MS Office software
Development resources, articles, tutorials, code samples, tools and downloads for AWS Amazon Web Services, Redshift, AWS Lambda Functions, S3 Buckets, VPC, EC2, IAM

Data Virtuality File Connector and Query CSV File Data


Using Data Virtuality File Connector, users can define a path for data files on a file share or local path and query data files like CSV files or XML/JSON files using SQL queries. In this tutorial, I want to show the steps how a local directory can be defined as a data source to Data Virtuality Server using File Connector and how the contents of a CSV file can be displayed and a SQL query can be build on top of CSV file data.


Create Connection to Local Path using Data Virtuality File Connector

On my local computer I have a number of CSV files which contains some data. Using Data Virtuality File Connector I want to introduce the file directory (or a file share) as a data source for Data Virtuality Logical Data Warehouse for my installation.

In this tutorial, I want to show the steps of using File Connector with a sample and how to access data over this file connection.

Launch Data Virtuality Studio and connect to the target Data Virtuality server as the first step.
On Data Explorer window right click on Data sources node to create a new Data Source using File connector.

add new data source to Data Virtuality server

When the "Select data source type" screen is displayed, among File node choose File Connector as seen in following screenshot.

File Connector to create data source on Data Virtuality

To configure the File Connector settings for a new file folder, provide similar entries according to your case seen in below screenshot.
Provide a descriptive name to your folder path. This will be the schema name when you try to reach files in this path via the file connector.
Enter the directory path. Since my files containing data are stored in "C:/Kodyaz" folder I entered this file folder path
Keep other parameters as offered default.

directory path configuration parameter for file connector

Click Next to continue and if you wish you can mark "Gather statistics" checkbox to collect statistical information about this data source. Click on Finish button to complete setup.

If you get the message "Data source added." then you have successfully defined a new data source using File connector to reach data contained in files within given path.

After the file folder is added as a new data source, you will not see the files as a table or view under the data source schema name. It will be empty showing 0 tables/views.

no tables and views for file data source type

But using the procedures that are assigned to data sources created with File Connector, it is possible to access files and data within the data source or file folder.

procedures for File Connector to read data on Data Virtuality

For example, to get the list of all files in target directory or file path, following CALL command can be used to execute getFiles procedure. Just double click on getFiles, a template for the CALL procedure command will be displayed in a new SQL editor. You can modify the procedure call command in the SQL editor according to your requirement.

call "LocalFilesFolder.getFiles"(
 "pathAndPattern" => '*.csv'
);;
Code

The output of the above SQL command will give us the filePath including file name and folder, last modified time, last access time, creation time, whether it is a regular file or a directory, the file size and a file key, etc.

getFiles to list data files on File Connector data source


Create Virtual Schema

Within Data Virtuality Studio, in Data Explorer window, right click on "Virtual Schemas" and from context menu option choose "Create schema" as follows:

create virtual schema

Give a descriptive name to your new virtual schema. In this example tutorial, I will use "localfiles" since the schema will contain data from local files connected by Data Virtuality File Connector

Data Virtualirt Studio create virtual schema wizard

Then click Add button. The new schame localfiles will be automatically listed under the list "Virtual schemas" on Data Explorer window. If not, you can use the Refresh list context menu option displayed by a right-click on Virtual Schemas node.

virtual schema details for file connector data sources

We are now ready to create a view within the virtual schema so it is possible to query the data within data files any time easily.


Create View

In this step, we will create a new view within our recently created Virtual Schema localfiles which SELECT and display all data in sample schools.csv file which locates on file folder defined by the file connector.

create view localfiles.Schools as
SELECT
"csv_table"."schoolID",
"csv_table"."schoolName",
"csv_table"."schoolCity",
"csv_table"."schoolState",
"csv_table"."schoolNick"
FROM
(call "LocalFilesFolder".getFiles('schools.csv')) f,
 TEXTTABLE(to_chars(f.file,'UTF-8')
  COLUMNS
  "schoolID" STRING ,
  "schoolName" STRING ,
  "schoolCity" STRING ,
  "schoolState" STRING ,
  "schoolNick" STRING
  DELIMITER ','
  QUOTE '"'
  HEADER 1
 )
"csv_table"
Code

Please note that, above TEXTTABLE() function splits the file column data that is fetched with getFiles() procedure into rows as well as into columns defined in COLUMNS parameter and other parameters like DELIMETER, HEADER line count, etc.

create view in virtual schema for Data Virtuality File Connector data source


Execute SQL Query on CSV File via View

Data professionals can now build SQL query to query data that .csv file contains using standard SQL query. It is also possible to use view data defined by using file connector within SQL queries and join with other data sources, too.

Following SQL query is a very sample SELECT query without any filtering criteria or a JOIN condition.

SELECT  "schoolID", "schoolName",  "schoolCity", "schoolState", "schoolNick" FROM "localfiles.Schools";;
Code

sample SQL query on CSV file data using Data Virtuality Studio

I hope Data Virtuality users now have a quick step to use File Connector with text files including data and stored on local folders. As seen in this tutorial, it is easy to map to the file directory using File connector using as data source type and read file contents with procedures available for the data type.



AWS


Copyright © 2004 - 2024 Eralper YILMAZ. All rights reserved.