What is Alteryx?

Alteryx is a Platform that delivers end-to-end automation of analytics, machine learning, and data science processes that accelerate digital transformation.

Most analysts spend 80% of their time preparing and combining data from different sources, and only 20% of their time is devoted to actual analysis.

Alteryx reverses these figures. 

Pre-requisites

Extracting data from SQL in Alteryx

Create a database called Sales and import some data from CSV. You can get free sample datasets from Kaggle to follow along with this tutorial.

Check that the data imported successfully into your database

In Alteryx, we will create a new Workflow and select an Input data node for PostgreSQL

You’ll need the latest ODBC (Open Database Connectivity) Driver for postgreSQL

Add the data source to your database

Verify the data has extracted from your SQL Database correctly using the Browse node then run your Alteryx Workflow

Extracting Data from Amazon S3 Buckets

Let’s take a look at how to extract data from cloud storage in Alteryx using AWS S3

Create a new S3 Bucket on AWS

Upload your dataset

Give access to your bucket using IAM (Identity & Access Management)

Create a key and store this somewhere safe, we’ll need this to connect from Alteryx

Back in your Alteryx Workflow create another Input node and use Amazon S3 Quick connect as the data source

Enter in your access and secret key that was mentioned above

Select the bucket and Object (CSV file we uploaded)

Change the file Format to CSV then add a Browse node and run the workflow

We’ve now successfully created an extract workflow from both on-premise SQL Databases and Cloud Data Sources

Merging Data Sources

Now we have multiple data sources, we can merge them into one single data stream using the Join tools in Alteryx