Alteryx is a Platform that delivers end-to-end automation of analytics, machine learning, and data science processes that accelerate digital transformation.
Most analysts spend 80% of their time preparing and combining data from different sources, and only 20% of their time is devoted to actual analysis.
Alteryx reverses these figures.
Pre-requisites
Alteryx Designer
PostgreSQL
AWS S3 Bucket
Extracting data from SQL in Alteryx
Create a database called Sales and import some data from CSV. You can get free sample datasets from Kaggle to follow along with this tutorial.
Check that the data imported successfully into your database
In Alteryx, we will create a new Workflow and select an Input data node for PostgreSQL
You’ll need the latest ODBC (Open Database Connectivity) Driver for postgreSQL
Add the data source to your database
Verify the data has extracted from your SQL Database correctly using the Browse node then run your Alteryx Workflow
Extracting Data from Amazon S3 Buckets
Let’s take a look at how to extract data from cloud storage in Alteryx using AWS S3
Create a new S3 Bucket on AWS
Upload your dataset
Give access to your bucket using IAM (Identity & Access Management)
Create a key and store this somewhere safe, we’ll need this to connect from Alteryx
Back in your Alteryx Workflow create another Input node and use Amazon S3 Quick connect as the data source
Enter in your access and secret key that was mentioned above
Select the bucket and Object (CSV file we uploaded)
Change the file Format to CSV then add a Browse node and run the workflow
We’ve now successfully created an extract workflow from both on-premise SQL Databases and Cloud Data Sources
Merging Data Sources
Now we have multiple data sources, we can merge them into one single data stream using the Join tools in Alteryx