MadKudu Docs

Home

MadKudu SQL

MadKudu can give you direct access to your data through a direct SQL connection to Amazon Redshift.

1. Use cases

With MadKudu SQL, you can access your data directly in Redshift. Now you can run custom SQL queries, export large amount of data, or even build your own machine learning models.

What’s included?
The export include:

  • events
  • account
  • contacts
    *
    from all your existing integrations, as well as:

  • company and person information

  • the results of MadKudu models

2. Getting started

  1. MadKudu SQL is turned on demand. To get started, send us an email at hello@madkudu.com.

  2. Once enabled, your access URL and username will be available on the MadKudu SQL integrations page.

2.1 Options

When setting up MadKudu SQL, we will create and maintain a Redshift cluster for you. No need to have an AWS account or your own Redshift cluster.

If you would like the data to be hosted on your own cluster, we can do that as well!

If you prepare to load the data yourself, we can also unload the data to your S3 bucket and let you load it yourself at your convenience.

3. Compatible clients

Postico

For Mac users, Postico works out of the box without having to install any extra driver.

SQL Workbench/J

Amazon recommends to use SQL Workbench/J to run SQL queries on the database. Please see the “Getting started” section above for more details.

Follow these instructions to set SQL Workbench/J for MadKudu SQL access.

Note: Make sure to start by installing the Redshift JDBC driver.

Excel

Excel can be a great way to run some analyzes and creates reports. To connect to Redshift from Excel, follow these instructions.

Python

Python is a very popular way to analyze data. To connect to your Redshift instance from Python, you can use a library like psycopg2.

To make it work on a mac:

  1. Install psycopg2 by running pip install psycopg2 in a terminal.

  2. Use a python script to connect to the data (see an example here).

  3. You now have access to all the tables documented below.

R

R is a great free open source software environment for statistical computing and graphics. You can use a library like this one to pull data sets directly into R.

Tableau

MadKudu SQL is compatible with Tableau. You will first need to install the Tableau Redshift Driver.

Mode Analytics / Metabase

MadKudu SQL is fully compatible with Mode Analytics and Metabase. Choose the Redshift database type and simply enter your credentials.

3. Schema

When connecting to MadKudu SQL you will have access to multiple tables, one for each type of data.

These tables are a combination of standard tables and tables that are specific to your data.

3.1. List of standard tables

Here are some of the standard tables you will see:

  • accounts
  • accounts_aggregations
  • accounts_firmographics
  • contacts
  • contacts_aggregations
  • contacts_demographics
  • contacts_to_accounts
  • events

These tables will always have the same standard columns. In addition, accounts and contacts will have a column for each of your custom properties.

events

The events table contains all the events recorded for your account.

It contains the following columns:

  • event_id
  • hash_key
  • event
  • event_text
  • event_timestamp
  • meta_event
  • contact_id

events_raw

The eventsraw table has one column, for each of your event properties, prefixed with `a`.

For example, if one of your events has a property category in your analytics platform, then the events table will have this information in the a_categorycolumn.

(Note: it’s a bit confusing, but this prefixing is necessary to avoid collision with SQL reserved keywords)

3.2. Sample queries

Joining events to contacts

SELECT
        *
FROM
        events AS e
INNER JOIN     
        contacts AS c
        ON e.contact_id = c.contact_id
LIMIT 1000;

Find the email address associated to an event

SELECT
        e.event_id,
        e.event_timestamp,
        e.event,
        c.contact_id,
        c.email
FROM
        events AS e
LEFT JOIN
        contacts AS c
ON
        e.contact_id = c.contact_id
LIMIT 1000;

Find the company info associated to a contact

SELECT
  c.contact_id,
  c.email,
  af.*
FROM
  contacts AS c
INNER JOIN
  contacts_to_accounts AS c2a
ON c.contact_id = c2a.contact_id
INNER JOIN
  accounts_firmographics AS af
ON c2a.account_id = af.account_id;

Find the likelihood to convert for users in their trial

SELECT
        t2.email,
        t1.likelihood_to_convert_segment,
        t1.logit
FROM
        analysis_scoring_conversion_free_trial t1
INNER JOIN
        contacts t2 on t1.contact_id = t2.contact_id
LIMIT 100;

4. Frequently asked questions

How frequently is the data refreshed?

Data is refreshed once a day.

How much does it cost?

Cost is based on the amount of data stored in Redshift. Please drop us a line at hello@madkudu.com to get a price.

How secure is it?

MadKudu enforce a strict security policy to make sure your data is safe. SSL connection to the database is mandatory. Whitelist of your IP address is highly recommended.