1. Introduction to Tableau¶
1.1. What is Tableau¶
By now, you’ll know that Tableau is going to help you make sense of your data but how?
Tableau is a visualization software used in businesses to create static and dynamic plots that can easily be shared within and across organizations.
This software is used by many different occupations such as:
Business Analysts
Data Visualization Analysts
Data Scientists
Software developers
Engineers
to name a few.
Not only do many different occupations use it but companies ranging in size including giants like:
Amazon, Apple, CIBC, Coca-Cola, Lululemon, Lenovo, LinkedIn, PayPal, etc.
1.2. Tableau Examples¶
I guess it would be nice to know why right? Well, Tableau makes some beautiful visualizations.
Here is a dashboard I made for the company “Fresh Prep” that you can explore on Tableau public.
This took around 5 weeks and note that you are seeing the dashboard using fake data.
This is just the tip of the iceberg though.
Let’s take a look at a few exceptionally beautiful examples.
1.2.3. Dinosaur2 by Rahul Patil¶
1.2.5. Piano Classroom by Nir Smilga¶
To see more, go to Tableau’s Feature Gallery
1.3. History of Tableau¶
Tableau was founded in 2003 in California (classic).
Recently in 2019, the company was acquired by Salesforce for $15.7 billion just giving a little reference about how valuable this tool is considered.
Normally this wouldn’t be too important but I do want you to know the expertise that this software is built by and why it is credible.
There are 3 main characters involved:
Chris Stolte who is the Co-founder and technical advisor
Ph.D. in Computer Science (from Stanford University)
Co-inventor on five software patents related to information visualization
Pat Hanrahan who is the Co-founder and Chief Scientist
A founding employee at Pixar Animation Studios
Received three Academy Awards for his work in rendering and computer graphics research
Professor of Computer Science and Electrical Engineering at Stanford University, teaching computer graphics
Christian Chabot, who is the Co-founder and chairman
MBA from Stanford University
CEO and co-founder of BeeLine Software, a pioneer of next-generation digital mapping technology
The short story of this is that Tableau is built by some very talented individuals.
1.4. Other Similar Tools Available¶
Microsoft PowerBI (sold as part of a Microsoft package)
Looker (Owned by Google)
QlikView
Domo
many, many others…
1.5. Pros and Cons¶
1.7. Where is visualization situated in the grand scheme of data science?¶
Visualization can absolutely be an end-goal!
Visualization dashboards can be used to target individuals for sales, find shortcomings in production, as well as help, identify trends in the data which can help with a prediction component.
1.8. Quick Quiz¶
What must you have in order for us to have some form of version control beyond saving files with different names?
Who is Tableau’s Co-founder and technical advisor?
Name me a company that uses Tableau.
What visualization tool was acquired by Google?
True or False: You must have some coding experience to use Tableau.
Solutions!
Tableau Server
Chris Stolte
Amazon, Apple, CIBC, Coca-Cola, Lululemon, Lenovo, LinkedIn, PayPal, Nokia Dell, Lenovo, Cisco, Forbes, eBay, Intel, Ferrari, Deloitte etc.
Looker
False
1.9. Getting Started¶
First, let’s open up the application. I’ll be using the software installed on my computer.
You can also use Tableau public which is a free version of Tableau that allows you to use most of the software functions. The biggest downside is it does not let you save your work locally.
When you open it, you’ll be greeted with the home screen. I have a few projects already on the go but you’ll have your own as well.
1.10. Connecting to Data¶
Tableau lets you either connect to a database server or upload a file. We are going to discuss one of each.
1.10.1. Connecting to MySQL Server¶
The most occurring database server and the example we are going to show you here is MySQL.
Clicking MySQL for the first time results in the following popup.
You’ll have to follow the instructions here and make sure you downloaded the 2 necessary packages in order for Tableau to connect.
Once you’ve installed them, you’ll have to close Tableau and reopen it.
Now when you click on MySQL, you will have the new popup where you’ll have to put the required information to your server.
Note that there are MANY other servers that are options. Simply click More… under the “To a Server” heading and take your pick!
1.11. csv, excel, etc.¶
In addition to Tableau connecting to a server, there are a multitude of file types Tableau can upload. The major types are CSVs, excel files, JSON files etc. to upload, we can simply use the More… under the “To a File” heading.
We are going to use a CSV named street-trees.csv
for our demo.
Locate your file and click Open.
I’ll be using the street-trees.csv
file located in this Google Drive file
After you’ve selected you’ll be directed to a new screen where you’ll be asked to Update now or Automatically Update. I generally select the former.
This will now give you a view of the data that the csv has to offer.
This is where you’ll see some symbols on top of each column.
What are these?
1.12. Introduction to data types¶
Each column in your data will have a data type. This represents the kind of information that is stored in a column.
Tableau will designate a data type when you connect to a source.
The data types offered in Tableau are as follows: (source).
Icon | Data type |
---|---|
Text (string) values | |
Date values | |
Date & Time values | |
Numerical values | |
Boolean values (relational only) | |
Geographic values (used with maps) | |
Cluster Group (used with Find Clusters in Data(Link opens in a new window)) |
Note that generally speaking Tableau will guess which type goes with each column but you’ll soon find out that Tableau isn’t always right. For example, the column Date Planted
should be changed to a “Date” data type and Longitude
and Latitude
should be both be Geographic data types.
The good news is this is easy to fix!
Let’s convert the column Date Planted
first.
Simply click on the icon (ABC in this case) and select the desired type. We are changing this to a date.
To convert the columns Longitude
and Latitude
will take 2 steps instead.
We can’t convert it straight to a geographical location since Longitude
or Latitude
isn’t an option when the column is of type string.
So, we need to change it to a “Number (decimal)” first!
Once the column is a number, we can then select the appropriate Longitude
option under the Geographical Location menu.
Ta-Da! Now we have the appropriate globe icon, meaning the column is now a Geographical data type.
We will have to repeat this for the Latitude
column now.
1.13. Tableau Setup¶
Let’s stop playing with our data for a second and get into the actual visualization screen!
Clicking on Sheet 1 will take us to a worksheet where we can make our first plot.
1.13.1. Workspace¶
Tableau has a great image that will help you identify what you need in your workspace but here is my version.
: On the top left of the workspace you’ll see this symbol, this will take you back to the home page that displays all your projects.
Toolbar
I’m not showing all the options but the ones you’ll use often.
Shelves
This is essentially your x and y-axis and where you can designate one of your data source’s columns.
Cards
Assigning columns to cards can build on your visualization by encoding marks with colour, size, shape, text, and details like hovers.
Viz Area
Also known as the “View”, this is exactly what it sounds like; this is where your visualization will be displayed.
Show Me Window
This is going to be a really useful window when you first start using Tableau. If you have already selected columns in any of the shelves or cards, Tableau will give you the option of several types of graphs that could be appropriate for the columns selected. It will also highlight the graph type that best matches the data.
Data or Analytics Pane
This is going to be a busy pane for you. Here is where all your data columns are displayed. Your columns are split into 2 different types (with the possibility of more) and 2 different colours. What do they mean though?!
1.13.2. Data Pane¶
1.13.2.1. Dimensions vs Measures¶
There is a faint line that splits the columns from the data source into 2 categories.
Dimensions and,
Measures
Tableau describes Dimensions as “qualitative values (such as names, dates, or geographical data)” whereas Measures are numeric, quantitative values.
This is important to know since sometimes you’ll have to switch a column to be expressed as one or the other which it may not have automatically been defaulted to.
1.13.2.2. Continuous vs Discrete columns¶
Have you notices there are 2 different colours of icons?
Green measures and dimensions means the field (column) is being expressed in a continuous manner. Continuous data is data that can take on any possible value. An example would be a person’s height or the time it takes to microwave popcorn.
Blue measures and dimensions means the field (column) is being expressed discretely. Discrete data can only take certain values. Examples include the number of employees at a company. We can’t really have half an employee; would that be the left or the right side? Or the number of vehicles each company produces on a yearly basis.
1.13.3. Calculations, Sets, Parameters¶
You’ll possible have to make other data types besides dimensions and measures such as
Calculation: These are values or calculations from existing columns that are not currently in your data source. You can create these new fields using a formula, calculating the values and saving them as part of your data.
Sets: A section of the data that you define from a column from your data source and a desired criteria. For example, you may want a subsection of the data that doesn’t include retail customers and only corporate customers.
Parameters: These are values that can be used as placeholders in formulas for calculations and filters.
We may touch on this later on, but for now, let’s move on.
1.13.4. Worksheets vs Dashboards vs Story¶
Worksheet
This is where you create each individual visualization. You create 1 viz per worksheet.
Dashboard
A dashboard is where you lay out your different worksheets and add any filtering toggles that want to make available for the user.
You can only add worksheets that have been already made. That means that if you want to add a graph to a dashboard, you have to first make it from a worksheet.
It’s likely as well that you’ll have more worksheets and graphs than you want to place in your dashboard. We will talk a little bit more about this in our section on Exploratory Data Visualization.
Story
A sequence of worksheets or dashboards that work together to tell a story.
You can create stories to tell a data narrative, provide context, demonstrate how decisions relate to outcomes, or simply make a compelling case.
Each individual sheet in a story is called a story point.
We will discuss more on the importance of storytelling later on in this course, but this will likely be one of the tabs you use less frequently.
1.14. Quick Quiz¶
True or False: Connecting to a MySQL server is relatively simple.
Tableau sorts data columns into how many different data types?
Which of the following is not part of the Tableau Workspace: Toolbar, Cards, Data Pane, Calculation Keys, Show Me Window.
What is the difference between Dimensions and Measures?
True or False: Columns that are blue are continuous data, whereas columns in green are discrete.
Solutions!
True
7
Calculation Keys
Dimensions are qualitative fields and Measures and quantitative fields.
False
1.15. Making some Viz! (It’s about time!)¶
1.15.1. Bar¶
Our question that we want to answer with this plot is how many trees there are in each Vancouver neighbourhood?
Step by Step Instructions
1. We are going to drag from the left-hand side under the heading “Tables” the column named Neighbourhood Name
to the Column shelf.
2. We are interested in the count of trees in each neighbourhood. We don’t have a count of trees as a column, but since the column Tree Id
is unique (that means that every row in the data has a different value for Tree Id
) we can use it to count the rows (you can use multiple different columns here though). Let’s drag the Tree Id
column from the left of the screen to the Rows shelf.
3. We need to convert this Tree Id
variable to a “Measure” specifically a “Count” so that we get 1 value for each neighbourhood. We can do this by right-clicking on it and selecting Count.
Voila! A bar chart!
4. Let’s change the colour. Go to the Marks card and select a new colour.
5. Let’s edit our y-axis label. Right-click on the axis and click “Edit Axis…”
Under “Axis”, you can edit your axis “Title”.
You can edit the title of the graph in two ways;
By editing the title or…
By editing the sheet name by double-clicking the sheet at the bottom. (I prefer this way)
6. You can sort the bars by clicking the icon beside the axis title or the icon in the toolbar.
7. Let’s convert it to a vertical bar chart. On the toolbar right above Columns you’ll see a Swap rows and columns* icon. This transposes your graph.
1.15.2. Bubble¶
To answer the same question, we could also have done a bubble chart.
Step by Step Instructions
1. Click on the Show Me in the top right corner of the workspace.
2. This will drop down other suggested plots for the measures and dimensions we have dragged to the Columns and Rows shelves.
Select the bubble diagram that’s at the bottom right of the Show Me menu.
And Voila! Bubble plot! Easy as that!
It would be nice to add some colour no?
4. To add colour to a field, we can drag it to the colour option on the Marks card. Here we are dragging Neighbourhood Name
.
This was a problematic choice though, there seem to be duplicate colours and it’s not an effective colour channel.
5. Let’s remove the Neighbourhood Name
from the colour channel by right-clicking it in the Marks card and selecting Remove from the dropdown menu.
6. Let’s instead drag the Tree Id
column to the colour channel.
Now, this looks bad but we want the count of trees to be represented by a colour scale.
7. We can change this by right-clicking Tree Id
in the Marks card and converting it to a Count Measure.
8. Now we can see that the number of trees in each neighbourhood is not only represented by the size of each bubble but also the colour. The scale on the right side of the plot gives us an idea of how the colour translate to quantity.
1.15.3. Highlight Table¶
We could also do a “highlight table” to answer this question as well.
Step by Step Instructions
1. Let’s go back to the Show Me menu and click the highlight table option.
And we are done! That’s all 1 step needed!
1.15.4. Making a New Worksheet¶
To create a new viz we need to make a new worksheet. We can do this in 2 ways:
by clicking the dropdown menu on the icon in the toolbar and selecting New Worksheet
Clicking the icon at the bottom left of the workspace.
1.15.5. Aggregation Plots¶
This is very similar to how you would make a COUNT bar plot with one minor difference; we no longer are using a “Count” Measure but instead perhaps Average, Median, Max and Min.
Our question now is: What is the average diameter of each tree genus?.
Step by Step Instructions:
1. Drag from the left-hand side under the heading “Tables” the column named Genus Name
to the Columns shelf.
2.We want the mean diameter for each genus so we can drag diameter to the Rows shelf.
3.This is where things differ. We right-click the diameter and transform it to a Measure specifying Average.
4.Instead of using a bar chart, Maybe using a dot plot would be more ideal. We can convert it by clicking the dropdown menu under the “Marks” card. Selecting “Circle” or “Shape” will instantly convert it.
5. I am going to rotate the axis since I find this to be a more effective plot.
Tip!
You can add your own shape icons by adding a folder to your “My Tableau Repository” folder under “Shapes”.
We will show you how to do this in Class 3 or 4.
1.15.6. Drill Down and hierarchies¶
You’ll notice that with this tree data a tree can have a Genus and a species. There are multiple species in each genus.
We can create a hierarchy from these columns so that we can make a “Drilling down” action between each field in the graph.
1. First, we identify the second step in our hierarchy and drag it under the field that it encompasses.
For us, this means dragging Species Name
under Genus Name
.
2. This will produce a popup to create the hierarchy. You can name this anything from Tree types or as we did simply the steps in the Hierarchy.
3. Once the hierarchy is made, you’ll see the steps under the title you just wrote as well as a small + icon next to the Genus Name
field in the Rows shelf.
4. If we click the plus next to Genus Name
species will populate beside it.
5. The graph will now reflect the hierarchy and sort the trees accordingly.
1.15.7. Maps¶
Maps seem to be pretty intimidating as they can be complex and provide a lot of information in a small space. Luckily for us, maps can be quick friendly and easy to do with Tableau.
Step by Step Instructions
1. Drag Longitude
to the Columns shelf.
2. Drag Latitude
to the Rows shelf.
And you have made a map with all the trees!
Warning
This won’t work so easily if you have not specified a geographical datatype for the columns.
Tip!
If you don’t have latitude and longitude columns, you may still be able to make a map if you have any of the following geographical data types that you can designate to the column.
Airport
US telephone Area Code
U.S. Core Based Statistical Areas (CBSA)
Country/Region
County (U.S. counties, French départements, German kriese, etc.)
NUTS Europe
State/Province
Code/Postcode
Let’s tidy this up a bit though. Decrease the size of your markers by clicking the Size icon under the Marks card.
Right-click the map and select Map Layers… from the dropdown. This gives you the ability to customize the appearance of your map.
5. Change the map Style to Normal.
6. Add opacity to the map with Washout.
7. Add different Map Layers such as Streets, Highways, Routes and Zip Code Boundaries.
8. Add a title like you’ve done before and you’ve got a functioning map in < 5 mins.
1.16. Distributing and Saving¶
We can save our work by clicking the floppy disk icon on the top left of the toolbar.
1.16.1. Workbook (.twb
)¶
Tableau by default will save all your sheets and dashboards as a workbook, a .twb
file.
This does require you to connect to the data source locally.
If you send this to colleagues, you will need to send them the data you used, along with any images or additional files you used in your dashboard or worksheets.
You’re more likely to use this is you are using dynamic data. This means that you are connected to a data source that updates regularly.
1.16.2. Packaged Workbook (.twbx
)¶
Instead, you could save your work as a packaged workbook where now all of the files and data sources used are contained in the space. The extension for this file type is .twbx
.
That means the workbook is no longer linked to the original data source or images.
If you are working with static data or want to send your colleague your dashboard as a sample, this is likely the better option! Just remember that this format will take up more storage space.
1.17. Quick Quiz¶
Where can we find the button/tool necessary to sort your plot?
What is one way we can convert a dimension to a measure?
True or False: When using the “Show me” window and changing the graph to a different type, the fields can change shelves and/or from dimensions or measures.
True or False: It’s important that every plot has a dimension/measure in the Columns and Rows shelves.
Which file extension does not save the data in the workbook?
Solutions!
In the toolbar or by clicking the sort icon on the axis of the plot.
Right-clicking and selecting the appropriate statistic. We can also right-click and convert them in the Data pane.
True
False
.tbw