2. Plot Types

2.1. More Plotting Examples

Alright, let’s now take a shot at creating some of the plots we just learned about.

I’m going to be using the street-trees.csv file as a data source available in the Google drive here.

2.1.1. Pie Chart

Let’s say we want to see the proportion of trees planted with root barriers to trees planted without root barriers.

First, make a new worksheet.

Step by Step Instructions

1. Drag the Root Barrier field to the Color icon.

404 image

2. Drag the Tree Id field to the Area icon.

404 image

3. Convert the Tree Id field to a Count Measure by right-clicking it and selecting from the dropdown menu.

404 image

4. Click on the Show Me menu on the top right side of the workspace.

404 image

5. Select the pie chart icon.

404 image

And voila! We baked made a pie chart 🥧!

404 image

2.1.2. Stacked Bars

A pie chart may not be the best visualization to see the proportion of trees planted with root barriers to trees planted without root barriers.

Let’s see how it looks like as a stacked Bar chart.

Make a new worksheet or clear the current sheet you are on.

Step by Step Instructions

1. Drag the Tree Id field to the Rows shelf.

404 image

2.Convert this field to a Count Measure by right-clicking and selecting the appropriate measure.

404 image

3. We can now add the stacking part of this bar chart by dragging Root Barrier to the Color icon in the Marks card.

404 image

4. Let’s transpose this graph so it’s a little clearer by clicking the Swap Rows and Columns icon in the toolbar.

404 image

Great! Here is our stacked bar chart!

404 image

2.1.3. Side-by-Side Bars

Still maybe not the right plot for this question. Let’s go with a barplot with the categories side-by-side.

Step by Step Instructions

1. Drag the Root Barrier field to the Columns shelf.

404 image

2. Drag the Tree Id field to the Rows shelf again.

404 image

You may have to indicate that want to Add All Members

404 image

3. Convert the Tree Id field to a Count Measure by right-clicking and selecting the appropriate measure.

404 image

4. Let’s add a little bit of colour to this plot. This isn’t a necessary step, however, we are doing this for consistency to compare to the last three charts.

404 image

Now that we’ve done that, which plot out of the pie, stacked bars and side-by-side plot do you most prefer?

404 image

2.1.4. Scatter Plot

With this particular data source, we don’t really have 2 good continuous numeric columns. To demonstrate how to make a scatter plot, we are going to use what we have and make the best of it.

Let’s plot and see if there is a relationship between the diameter of the trees’ trunks and their height.

Step by Step Instructions

1. First, let’s drag the Height Range Id column to the Columns shelf.

404 image

2. Let’s make sure this is a continuous field and convert it by right clicking and selecting Continuous from the drop down.

404 image

3. Next drag the Diameter Measure to the Rows shelf.

404 image

4. We need to make sure this Diameter field becomes a continuous Dimension as well, which we can do by right clicking and selecting it from the drop-down.

404 image

Great!

404 image

2.1.5. Line Graph

We are now interested in answering the question How many trees were planted over the years?

Before you start, let’s make a new worksheet.

Step by Step Instructions

1. Drag the Date Planted field to the Columns shelve and the tree Id field to the Rows shelf.

404 image

2.\ We are again interested in the number of trees planted at selected dates so once again, we want to transform this field to a Count Measure.

404 image

3. Since Date Planted is a continuous variable, it’s a good idea to right-click and transform this field into a Continuous Dimension.

404 image

4. This automatically generates the number of trees planted each year (but there are null values!)

404 image

4. We can change the YEAR(Date Planted) field to:

  • MONTH(Date Panted) (top month choice when right-clicking) - which aggregates months together for all years.

404 image

404 image

  • MONTH(Date Panted) (Bottom month choice when right-clicking) - which will make a sequential plot.

404 image

404 image

We are going to stick with the year dimension though!

404 image

404 image

5. We can add a circle for clarity at each year as part of our line graph by dragging a second Tree Id field to the Rows shelf.

404 image

Warning

You may get a popup warning when you do this where I specify Add All Members since we are converting it to a COUNT measure after this.

404 image

6. We need to make sure we also convert it to a Count measure.

404 image

At first, we should get 2 graphs on top of each other.

404 image

7. We can right-click one of them and select “Dual Axis”.

404 image

This will superimpose one on another with a left and a right axis title.

404 image

8. We can hide the one on the right by right-clicking the axis and unticking the “Show Header” option.

404 image

404 image

9. In the Marks card, select the `CNT(Tree Id)(2), and from the dropdown, select circle.

404 image

Now we have a line plot with points!

404 image

10. To change the colour of the line and the points, we need to make sure we change the colour of both measures by selecting the “All” tab under the “Marks” card on the right.

404 image

11. Don’t forget to give it a title and edit the y-axis label as we did before!.

404 image

2.1.6. Histograms

Let’s now start practicing making distributions. Tableau doesn’t easily facilitate density plots, so we are going to stick with learning how to make histograms.

Perhaps we are interested in the distribution of tree trunk diameter length. Remember histograms are used to visualize the distribution of a numeric continuous variable.

Step by Step Instructions

1. First, drag the Diameter Measure to the Columns shelf

404 image

2. You can then go to the Show Me menu and click on the Histogram option. Tableau will then assign the correct measures to the shelves and cards.

404 image

3. And there you have a histogram! Now, this already seems a little problematic because we didn’t choose the bin size and it’s clear that our distribution is skewed.

404 image

It might also be helpful to see this distribution shape without the outliers on the far right and with different bin size.
The majority of the data looks like it’s between 0-50 so let’s make the bin size 2 and limit the axis to 0-50.

4. You’ll notice that Tableau’s been kind and has made us a new continuous dimension named Diameter (bin). Right-click on this new field and click on Edit from the dropdown menus.

This is where we are going to change the bin size.

404 image

5. This will result in a popup window where we can change the size of the bins. Let’s go ahead and change it to 2. Remember bin size can cause bias in your plots so be careful when choosing this value. Click OK.

404 image

6. Now we can see that our bars are a lot thinner (If only exercising was this easy).

404 image

7. Let’s fix the axis range now. You’ll not have to do this often but for this particular problem and question, removing the outliers could give us a bit of a clearer distribution shape.

Right-click the axis we want to limit and from the dropdown click Edit Axis….

404 image

8. From the popup, select a Fixed Range and Fixed end at 50 for this plot.

404 image

Great!

9. We Are going to go one step forward and change the tick mark intervals too. Click on the Tick Marks option at the top of the popup window.

404 image

10. We can decrease the tick interval to 2 to help make our bar values easier to identify.

404 image

And we did it!

404 image

We now can see that the majority of trees in Vancouver have a diameter between 2 and 3 cms. We also see that it’s very skewed to the right.

2.1.7. Boxplot

Although there is an option to make boxplots using the Show Me menu, I find that it can often plot things differently than how I want them to. These are the steps I generally take.

Suppose that we want to see if the difference between the distributions of trunk diameter between trees planted with root barriers and without root barriers.

Step by Step Instructions

1. Begin by dragging Root Barrier to the Columns shelf.

404 image

2. Next, you’ll want to drag the Diameter field to the Rows shelf. You’ll have a beautiful bar plot now measuring the sum of all the trees diameters for each barrier type.

404 image

3. Since we want individual observations for each tree (somewhat), we need to convert the Diameter column to a dimension.

404 image

4. Let’s change the mark. Convert the mark from Automatic to Circle.

404 image

This will produce a circle for each tree now.

404 image

5. This is where we make the box part of our boxplot! Right-click on the axis with the continuous variable - in our case, that’s Diameter. Select the Add Reference Line option.

404 image

6. When we select this option, a popup with many different option tabs displays. We want the Boxplot tab!

404 image

7. Here we want to “Hide the underlying marks (except outliers)”. The reason we are hiding them, in this case, is because we have THOUSANDS of observations! If our dataset was smaller, it might be a good idea to show all the underlying marks.

404 image

8. We can also change the colour of the box Fill to a green palette which goes nicely with our tree theme.

404 image

We can now leave this popup screen by clicking OK.

9. Ok, so our outlying observations are rather large right now. Let’s decrease the size.

404 image

Ahh, that’s a bit cleaner.

404 image

10. We can also change the points to a green colour to go with the rest of the plot. This can be done by clicking the Color icon.

404 image

11. This is a completed boxplot! One thing you can do to get a better idea of the distributions is to transpose them.

404 image

Ahh, beautiful!

404 image

Tip!

When you have multiple boxplots and you want to sort them in some order, using the sorting buttons in the toolbar won’t quite sort them properly or may not sort them how you are intending them to.

The best way to sort your boxplots to some criteria is as follows:

1. Click on the dimension field - here it’s our Root Barrier column and from the dropdown select Sort…

404 image

2. This will produce a popup window where we selected a Nested option to sort our data by.

404 image

3. We can select if we want the field to be sorted in Ascending or Descending order, choose a field name (Diameter for us) and then choose an Aggregation. We are going to be selecting Median which is the center line of our boxes in the boxplot.

404 image

2.1.8. Heatmap

Let’s see what the joint distribution is for the presence of a curb and if the tree has root barriers or not.

This will need a heat map or a heat map with a size channel. let’s explore the former first.

Step by Step Instructions

1. Drag the Root Barrier to the Columns shelf- here we will first drag the Root Barrier column.

404 image

2. We then can drag our second discrete dimension to the Rows shelf. We will drag the Curb column.

404 image

3. To add a count field, we will drag the Tree Id to the Detail icon in the Marks card. As we have done before, we “Add all members” when prompted by the popup.

404 image

4. We now transform this field to a Count Measure by right-clicking and selecting it from the drop-down.

404 image

404 image

5. Although we already have square marks, let solidify it and convert the Automatic mark to a Square mark. This is to make sure nothing is transformed when we add additional fields to our graph.

404 image

6. We can include a value in each quadrant by dragging the Tree Id to the Label mark.

404 image

7. We then must convert it to a Count Measure by right-clicking and selecting it accordingly.

404 image

Nice!

404 image

2.1.9. Heatmap with Size Channel

If we also want to include an area channel in the plot, we can continue from the steps of the heatmap.

1. Here we will add all the counts of the trees by dragging the Tree Id to the Size icon.

404 image

2. As we have seen many times before we transform the dimension to a Count Measure by clicking and selecting from the dropdown menu.

404 image 404 image

9. The labels seem to ruin the esthetics of this plot, so let’s remove this from the plot by right-clicking and selecting Remove.

404 image

That’s better. Nice job!

404 image

2.2. Quick Quiz

  1. True or False: Sorting a boxplot can be done by using the sort buttons on the toolbar.

  2. True or False: Histograms can be made with a click from the Show Me window.

  3. What column type are the fields used in the Columns and Rows shelf for scatter plots- Continuous or Discrete?

  4. Which of the following fields acts as a hierarchy by default Row Id, Date Issued, Gender, Latitude?

  5. What mark shape is needed for a heatmap?