Nyc radial heatmap

Download this notebook from GitHub (right-click to download).


Most examples work across multiple plotting backends equivalent, this example is also available for:

In [1]:
import numpy as np
import pandas as pd
import holoviews as hv

hv.extension("bokeh")

Declaring data

NYC Taxi Data

Let's dive into a concrete example, namely the New York - Taxi Data (For-Hire Vehicle (“FHV”) records). The following data contains hourly pickup counts for the entire year of 2016.

Considerations: Thinking about taxi pickup counts, we might expect higher taxi usage during business hours. In addition, public holidays should be clearly distinguishable from regular business days. Furthermore, we might expect high taxi pickup counts during Friday and Saterday nights.

Design: In order model the above ideas, we decide to assign days with hourly split to the radial segments and week of year to the annulars. This will allow to detect daily/hourly periodicity and weekly trends. To get you more familiar with the mapping of segemnts and annulars, take a look at the following radial heatmap:

In [2]:
# load example data
df_nyc = pd.read_csv("../../../assets/nyc_taxi.csv.gz", parse_dates=["Pickup_date"])

# create relevant time columns
df_nyc["Day & Hour"] = df_nyc["Pickup_date"].dt.strftime("%A %H:00")
df_nyc["Week of Year"] = df_nyc["Pickup_date"].dt.strftime("Week %W")
df_nyc["Date"] = df_nyc["Pickup_date"].dt.strftime("%Y-%m-%d")

heatmap = hv.HeatMap(df_nyc, ["Day & Hour", "Week of Year"], ["Pickup_Count", "Date"])

Plot

At first glance: First, let's take a closer look at the mentioned segments and annulars. Segments correspond to hours of a given day whereas annulars represent entire weeks. If you use the hover tool, you will quickly get an idea of how segments and annulars are organized. Color decodes the pickup values with blue being low and red being high.

Plot improvements: The above plot clearly shows systematic patterns however the default plot options are somewhat disadvantageous. Therefore, before we start to dive into the results, let's increase the readability of the given plot:

  • Remove annular ticks: The information about week of year is not very important. Therefore, we hide it via yticks=None.
  • Custom segment ticks: Right now, segment labels are given via day and hour. We don't need hourly information and we want every day to be labeled. We can use a tuple here which will be passed to xticks=("Friday", ..., "Thursday")
  • Add segment markers: Moreover, we want to aid the viewer in distingushing each day more clearly. Hence, we can provide marker lines via xmarks=7.
  • Rotate heatmap: The week starts with Monday and ends with Sunday. Accordingly, we want to rotate the plot to have Sunday and Monday be at the top. This can be done via start_angle=np.pi*19/14. The default order is defined by the global sort order which is present in the data. The default starting angle is at 12 o'clock.

Let's see the result of these modifications:

In [3]:
xticks = ("Friday", "Saturday", "Sunday", "Monday", "Tuesday", "Wednesday", "Thursday")
heatmap.opts(
    radial=True, start_angle=np.pi*19/14, width=600, height=600,
    yticks=None, xticks=xticks, xmarks=7, ymarks=3)
Out[3]:

After tweaking the plot defaults, we're comfortable with the given visualization and can focus on the story the plot tells us.

There are many interesting findings in this visualization:

  1. Taxi pickup counts are high between 7-9am and 5-10pm during weekdays which business hours as expected. In contrast, during weekends, there is not much going on until 11am.
  2. Friday and Saterday nights clearly stand out with the highest pickup densities as expected.
  3. Public holidays can be easily identified. For example, taxi pickup counts are comparetively low around Christmas and Thanksgiving.
  4. Weather phenomena also influence taxi service. There is a very dark blue stripe at the beginning of the year starting at Saterday 23rd and lasting until Sunday 24th. Interestingly, there was one of the biggest blizzards in the history of NYC.

Download this notebook from GitHub (right-click to download).