Choropleth maps with geopandas, Bokeh and Panel

November 15 2019

Background

A choropleth is just a map made of polygons (shapes) representing a geographical area. The patches are coloured according to some measured value related to that region. For example you could colour each country in the world according to the average life expectancy. All you need are coordinates files and some data points to correspond with the polygon names.

Requirements

GeoPandas, Bokeh, Panel, Matplotlib can be installed with pip or conda.

Imports:

import pandas as pd
import geopandas as gpd
import json
import matplotlib as mpl
import pylab as plt

from bokeh.io import output_file, show, output_notebook, export_png
from bokeh.models import ColumnDataSource, GeoJSONDataSource, LinearColorMapper, ColorBar
from bokeh.plotting import figure
from bokeh.palettes import brewer

import panel as pn
import panel.widgets as pnw

Read Shapefile

A shapefile is an vector data storage format for storing the attributes of geographic features. It is stored as a set of related files, usually placed together in zip format. This is easily read in Geopandas. Here we use a shapefile from naturalearthdata for all the countries in the world. It’s read in to a DataFrame like structure and the columns renamed.

shapefile = 'data/ne_110m_admin_0_countries.shp'
#Read shapefile using Geopandas
gdf = gpd.read_file(shapefile)[['ADMIN', 'ADM0_A3', 'geometry']]
#Rename columns.
gdf.columns = ['country', 'country_code', 'geometry']
gdf = gdf.drop(gdf.index[159])

Fetch data

For this example we use the datasets from ourworldindata who store their datasets in a semi-standardized csv format on github. This means we can fetch the data based on a dataset name, which we will use later for an interactive map. This function takes a label and checks the url from a table and downloads it into a pandas dataframe. This is then merged into the GeoDataFrame object that we created above. We also return the name of the data column which is used to plot the colors.

owid = pd.read_csv('data/owid.csv').set_index('name')
def get_dataset(name,key=None,year=None):

    url = owid.loc[name].url
    df = pd.read_csv(url)
    if year is not None:
        df = df[df['Year'] == year]
    #Merge dataframes gdf and df_2016.
    if key is None:
        #name of column for plotting is always the third one
        key = df.columns[2]
    #merge with the geopandas dataframe
    merged = gdf.merge(df, left_on = 'country', right_on = 'Entity', how = 'left')
    merged[key] = merged[key].fillna(0)    
    return merged, key

Plot with matplotlib

Geopandas objects can be plotted directly with matplotlib.

datasetname='Land surface temperature anomaly'
data,key = get_dataset(datasetname, year=2010)
fig, ax = plt.subplots(1, figsize=(14, 8))
data.plot(column=key, cmap='OrRd', linewidth=0.8, ax=ax, edgecolor='black')
ax.axis('off')
ax.set_title('%s 2010' %datasetname, fontsize=18)

Plot with bokeh

Bokeh draws maps the way it would draw any polygons. First the geodataframe (with color data column added) is converted into a GeoJSONDataSource object which autotamically makes fields called ‘xs’ and ‘ys’ with the coordinates. We then plot the values as patches.

def get_geodatasource(gdf):    
    """Get getjsondatasource from geopandas object"""
    json_data = json.dumps(json.loads(gdf.to_json()))
    return GeoJSONDataSource(geojson = json_data)

def bokeh_plot_map(gdf, column=None, title=''):
    """Plot bokeh map from GeoJSONDataSource """

    geosource = get_geodatasource(gdf)
    palette = brewer['OrRd'][8]
    palette = palette[::-1]
    vals = gdf[column]
    #Instantiate LinearColorMapper that linearly maps numbers in a range, into a sequence of colors.
    color_mapper = LinearColorMapper(palette = palette, low = vals.min(), high = vals.max())
    color_bar = ColorBar(color_mapper=color_mapper, label_standoff=8, width=500, height=20,
                         location=(0,0), orientation='horizontal')

    tools = 'wheel_zoom,pan,reset'
    p = figure(title = title, plot_height=400 , plot_width=850, toolbar_location='right', tools=tools)
    p.xgrid.grid_line_color = None
    p.ygrid.grid_line_color = None
    #Add patch renderer to figure
    p.patches('xs','ys', source=geosource, fill_alpha=1, line_width=0.5, line_color='black',  
              fill_color={'field' :column , 'transform': color_mapper})
    #Specify figure layout.
    p.add_layout(color_bar, 'below')
    return p

Interactive plot with panel

If we want to be able interactively choose the year or dataset we can use Panel to make a small dashboard. This uses a few datasets from OWID and a range of years. Data may not be present for all years. Obviously this could be improved a lot - for example the scale changes with year which is confusing.

def map_dash():
    """Map dashboard"""

    from bokeh.models.widgets import DataTable
    map_pane = pn.pane.Bokeh(width=400)
    data_select = pnw.Select(name='dataset',options=list(owid.index))
    year_slider = pnw.IntSlider(start=1950,end=2018,value=2010)
    def update_map(event):
        gdf,key = get_dataset(name=data_select.value,year=year_slider.value)        
        map_pane.object = bokeh_plot_map(gdf, key)        
        return
    year_slider.param.watch(update_map,'value')
    year_slider.param.trigger('value')
    data_select.param.watch(update_map,'value')
    app = pn.Column(pn.Row(data_select,year_slider),map_pane)
    return app

app = map_dash()

The code in a Jupyter notebook and data files can be found together here.