Geometry Data

In [1]:
import numpy as np
import holoviews as hv
from holoviews import opts

hv.extension('bokeh')

Representing paths

The Path element represents a collection of path geometries with optional associated values. Each path geometry may be split into sub-geometries on NaN-values and may be associated with scalar values or array values varying along its length. In analogy to GEOS geometry types a Path is a collection of LineString and MultiLineString geometries with associated values.

While other formats can be supported through extensible interfaces (e.g. geopandas and shapely objects in GeoViews), natively HoloViews provides support for representing paths as one or more columnar data-structures including arrays, dataframes and dictionaries of column arrays and scalars. A simple path geometry may therefore be drawn using:

In [2]:
hv.Path({'x': [1, 2, 3, 4, 5], 'y': [0, 0, 1, 1, 2]}, ['x', 'y'])
Out[2]:

Here the dictionary of x- and y-coordinates could also be an NumPy array with two columns or a dataframe with 'x' and 'y' columns.

To draw multiple paths the data-structures can be wrapped in a list. Additionally, it is also possible to associate a value with each path by declaring it as a value dimension:

In [3]:
p = hv.Path([{'x': [1, 2, 3, 4, 5], 'y': [0, 0, 1, 1, 2], 'value': 0},
             {'x': [5, 4, 3, 2, 1], 'y': [2, 2, 1, 1, 0], 'value': 1}], vdims='value').opts(color='value')
p
Out[3]:

Multi-geometry

Splitting the geometries in this way allows assigning separate values to each geometry, however often multiple geometries share the same value in which case it may be desirable to represent them as a multi-geometry by combining the coordinates and separating them by a NaN value:

In [4]:
hv.Path([{'x': [1, 2, 3, 4, 5, np.nan, 5, 4, 3, 2, 1],
          'y': [0, 0, 1, 1, 2, np.nan, 2, 2, 1, 1, 0], 'value': 0}],
        vdims='value').opts(color='value')
Out[4]:

This represents a more efficient format particularly when there are very many small geometries with the same value.

Scalar vs. continuously varying value dimensions

Unlike Contours which are limited to representing iso-contours or isoclines, i.e. a function of two variables which describes a curve along which the function has a constant value, a Path element may also have continuously varying values along its path. Below we will declare a path with a value that varies along its path:

In [5]:
a, b, delta = 3, 5, np.pi/2.

vs = np.linspace(0, np.pi*2, 200)
xs = np.sin(a * vs + delta)
ys = np.sin(b * vs)

hv.Path([{'x': xs, 'y': ys, 'value': vs}], vdims='value').opts(
    color='value', cmap='hsv')
Out[5]:

Note that since not all data formats allow storing scalar values as actual scalars, 1D-arrays matching the length of the coordinates but with only one unique value are also considered scalar. For example the following is a valid Contours element despite the fact that the value dimension is not a scalar variable:

In [6]:
hv.Contours([{'x': xs, 'y': ys, 'value': np.ones(200)}], vdims='value').opts(color='value')
Out[6]:

Representing Polygons

The Polygons element represents a collection of polygon geometries with associated scalar values. Each polygon geometry may be split into sub-geometries on NaN-values and may be associated with scalar values. In analogy to GEOS geometry types a Polygons element is a collection of Polygon and MultiPolygon geometries. Polygon geometries are defined as a set of coordinates describing the exterior bounding ring and any number of interior holes.

In summary Polygons can be represented in much the same way as Paths above but have a special reserved key to store the polygon interiors or 'holes'. The holes are stored as a list-of-lists of arrays. This nested format is necessary to unambiguously associate holes with the sub-geometries in a multi-geometry. In the simplest case of a single Polygon geometry the format looks like this:

In [7]:
xs = [1, 2, 3]
ys = [2, 0, 7]
holes = [[[(1.5, 2), (2, 3), (1.6, 1.6)], [(2.1, 4.5), (2.5, 5), (2.3, 3.5)]]]

hv.Polygons([{'x': xs, 'y': ys, 'holes': holes}])
Out[7]:

The 'x' and 'y' coordinates represent the exterior of the Polygon and the list-of-list of holes defines two interior regions inside the polygon.

In a multi-Polygon arrangement where two Polygon geometries are separated by NaNs, the purpose of the nested format becomes a bit clearer. Here the polygon from above still has the two holes but the second polygon does not have any holes, which we declare with an empty list:

In [8]:
xs = [1, 2, 3, np.nan, 6, 7, 3]
ys = [2, 0, 7, np.nan, 7, 5, 2]

holes = [
    [[(1.5, 2), (2, 3), (1.6, 1.6)], [(2.1, 4.5), (2.5, 5), (2.3, 3.5)]],
    []
]

hv.Polygons([{'x': xs, 'y': ys, 'holes': holes}])
Out[8]:

If a polygon has no holes at all the 'holes' key may be ommitted entirely:

In [9]:
hv.Polygons([{'x': xs, 'y': ys, 'holes': holes, 'value': 0},
             {'x': [4, 6, 6], 'y': [0, 2, 1], 'value': 1},
             {'x': [-3, -1, -6], 'y': [3, 2, 1], 'value': 3}], vdims='value')
Out[9]:

Accessing the data

To access the underlying data the geometry elements (Path/Contours/Polygons) implement a split method. By default it simply returns a list of elements, where each contains only one geometry:

In [10]:
poly = hv.Polygons([
    {'x': xs, 'y': ys, 'holes': holes, 'value': 0},
    {'x': [4, 6, 6], 'y': [0, 2, 1], 'value': 1}
], vdims='value')

hv.Layout(poly.split())
Out[10]:

Using the datatype argument the data may instead be returned in the desired format, e.g. 'dictionary', 'array' or 'dataframe'. Here we return the 'dictionary' format:

In [11]:
poly.split(datatype='dictionary')
Out[11]:
[{'x': array([ 1.,  2.,  3.,  1., nan,  6.,  7.,  3.,  6.]),
  'y': array([ 2.,  0.,  7.,  2., nan,  7.,  5.,  2.,  7.]),
  'holes': [[[(1.5, 2), (2, 3), (1.6, 1.6)],
    [(2.1, 4.5), (2.5, 5), (2.3, 3.5)]],
   []],
  'value': 0,
  'geom_type': 'Polygon'},
 {'x': array([4, 6, 6, 4]),
  'y': array([0, 2, 1, 0]),
  'value': 1,
  'geom_type': 'Polygon'}]

Note that this conversion may be lossy if the converted format has no way of representing 'holes' or other data.