Visualizing building footprint areas

2 minute read

This Quarto doc walks through the steps of porting data over from R to Python and back again: extracting the data, applying an area-packing algorithm to it, and then visualizing it.

Load the dataPermalink

Read in the data from the EUBUCCO website (this takes a relatively long time):

library(sf)
dat_full <- read_sf('v0_1-SVN.gpkg')

This dataset is > 1 million rows. For the purposes of this visualization, I’m only taking 1,000 randomly sampled rows:

dat <- dat_full[sample(nrow(dat_full), 1000), ]  

Calculate area of each polygonPermalink

Each row in the geom column contains a polygon. We can use an sf package function to calculate the area of each polygon and save it in a separate column. For plotting, we’ll assume each polygon is a square (which it isn’t) and save the sides of the square as well.

library(dplyr)
dat$area <- st_area(dat$geom) 

dat <- dat |>
  mutate(side_x = sqrt(area), side_y = sqrt(area)) |>
  st_drop_geometry() |>
  # save only coordinates to pass on to python
  select(side_x, side_y)

dat$side_x <- as.numeric(dat$side_x)
dat$side_y <- as.numeric(dat$side_y)

library(knitr)
kable(head(dat))

Rectangle packingPermalink

Now, we pass these data to Python and use the rpack library to derive coordinates to pack rectangles into a plot:

dat_py = r.dat
# round to nearest integer, otherwise the rpack function doesn't work
dat_py = round(dat_py, 0).astype(int)

# convert to list of arrays
dat_list = dat_py.values.tolist()

import rpack
import pandas as pd
import numpy as np

positions = rpack.pack(dat_list)

positions = pd.DataFrame(np.row_stack(positions))
positions.columns = ['x_coord', 'y_coord']

positions = pd.concat([positions.reset_index(drop=True), dat_py], axis=1)

VisualizationPermalink

Port the Python results back to R and visualize.

library(reticulate)
library(ggplot2)
library(showtext)
library(ggtext)

data_coord <- py$positions

# combine the two dataframes
#data_final <- cbind(dat, data_coord)

data_coord$x_end = data_coord$x_coord + data_coord$side_x
data_coord$y_end = data_coord$y_coord + data_coord$side_y
# this is slightly stupid, but whatever. recalculate the area
data_coord$area <- data_coord$side_x * data_coord$side_y

color_scale <- c(
  "#271220", 
  "#24101e", 
  "#574557", 
  "#a192ad", 
  "#e0d3f3", 
  "#e0c8e5"
)

font_add_google(name = 'Major Mono Display', family = 'MajorMono')
showtext_auto()

ggplot() +
  geom_rect(data = data_coord, 
            aes(xmin = x_coord, ymin = y_coord,
                xmax = x_end, ymax = y_end,
                fill = log(area)), color = '#F2eef8',
            linewidth = .08) +
  scale_fill_gradientn(colours = rev(color_scale)) +
  labs(title = '1,000 BUILDING FOOTPRINTS',
       caption = 'Each square represents the area of one of 1,000 randomly sampled building footprints in Slovenia. <br>Plotted with rectangle-packer. Data: EUBUCCO.\t') +
  coord_fixed() +
  theme_void() +
  theme(legend.position = 'none',
        plot.background = element_rect(fill = '#F2eef8', color = NA),
        plot.title = element_text(size = 130, family = 'MajorMono',
                                  color = '#271220', hjust = 0.5),
        plot.caption = element_markdown(lineheight = .4,
                                        size = 36, 
                                        family = 'MajorMono',
                                        color = '#271220'))

Save the plot:

ggsave('monochrome_viz.png', width = 12, height = 8, dpi = 300)

building area visualization

Share on

Twitter Facebook LinkedIn

Visualizing building footprint areas

Load the dataPermalink

Calculate area of each polygonPermalink

Rectangle packingPermalink

VisualizationPermalink

Share on

You May Also Enjoy

Data science is not a backup plan: Advice for academics thinking about industry

How I replaced my existing GitHub pages website with a Jekyll site

30 day chart challenge: 2024

30 day chart challenge: 2023