The ggip package provides data visualization of IP addresses and networks. It achieves this by mapping one-dimensional IP data onto a two-dimensional grid.
This introductory vignette gives a quickstart guide on how to use ggip functions to generate plots.
The central component of any ggip plot is the coordinate system, as specified by
coord_ip(). It determines exactly how IP data is mapped to the 2D grid, and ensures this mapping is applied consistently across all plotted layers.
coord_ip() function must be called once per plot, and takes arguments:
||The region of IP space visualized by the entire 2D grid. By default, the entire IPv4 space is visualized. This argument allows you to zoom into a region of IP space or to visualize IPv6 space.|
||The network prefix length corresponding to a pixel in the 2D grid. Increasing this argument effectively improves the resolution of the plot.|
||The path taken across the 2D grid while mapping IP data.|
More details about this mapping are found in
Behind the scenes, ggip searches the plotted data sets for any columns that are
ip_network() vectors. For each matching column, it replaces the vector with a data frame containing both the original IP data and the mapped Cartesian coordinates. This means the plotted data set now contains a nested data frame column.
As an example, consider a data set featuring two columns. The
label column is a character vector and the
address column is an
#> # A tibble: 3 x 2 #> label address #> <chr> <ip_addr> #> 1 A 0.0.0.0 #> 2 B 192.168.0.1 #> 3 C 255.255.255.255
This data set is transformed such that the
address column is now a data frame. It contains an
ip column with the original
ip_address() vector, and
y columns with the Cartesian coordinates on the 2D grid.
#> # A tibble: 3 x 2 #> label address$ip $x $y #> <chr> <ip_addr> <int> <int> #> 1 A 0.0.0.0 0 255 #> 2 B 192.168.0.1 214 142 #> 3 C 255.255.255.255 255 255
These transformed data frame columns are available when specifying aesthetics. The nested columns can be accessed using the usual
$ syntax (see examples below).
Layers from ggplot2 and other external packages don’t know about the internal data transformation used by ggip. For this reason, these layers expect their positional aesthetics (e.g.
y) to be specified explicitly. Fortunately, we can extract the Cartesian coordinates from our data frame columns using the
As an example, we plot an
ip_address() vector as points accompanied by labels. Note that we’ve specified the
label aesthetics at the top level of the plot, and then the
geom_label() layers have picked them up later.
tibble(address = ip_address(c("0.0.0.0", "188.8.131.52", "192.168.0.1"))) %>% ggplot(aes(x = address$x, y = address$y, label = address$ip)) + geom_point() + geom_label(nudge_x = c(10, 0, -10), nudge_y = -10) + coord_ip(expand = TRUE) + theme_ip_light()
Similarly, we plot
ip_network() vectors using layers corresponding to rectangles.
iana_ipv4 %>% ggplot(aes(xmin = network$xmin, ymin = network$ymin, xmax = network$xmax, ymax = network$ymax)) + geom_rect(aes(fill = allocation)) + scale_fill_brewer(palette = "Accent", name = NULL) + coord_ip() + theme_ip_dark()
Note: There are small gaps between the rectangles because networks are mapped onto a 2D grid (i.e. discrete), whereas ggplot2 visualizes the continuous 2D plane. This can be resolved by adding/subtracting 0.5 to the positional aesthetics. However, this gap is often helpful to distinguish networks.
Layers from ggip do know about the internal data transformation, so they take an
ip aesthetic corresponding to the data frame column. They can then automatically extract the relevant positional information. This is easier because the name of the data frame column is also the name of the original
ip_network() column in the input data set.
As an example, we plot a heatmap of an
tibble(address = sample_ipv4(10000)) %>% ggplot(aes(ip = address)) + stat_summary_address() + scale_fill_viridis_c(guide = "none") + coord_ip() + theme_ip_dark()