CRAN_Status_BadgeAppveyor Windows Build status Travis OS X Build status Travis Linux X Build status Package status

The Mapper package provides an R implementation of the Mapper framework (see 1). The package includes:

  • Efficient implementations of Mapper components using Rcpp

  • Practical default filters, covers, and other settings for those unfamiliar with Mapper

  • Composable API via method chaining

  • Pre-configured tools for visualizing and interacting with mappers

The package is designed to make modifying or extending the Mapper method simple and efficient, without limiting its generality.

Installation

Install the Mapper package from github as follows:

require("devtools")
devtools::install_github("peekxc/Mapper")

A CRAN release is planned for the near future.

Getting started

Mapper takes as input a point cloud \(X\) and a reference map \(f : X \to Z\), and returns a topological summary of \(X\) expressed via a cover equipped to the codomain of the map. For example, consider a point cloud sampled from an ‘eight-curve’ in \(\mathbb{R}^2\):

\[g(t) = [\cos(t), \sin(t)\cos(t)],\; t \in \Big(-\frac{1}{2}\pi, \frac{3}{2}\pi\Big)\] In the example below, the data set \(X\) is created from equally spaced samples over \(t\), and the map chosen is simply the \(x\)-coordinate of the shape, i.e. \(f(X) = Z = x_1\).

t <- seq(-0.5*pi, (3/2)*pi, length.out = 100) + runif(100, min=0.01, max = 0.02)
eight <- cbind(x1=cos(t), x2=sin(t)*cos(t))
f_x <- matrix(cos(t))

## View the data along with the mapping 
layout(matrix(1:2, nrow = 1))
plot(eight, pch = 20, col = bin_color(f_x), main = expression(X %subset% R^2))
stripchart(f_x, pch = "|", main = expression(f(X) %subset% R)) 
points(cbind(f_x, 1), pch = "|", col = bin_color(f_x), cex = 2)

Below is a illustrative example of how one may go about constructing a mapper.

There are multiple options one may use to visualize mapper. A default plotting method is available using an igraph determined layout:

plot(m$simplicial_complex)

For other visualization options, see below.

Customizing Mapper

Almost any component of the Mapper method can be customized.

Want to change the metric? Pass the name of any proximity measure used in the proxy package.

## See ?proxy::pr_DB for more details.
m$use_distance_measure("manhattan") ## This is stored as m$measure

Prefer a different linkage criteria to cluster with? Any of the criteria used by hclust can be swapped in.

Or, just replace the clustering algorithm entirely by supplying a function.

If you prefer a different covering, just assign a valid object inheriting from CoverRef.

m$use_cover(cover="fixed interval", number_intervals = 10L, percent_overlap = 50)

A list of available covering methods, their correspondings parameters, and their generators can be printed as follows:

Alternatively, you can create your own cover. See the article on how to make custom cover.

Prior to constructing the simplicial complex, Mapper requires applying the pullback operation. Computationally, the pullback applies the clustering algorithm to subsets of the data given by the cover, which decomposes the data set into connected components. In Mapper, these connected components are represented as vertices. To view which vertices are mapped from the sets in the cover, use the pullback member:

The vertices are stored are stored as a named list. Each vertex contains a vector of the indices that representing the points that comprise the connected component.

Once you’re satisfied with the clustering, you can construct the nerve, the principal output of Mapper. The complex is stored in a Simplex Tree (see 2), which available via the $simplicial_complex member. Initially, the complex is empty:

The maximum dimension of the nerve is up to you. It’s common restrict the mapper to \(1\)-skeleton.

m$construct_nerve(k = 0L)
m$construct_nerve(k = 1L)
plot(m$simplicial_complex)

By default, the construct... series of functions enact side-effects and return the instance invisibly, making them suitable to chain. If you want to inspect the result before modifying the instance, pass modify=FALSE. For example, to list the vertices that have a non-empty intersection:

The \(1\)-skeleton can be exported to any of the usual graph-type data structures.

Visualizing the mapper

To get a quick overview of what the mapper looks like, you can use the default plotting method above given by the simplextree package.

Alternatively, the \(1\)-skeleton can be automatically converted to igraph objects and customized as needed.

plot(m$as_igraph(), vertex.label=NA)

For more interactive visualization options, consider the (experimental) pixiplex package.

References

1. Singh, Gurjeet, Facundo Mémoli, and Gunnar E. Carlsson. “Topological methods for the analysis of high dimensional data sets and 3d object recognition.” SPBG. 2007.

2. Boissonnat, Jean-Daniel, and Clément Maria. “The simplex tree: An efficient data structure for general simplicial complexes.” Algorithmica 70.3 (2014): 406-427.