The **Mapper** package provides a configurable implementation of the Mapper framework (see 1). The package includes:

Efficient implementation of the Mapper framework

Multiple options for visualizing and interacting with

*mappers*Practical defaults for those unfamiliar with Mapper

Expressive parameterization interface through method chaining

Portable R6 generators for Mappers components

The package was designed to make modifying or extending the Mapper method simple and efficient, without limiting its generality.

Install the **Mapper** package from github as follows:

```
require("devtools")
devtools::install_github("peekxc/mapper")
```

A CRAN release is planned for the near future.

Mapper takes as input a point cloud \(X\) and a reference map \(f : X \to Z\), and returns a topological summary of \(X\) expressed via a cover equipped on the codomain of the map. Here’s an example where points are sampled from an ‘eight-curve’ in \(\mathbb{R}^2\):

\[g(t) = [\cos(t), \sin(t)\cos(t)],\; t \in \Big(-\frac{1}{2}\pi, \frac{3}{2}\pi\Big)\] In the example below, \(X\) is created from equally spaced samples over \(t\), and the map chosen is simply the \(x\)-coordinate of the shape, i.e. \(f(X) = Z = x_1\).

```
t <- seq(-0.5*pi, (3/2)*pi, length.out = 100)
eight <- cbind(x1=cos(t), x2=sin(t)*cos(t))
f_x <- matrix(cos(t))
## Use a rainbow palette
rbw_col <- rainbow(length(f_x), start = 0, end = 4/6)
col <- rbw_col[cut(f_x, length(f_x), labels = FALSE)]
## View the data along with the mapping
layout(matrix(1:2, nrow = 1))
plot(eight, pch = 20, col = col, main = expression(X %subset% R^2))
stripchart(f_x, pch = "|", main = expression(f(X) %subset% R))
points(cbind(f_x, 1), pch = "|", col = col, cex = 2)
```

Below is one possible way to construct a *mapper*.

```
m <- MapperRef$
new(X = eight)$
use_cover(filter_values = f_x,
typename = "restrained rectangular",
number_intervals = 10, percent_overlap = 50)$
use_distance_measure(measure = "euclidean")$
use_clustering_algorithm(cl = "single")$
compute_k_skeleton(k = 1L)
print(m)
```

```
## Mapper construction for 100 objects
## Cover: (typename = Restrained Rectangular, number intervals = [10], overlap = [50])
```

```
Mapper construction for 100 objects
Cover: (typename = Restrained Rectangular, number intervals = [10], overlap = [50])
```

There are multiple options one may use to visualize **mapper**. The easiest is to export the \(1\)-skeleton to a mature graph plotting library, such as `igraph`

:

```
am <- m$simplicial_complex$as_adjacency_matrix()
g <- igraph::graph_from_adjacency_matrix(am, mode = "undirected", add.colnames = NA)
avg_filter_val <- sapply(m$vertices, function(idx) mean(f_x[idx,]))
plot(g, layout=igraph::layout_with_fr(g, niter = 5500),
vertex.size = 18L, vertex.color = bin_color(avg_filter_val), label.cex = 0.3)
```

For other visualization options, see the section below.

Almost any component of the Mapper method can be customized.

Want to change the metric? Pass the name of any proximity measure used in the proxy package.

```
## See ?proxy::pr_DB for more details.
m$use_distance_measure("manhattan") ## This is stored as m$measure
```

Prefer a different linkage criteria to cluster over? Any of the criteria used by hclust can be swapped in.

`m$use_clustering_algorithm(cl = "average", num_bins = 10L)`

Or, just replace the clustering algorithm entirely

```
## If using a custom metric, just compute the distance matrix
m$use_clustering_algorithm(cl = function(X, idx, num_bins=10){
dist_x <- dist(X[idx,], method = self$measure)
hc <- hclust(dist_x, method = "average")
cutoff_first_bin(hc, diam = max(dist_x), num_bins)
})
```

If you prefer a different covering, just assign a valid object inheriting from `CoverRef`

.

```
m$use_cover(filter_values = f_x,
typename="fixed rectangular",
number_intervals = 10L, percent_overlap=50)
```

A list of available covering methods, their correspondings parameters, and their generators can be printed as follows:

```
## Typename: Generator: Parameters:
## fixed rectangular FixedRectangularCover number_intervals, percent_overlap
## restrained rectangular RestrainedRectangularCover number_intervals, percent_overlap
## adaptive AdaptiveCover number_intervals, percent_overlap, quantile_method
## ball BallCover epsilon
```

Alternatively, you can create your own cover. See the article on how to make custom cover.

The cover is composed of subsets of the data indexed by an **index set**

`print(m$cover$index_set)`

`## [1] "(1)" "(2)" "(3)" "(4)" "(5)" "(6)" "(7)" "(8)" "(9)" "(10)"`

These indices can be used as keys into the collection of sets comprising the cover.

`## [1] TRUE`

This can be useful if you want to only update only part of the complex for some reason, e.g. re-cluster part of the data with different clustering parameters. For example, to re-cluster the data associated with the preimages the first two indices, you can do:

```
clustering_params <- list(num_bins=5L)
m$compute_vertices(which_levels = m$cover$index_set[1:2], clustering_params)
```

The same rule applies to computing the edges.

```
idx_pairs <- t(apply(combn(5, 2), 2, function(ls_idx) { m$cover$index_set[ls_idx] }))
m$compute_edges(which_level_pairs = idx_pairs)
```

If the construction has too many edges, you could require the intersection size be above a certain threshold (default is 1). This isn’t technically part of Mappers definition, but could be useful in practical applications.

`m$compute_edges(min_weight = 10)`

The simplicial complex is available via the `$simplicial_complex`

member, and the points in contained in each vertex is available via the `$vertices`

member.

`## Simplex Tree with (15, 15) (0, 1)-simplices`

`head(m$vertices, 2)`

```
## $`0`
## [1] 63 64 65 66 67 68 69 70 71 72 73 74 75 76
## attr(,"level_set")
## [1] 1
##
## $`1`
## [1] 59 60 61 62 63 64 65 66 67 68
## attr(,"level_set")
## [1] 2
```

The complex is stored in a Simplex Tree (see 2), and the vertices are stored as a list. Each vertex also has an attribute “level_set” denoting the index in the index set of the cover its associated with.

The \(1\)-skeleton can be exported to any of the usual graph-type data structures.

To get a quick overview of what the graph looks like, you can convert to an igraph object. By default, the vertices are sized logarithmically according to how many points they have in them and colored based on the mean value of the points \(f\) values. Each vertex is given a label of the form “x:y” where ‘x’ denotes the vertex id, and ‘y’ denotes the number of points in the vertex.

More more interactive plotting methods, the **mapper** can be converted to a grapher object.

```
## Visualize interactively w/ grapher
library("grapher")
m$as_grapher(construct_widget = TRUE) %>% center()
```

- Singh, Gurjeet, Facundo Mémoli, and Gunnar E. Carlsson. “Topological methods for the analysis of high dimensional data sets and 3d object recognition.” SPBG. 2007.
- Boissonnat, Jean-Daniel, and Clément Maria. “The simplex tree: An efficient data structure for general simplicial complexes.” Algorithmica 70.3 (2014): 406-427.

- Report a bug at

https://github.com/peekxc/mapper

- file LICENSE

- Matt Piekenbrock

Author, maintainer, copyright holder - All authors...