The `Mapper`

package provides an R implementation of the *Mapper* framework (see 1). The package includes:

Efficient implementations of

*Mapper*components using RcppPractical default filters, covers, and other settings for those unfamiliar with

*Mapper*Composable API via method chaining

Pre-configured tools for visualizing and interacting with

*mappers*

The package is designed to make modifying or extending the *Mapper* method simple and efficient, *without* limiting its generality.

Install the `Mapper`

package from github as follows:

```
require("devtools")
devtools::install_github("peekxc/Mapper")
```

A CRAN release is planned for the near future.

*Mapper* takes as input a point cloud \(X\) and a reference map \(f : X \to Z\), and returns a topological summary of \(X\) expressed via a cover equipped to the codomain of the map. For example, consider a point cloud sampled from an ‘eight-curve’ in \(\mathbb{R}^2\):

\[g(t) = [\cos(t), \sin(t)\cos(t)],\; t \in \Big(-\frac{1}{2}\pi, \frac{3}{2}\pi\Big)\] In the example below, the data set \(X\) is created from equally spaced samples over \(t\), and the map chosen is simply the \(x\)-coordinate of the shape, i.e. \(f(X) = Z = x_1\).

```
t <- seq(-0.5*pi, (3/2)*pi, length.out = 100) + runif(100, min=0.01, max = 0.02)
eight <- cbind(x1=cos(t), x2=sin(t)*cos(t))
f_x <- matrix(cos(t))
## View the data along with the mapping
layout(matrix(1:2, nrow = 1))
plot(eight, pch = 20, col = bin_color(f_x), main = expression(X %subset% R^2))
stripchart(f_x, pch = "|", main = expression(f(X) %subset% R))
points(cbind(f_x, 1), pch = "|", col = bin_color(f_x), cex = 2)
```

Below is a illustrative example of how one may go about constructing a *mapper*.

```
m <- MapperRef$new(X = eight)$
use_filter(filter= f_x)$
use_cover(cover = "fixed interval", number_intervals = 5, percent_overlap = 20)$
use_clustering_algorithm(cl = "single")$
construct_k_skeleton(k = 1L)
print(m)
#> Mapper construction with (7, 8) (0, 1)-simplices
#> Fixed Interval Cover: (number intervals = [5], percent overlap = [20]%)
```

There are multiple options one may use to visualize **mapper**. A default plotting method is available using an `igraph`

determined layout:

`plot(m$simplicial_complex)`

For other visualization options, see below.

Almost any component of the Mapper method can be customized.

Want to change the metric? Pass the name of any proximity measure used in the proxy package.

```
## See ?proxy::pr_DB for more details.
m$use_distance_measure("manhattan") ## This is stored as m$measure
```

Prefer a different linkage criteria to cluster with? Any of the criteria used by hclust can be swapped in.

`m$use_clustering_algorithm(cl = "average")`

Or, just replace the clustering algorithm entirely by supplying a function.

```
## If using a custom metric, just compute the distance matrix
## pid := open set index
## idx := point indices in the preimage of the open set indexed by 'pid'
## self := mapper instance object ('m' in this case)
m$use_clustering_algorithm(cl = function(pid, idx, self){
dist_x <- dist(self$X(idx), method = self$measure)
hc <- hclust(dist_x, method = "average")
eps <- cutoff_first_threshold(hc)
cutree(hc, h = eps)
})
```

If you prefer a different covering, just assign a valid object inheriting from `CoverRef`

.

A list of available covering methods, their correspondings parameters, and their generators can be printed as follows:

```
covers_available()
#> Typename: Generator: Parameters:
#> fixed interval FixedIntervalCover number_intervals, percent_overlap
#> restrained interval RestrainedIntervalCover number_intervals, percent_overlap
#> ball BallCover epsilon
```

Alternatively, you can create your own cover. See the article on how to make custom cover.

Prior to constructing the simplicial complex, *Mapper* requires applying the *pullback* operation. Computationally, the pullback applies the clustering algorithm to subsets of the data given by the cover, which decomposes the data set into connected components. In Mapper, these connected components are represented as vertices. To view which vertices are mapped from the sets in the cover, use the `pullback`

member:

```
m$construct_pullback()
str(m$pullback)
#> List of 10
#> $ (1) : num 0
#> $ (2) : num [1:2] 1 2
#> $ (3) : num [1:2] 3 4
#> $ (4) : num 5
#> $ (5) : num 6
#> $ (6) : num 7
#> $ (7) : num 8
#> $ (8) : num [1:2] 9 10
#> $ (9) : num [1:2] 11 12
#> $ (10): num 13
```

The vertices are stored are stored as a named list. Each vertex contains a vector of the indices that representing the points that comprise the connected component.

```
sapply(m$vertices, length)
#> 0 1 2 3 4 5 6 7 8 9 10 11 12 13
#> 25 9 9 7 7 14 15 13 12 7 7 10 10 26
```

Once you’re satisfied with the clustering, you can construct the nerve, the principal output of *Mapper*. The complex is stored in a Simplex Tree (see 2), which available via the `$simplicial_complex`

member. Initially, the complex is empty:

The maximum dimension of the nerve is up to you. It’s common restrict the *mapper* to \(1\)-skeleton.

```
m$construct_nerve(k = 0L)
m$construct_nerve(k = 1L)
plot(m$simplicial_complex)
```

By default, the `construct...`

series of functions enact side-effects and return the instance invisibly, making them suitable to chain. If you want to inspect the result before modifying the instance, pass `modify=FALSE`

. For example, to list the vertices that have a non-empty intersection:

```
m$construct_nerve(k = 1L, modify = FALSE)
#> [,1] [,2]
#> [1,] 0 1
#> [2,] 0 2
#> [3,] 1 3
#> [4,] 2 4
#> [5,] 3 5
#> [6,] 4 5
#> [7,] 5 6
#> [8,] 6 7
#> [9,] 7 8
#> [10,] 8 9
#> [11,] 8 10
#> [12,] 9 11
#> [13,] 10 12
#> [14,] 11 13
#> [15,] 12 13
```

The \(1\)-skeleton can be exported to any of the usual graph-type data structures.

To get a quick overview of what the **mapper** looks like, you can use the default plotting method above given by the simplextree package.

Alternatively, the \(1\)-skeleton can be automatically converted to igraph objects and customized as needed.

For more interactive visualization options, consider the (experimental) pixiplex package.

1. Singh, Gurjeet, Facundo Mémoli, and Gunnar E. Carlsson. “Topological methods for the analysis of high dimensional data sets and 3d object recognition.” SPBG. 2007.

2. Boissonnat, Jean-Daniel, and Clément Maria. “The simplex tree: An efficient data structure for general simplicial complexes.” Algorithmica 70.3 (2014): 406-427.