vignettes/UsingCustomCover.Rmd
UsingCustomCover.Rmd
Perhaps one of the most important aspects of parameterizing Mapper is the choice of the cover. Given data \(X\) and a continuous map \(f : X \to Z\), the cover (often denoted as \(\mathcal{U} = \{U_\alpha\}_{\alpha \in A}\)) is constructed over the values of \(Z\). Since \(f\) is continuous, \(\mathcal{U}\) also forms a cover of \(X\). That is, the sets
\[\{ f^{-1}(U_\alpha), \alpha \in A \}\] form an open cover over \(X\), i.e. \(X \subseteq \bigcup\limits_{\alpha \in A} f^{-1}(U_\alpha)\). By applying a partial clustering to each of these open sets in \(X\), \(f^{-1}(U_\alpha)\) decomposes into a set of path-connected components. These form the \(0\)-simplices of the resulting mapper construction.
The Mapper
package theoretically can support any covering strategy, so long as a few requirements are satisfied. All covers can be constructed either directly via their R6 generators, or more succinctly using the member function associated with a object. See for more details. A few frequently used methods are included in the package. Their typenames, generators, and parameters can be listed via:
## Typename: Generator: Parameters:
## fixed interval FixedIntervalCover number_intervals, percent_overlap
## restrained interval RestrainedIntervalCover number_intervals, percent_overlap
## ball BallCover epsilon
To use a custom covering strategy, a new R6
class must be defined satisfying the following requirements:
The class inherits from the CoverRef
class.
The class overrides the construct_cover
function.
Upon calling construct_cover
, the class populates the level_sets
member with a named list whose names are uniquely indexed by the index_set
member, and whose values are integer indices that form a valid cover.
Below is an example of a simplified version of the source code used to create the BallCover
generator.
BallCoverEx <- R6::R6Class(
classname = "BallCoverEx",
inherit = CoverRef,
public = list(
epsilon=NA, ## cover parameter
initialize = function(epsilon){
super$initialize(typename="ball")
self$epsilon <- epsilon
}
)
)
First, a generator is defined that satisfies (1). The public fields represent the parameters of the cover. Note that the initializer must be overridden so that the filter values along with the unique ‘typename’ of the cover are first passed to the parent CoverRef
initializer. When that happens, a few checks are performed, and the following private fields become available to the derived class via the private
namespace.
private$
__.level_sets:__
list of data indices that intersect the open sets
__.index_set:__
character vector of keys that uniquely index the level sets
__.typename:__
string identifier of the covering method
Additional information can be found in the documentation of the base class, see ?CoverRef
for more details.
The only other required function to override is the construct_cover
method. Below is an example that uses the RANN
package to construct a cover.
BallCoverEx$set("public", "construct_cover", function(filter, index=NULL){
## Get filter values
fv <- filter()
f_dim <- ncol(fv)
f_size <- nrow(fv)
## Construct the balls
ball_cover <- RANN::nn2(fv, query = fv, searchtype = "radius", radius = self$epsilon)
## Union them together
ds <- union_find(f_size)
apply(ball_cover$nn.idx, 1, function(idx){
connected_idx <- idx[idx != 0] - 1L
if (length(connected_idx) > 0){ ds$union_all(connected_idx) }
})
## Construct the intersections between the open sets and the data
cc <- ds$connected_components()
self$index_set <- as.character(unique(cc))
ls <- lapply(self$index_set, function(idx){ which(cc == as.integer(idx)) })
self$level_sets <- structure(ls, names=self$index_set)
## If index is requested, return the preimages at that index,
## otherwise return the instance invisibly
if (!missing(index)){ return(self$level_sets[[index]]) }
invisible(self)
})
A disjoint-set data structure is used to track the intersections between each ball (see ?union_find
for usage details). Note how the index set uniquely indexes the open sets. Although in this case the balls don’t necessarily represent the classical notion of a “level set”, the member is named that way due to their relaxed interpretation as such in the other covers.
The above example is all one needs to add use a cover not included in the package. However, there are other methods or extensions that may be worth implementing. For example, the above example cover isn’t type-safe. A better example would use active bindings to make fields which can be assigned to like regular member fields, but also type-check on assignment.
Another important method that can be overridden is the $level_sets_to_compare
member, which is inherited by the base class. This method controls which of the open sets to compare (pairwise) in computing the \(1\)-skeleton. By default, it returns all pairwise combinations of indices, which can obviously be greatly improved depending on how the cover is parameterized!
Once you have a cover generator, you can check if its compatible with the package using the $validate
function. If it does, it should silently return.
If you do design a new covering method not used in the package, consider submitting a pull request incase others might find it useful!