Fraction-Score: A Generalized Support Measure for Weighted and Maximal Co-Location Pattern Mining

Co-location patterns, which capture the phenomenon that objects with certain labels are often located in close geographic proximity, are defined based on a support measure which quantifies the prevalence of a pattern candidate in the form of a label set. Existing support measures share the idea of counting the number of instances of a given label set <inline-formula><tex-math notation="LaTeX">$C$</tex-math><alternatives><mml:math><mml:mi>C</mml:mi></mml:math><inline-graphic xlink:href="chan-ieq1-3304365.gif"/></alternatives></inline-formula> as its support, where an instance of <inline-formula><tex-math notation="LaTeX">$C$</tex-math><alternatives><mml:math><mml:mi>C</mml:mi></mml:math><inline-graphic xlink:href="chan-ieq2-3304365.gif"/></alternatives></inline-formula> is an object set whose objects collectively carry all labels in <inline-formula><tex-math notation="LaTeX">$C$</tex-math><alternatives><mml:math><mml:mi>C</mml:mi></mml:math><inline-graphic xlink:href="chan-ieq3-3304365.gif"/></alternatives></inline-formula> and are located close to one another. However, they suffer from various weaknesses, e.g., fail to capture all possible instances, or overlook the cases when multiple instances overlap. In this paper, we propose a new measure called Fraction-Score which counts instances <italic>fractionally</italic> if they overlap. Fraction-Score captures all possible instances, and handles the cases where instances overlap appropriately (so that the supports defined are more meaningful and anti-monotonic). We develop efficient algorithms to solve the co-location pattern mining problem defined with Fraction-Score. Furthermore, to obtain representative patterns, we develop an efficient algorithm for mining the maximal co-location patterns, which are those patterns without proper superset patterns. We conduct extensive experiments using real and synthetic datasets, which verified the superiority of our proposals.

but also location information about their habitats; in urban areas, point-of-interests (POIs) such as restaurants and shops are also associated with some labels such as their business types and brands as well as their locations (e.g., in Google Maps); also in epidemiology, patients are usually recorded with not only demographic information like their jobs, ages and races, but also location information like their home addresses.We call an object as an instance of a label if the object carries the label.One interesting pattern on these objects is the co-location pattern [18], [19], [26], [28].A co-location pattern corresponds to a set of labels whose instances are frequently located in a close geographic proximity (i.e., the instances are within a distance d from each other).As an example, Snack Bar shops and Beauty Salon shops are often found located near each other [26], forming a co-location pattern.
Similar to frequent itemsets in the context of transaction data [1], co-location patterns are defined based on a support measure, which quantifies how frequently those instances of the labels in a given label set are located closely.In the context of transaction data, the support of an itemset is defined as the number of transactions that contain all objects in the itemset.Unfortunately, this definition cannot be straightforwardly adapted to our context since there exist no explicit transactions in spatial data.
We say that a set of objects is an instance of a label set if the objects carry all labels in the label set and are located within distance d from each other.The challenge of defining the support properly is mainly due to the fact that different instances of a label set usually overlap with each other, and this leads to a dilemma that enumerating all instances would over-count the support while using heuristics would miss some instances completely.Fig. 1 shows an example.Both sets {R 7 , C 1 } and {R 8 , C 1 } are instances of the label set {restaurant, church}.However, the two sets are overlapped by the object C 1 .In the literature, several support measures for co-location patterns have been proposed, namely partitioning-based [28], construction-based [26], enumeration-based [18], [19], [28], and participation-based [18], [19], [28], [44], [45], [46].The major idea shared by these approaches is to count for a given label set the number of its instances for measuring the support.However, as will be discussed in Section II, they all suffer from various weaknesses such as missing or over-count instances, or is not anti-monotonic.
An instance is said to be a row instance if it does not have a proper subset which is also an instance of the same label set.For example, the set {R 7 , C 1 } is a row instance of the label set {restaurant, church} in Fig. 1, while {R 7 , R 8 , C 1 } is not.In our prior work [8], we propose a new support measure called Fraction-Score which puts all possible row instances into different groups then counts the groups.Specifically, it selects a label and then puts all row instances sharing the same object with the selected label in the same group.Compared to the participation-based approach that also groups the row instances (to be detailed in Section II), Fraction-Score avoids the over-counting problem.The major idea is to count each group as a fractional unit of prevalence instead of an entire one, where the fraction value is calculated by amortizing the contribution of an object among all the row instances that the object is involved in.
Here, we briefly illustrate how the fraction values are calculated (the detailed definitions will be introduced in Section III).Consider Fig. 1 and the label set {restaurant, church}.Suppose that label "restaurant" is the label used for grouping the row instances.In this case, there would be eight groups, formed by R 1 -R 8 , respectively.Consider the group formed by R 1 .It involves only one row instance, namely {R 1 , C 1 }.The fraction associated with the group by R 1 would be set to 1/8, and the intuition is that it involves an object C 1 and there are 8 groups (or objects involving the label "restaurant", namely R 1 -R 8 ) that share C 1 and thus, each of the groups (including the one by R 1 ) would be associated with a fraction 1/8 (of C 1 ).Similarly, the fraction associated with each group by R 2 -R 6 would be set to 1/8.The fraction associated with the group by R 7 would be set to 1, which is explained as follows.First, the row instances in this group, namely {R 7 , C 1 }, {R 7 , C 2 }, and {R 7 , C 3 }, involve three churches, namely C 1 , C 2 , and C 3 .Second, the fractions w.r.t.these objects are 1/8, 1/2, and 1/2, respectively (the fraction 1/8 of C 1 could be explained as above, the fraction 1/2 of C 2 (C 3 ) could be explained by the fact that C 2 (C 3 ) is shared by two groups, namely those by R 7 and R 8 ).Third, the fractions are first aggregated (using a sum function) and then bounded by 1 (using a min function) simply because each group cannot be counted as more than one unit.Similarly, the fraction associated with the group by R 8 is 1.
The sum of fractions, 1/8 • 6 + 1 + 1 = 2.75, corresponds to the support of {restaurant, church} by Fraction-Score.This is more meaningful than 8 that is the support defined by the participation-based approach, which we will see shortly in Section II, since indeed there are roughly three units of prevalence of the label set (one in left region, one in the top-right region, and one in the middle region which overlaps with the other two).
The example above illustrates the cases in an unweighted dataset.However, in some cases, each object contains a weight attribute which quantifies its importance.For example, the Neu-roSynth dataset [39] (details will be given in Section VI-A) contains a mapping between labels (e.g., "depression" and "anxiety") and the activated locations in the brain (i.e., location).Each object weight is a relevance score between the label and the location.As a generalization of the definition in [8] that defined Fraction-Score based on unweighted objects, our Fraction-Score proposed in this journal extension also works well on these weighted datasets, since it seamlessly captures object weights in its support definition.The unweighted case proposed in [8] is a special case with all weights equal to 1.
Moreover, as will be shown later, the support defined by Fraction-Score satisfies the desirable anti-monotonicity property.Based on Fraction-Score, we define co-location patterns using a pre-set parameter minimum support.
Since Fraction-Score satisfies the anti-monotonicity property, we adopt an Apriori-like algorithm for mining the co-location patterns.One key component of the algorithm is to compute the support of a given label set C, which is not as straightforward in our case as in the transaction data scenario.To compute C's support, we design an algorithm, where a basic operation is to decide whether there exists a row instance of C, which involves a particular object.We show that the decision problem of this operation is NP-hard (w.r.t.|C|).In fact, this operation is also necessary when the supports defined by the participation-based approach [18], [19], [28], [44], [45], [46] are computed, and it is solved by materializing all row instances of C there.Nevertheless, we observe that the complete materialization is an overkill since the operation could be finished by just finding one row instance involving the object if there exists one.Besides, we notice that though the decision problem in general is NP-hard, it can be easily solved in certain cases.Motivated by these observations, we design a filtering-and-verification approach for the decision problem, which performs a few efficient pre-checking procedures (i.e., filtering) for cases where the decision problem could be answered easily, and performs a verification procedure for those remaining cases.Note that the algorithm improved over the one in [8] in both memory usage and efficiency by additionally including a memory-saving strategy and filtering and pruning steps.
In addition, we found that the number of patterns returned is large in some cases, which might cause difficulty for users to interpret the results.Thus, we study the maximal co-location patterns [43] mining problem based on Fraction-Score, where a pattern is maximal if it has no proper superset pattern.It is particularly useful when we want to obtain a smaller set of patterns that can concisely represent all the co-location patterns.We propose an efficient algorithm for mining all maximal co-location patterns.The major idea is to generate candidate maximal patterns from the size-2 patterns, and verify them in a top-down manner.Thus, it avoids the unnecessary computations in the above Apriori-like algorithm which is designed for mining all patterns.Compared to the existing maximal pattern mining algorithm [38] that generates the candidate patterns from size-2 instance table, we do not need to materialize the instances.In the verification, the filtering-and-verification approach is also adopted, with an additional filter for better efficiency.
The contributions of this paper are summarized as follows.
r We show the weaknesses of existing support measures and propose a new and better one called Fraction-Score, which avoids the weaknesses and satisfies the desirable anti-monotonicity property.
r For a fundamental operation involved in mining the co- location patterns, we provide hardness results and design an efficient algorithm.
r We propose an efficient algorithm for answering the maxi- mal co-location pattern mining problem based on Fraction-Score.
r We conducted extensive experiments on both real and syn- thetic datasets, which showed the superiority of Fraction-Score as well as the efficiency of the proposed algorithms.This journal extension adds substantial new technical contributions over [8] by (1) generalizing the definition of Fraction-Score to be applicable on weighted objects (Section III-C); (2) improving the algorithms to be more memory-saving and efficient (Section IV); (3) proposing an efficient algorithm to find the maximal patterns (Section V); (4) including an additional real dataset NeuroSynth [39] to evaluate our algorithms (Section VI); and (5) releasing the source code of our algorithms. 1he rest of the paper is organized as follows.Section II reviews some related work.Section III gives the formal definition of Fraction-Score and defines our problems.Section IV adopts an Apriori-like algorithm for mining the co-location patterns and introduces an algorithm for computing the support defined by Fraction-Score.Section V discusses the maximal co-location pattern mining based on Fraction-Score.Section VI presents the experimental results.Section VII concludes the paper and provides some future directions.

A. Support Measures for Co-Location Pattern Mining
The co-location pattern mining problem has been studied extensively using different support measures.We illustrate the weaknesses of different approaches as follows.Partitioning-based approach [28] uses a grid to partition the space into many cells, constructs for each cell a transaction involving all objects within the cell, and then defines supports based on the generated transactions as if they are on conventional transaction data [1].With this approach, only those instances within individual cells are considered, while those across cells are missed since two objects within distance d but across cell boundaries are ignored.
Construction-based approach [26] constructs instances of a given label set heuristically and counts the number of constructed instances as the support.This approach is not robust simply because some instances of a label set might be missed due to the heuristic nature.
Enumeration-based approach [18], [19], [28] counts for a given label set all its row instances.With this approach, no instances can be missed, but the support definition is not anti-monotonic and counter-intuitive.That is, the support of a label set is larger than that of its subset, which breaks the anti-monotonicity property that is important both to make sense semantically, and to enable the design of efficient algorithms for frequent pattern mining.The insight into the problem is that this approach may reuse one object in many row instances, and since the object contributes wholly to every row instance that it is involved in, the support is over-measured.Due to this problem, the supports defined by this approach are not used on their own, but as components for defining the confidence of a rule candidate [18], [19], [28].
Participation-based approach [18], [19], [28], [44], [45], [46] considers all possible row instances, but instead of counting each individual row instance, it puts the row instances into different groups and then counts the groups.Specifically, it selects a label and then puts all row instances sharing the same object with the selected label in the same group.The rationale is that all row instances within a group are counted as one unit of prevalence since they are all based on the same object with a particular label.Nevertheless, in cases where some row instances across different groups share an object, this approach would count them as multiple units of prevalence (one for each group), i.e., the object's contribution is over-counted.To illustrate, consider Fig. 1.Consider the label set {restaurant, bank, church}.Suppose that the label "restaurant" is the label used for grouping the row instances.There would be eight groups, each based on a restaurant R 1 -R 8 .Within each group, all row instances contain the same restaurant.Thus, the support defined by the participationbased approach would be equal to 8. Nevertheless, among these eight groups, many share objects with labels of "bank" and/or "church" (e.g., {R 3 , B 1 , C 1 } and {R 6 , B 1 , C 1 } are two row instances from two different groups since they contain different restaurants but they share their restaurant and church, i.e., B 1 and C 1 ).In this case, the prevalence is over-measured.
Note that the partitioning-based, construction-based and participation-based approaches can be adapted to handle weighted objects.The details can be found in Appendix A, available online.
In [49], Zhang et al. proposed to improve the efficiency of colocation pattern mining by adopting a multi-way join approach.In [20], Huang et al. developed a FP-tree based algorithm for the co-location pattern mining problem.Motivated by the fact that it is expensive to generate row instances of a size-(k + 1) label set via joining the row instances of two size-k label sets, in [44], [45], [46], the authors proposed some partial join and joinless techniques which materialize some transactions of spatial objects such that those row instances within transactions could be generated without the join process [28], but for those row instances across different transactions, they still use the join operation.In [4], Boinski and Zakrzewicz developed a new method to efficiently process co-location pattern queries using materialized, improved candidate pattern instance tree (iCPI-tree).

B. Condensed Co-Location Pattern Mining
In [43], Yoo and Bow studied the closed top-k co-location pattern mining problem.The authors also studied the maximal co-location pattern mining problem [42].In [38], Yao et al. proposed to construct a graph based on size-2 co-location patterns, and then find maximal cliques as the maximal co-location pattern candidates for better efficiency.In [23], Liu et al. studied the problem of summarizing co-location patterns.In [31], Wang et al. proposed a redundancy reduction for co-location patterns.All these studies aimed at finding a representative set of patterns that is of a smaller size.However, their definitions and methods are designed based on the participation-based measures, and thus cannot be used in our Fraction-Score.

C. Variants of Co-Location Pattern Mining
Some works defined the spatial co-location pattern based on regions and polygons.In [35], Xiong et al. presented a framework for mining co-location patterns for extended spatial objects, e.g., polygons and line strings.In [10], Ding et al. studied the problem mining regional (or local) co-location patterns.In [11], Eick et al. studied the problem of finding regions that each represented as a set of spatial objects by using a clustering-like algorithm where the interestingness score of a region is based on how much the objects representing the region have their continuous values co-related with each other.In [12], [13], the authors studied of finding co-location patterns where a set C of spatial labels corresponds to a pattern if the clusterings each based on the objects with a spatial label in C have at least a certain degree of overlap which is captured by the area intersected by the polygons formed based on the clusters.In [6], Celik et al. proposed to find zonal or local co-location patterns which represent subsets of label types that are frequently located in a subset of space (i.e., zone).In [33], Wang et al. studied the problem of finding regions that each represented by a set of cells linking with each other where two labels co-occur more frequently than globally.In [25], Long et al. proposed to find the co-location patterns from regional objects, and defined the proximity relationship between instances by their overlapping area.
Some other studies related to the co-location pattern mining problem are reviewed as follows.In [22], Koperski and Han aimed to find strong association rules where a rule indicates certain association relationship among a set of spatial and possibly non-spatial predicates.In [3], Barua and Sander studied the problem of finding statistically significant co-location patterns based on hypothesis testing, where some models are assumed which limits its application scope.In [21], Huang and Zhang proposed to cluster on the set of spatial labels where the similarity between two labels is measured with some spatial statistical functions [9].In [37], Yang et al. studied the co-location pattern mining problem with the consideration of distance decay effects and also the direction information.

III. FRACTION-SCORE AND PROBLEM DEFINITION
Section III-A introduces some notations.Section III-B gives an overview of Fraction-Score, and Section III-C presents its formal definition.Section III-D defines our problems.

A. Notations
Let O be a set of n objects.Each object o ∈ O has a location o.λ, a weight o.w in range [0,1] that represents the importance of the object, and also a set of (categorical) labels (e.g., a shop brand name such as Starbucks).For ease of presentation, we assume that each object o has only one single label, denoted by o.t, but the concepts and algorithms introduced in this paper can easily be applied to the general case by making some duplications of each object with multiple labels, each with one label.For example, object A 1 in Fig. 2 has the label • and a weight 0.8.
Let T be the set of all possible labels of the objects, i.e., T = {o.t|o∈ O}.Let O t be the set of objects with label t, i.e., O t = {o|o.t= t}.Given a label t, we use W t to denote the sum of weights of the objects in O t , i.e., W t = o∈O t o.w, and W max to denote the largest W t among all t ∈ T .
Given two objects o and o , we denote the distance between them by d(o, o ).Depending on the applications, different metrics such as euclidean distance and Haversine distance could be used for defining the distance.For ease of illustration, we use euclidean distance in this paper.Given a set S of objects,  (o, r) the disk with its center at o.λ and its radius equal to r.Given a label set C, a set S of objects is said to be an instance of C if S is a neighbor set and covers all labels in C (i.e., C ⊆ {o.t|o ∈ S}).An instance of C is said to be a row instance of C if none of its proper subsets is an instance of C. The main notations that are used throughout the paper are summarized in Table II.

B. Overview of Fraction-Score
Same as the participation-based approach, Fraction-Score groups the row instances of C by the objects with a given label t in C, i.e., all row instances involving the same object with label t are put in the same group.Note that this is always possible since each row instance involves exactly one object with the label t since otherwise, a subset of it will also be a row instance, a contradiction.To solve the over-counting problem when instances across different groups share an object, says o , with a label t other than t, Fraction-Score assigns a fraction of o to each group among all groups whose row instances share o .This fraction is equal to o .wdivided by the total number of such groups.That is, Fraction-Score splits object weight o .winto some equal fractions and distributes these fractions to all groups of row instances that share o .Note that for each label other than t in C, the object o (and essentially the corresponding group of row instances) may receive multiple fractions since there are multiple objects other than o in the group that might be shared by other groups.We use an appropriate aggregation function on these fractions which gives an aggregated one for the object o (or equivalently the corresponding group) and then sum the (aggregated) fractions of all groups to be the support.We note that for each label t in C, we would have a grouping of the row instances of C and correspondingly a support.To capture the worst-case prevalence, we choose to use the minimum one among all supports as the final support which would then be normalized into [0,1] by being divided by a constant.

C. Formal Definition of Fraction-Score
We start by defining some concepts related to fraction.Let t be the label used for grouping the row instances of C. We denote by Obj(t, C) the set of objects o which has the label t and there are some row instances of C involving o.Conceptually, each object o in Obj(t, C) corresponds to a group of row instances of C (by label t).To illustrate, consider Fig. 2. Suppose C is {×, •} and × is used for grouping the row instances of C (we will use this setting as our running example in this section unless otherwise specified).Then, Thus, a fraction 0.8 from A 1 's weight is distributed to B 1 and a fraction 0.1 from A 9 's weight is distributed to each of B 2 -B 5 .The intuition here is that A 1 's weight could be shared by 1 group (one with a fraction of 0.8) and A 9 by 4 groups (each with an equal fraction 0.1, i.e., 1/4 of 0.4).Now, we take the perspective of how object o receives fractions of objects located nearby.Specifically, it would receive a fraction of each of those objects o with o ∈ Θ(o , t, d).Besides, the amount of fraction of an object o that o receives, denoted by Consider the example in Fig. 2. We have Note that this is a generalization of the definition of unweighted case in [8].
Object o may receive fractions from multiple objects, which need to be aggregated.This is achieved in two steps.First, we aggregate the fractions from those objects with the same label using a sum function since the fraction of one object could contribute to forming a row instance and that of another object could also contribute to forming another row instance within the same group (i.e., these fractions are complementary to one another for forming row instances).Second, we bound the aggregated fraction for a label by one unit since each group cannot be counted as more than one unit (recall that the row instances within each group share one single object with the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.label used for grouping the row instances).In summary, the aggregated fraction of objects sharing a label t ∈ C − {t} that o receives (these objects form the set Θ(o, t , d)), denoted by Δ label (o, t ), is defined as Consider the example in Fig. 3 where Now, we are ready to introduce the formal definition of Fraction-Score.Instead of materializing all row instances of C and then grouping the row instances by the objects with the label t explicitly as existing studies did [18], [19], [28], we only maintain the grouping conceptually.Recall that Obj(t, C) denotes the set of objects o which have the label t and are involved in some row instances of C. For each object o in Obj(t, C), we aggregate the fractions it receives w.r.t.all labels t in C − {t} using a min function, since it corresponds to the worst-case scenario that one object is shared by multiple groups.We denote the aggregated fraction o receives w.r.t.C by The above definition is for cases where |C| ≥ 2, and in the case when |C| = 1, we simply define Δ labelSet (o, C) = o.w.Consider the example in Fig. 2 where We then define the support given the label t for grouping row instances, denoted by sup(C|t), as the sum of the aggregated fractions that the objects in Obj(t, C) receive w.r.t.C, i.e., Consider the example in Fig. 3 where Note that depending on different choices of label t, we may have different sup(C|t).To capture the worst-case prevalence, we choose the label given which the value is the smallest.Besides, we normalize the value to [0,1] by dividing it by the maximum total weight W max among the labels in T .In summary, the support of a given label set C, denoted by sup(C), is defined as follows.

7
, sup(C|•) It is worth mentioning that all row instances are captured and counted appropriately by Fraction-Score.All instances that are involved in any row instance (and thus possibly contributing to the support of the label set) are considered, and thus no instance is missed.Moreover, Fraction-Score satisfies the anti-monotonicity property.
Lemma 1 (Anti-monotonicity property): Given two label sets C and C, where C is a subset of C, we have sup(C ) ≥ sup(C).
Proof: The correctness relies on the fact sup(C |t) ≥ sup(C|t) for any t in C which could be verified by checking the following facts against (4):

D. Problem Definition
We formally define the co-location pattern mining problem.Problem (Co-location Pattern Mining.)Given a set O of objects, each with a location, a weight and a label, a distance threshold d for defining neighbor sets, and a user parameter min-sup, the co-location pattern mining problem is to find all co-location patterns, where a label set C is a co-location pattern if sup(C) ≥ min-sup.
A closely related problem called co-location rule mining problem [8] can be answered easily once we found the co-location patterns.Due to page limit, please refer to our previous work [8] for the details.
Besides, we define the maximal pattern mining problem as follows.Formally, a pattern C is a maximal pattern if there is no superset C ⊃ C that is a pattern.For example, if both label sets C = {×} and C = {×, •} are co-location patterns, C must not be a maximal pattern since C ⊃ C. It is noteworthy that the closed co-location pattern mining is not suitable in our setting, where a pattern C is closed if there is no superset C ⊃ C that is closed and sup(C) = sup(C ), since our Fraction-Score definition usually leads to different support values for a pattern and its subsets.
Problem (Maximal Co-location Pattern Mining) Given a set O of objects, each with a location, a weight and a label, a distance threshold d for defining neighbor sets, and a user parameter min-sup, the maximal co-location pattern mining problem is to find all maximal co-location patterns, where a label set C is a maximal co-location pattern if sup(C) ≥ min-sup and there is no superset C ⊃ C that is a pattern.

IV. CO-LOCATION PATTERN MINING ALGORITHMS
Section IV-A presents an algorithm for mining the co-location patterns based on Fraction-Score.Section IV-B details the support computation algorithms.Section IV-C discusses the problem of deciding whether an object is involved in any row instance of a given label set, and Section IV-D presents a filtering-andverification approach for it.

A. An Apriori-Like Algorithm
Since the fraction-based prevalence measure satisfies the anti-monotonicity property (Lemma 1), we design an Apriorilike algorithm for computing all co-location patterns from O. The major idea is to iteratively construct co-location pattern candidates and then verify them in an ascending order of their sizes.Specifically, we use C k (k ≥ 1) to denote the set of co-location pattern candidates with the size of k and L k (k ≥ 1) the set of confirmed co-location patterns with the size of k.The algorithm proceeds iteratively.At the first iteration, it computes C 1 as {{t}|t ∈ T } and L 1 as {{t}|sup({t}) ≥ min-sup, t ∈ T }.At the kth iteration (k ≥ 2), it generates Here, C k is generated by combining any two patterns in L k−1 only, and the rationale is that by the anti-monotonicity property, it cannot happen that an object set is in L k while one of its subsets is not in L k−1 .
As could be noticed, a key procedure involved in the above Apriori-like algorithm is to compute for a given label set C its support, i.e., sup(C).Different from the case on transaction databases [1], where the procedure could be finished by scanning the transactions once and counting how many transactions involve the label set, this procedure is non-trivial in our scenario.Besides, none of the algorithms proposed for this procedure in existing studies on mining co-location patterns [18], [19], [26], [28] could be used for the procedure based on Fraction-Score.First, the procedure based on the partitioning-based approach is the same as that on transaction databases and thus not applicable, Second, that based on the construction-based approach [26] is far from being applicable here since it is based on some heuristics only and involves no concepts of fraction.Third, those based on the enumeration-based and participation-based approaches [18], [19], [28] all materialize and count all row instances of a given label set, while the support by Fraction-Score does not rely on counting row instances of a given label set.
We note here that our main technical focus in this paper is on computing the supports defined by Fraction-Score, which is orthogonal to existing studies aiming for faster and more scalable frequent pattern mining techniques [30], [32].In fact, these techniques could be easily adapted to our problem since the supports defined by Fraction-Score satisfy the anti-monotonicity property.

B. An Algorithm for Computing the Support
Our algorithm consists of two procedures, namely Fraction-Computation which collects the information of Δ label (o, t) for all objects o's and all labels t's and SupportComputation which memory for storing the information Δ label (o, t).For better storage efficiency, we have the following two strategies.First, we do not need to store the fractions of those objects o ∈ O t that t have a total weight W t ≤ min-sup/W max , since these labels t cannot be involved in any co-location pattern.This is a new strategy that cannot be found in [8].Second, we adopt a maintenance-on-demand strategy, i.e., only those Δ label (o, t)'s with t ∈ o ∈Disk(o,d) {o .t}are computed, given the fact that the objects within the neighborhood of an object usually involve not that many labels.Based on these strategies, the memory usage for storing the fractions would be much smaller than

O(|O| • |T |).
SupportComputation: Algorithm 2 presents the SupportComputation procedure.First, it initializes sup(C) to be infinity (line 1).Then, it tries to use different labels in C for grouping the row instances of C conceptually (line 2).For a specific label t, it first initializes sup(C|t) as 0 (line 3), and then for each object o ∈ O t which is involved in some row instances of C, it adds up the fraction it receives w.r.t.C, which is computed by the "FractionAggregation" procedure (whose details are presented in Algorithm 3), as sup(C|t).To speed up the additions, if sup(C|t) > sup(C), it terminates the search on t and proceeds with the next label, as sup(C|t) cannot contribute to a smaller Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Algorithm 2: SupportComputation(C, O).
Require In practice, we can speed up the procedure if we only need to compute the support of label sets that have at least min-sup as follows.Specifically, we keep track of an upper bound of sup(C|t), denoted by sup(C|t) UB , by assuming that there is a row instance of C which involves the remaining objects o ∈ O t .If sup(C|t) UB < min-sup, we know that C cannot be a pattern.
The "FractionAggregation" procedure, which for an object o in O, computes the fraction it receives w.r.t. a label set C, i.e., Δ labelSet (o, C), is presented in Algorithm 3. First, it initializes the fraction o receives w.r.t.C as ∞ (line 1).Second, for each label t in C − {o.t} (line 2), it updates

C. OIRI: Is Object o Involved in a Row Instance of C
There is one issue in Algorithm 2 that remains unsolved, namely, the step to decide whether an object o is involved in any row instance of a given label set C (line 5 in Algorithm 2).We denote this problem by OIRI.Unfortunately, the OIRI problem is NP-hard, which we present in the following theorem.
Theorem 1: The OIRI problem, which is to decide for given label set C and an object o whether there exists a row instance of C involving o is NP-hard.
Proof: The proof can be found in Appendix B, available online.

D. A Filtering-and-Verification Approach for OIRI
A naive method for OIRI is to enumerate all row instances of C and check whether there exists one involving object o.However, as has been known in existing studies [44], [45], [46], the procedure of materializing all row instances of a given label set is very expensive.In this paper, we develop a filtering-andverification approach for OIRI, which involves two phases, namely a filtering phase and a verification phase.The filtering phase is to solve OIRI for easy cases and the verification phase for all remaining cases.The details are introduced as follows.
1) Filtering Phase: The filtering phase is motivated by the fact that the remaining issue OIRI could be easy to solve with some information re-used in certain cases: r Filter 1: (For |C| = 2 only.)Let t denote the label in C \ {o.t}.We check if Δ label (o, t ) > 0. If so, we return "yes".Otherwise, we return "no" (since Δ label (o, t ) > 0 if and only if o is involved in a row instance of C).Note that compared to [8], this filter is newly included.
r Filter 2: We check if there exists a row instance S of C, which was found previously when answering another OIRI instance for a different object o and label set C, such that o is involved in S. If so, we return "yes".To support this checking, we could keep track of all those objects that are involved in row instances that have been found.2) Verification Phase: We propose three methods for verification phase as follows.
Dia-CoSKQ-Adapt: This method is based on the close relationship between OIRI and Dia-CoSKQ.In the proof of the NP-hardness of OIRI, we show that any decision problem instance of Dia-CoSKQ could be transformed to a OIRI problem instance.Here, we further show that an arbitrary instance OIRI could be answered by solving a corresponding optimization problem instance of Dia-CoSKQ.Specifically, given an instance of OIRI which involves a set O of spatial objects, a set C of labels, a real number d, and one object o in O, we consider a Dia-CoSKQ problem which is to find a set S of POIs from a given set D of POIs which covers all query keywords of a given query q and has the diameter of S ∪ {q} the smallest, where the set D of POIs includes one POI for each object o in Disk(o, d) with its location as o.λ and its set of keywords as {o.t} and the query q has its location at o.λ and its set of query keywords as C − {o.t}.It could be verified that if the diameter of S ∪ {q} is at most d, the answer of the OIRI is "yes"; otherwise, the answer is "no".Based upon this, we can utilize the exact algorithm proposed in [24] for OIRI.Note that we could do slightly better by adopting an early-stopping strategy that whenever a set S with the diameter of S ∪ {q} at most d is found, it returns "yes" immediately.
Combinatorial-Search: We notice that enumerating all row instances of C is more than necessary for answering the question of OIRI.In fact, it would be sufficient to find one row instance of C which involves o if it exists to answer the question.Besides, there are two constraints that could be utilized for refining the search space.First, it is safe to focus the search on those objects which are near o, specifically, those in Disk(o, d), since those objects outside this disk have their distances from o larger than d and cannot be involved in the same row instance together with o.Second, it is enough to consider those object sets that only contain objects corresponding to different labels in C, since other object sets either do not carry the labels in C or have proper subsets which carry all the labels in C. Based upon the above two constraints, we design an algorithm for searching a possible row instance of C involving o if there exists one as follows.
r Step 1: it finds all objects in Disk(o, d) by performing a range query with its center at o and its radius of d. r Step 2: it prunes the objects that already returned "no" as the answer in the previous iterations for the same label set C. Note that this step is new as compared to [8].
r Step 3: it indexes the remaining objects using an inverted index which stores the objects using different lists each corresponding to a label and contains all objects with this label.
r Step 4: it tries all combinations of objects from those lists corresponding to the labels in C − {o.t} and for each combination S which contains |C − {o.t}| objects it checks whether the maximum pairwise distance of S is at most d.If such a combination is found, it stops by returning "yes"; otherwise, it returns "no".Optimization-Search: In Combinatorial-Search, there is a step which is to enumerate all combinations of some objects in Disk(o, d) indexed by their labels in C = C − {o.t} and see whether there exists a combination with the diameter at most the value d.An alternative for this step is to compute the set of objects in Disk(o, d) which covers all labels in C and has the smallest diameter and then compare this diameter against d to answer the question, i.e., if this diameter is at most d, it returns "yes", and otherwise, it answers "no".In the literature, the problem of finding a set objects which covers a given set of labels/keywords and has the smallest diameter has been studied [15], [47], [48] and is called the m-closest keywords (mCK) problem.Based upon this, we can utilize the exact algorithm proposed in [15] for mCK to do this step, and the resulting method corresponds to Optimization-Search. Similar to the Dia-CoSKQ-Adapt method, an early-stopping strategy could be adopted here.
3) Time Complexity Analysis: Since the verification phase dominates the time cost of the approach, we focus on the verification phase only.The complexity of Dia-CoSKQ-Adapt is [24], where n 1 (n 1 << |O|) is the number of objects that carry a label t ∈ C − {o.t}, k 3 (k 3 << |O|) is the number of objects shared by results of range queries.The complexity of Combinatorial-Search is , where C range is the cost of performing the range query in Step 1, k 1 (k 1 << |O|) is the number of objects returned by the range query in Step 1, and k 2 (k 2 << |O|) is maximum number of objects in an inverted list constructed in Step 3.While the worst-case time complexity is exponential, the algorithm is feasible in practice with the help of index structures such as inverted lists and also because of the problem nature (e.g., the exponent |C| is small in most cases), and this will be verified by the experiments.The complexity of Optimization-Search is ) [15].

V. MAXIMAL CO-LOCATION PATTERN MINING
Section V-A presents an algorithm for mining the maximal colocation patterns based on Fraction-Score.Section V-B details the supports computation algorithm.Section V-C analyzes the time complexity.

A. An Algorithm for Finding the Maximal Patterns
A straightforward solution to find all maximal patterns is to first find all co-location patterns using the algorithms proposed in Section IV, and then check the maximality of each pattern one by one.This method, however, incurs unnecessary computations as most of the patterns are not maximal and will not be in the result.
To this end, we propose an algorithm that generates the candidate maximal patterns and checks the maximality of each candidate, so it avoids those unnecessary computations as much as possible.It consists of the following steps.[38], we construct a graph G from L 2 , and find the maximal clique to generate the candidate maximal patterns.In particular, each label t correspond to a vertex v in G.If the labels form a size-2 pattern (i.e., can be found in L 2 ), each pair of vertices is connected by an edge in G.We then find the set CM P of all maximal cliques in G by utilizing the Bron-Kerbosch algorithm [5].Different from [38] that generate the candidate patterns from size-2 instance table, we do not need to materialize the instances.if C ∈ CM P then CM P ← CM P ∪ {C } 16: j ← j − 1 return MP the set of candidate maximal patterns CM P (lines 4-5).Third, it iteratively checks each label set C in CM P in descending order of their sizes j, where 2 ≤ j ≤ max C∈CMP |C| (lines [7][8][9][10][11][12][13][14][15][16].Consider an iteration it processes size j and label set C. If C is a subset of any pattern in the result, we can safely skip C. Otherwise, it invokes the procedure "SupportCompu-tationMaximal" (to be discussed below), which takes a label set C and an object set O as inputs, and computes the support sup(C).If sup(C) > min-sup, C is added to MP .Otherwise, it constructs the subsets of C, denoted by C , with |C | = j − 1, and inserts C into CM P if C does not exist in CM P .The iterations end when all candidates in CM P have been iterated.Finally, it returns MP as the result.
Theorem 2: The MaximalPatternMining algorithm correctly finds all maximal co-location patterns.
Proof: The completeness can be proven as follows.It is easy to see that all maximal patterns with size-2 can be found in Step 1.For maximal patterns with size larger than 2, we show that they must exists in CM P .Specifically, we prove it by contradiction.Suppose there exists a maximal pattern C not in CM P .Then, either (1) there exists a superset of C is in CM P , or (2) there exists a subset C ⊂ C with |C | = 2 that is not a pattern.In the former, C is not a maximal pattern by definition.In the latter, C can not be a pattern by the anti-monotonicty property.Both cases lead to contradictions.Thus, all maximal patterns C are in CM P .The correctness is guaranteed as the algorithm calculates the support of each candidate patterns.

B. Algorithm SupportComputationMaximal
In fact, since the definition of sup(C) does not change, we can reuse "SupportComputation" procedure (i.e., Algorithm 2) to calculate the support value of a label set C.
Nevertheless, to further improve the performance, we include an additional filter in the filtering phase of "SupportComputation".The resulting procedure is called "SupportComputation-Maximal".In particular, the additional filter takes advantage of If so, we return "yes" (since o must also be involved in a row instance of C).To support this checking, we maintain the objects involved in the row instances of each label set with size (k + 1) when we process the label set with size k.

C. Time Complexity Analysis
It is easy to see that the time complexity of SupportCompu-tationMaximal is same as that of SupportComputation, denoted by θ.We analyze the time complexity of Algorithm 4 as follows.
The complexity of MaximalPatternMining is dominated by Step

A. Experimental Set-up
Datasets: We use both real and synthetic datasets, as shown in Table III.The first real dataset U.K. is the set of POIs of the United Kingdom. 2 Each POI has a textual description (e.g., supermarket, bank, cinema) and a GPS location.It consists of 182,334 objects with 36 types (i.e., labels).The second real dataset NeuroSynth [39] was developed as an automated brain mapping framework that uses text mining to generate a large database of mappings between neural and cognitive states.The database contains a mapping between terms (e.g., "depression" and "anxiety") and the activated locations in the brain (3D coordinates in the MNI stereotaxic space, which we mapped to 3D euclidean space).It contains 507,891 locations (i.e., objects) with 3,229 terms (i.e., labels).The object weights, obtained from text-mining, are relevance scores between the labels and the locations.
The synthetic datasets are generated by following existing studies [18], [28] as follows.
Step 1 (Label Set Generation): We generate N co_loc subsets of labels one by one, and for each one, we construct it by sampling a certain number of labels randomly where the number follows a Poisson distribution with mean λ 1 .We then construct m overlap maximal co-location patterns (i.e., label sets) from each set of labels constructed by augmenting it where n 1 is equal to the number of non-noisy labels (i.e., those generated in Step 1).We then construct (r noisy_num × n 2 ) noisy instances based on the noisy labels similarly as we did based on non-noisy labels (i.e., via Step 2), and put each noisy instance at a random grid cell, where n 2 is equal to the number of non-noisy instances (i.e., those generated in Step 2).We set N co_loc , λ 1 , D, d, r noisy_label , and r noisy_num as 20, 5, 10 6 , 10, 0.5, and 0.5, respectively.By following existing studies [18], [28], we set the other parameters as shown in Table IV (with the default ones in bold).Note that the numbers of objects and labels in the synthetic datasets depend on the parameter settings.Under the default settings, the dataset contains 94,028 objects and 462 labels.In addition to the unweighted datasets, we further assign weights to generate weighted datasets.Specifically, we assign each object a weight picked uniformly at random in the range [0,1] to form the weighted datasets.
Algorithms: For the co-location pattern mining problem, we test our Filtering-and-Verification approach.For comparison, we adapt the Join-less algorithm from [44] for two reasons.First, it is the state-of-the-art algorithm for co-location pattern mining.Second, though originally designed for participation-based measure, it involves procedures of computing the row instances of given label set, which is shared by our Fraction-Score measure.Specifically, the adapted algorithm works as follows.First, it generates all star neighborhoods.Second, for each label set C, it finds all the row instances from the corresponding star neighborhoods.Third, to check whether an object o is involved in C, it checks whether o exists in one of the row instances of C.
For the maximal pattern mining problem, we test our Max-imalPatternMining algorithm.For comparison, we adapt the SGCT algorithm from [38], which is the state-of-the-art algorithm for maximal co-location pattern mining.Similar to the above, though it is originally designed for participation-based measure, we adapt it for our Fraction-Score measure.Specifically, the adapted algorithm works as follows.First, it finds the size-2 patterns and candidate maximal patterns.Second, for each candidate C, it generates all row instances and stores them in a condensed instance tree.Third, to check whether an object o is involved in C, it checks whether o exists in the tree.All algorithms were implemented in C/C++ and are memorybased.All experiments were conducted on a Linux platform with a 2.66 GHz machine and 32 GB RAM.

B. Experiment Results on Co-Location Pattern Mining 1) Effectiveness Results on Synthetic Datasets:
We compare Fraction-Score with the other approaches in terms of how close the supports measured are from the ground truths.Note that we did not include the enumeration-based approach here since it is used for defining the confidence of a rule candidate only as mentioned in Section II.Besides, we use the unweighted synthetic datasets only for the study here since it allows the flexibility to generate the datasets where the ground-truth supports could be estimated accurately.For this particular experiment, we set the parameter m clump , i.e., the number objects to be generated for a label, to be a random number from a uniform distribution of [1,5] instead of a fixed number as we do for other experiments, and the purpose here is to test the robustness of support measures.Specifically, we estimate the ground-truth support of a pattern as the maximum number of disjoint row instances of the pattern.Based on the way we generate the synthetic datasets, this is close to the number of instances of a label (which follows P ois(λ 2 )) with the smallest m clump values among the labels in the pattern.For normalization, we then divide it by the maximum number of objects that have a specific label in T .
Fig. 4 shows the results of patterns with top-10 supports, where the x-axis corresponds to the patterns (in a descending order of their supports) and the y-axis shows the actual supports.According to these results, the supports by Fraction-Score are closest to the ground-truths among all approaches.This could be explained by the fact that the row instances that overlap with each other are not counted multiple times when collecting groundtruths, which is reasonable, while the participation-based approach would count those row instances which share some objects with their labels different from the one used for grouping the row instances as if they share nothing.The partitioning-based approach under-measures the supports since it misses some of the row instances, and the construction-based approach misses some of the row instances due to its heuristic nature.
We also studied how the fractions in Fraction-Score are distributed.The results showed that only around one-fifth of the patterns have their fractions equal to 1. Due to page limit, please refer to our previous work [8].
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.2) Effectiveness Results on the U.K. Dataset: We study the effectiveness of different support measures on the U.K. dataset.Specifically, we ran our algorithm and found the co-location patterns with top-5 supports (with the setting of d = 1000 m).Table V presents the patterns, each with its supports computed by other approaches also shown.According to the results, we know that the supports by the participation-based approach are very close to 1 (which is mainly because this measure has a normalization step of dividing by the number of occurrences of the label but not the maximum among all labels as Fraction-Score does) and the supports by the partitioning-based and construction-based approaches are slightly smaller than those by Fraction-Score (which is mainly because the former ones miss some row instances while Fraction-Score captures all instances appropriately).
We also visualized the objects involving the labels in two different patterns.The distributions shown in the visualizations are consistent with our computation results.Due to page limit, please refer to [8].
3) Effectiveness Results on the NeuroSynth Dataset: We further study the effectiveness of our Fraction-Score on the Neu-roSynth dataset.Specifically, in the 3D space with x, y, z ∈ [−100, 100] mapped from the MNI space, we found the colocation patterns by setting d = 20.We selected four interesting patterns, and computed their supports with different approaches.Note that the baseline approaches originally do not support weighted dataset, and we adapted them to handle the case with weights.For the adaption details, please refer to Appendix A, available online.
The results are shown in Table VI.According to the results, we found that autism spectrum disorder (ASD) is often correlated to pain, speech and working memory (WM), which conforms with the findings in existing studies [16], [34].We found that ASD and Parkinson's disease (PD) have similar activated locations in the brain, which is also an ongoing research direction in the medical field [14], [17].In addition, we have observations on the supports by other approaches similar to above.The supports by the participation-based approach are very close to 1, which decreased the ability to distinguish patterns from label sets.The supports by partitioning-based and construction-based approaches are smaller than those by Fraction-Score.
4) Results on the Filtering-and-Verification Approach: Filtering phase: In this part, we show the results reflecting the effectiveness of the filtering phase.Consider Fig. 5(a), where we vary min-sup and measure the percentage of OIRI instances that are found by each of the four filters in the filtering phase and also that by the verification phase.These results show that more than 80% of OIRI instances could be found in the filtering phase, and thus less than 20% OIRI instances would be left in the verification phase.Besides, we notice that when min-sup increases, the filtering powers of Filters 1 and 2 increase while that of Filter 3 decreases.The former is because the number of large co-location patterns decreases when min-sup increases and as a consequence, it is more likely that size-2 patterns have a larger portion, which benefits Filter 1, and it is easier to find a row instance of a label set, which benefits Filter 2. The latter is because when min-sup increases, it becomes rare for Disk (o, d) to not cover all labels of a label set (which is of a small size) and thus the filtering power of Filter 3 decreases.The results on the other datasets provide similar clues and thus they are omitted.
Verification Phase: We conducted experiments on both real and synthetic datasets for studying the performance of the three methods proposed for the verification phase.The results can be found in Appendix C, available online due to page limit.According to the results, Combinatorial-Search runs the fastest consistently under all settings.This could probably explained by the fact that the exact algorithms employed in Dia-CoSKQ-Adapt and Optimization-Search were originally designed for some optimization problem (i.e., Dia-CoSKQ and mCK problems) while OIRI is a decision problem.These exact algorithms involve extra steps for finding an optimal solution and thus they take more time.Therefore, we focus on Combinatorial-Search in the verification phase for the remaining experiments.With Combinatorial-Search used in the verification phase, the breakdown of the running time is shown in Fig. 5(b).
We also compared the overall improvement of the Filteringand-Verification approach to the one proposed in [8].The results can be found in Appendix D, available online.According to the results, the updated Filtering-and-Verification algorithm runs faster and uses fewer memory in most cases, which demonstrates the effectiveness of the additional filtering and pruning steps and the strategy to reduce memory usage.

5) Filtering-and-Verification vs State-of-the-Art:
In this part, we compare the performance between Filtering-and-Verification  and Join-less [44], in terms of running time and memory consumption.
Effect of min-sup: Fig. 6 shows the results on the real dataset where we vary min-sup.According to Fig. 6(a), the running times of both algorithms decrease when min-sup increases.This is because fewer co-location patterns would be found when min-sup increases.Besides, our Filtering-and-Verification approach runs much faster than the Join-less method, which could be explained by the fact that the former only needs to check whether some objects are involved in any of the row instances while the latter needs to find all row instances of each co-location pattern.According to Fig. 6(b), our Filtering-and-Verification approach consumes significantly less memory than the Join-less method, which is because the former only maintains the fractions received by each object for each label while the latter needs to store all row instances of each co-location pattern.Fig. 7 shows the results on the NeuroSynth dataset where we vary min-sup, where the results for Join-less with min-sup ≤ 0.4 are not shown because it takes more than 1 d to run.According to Fig. 7(a), the running times of both algorithms decrease when min-sup increases.Our Filtering-and-Verification approach runs faster than the Join-less method, which is because we only check if the objects are involved in any row instances, while Join-less finds all row instance for each pattern.According to Fig. 7(b), our Filtering-and-Verification approach consumes less memory than the Join-less method, since the Join-less method needs to store all row instances of the patterns.The results on the synthetic datasets, where we vary other parameter settings, can be found in Appendix E, available online.
6) Scalability Test: We further generated 5 synthetic datasets with sizes {180 k, 360 k, 540 k, 720 k, 900 k} from the real dataset for scalability test.According to the results, our Filteringand-Verification method could scale up on large datasets of size 1 M, while the Join-less method cannot scale to large datasets,  e.g., it ran for more than 2 days on dataset of size about 180 k.The results can be found in Appendix F, available online.

C. Experiment Results on Maximal Pattern Mining
In this part, we compare the performance between our Max-imalPatternMining algorithm and SGCT [38], in both running time and memory consumption.
Fig. 8 shows the results on the weighted synthetic dataset where we vary min-sup.According to Fig. 8(a), the runnning times of both algorithms decrease when min-sup increases.This is because fewer co-location pattern exists and thus the sizes of the maximal patterns would decrease.Besides, our Max-imalPatternMining runs much faster than the SGCT method, which is because (1) our two-phases approaches prune more non-promising candidates, and (2) we do not need to generate and store all row instances, while SGCT materializes all of them.According to Fig. 8(b), the two algorithms have similar memory usage.
Fig. 9 shows the results on the NeuroSynth dataset, where the results for SGCT with min-sup < 0.4 are not shown because it takes more than 1 d to run.According to Fig. 9(a), our MaximalPatternMining runs consistently faster than SGCT, which is because of MaximalPatternMining has more effective prunings to reduce the number of candidate patterns.The results for the unweighted synthetic and U.K. datasets can be found in Appendix G, available online.
Summary of Results: Our Fraction-Score metric measures the prevalence of co-location pattern candidates more properly than existing ones.Three filters in the filtering phase are effective (e.g., they filter more than 80% OIRI instances), and among three methods in the verification phase, Combinatorial-Search works the best.Besides, our Filtering-and-Verification approach works consistently better than the state-of-the-art in terms of both running time and memory consumption.Moreover, our Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Fig. 1 .
Fig. 1.A small portion of a real dataset of POIs in United Kingdom, including 8 restaurants (blue), 3 banks (green) and 3 churches (red), where the icons indicate the labels of the spatial objects and the disks have their centers at C 1 -C 3 and radii all equal to d.

Fig. 2 .
Fig. 2. A toy example where × and • are two labels, and A 1 -A 9 and B 1 -B 9 are 18 objects each with exactly one label indicated by the shape representing the object, and its weight is indicated by the values in blue.

Fig. 3 .
Fig. 3.The distribution of the fractions from the perspective of objects with label •, represented by the arrows with solid lines and the values in black.

r Filter 3 :r 4 :
We check if all objects in Disk(o, d) together carry all labels in C. If no, we return "no" (since all possible sets of objects in Disk(o, d) correspond to subsets of the set containing all objects in Disk(o, d) and thus, they cannot carry all labels in C either).Filter We check if all objects in Disk(o, d/2) together carry all labels in C − {o.t}.If so, we return "yes" (since there exists a set S of objects in Disk(o, d/2) including o that has max o,o ∈S d(o, o ) ≤ d and corresponds to a row instance of C).

r
Step 1. (Finding size-2 patterns): We find the size-2 pat- terns using the algorithms discussed in Section IV, denoted by L 2 .r Step 2. (Generating Candidate Maximal Patterns): In- spired by

r
Step 3. (Finding Maximal Patterns): We find the maximal patterns MP from the candidate set CM P .The major idea is to iteratively verify the candidate patterns in a descending order of their sizes.If a candidate pattern C is not maximal (i.e., sup(C) < min-sup), all its subsets C with |C | = |C| − 1 are constructed as the candidate patterns to be checked.The iterations stop when |C | = m.Algorithm 4 shows the maximal pattern mining algorithm.It takes a set O of objects, a label set C as inputs, and finds all maximal patterns and stores in MP .Specifically, it first finds all size-2 patterns, denoted by L 2 .Second, it constructs a graph by L 2 , and find the set of maximal cliques in G to be Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

3 .
The complexity is O((|L 2 | + |CM P |) • θ), since it need to compute the supports of at most (|L 2 | + |CM P |) label sets.VI.EMPIRICAL STUDIESSection VI-A details the experimental set-up.Section VI-B reports the results on co-location pattern mining, and Section VI-C presents the results on maximal pattern mining.
Table I summarizes them and compares with Fraction-Score.

TABLE II NOTATION
TABLE we say that S is a neighbor set if the maximum pairwise distance within S is bounded by a distance threshold d, i.e., max o,o ∈S d(o, o ) ≤ d.Given an object o and a real number r, we denote by Disk B 5 } and each object in Obj(×, C) corresponds to a group of C's row instances.Consider an object o in Obj(t, C) and another object o with its label different from t (i.e., o .t= t).If some row instances in the group formed by o involve o , i.e., o is shared by this group, we know that o must be located in Disk(o , d) since otherwise o and o cannot be involved in the same row instance of C. Thus, the potential number of groups that o could be shared by is bounded by the number of objects which are located in Disk(o , d) and have the label t.Let us denote by Θ(o , t, d) the set of objects which are located in Disk(o , d) and carry the label t (note that o ∈ Θ(o , t, d)).Motivated by the previous observation, Fraction-Score splits o into |Θ(o , t, d)| equal fractions each equal to o .w/|Θ(o, t, d)| and then distributes each fraction to an object in Θ(o , t, d), To illustrate, consider Fig. 2. We have

Algorithm 1 :
FractionComputation(O, T , d, min-sup).Require: an object set O, a label set T , a distance threshold d and a support threshold min-sup Ensure: the aggregated fraction each object o ∈ O receives w.r.t. each t ∈ T , i.e., Δ label (o, t) label (o , o.t) ← 1 computes the support of a given label set C based on these information.FractionComputation: Algorithm 1 presents the Frac-tionComputation. First, it initializes |N eigh(o, t, d)| and Δ label (o, t) for each object o ∈ O and each label t ∈ T as 0 (lines 1-4).Second, for each object o ∈ O, it proceeds as follows.It counts the number of objects in Disk(o, d) which have a label t (lines 6-7).Then, it distributes a fraction o.w/|N eigh(o, o .t,d)| : a label set C and an object set O Require: an object set O, a label set C, and an object o in O Ensure: the aggregated fraction object o receives w.r.t.C, i.e., Δ labelSet (o, C)1: Δ labelSet (o, C) ← ∞ 2: for label t in C − {o.t} do 3: if Δ label (o, t) < Δ labelSet (o, C) then 4: Δ labelSet (o, C) ← Δ label (o, t) 5: Return Δ labelSet (o, C)sup(C) (lines 4-7).Finally, it returns the smallest sup(C|t) for a label t ∈ C as sup(C) (lines 8-9).

TABLE III DATASETS
USED IN THE EXPERIMENTS the top-down approach in our maximal pattern mining algorithm to reuse information from previous checking.It is inserted after Filter 2, and is as follows.Filter 2': We check if there exists a row instance of C ⊃ C involving o for the label set C that satisfies |C | = |C| + 1.

TABLE IV PARAMETERS
AND SETTINGS with one more random label.Step 2 (Instance Construction): For each maximal co-location pattern, we construct a certain number of instances where the number follows a Poisson distribution with mean λ 2 , each by creating m clump objects for each label in this instance and putting them inside a random grid cell with size d × d from the spatial frame of size D × D. Step 3 (Noise Injection): We generate (r noisy_label × n 1 ) noisy labels,