Stable Learned Bloom Filters for Data Streams

(1)

Stable Learned Bloom Filters for Data Streams

Qiyu Liu

^§

, Libin Zheng

^§∗

, Yanyan Shen

^†

, and Lei Chen

^§

§The Hong Kong University of Science and Technology ^†Shanghai Jiao Tong University {qliuau, lzhengab, leichen}@cse.ust.hk, [email protected]

ABSTRACT

Bloom filter and its variants are elegant space-efficient prob- abilistic data structures for approximate set membership queries. It has been recently shown that the space cost of Bloom filters can be significantly reduced via a combi- nation with pre-trained machine learning models, named Learned Bloom filters (LBF). LBF eases the space requirement of a Bloom filter by undertaking part of the queries using a classifier. However, current LBF structures generally target a static member set. Their performances would inevitably decay when there is a member update on the set, while this update requirement is not uncommon for real- world data streaming applications such as duplicate item detection, malicious URL checking, and web caching. To adapt LBF to data streams, we propose the Stable Learned Bloom Filters (SLBF) which addresses the performance decay issue on intensive insertion workloads by combining classifier with updatable backup filters. Specifically, we propose two SLBF structures, Single SLBF (s-SLBF) and Group- ing SLBF (g-SLBF). The theoretical analysis on these two structures shows that the expected false positive rate (FPR) of SLBF is asymptotically a constant over the insertion of new members. Extensive experiments on real-world datasets show that SLBF introduces a similar level of false negative rate (FNR) but yields a better FPR/storage trade-off compared with the state-of-the-art (non-learned) Bloom filters optimized on data streams.

PVLDB Reference Format:

Qiyu Liu, Libin Zheng, Yanyan Shen and Lei Chen. Stable Learned Bloom Filters for Data Streams. PVLDB, 13(11): 2355- 2367, 2020.

DOI: https://doi.org/10.14778/3407790.3407830

1. INTRODUCTION

Bloom filters (BF) [7] are simple, space-efficient proba- bilistic data structures designed for answering membership queries, that is, testing whether a queried element x is in a given set S. Due to its great importance, optimizations

∗Libin Zheng and Yanyan Shen are corresponding authors.

This work is licensed under the Creative Commons Attribution- NonCommercial-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/. For any use beyond those covered by this license, obtain permission by emailing [email protected]. Copyright is held by the owner/author(s). Publication rights licensed to the VLDB Endowment.

Proceedings of the VLDB Endowment,Vol. 13, No. 11 ISSN 2150-8097.

DOI: https://doi.org/10.14778/3407790.3407830

learned oracle

0 1 1 0 0 0 1 0 1 0 0 1 1 0 0 0

16 bits

pos neg

backup ﬁlter

element prediction

x pos

y neg

z neg

w neg

y z w?

input

Figure 1: Illustration of the learned Bloom filter built on an element set S = {x, y, z} with a query element w.

and variants of Bloom filter as well as its applications, es- pecially in database and networking, have attracted much research efforts over the past decades [11, 16, 17, 19, 18, 23, 15, 13]. Although Bloom filters have been well explored and evaluated in both academia and industry, a recent proposal,

“The Case for Learned Index Structures” [25], suggests that machine learning models like neural networks can be com- bined with traditional index structures, including B-tree, hash tables and Bloom filters, to further improve the space utilization and query efficiency.

In their seminal work, Kraska et al. [25] claim that the membership queries can be regarded as an instance of classification problem on a dataset {(xi, yi = 1)|xi ∈ S} ∪ {(xi, yi= 0)|x ∈ N } where S and N are the sets of members and non-members. They first use a classifier which classi- fies elements into members/non-members. Then, to remove possible false negatives, a small backup Bloom filter is built on the set SN = {x|x ∈ S, x is predicted as non-member}

to distinguish the true members from the predicted non- members. Such a data structure, called Learned Bloom Fil- ter (LBF), is illustrated in Figure 1. For query processing, a non-member decision is made i.f.f. both the classifier and the backup filter determine a queried element as a non-member.

Example 1. As shown in Figure 1, let us consider a toy element set S = {x, y, z} for constructing an LBF. x is cor- rectly predicted by the classifier as a member of S, so it needs no further treatment. In contrast, for y and z, which are wrongly classified, we insert them into a standard Bloom filter [7] with an array of 16 bits and 3 hash functions. When the constructed LBF is queried with an element w, supposing that the classifier determines w as a non-member, w would be further tested over the backup filter, which finally yields a non-member decision, i.e., w /∈ S.

Similar to standard BF, LBF has one-sided error (i.e., only false positives). Compared with non-learned filters, the advantage of LBF is that, on a static element set, it requires much smaller storage but retains competitive query efficiency and error rate. The reason is that the space cost of LBF comes from storing both the classifier and the backup

(2)

0 20000 40000 60000 80000 100000120000140000

#insertion 0.0

0.2 0.4 0.6 0.8 1.0

false positive rate

Fp=0.01, Fn=0.5 Fp=0.1, Fn=0.3 Fp=0.25, Fn=0.25 Fp=0.5, Fn=0.1

Figure 2: Simulation result of the FPR decay effect for LBF. The backup filter is initialized with #expected elements=1,000, #bits per element=9.85, #hash functions=6, and FPR is calculated by varying the number of insertions and trying 4 different combinations of Fp(FP rate) and Fn

(FN rate) of the classifier.

filter. However, SN, the set used to build the backup filter, is relatively small, and storing a well-trained classifier usually requires much less space [25].

However, similar to the standard BF, LBF is designed for static element set S whose total cardinality is known in advance. As a result, when there are new elements outside S being inserted to the filter, the false positive rate (FPR) would inevitably grow due to the limited space of backup filter, which is determined upon its construction. Compared with standard BF, such performance decay effect caused by insertions is more severe for LBFs, as the backup filter is usually small, and the insertion capacity can be easily reached.

Example 2. (FPR Decay) Figure 2 illustrates such an effect by reporting the FPR versus the number of insertions according to the theoretical analysis for LBF in [30], under the assumption that newly inserted elements are sampled from the same distribution as S. It shows that FPR tends to reach 100% after 80K new insertions under all the four settings. More specifically, though a smaller Fn (false negative rate of the classifier) can slow down the increase of FPR, it introduces a higher initial FPR. Intuitively, this is because Fpand Fn usually contradict with each other, and Fpis the lower bound of FPR (discussed in Section 2.2).

Though space-efficient, LBF is only applicable for a pre- defined element set and query-only workloads, which limits its real-world applications and motivates us to design new learned filters usable on insertion-intensive workloads. How- ever, building such an insertion-aware LBF is non-trivial.

Existing works addressing this dynamic insertion issue all target standard BF [19, 6, 13, 18]. They cannot be directly applied to the context of LBF due to the existence of an extra classifier. The major challenges lie in four aspects.

• First, ML models are usually non-deterministic. There- fore, we need to devise a new mathematical model for ana- lyzing the performance of learned filters over data streams, just like what is done for the LBF on static sets in [30].

• Second, different from standard BF, since we wish the FPR to be controlled when the filter is applied to an unbounded element stream with limited storage, it would inevitably introduce false negatives. Consequently, we need to carefully quantify the false negative rate and achieve a proper trade-off among FPR, FNR, and storage.

• Third, when handling a static element set, overfitting of the classifier is usually good since it improves both Fpand Fn, which is contrary to the case of streaming data as we hope that the classifier generalizes well to future elements.

• Fourth, parameter setting becomes harder under the context of dynamic insertions. For the original learned Bloom

#insertions FPR

100% original

LBF our SLBF

storage FPR Decay?

original LBF

our SLBF

standard BF

low high

no yes

updatable BF variants

Figure 3: Left: FPR v.s. #insertions under the same storage cost. Right: sketch of performance feature under the same expected FPR upper bound.

filter on a static element set, the parameters of the backup filter, e.g., number of hash functions, are optimized based on the size of element set, classifier performance and a user-provided FPR threshold, which will change as new elements inserted to the filter.

To handle the dynamic insertion of new elements, in this paper, we design a new insertion-aware LBF structure called Stable LBF (SLBF). SLBF is expected to have the following features: 1) the performance decay effect is under control, i.e., FPR has a non-trivial upper bound even for a large number of insertions; 2) the total storage cost is limited and less than that of using a standard filter at the same error level; and 3) the membership query is as efficient as a standard filter, i.e., in O(1) time. Figure 3 outlines the major performance merits of our SLBF. When applied to dynamic insertions, SLBF has low storage cost and survives from the performance decay issue. Many real-world applications, like duplication detection [29], IP traffic monitoring [26], and search engine refinement [22], can benefit from using our SLBF. The applications of SLBF as well as a detailed discussion about the proper choice among using BF, LBF or our SLBF on different scenarios are put in Appendix A and Appendix B.

To the best of knowledge, this is the first work that consid- ers optimizing LBF over dynamic insertion workloads. We summarize the major technical contributions as follows.

• We introduce two new learned Bloom filters, Simple Sta- ble Learned Bloom Filter (s-SLBF) and Grouping Stable Learned Bloom Filter (g-SLBF), to achieve the three ob- jectives mentioned above.

• We perform detailed analysis on the performance of our proposed data structures over dynamic insertion workloads, regarding which we explain its parameter setting and classifier selection.

• We conduct extensive experimental studies on the real data, which show that g-SLBF can effectively reduce up to 97% storage cost.

The rest of this paper is organized as follows. Section 2 introduces some preliminaries and formulates the problem of devising an insertion-aware Bloom filter with a performance guarantee. We present our s-SLBF and g-SLBF and analyze their performance over data streams in Section 3. Section 4 discusses the parameter settings for our SLBF. We report the experimental results in Section 5. Finally, we review previous works in Section 6 and conclude in Section 7.

2. PRELIMINARIES

This section overviews the structures and analytical results of standard Bloom filers and learned Bloom filters.

Then, we discuss the property of stability for Bloom filters on unbounded data streams. For quick reference, all the notations used hereafter are summarized in Table 1.

(3)

Table 1: Notations and descriptions.

Notation Description

S the element set of n distinct members h1, · · · , hk k independent hash functions

B[1 · · · m] the array of m bits used in BF f : x → [0, 1] a trained classifier

τ the decision threshold of classifier f SBF [1 · · · m] the array of m counters used in stable BF

g the number of total groups

τ1< · · · < τg−1 a partition to interval [0, 1]

SBF1, · · · , SBFg a collection of g stable BFs M axj the max value of each counter of SBFj

mj the number of counters of SBFj

Kj the number of hash functions of SBFj

Pj the number of decrements of SBFj

Fp the false positive prob. of classifier f Fn the false negative prob. of classifier f x1, · · · , xN a sequence of N elements to be inserted

2.1 Standard Bloom Filter

Given a set S of n member elements, a standard Bloom filter [7] represents S using a bit array B of size m and k independent hash functions h1, · · · , hk which uniformly map the elements to the range 1 ∼ m (inclusive). The bit array is initialized to all 0’s in the beginning. Then, for each x ∈ S, the bits located at h0(x), · · · , hk(x) are set to 1. For a membership testing of element y, a positive answer is given if all k bits pointed by h0(y), · · · , hk(y) are 1, otherwise a negative answer is returned. Such a construction and query mechanism ensure there is no false negatives but possibly false positives. After inserting all n elements of S into the bit array B, the probability that an arbitrary bit of B is still 0 can be calculated as:

Pr (B[i] = 0) =

1 − 1

m

kn

≈ e^−kn/m. (1) We denote p0= Pr (B[i] = 0). For any non-member element y /∈ S, the false positive rate (FPR) is then given by,

FPRBF= Pr(B[h1(x)] = 1 ∧ · · · ∧ B[hk(x)] = 1)

= (1 − p0)^k≈

1 − e^−kn/mk

. (2)

In practice, n is known in advance as the expected size of element set, and m is the size of the bit array. Given n and m, the optimal number of hash functions is k^opt= ^m_n ln 2 by setting the derivative of Eq. (2) to zero, and the corresponding optimal FPR is 0.5^mⁿ^{ln 2}≈ 0.6185^mⁿ. To make it more general, the FPR of standard Bloom filter as well as its variants like Cuckoo filter [16] on a static element set can be modeled as α^t where α ∈ (0, 1) is a constant and t = ^m_n stands for the number of bits used to encode each element.

2.2 Learned Bloom Filter

The formal definition to the learned Bloom filter (LBF), which is first introduced by Kraska et al. [25] and further refined by Mitzenmacher [30], is given as follows.

Definition 1. (Learned Bloom Filer) Given an element set S, an LBF can be represented as a triple (f, τ, BF ) where f : x ∈ S → [0, 1] is a pre-trained classifier (learned oracle), τ is the decision threshold of f where f (x) ≥ τ implies x ∈ S, and BF is the backup standard Bloom filter built on the set of all elements in S that are wrongly predicted as non-members, i.e., {x|f (x) < τ, x ∈ S}.

As shown in Figure 1, to process a membership query of element y, we trust the positive prediction but challenge the

negative outputs from the classifier, and a negative answer is made i.f.f. both classifier and backup filter determine y is a non-member. Such a construction of LBF ensures one- sided error (i.e., no false negatives), and for any non-member element y /∈ S, the FPR is given by:

FPRLBF= Pr(f (y) ≥ τ ) + Pr(f (y) < τ ) · α^m/|S^N^|, (3) where m is the number of bits allocated to the backup filter BF , SN = {x|f (x) < τ, x ∈ S}, and α is a constant that depends on the implementation of the backup filter.

In Eq. (3), Pr(f (y) ≥ τ ) can be interpreted as the false positive probability of the classifier, which is essentially a random variable and depends on how y is picked, i.e., the query distribution. In statistics literature, Pr(f (y) ≥ τ ) can be estimated by using a probe dataset which is assumed to be sampled from the same distribution of the dataset used to train classifier f . Given an LBF (f, τ, BF ), supposing the classifier f and the backup filter BF use ζ and m bits respectively, combining Eq. (2) and Eq. (3), an LBF is better than a standard BF consuming the same space (i.e., ζ + m bits) if the following inequality holds,

Fp+ (1 − Fp) · α^b/Fⁿ< α^ζ/n+b, (4) where Fp is the false positive probability of the classifier, Fn = |SN|/n, b = m/n, and α is a constant (Section 2.1).

Note that, the left- and right-hand sides of Eq. (4) stand for the FPR of LBF and BF with the same storage and the optimal number of hash functions.

2.3 BF Stability on Data Streams

The construction of either standard BF or LBF relies on knowing the whole picture of the element set S. To start our discussion on BF stability over dynamically growing element sets, we first define the membership testing on data streams.

Definition 2. (Membership Query on Data Stream) Con- sider an unbounded stream of elements x1, · · · , xnwhere n can be infinite, for a queried element y, the membership query of y returns true i.f.f. y ∈ {x1, · · · , xn}, i.e., y has been seen before timestamp n.

As we have stated earlier, any Bloom filter using limited space cannot achieve bounded one-sided error on unbounded data streams. Intuitively, this can be explained using the model FPR = α^m/n (Section 2.1), according to which we have limn→∞FPR = 1. To achieve a non-trivial FPR bound using limited storage, Deng and Rafiei [13] first introduce the concept of Stable Bloom Filter (SBF) with an idea of clearing random bits when inserting an element, in order to make rooms for future elements.

An SBF represents a dynamically growing set using an array of m counters SBF [1, · · · , m], instead of bit array used by standard BF, and each counter is allocated with d bits (i.e., SBF [i] is between 0 and M ax = 2^d− 1). To insert an element x, P counters are first randomly selected and decremented by 1 if they are non-zero. Then, similar to a standard BF, K independent hash values h1(x), · · · , hK(x) are calculated and counters SBF [h1(x)], · · · , SBF [hK(x)] are set to M ax. For a membership query of an element y, “yes”

would be returned if none of SBF [h1(y)], · · · , SBF [hK(y)]

is 0, otherwise “no” would be returned. The insertion algorithm and membership query processing using SBF are shown in Figure 4a and Figure 4b.

When applying SBF to a data stream x1, · · · , xn, a key observation is that, with the counter decrement behavior,

(4)

Function insert(SBF, x) for p = 1, · · · , P do

idx ← Rand(1, m) if SBF [idx] > 0 then

SBF [idx] ← SBF [idx]−1 end

end

for k = 1, · · · , K do SBF [hk(x)] ← M ax end

end

(a) Insertion algorithm of SBF.

Function query(SBF, y) for k = 1, · · · , K do

if SBF [hk(y)] = 0 then return false end

end return true end

(b) Query processing using SBF.

Input: an s-SLBF (f, τ, SBF ) and a sequence of N elements to be inserted x1, · · · , xN

for i = 1, · · · N do if f (xi) > τ then

continue else

insert(SBF, xi) end

end

(c) Insertion algorithm of s-SLBF.

Input: an s-SLBF (f, τ, SBF ) and a query element y

conf ← f (y) if conf > τ then

return true else

return SBF end

(d) Query processing using s-SLBF.

Input: a classifier f , a filter array SBF1, · · · , SBFg, and a sequence of N elements x1, · · · , xN

for i = 1, · · · , N do

j ← the interval [τj−1, τj] which f (xi) belongs to

insert(SBFj, xi) end

(e) Insertion algorithm of g-SLBF.

Input: a classifier f , a filter array SBF1, · · · , SBFg, and a query element y j ← the interval [τj−1, τj] which

f (xi) belongs to return query(SBFj, y)

(f ) Query processing using g-SLBF.

Figure 4: Insertion and membership query processing algorithms of SBF, s-SLBF and g-SLBF.

the fraction of ‘0’ counters in the array tends to be a constant as the insertion number n → ∞. According to Theorem 2 of [13], given an SBF with m counters, denoting p⁽ⁿ⁾₀ as the fraction of counters with value 0 after inserting n elements, the limit of p⁽ⁿ⁾₀ is,

n→∞lim p⁽ⁿ⁾₀ = 1 1 +P (1/K−1/m)¹

!M ax

. (5)

In addition, p⁽ⁿ⁾₀ − p⁽ⁿ⁻¹⁾₀ ≈ ^K_m 1 −^K_mn

, which indicates an exponential convergence.

The zero fraction p⁽ⁿ⁾₀ can be interpreted as the probability that an arbitrary counter SBF [i] is 0 after inserting n elements from a data stream. Thus, for any query element y not in the data stream, the false positive rate¹ generated by SBF is given by,

n→∞lim FPRSBF= lim

n→∞

1 − p⁽ⁿ⁾₀ K

(mK)

≈ 1 −

1

1 + K/P

M ax!K

. (6)

This property of reaching a non-trivial FPR after a large number of insertions, instead of decaying to 1, is called “stable” for Bloom filters on data streams.

However, with the counter decrement operations, SBF achieves stable at the cost of a non-zero number of false negatives, which means an already inserted element ximay be wrongly determined as a non-member by SBF. Accord- ing to [13], the false negative rate (FNR) for SBF relates to not only the filter’s parameters but also how the query elements distribute, i.e., the query distribution. Please refer to Section 3.3 for a detailed discussion on FNR over data streams.

1When referred to the false positive rate over data streams, we simply use FPR, instead of the limit of FPR when n → ∞, unless discussing the convergence rate of FPR.

2.4 Problem Statement

We have overviewed structures and analytical results of standard BF, LBF, and SBF. On static element sets, compared with standard BF, which has been used for decades, LBF shows its advantage of reducing the memory cost [25].

This inspires us to devise a new learned Bloom filter structure for approximate membership queries over a dynamic element set (as shown in Definition 2).

Specifically, we consider two operations for such a data structure: insert which adds an element x to the filter, and query which returns the membership testing result of an element y using the filter. When applied to an element stream, such a learned filter is expected to 1) achieve the stable property (i.e., the FPR reaches a non-trivial value as n → ∞); and 2) consume less storage at a competitive FPR/FNR level compared with a non-learned filter optimized on data streams (e.g., the SBF).

3. STABLE LEARNED BLOOM FILTER

In this section, we present two data structures (Sec- tion 3.1 and Section 3.2) as well as theoretical analysis (Sec- tion 3.3) to address the approximate membership testing problem for streaming data under the context of learned indexes.

3.1 Single SLBF

To make the original LBF framework stable after an unbounded number of insertions, an intuitive idea is to replace the backup filter in LBF, which is a standard BF, with a stable Bloom filter (Section 2.3). Such structure, as illustrated in Figure 5a, is referred to as single stable Learned Bloom filter (s-SLBF) where single means there is a single backup filter in such framework.

Definition 3. (s-SLBF) An s-SLBF can be represented as a triple (f, τ, SBF ) where f is a pre-trained learned oracle (i.e., classifier), τ is the corresponding decision threshold and SBF is a backup stable Bloom filter.

To insert a new element x (as shown in Figure 4c), the membership confidence f (x) is first calculated and compared

(5)

Learned Oracle Input: 𝑥

𝑆𝐵𝐹

positive

positive negative

negative

(a) Illustration of s-SLBF.

0.0 classification 1.0

score

Learned Oracle

𝑆𝐵𝐹1 𝑆𝐵𝐹2 𝑆𝐵𝐹𝑔 Input: 𝑥

…

[0.0, 𝜏1) [𝜏1, 𝜏2) [𝜏𝑔−1, 1.0]

(𝑚1, 𝑘1, 𝑃1, 𝑀𝑎𝑥1)(𝑚2, 𝑘2, 𝑃2, 𝑀𝑎𝑥2) (𝑚𝑔, 𝑘𝑔, 𝑃𝑔, 𝑀𝑎𝑥𝑔) positive distribution

negative distribution

(b) Illustration of g-SLBF.

Figure 5: Motivation and overview of our stable learned Bloom filter structures.

with the threshold τ . If f (x) ≥ τ , which means the model determines x is already predicted as a member, the insertion process will directly terminate; otherwise, x is inserted to SBF . To query an element y (as shown in Figure 4d), a positive answer is returned if f (y) ≥ τ or f (y) < τ but SBF determines y as positive.

Though very similar to the original LBF, the way how the classifier in s-SLBF (i.e., f and τ ) is obtained is intrinsi- cally different. Recall that the classifier f used in the original LBF is trained over a binary dataset {(xi, yi= 1)|xi∈ S}∪{(xi, yi= 0)|xi∈ N } where S is the element set used to build the filter, which is static, and N is the set of synthetic negative samples. In contrast, for s-SLBF, since it is built before fed with the data stream, instead of the exact element set (i.e., S), the prior knowledge would be accessible to train the classifier. This yields the fundamental difference of applicable scenarios between LBF and SLBF. Please refer to Appendix A for a detailed discussion.

It is obvious that s-SLBF is stable after a substantial number of insertions, if we assume the streaming elements follow the distribution used in training the classifier. Suppose the parameters used in SBF are m, K, P, M ax, based on Eq. (3) and Eq. (6), the expected FPR of s-SLBF is given by,

E[FPR] = Fp+ (1 − Fp) · 1 −

1

1 + K/P

M ax!K

(7)

where Fp= Pry∼D_N(f (y) ≥ τ ) and DN is the distribution of non-members.

If the classifier performs well, at the same expected FPR level (at stable), s-SLBF is supposed to save more space compared with a pure SBF. We explain this advantage using the following example.

According to Eq. (3), for SBF’s, the FPR at stable, is insensitive to the number of counters m since m K. With- out loss of generality, we assume Fp = 0.01 and Fn = 0.5 for the classifier of s-SLBF. We further pick SBF parameters P , K and M ax such that (1 − (1/(1 + K/P ))^{M ax})^K ≈ 0.1.

Under such settings, according to Eq. (7), the FPR bound at stable of s-SLBF is 0.01 + 0.99 ∗ 0.01 ≈ 0.1, which is at the same level compared with a pure SBF using identical parameters. Since the total storage cost of an SBF is m · blog₂(M ax) + 1c where M ax is fixed, the only factor in- fluencing total storage lies on the number of used counters m. According to Eq. (5) and Eq. (6), for SBF’s, increasing or decreasing m does not influence the FPR at stable (converged); however, m influences how fast FPR converges to its stable point. Thus, to make it fair, we compare the difference in storage cost for s-SLBF and SBF under similar stable FPR and convergence rate. Suppose the numbers of counters used by SBF and s-SLBF are m and m⁰, respectively. Let the two filters have the identical convergence

0.300 0.325 0.350 0.375 0.400 0.425 0.450 0.475 0.500 Fn

15000 20000 25000 30000 35000 40000 45000 50000

expected m'

N=50000 N=100000 N=200000 N=1000000 N=10000000

Figure 6: Numerical simulation result of Eq. (8) by varying the number of insertions N and the false negative probability of the classifier (i.e., Fn).

rate, and then we have the following equation, K

m

1 −K

m

N

= K m⁰

1 − K

m⁰

N ·F_n

. (8)

By setting m = 10⁶, K = 6, N = 10⁸, and Fn = 0.5, we can solve the numerical solution as m⁰ ≈ 4.7 × 10⁴, which implies a 53% reduction in terms of total storage. Note that we ignore the space cost caused by the classifier in s-SLBF since it is usually much smaller than the counter array.

To further understand the relationship between m⁰and Fn

regarding Eq. (8), we vary Fn from 0.3 to 0.5 and N from 5 × 10⁴ to 10⁷, and show the value of m⁰as a function of Fn

in Figure 6. The gap between neighboring lines decreases as N increases, which demonstrates that the filter approaches the stable point when the number of insertions N becomes substantially large. We can observe an approximately linear relation between m⁰ and Fn, which is reasonable since Fn

determines how many elements in the insertion stream “es- cape” from the classifier and are added to the backup SBF.

A higher Fnof the classifier implies more elements need to be inserted to the backup SBF and thus more counters are required to retain a similar convergence rate. Note that, the above simulation fixes other parameters like Fp to simplify the analysis and provide a general insight into the advantage of learned Bloom filters.

3.2 Grouping SLBF

As a straight extension, the s-SLBF has been shown to be stable using potentially lower storage than an SBF as we expect. However, it also inherits a major drawback from the original LBF framework, i.e., trusting all the positive predictions made by the classifier. This over reliance makes s-SLBF as well as the original LBF fragile when the classifier is not trustworthy. Such effect can be explained using Eq. (7) where Fp, the false positive probability of the classifier, is the lower bound of the overall FPR.

Besides the over reliance issue of the learned oracle, the single backup filter design also omits useful information provided by the learned oracle. Figure 5a illustrates how the classifier works in s-SLBF, from which we can find that both the false positives and the false negatives of the classifier come from the setup of a hard decision boundary. That is, negative (positive) elements falling on the left side are all categorized as positive (negative). This hard decision rule does not distinguish the confidence levels of elements falling on the same side, which is illustrated as follows.

Example 3. Suppose the decision threshold is set to τ = 0.7, there is no difference for elements with prediction scores 0.69 and 0.01, respectively, both of which would be treated in the same way, i.e., feeding to the same backup SBF. Sim- ilarly, for elements with scores 0.71 and 0.99, both of which

(6)

would be directly judged as members without any further action on the backup filter.

From the analysis above, we realize the major flaw of s- SLBF as well as the original LBF is the single backup filter nature. To further improve s-SLBF, we introduce the second data structure called grouping stable Learned Bloom filter (g-SLBF) by breaking the classification score (in range [0, 1]) into several intervals and allocating independent sub-filters for each interval.

Definition 4. (g-SLBF) A Grouping SLBF (g-SLBF) consists of a classifier f , and g heterogeneous SBF’s (also known as “sub-filters”) SBF1, · · · , SBFgwhere the j-th SBF SBFj

is described by a tuple of parameters (mj, Kj, Pj, M axj). A partition of the interval [0, 1] leads to g sub-intervals, i.e., [τ0 = 0, τ1], [τ1, τ 2], · · · , [τg−1, τg = 1], which are used to map an element x to the SBF’s regarding their prediction values f (x). More specifically, a new element x is inserted to SBFjif f (x) ∈ (τj−1, τj] (as shown in Figure 4e). To test the membership of element y, we directly ask the sub-filter SBFjmapped from f (y) (as shown in Figure 4f).

As illustrated in Figure 5b, the basic idea of g-SLBF is to partition elements in the insertion stream into several sub- groups based on the membership confidence given by the classifier. Intuitively, for those elements to be inserted with low membership confidence (i.e., locating at the left side of the confidence distribution as shown in Figure 5b), since the classifier has a relatively high Fpin this range, we can adjust the corresponding sub-filter SBFjto compensate the loss of FPR by setting K, P and M ax appropriately. Besides, since Fnis low in this range, which means there would not be too many elements to be inserted to SBFj, fewer counters are needed to be allocated to achieve the desired FPR at stable in a satisfactory convergence speed. On the other hand, for elements with high confidence (i.e., locating at the right side in Figure 5b), the FPR requirement of the SBF can be loosed since elements in this range already have a high prior possibility of being a member, and similarly, since the classifier might wrongly determine many member elements as non-member (i.e., high value of Fn), more counters are required to let SBFjconverge to its stable point.

It is noteworthy that s-SLBF can be regarded as a special case of the g-SLBF by setting g = 2, i.e., only one decision threshold τ , and letting the sub-filter in range [τ, 1] always give positive answers. Compared with s-SLBF, where the positive predictions from classifier are fully trusted, g-SLBF (g > 2) is more conservative towards the classifier output since all the membership decisions are jointly made by the classifiers and the backup filter. Such property makes the g- SLBF more robust against the quality of the classifier (e.g., the incoming element stream does not strictly follow the distribution as that of the training data). Note that, the su- periority in robustness of g-SLBF will be demonstrated both analytically and experimentally in the following sections.

3.3 Analytical Results

In this section, we analyze the FPR, FNR and convergence behavior of our two SLBF structures. Note that, we focus on g-SLBF since s-SLBF is a special case of g-SLBF, whose theoretical results naturally apply to s-SLBF. We first give some preliminary notations in the following.

For the j-th classification score interval [τj−1, τj] in g- SLBF, we define two probabilities pjand qj as follows,

pj= Pr

x∈D_N(f (x) ∈ [τj−1, τj−1]), qj= Pr

x∈D_P(f (x) ∈ [τj−1, τj−1]), (9) where DN and DP are distributions of non-members and members. The pair (pj, qj) depicts the false positive and false negative behaviors of the classifier in the range [τj−1, τj].

Note that, it is generally hard to know the exact values of pj and qj as DN and DP are unknown. However, as what would be shown in the parameter setting (Section 4), we use the test datasets to estimate pj and qj.

In the analysis presented hereafter, we adopt the following two assumptions about the classifier and data, which are not hard to understand and have been adopted by existing learned Bloom filter works [12, 25, 30].

Assumption 1. The members and the non-members of the filter follow the distributions DP and DN, respectively.

Thus, the FPR of an SLBF is the expected false negative rate over the non-member distribution DN.

Assumption 2. For j = 1, · · · , g, it holds that p1 ≥ p2≥

· · · ≥ pg and q1≤ q2≤ · · · ≤ qg.

3.3.1 False Positive Rate and Stability

Suppose a sequence of n elements x1, · · · , xn, where xi∼ DP, has been inserted to a g-SLBF. Then, for a new query with an element drawn from the non-member distribution DN, the expected FPR of this g-SLBF (at stable) is

E[FPR] =

g

X

j=1

pj· 1 −

1

1 + Kj/Pj

M ax_j!K_j

| {z }

denoted by α_j

. (10)

Suppose there are g SBF’s with αj’s satisfying α1 ≤ α2 ≤

· · · ≤ αg. Then considering the intervals depicted in As- sumption 2, Lemma 1 describes how to allocate the SBF’s to these intervals to minimize E[FPR], which also validates our discussion in Section 3.1.

Lemma 1. Allocating the filter with αj to the interval [τj−1, τj] for j = 1, · · · , g minimizes E[FPR].

Proof. Following Assumption 2, p1≥ · · · ≥ pgand α1≥

· · · ≥ αg, then according to the Rearrangement inequality [20], for any other permutation ασ(1), ασ(2)· · · ασ(g),

g

X

j=1

pj· ασ(j)≥

g

X

j=1

pj· αj= E[F P R], (11) which proves this lemma.

Based on Lemma 1, we then prove an upper bound of the expected FRP of g-SLBF (at stable), which is free of pj.

Theorem 1 (FPR Upper Bound). An upper-bound of the expected FPR of g-SLBF at the stable state is the arithmetic mean of FPR of the g sub-filters SBF1, · · · , SBFg, i.e., E[FPR] ≤ ¹_gPg

j=1αj.

Proof. Since p¹ ≥ p2 ≥ · · · ≥ pg and α1 ≥ α2 ≥ · · · ≥ αg, according to the Chebyshev’s sum inequality [20], it always holds that,

E[FPR] =

g

X

j=1

pj· αj≤ g · 1 g

g

X

j=1

pj

!

· 1 g

g

X

j=1

αj

!

= 1 ·1 g

g

X

j=1

αj=1 g

g

X

j=1

αj.

(12)

Thus we complete the proof.

(7)

Recall that in Section 3.2, we argue that g-SLBF is more robust against the classifier quality than the s-SLBF. As we discussed earlier, the s-SLBF (as well as the original LBF) adopts a single backup filter structure, which makes the classifier’s FPR directly upper bounds the overall FPR as a consequence. However, according to Theorem 1, the FPR of g-SLBF is bounded by the arithmetic mean of sub-filters’

FPR, which is independent of the classifier quality and spe- cific distribution assumption. Note that, this inequality holds under the assumption that p1 ≥ p2 ≥ · · · ≥ pg, which generally holds if the dataset is “learnable” (see our validation of this assumption in Section 5.3). We also conduct experimental study to validate the robustness claim by adding distortions to the distribution of element streams. The results show that the speed of FPR getting deteriorated of g-SLBF is much slower than that of s-SLBF w.r.t. distribution distortion. Please refer to Appendix E for more details.

3.3.2 Convergence Rate

The following theorem describes the convergence rate of g-SLBF, i.e., how fast the filter approaches its stable FPR.

Theorem 2 (g-SLBF Convergence). g-SLBF converges to its stable point, which is shown in Eq. (10), at a speed of O(exp(−C · n)) where n is the number of total insertions and C = minj

q_jm_j

K_j for j = 1, · · · , g.

Proof. As introduced in Section 2.3, the rate of convergence for sub-filter SBFj is,

Kj

mj

1 −Kj

mj

q_j·n

= Kj

mj

1 −Kj

mj

_mj^Kj·^{mj qj n} Kj

≈ O

exp

−qj

kj

n

.

(13)

Apparently, the g-SLBF approaches its stable point if and only if SBF1, · · · , SBFg are all stable. Thus, the overall convergence rate is that of the slowest sub-filter, i.e., O(exp(−n · minj

q_jm_j K_j )).

3.3.3 False Negative Rate

A false negative occurs when a negative answer is given to a query of a member, i.e., an element which has been inserted before. Similar to SBF, our g-SLBF allows a number of false negatives to achieve a bounded FPR (stable) using limited storage on unbounded data streams. To quantify the influence of false negatives for our data structure, we first review the FNR results of SBF, which is adopted by our g-SLBF as sub-filters.

Different from the FPR which is determined by only filter parameters, the FNR of SBF also depends on the charac- teristic of input data stream and query workloads. Given an element xi in a data stream, let δi be the number of timestamps between the most recent insertion and query of the element xi, which is referred to as the “gap” of xi. Ac- cording to [13], for an inserted element xi, the false negative probability for element xi is,

Pr(F Ni) = 1 −

K

Y

j=1

(1 − Pr(SBF [hj(xi)] = 0|δi)), (14)

where Pr(SBF [hj(xi)] = 0|δi) is the probability that counter SBF [hj(xi)] becomes 0 after δi times new insertions. Note

that, if δi< M ax, then Pr(SBF [hj(xi)] = 0|δi) is always 0 since it is impossible to decrease the counter to 0.

For our g-SLBF, which adopts a sequences of independent SBF’s as sub-filters, supposing an element xiwhich has been inserted to j-th sub-filter SBFj, according to Eq. (14), its false negative probability is

Pr(F Ni|xi∈ SBFj) = 1 − (1 − pN(δi, kij))^K^j, (15) where pN(·) is the probability that one of the Kj counters (xi is mapped to) has been decremented to 0 after δi insertion operations to the filter (note that δiis the gap of xi).

pN(·) is a function of δiand kijwhich is the probability of a mapped counter to be set to M axj. Note that, kijis a random variable which depends on the occurrence frequency of each element in the insertion stream. That is to say, pN(·) is different for each xi, which makes it rather difficult to precisely compute FNR as we do not have prior knowledge towards such insertion frequencies. On the other hand, the simulation results shown in [13] reveal that such frequency features make a little impact on the overall FNR result provided that the data stream is large enough (n → ∞). Thus, in our work, without loss of generality, we derive the expected FNR of the g-SLBF by assuming that an element appears only once in the insertion data stream, i.e., there is no duplicate insertion.

Under the above assumption, kijare in the same form for each xi, which is kij= kj= _n¹

j(1+Pn_j−1

l=1 Il) where njis the number of elements already inserted to the filter SBFj and Ilis an Bernoulli distributed random variable with Pr(Il= 1) = Kj/mj. Consequently, the overall expected FNR of g-SLBF can be deducted as

E[FNR] =

g

X

j=1

Pr(F N |x ∈ SBFj) · Pr(x ∈ SBFj)

=

g

X

j=1

1 − (1 − pN(˜δ, kj))^K^j

· qj,

(16)

where ˜δ is the average gap of the data stream. The concrete evaluation of pN(·) based on ˜δ and kj can be found in Ap- pendix C. Once pNand the filter parameters are determined, we can estimate FNR using the equation above.

In summary, as a side effect, false negatives are inevitable for our g-SLBF to obtain stability over an unbounded number of insertions, which is similar to SBF [13]. Unlike FPR, determining FNR of g-SLBF relies on the prior knowledge of both insertion element stream as well as query element stream (to calculate the gap value δi for each inserted element xi). To tackle the false negative issue, in the following section, we devise a parameter setting strategy with the objective of minimizing FNR while bounding FPR by a user-given threshold. Detailed evaluation results presented in Section 5 demonstrate that our g-SLBF has a similar FNR compared with SBF but achieves a better FPR/storage trade-off, i.e., in the same FNR and FPR level, our proposed learned filter is supposed to save more storage.

3.3.4 Time Complexity

Both the insertion operation and the membership query processing using g-SLBF take O(1) time. Supposing the model prediction takes time O(M ), the time complexities of insertion and query processing are O(M + maxj(Pj+ Kj)) and O(M + maxj(Kj)), respectively. Since all M, Pj, Kjare

(8)

user-specified constants, we conclude that the insertion and membership testing using g-SLBF take constant time.

4. PARAMETER SETTING

In this section, we discuss how to properly set the parameters of g-SLBF according to the analytical results in Section 3.3. Again, without loss of generality, we focus on g-SLBF, and the parameter setting strategies can be naturally extended to s-SLBF.

Overview. The parameters include the number of groups g, the partition values τ1, · · · , τg−1, and the specification for each sub-filter, i.e., (mj, Pj, Kj, M axj) for SBFj. The users are enabled to provide their desired g, upper bound of expected FPR , and storage budget B (i.e., number of bits). Then, we set up the aforementioned parameters by minimizing the expected FNR (Eq. (16)) while bounding the expected FPR (Eq. (10)) within and the storage cost within B, similar to the setting of SBF [13].

Setting of τ1, · · · , τg−1. Given the total number of groups g, we uniformly partition the interval [0, 1], leading to [^j−1_g ,^j_g] as [τj−1, τj]. The intervals are equally important in terms of our analysis in Section 3.3, so we simply adopt an uni- form partition. For each specified decision interval, two test datasets are used to estimate its pj and qj. Specifically, given the member and non-member sample sets fSP and fSN, pjand qj can be estimated as

pbj= |{x|x ∈ fS_N, f (x) ∈ [τj−1, τj]}| / | fS_N|,

qbj= |{x|x ∈ fS_P, f (x) ∈ [τj−1, τj]}| / | fS_P|. (17) Recall that we need to bound the overall expected FPR within . According to Lemma 1, by assuming pj· αj= C where C is a constant, which implies an inverse proportional relationship between pjand αj, an upper bound of FPR for each sub-filter SBFj, denoted by α^obj_j , is then derived as

α^obj_j =

1 cp_j · Pg

l=1 1 pb_l

. (18)

Setting of Kjand M axj. We then determine the values of Kjand M axjby minimizing the expected FNR (Eq. (16)).

According to the the observation in [13], the optimal or near optimal value of Kj is determined mainly by M axj

and the FPR bound α^obj_j , and insensitive to mjand the input data stream. The optimal M axjrelates to the average gap value of the data stream. Besides, M axj should not be too large, as a large M axj will lead to a significantly large Pj(counter decrements) during query and thereby low query efficiency. Thus, we search Kjand M axjin the space Kj = 1, · · · , 10 and M axj ∈ {1, 3, 7} (corresponding to

#bits per counter∈ {1, 2, 3}). We enumerate all the possible combinations of (Kj, M axj) to pick the optimal pair such that the estimated FNR of SBFj(using Eq. (15) with a presumed average gap value) is minimized.

Setting of Pj. With Kj, M axj and α^obj_j , Pj can be solved w.r.t. Eq. (6) as follows,

Pj= Kj

1 −

α^obj_j 1/K_j1/M ax_j. (19) Setting of mj. Finally, to set the number of counters for BSFjmjw.r.t. the total bit budget B, as we have discussed in Section 5b, we require all the sub-filters are required to

2 4 6 8 10 12 14

K 0.015

0.020 0.025 0.030 0.035 0.040 0.045

Estimated FNR

SBF 1 SBF 2 SBF 3

Figure 7: Simulated FNR for each sub-filter in Example 4 with δ = 100, m = 10⁵ and M ax = 1.

converge at a similar speed. Thus, according to Theorem 2, we have ^q^j_K^m^j

j = W where W is a constant. To bound the total number of bits usage, mjcan be determined by

mj=

K_j cq_j · B Pg

l=1 K_l

qb_l · blog₂(M axl) + 1c. (20) Example 4. (Parameter Setting) Suppose that g = 3, B = 16, 384, = 0.01, and we are given a classifier which havepb1 = 0.485,pb2= 0.390,pb3= 0.125 andqb1= 0.090,qb2= 0.347,qb3= 0.563. The FPR upper bound can then be calculated using Eq. (18) as αôbj₁ = 0.0016, αôbj₂ = 0.0020, αôbj₃ = 0.0063. To find the optimal Kjand M axj, we presume the gap value as 100 and compute the values of FNR over different pairs of Kj& M axj, which is shown in Figure 7. Note that, we only show the case of M ax = 1 to due to the limited space. Figure 7 suggests the optimal setting of (Kj, M axj) for j = 1, 2, 3 w.r.t. FNR is (6, 1), (6, 1) and (5, 1), respectively. Then, according to Eq. (19), all Pj’s are set to 12, and according to Eq. (20), the number of bits allocated to each sub-filter are 11,764, 3,054, and 1,566, respectively.

5. EXPERIMENTAL STUDY

In this section, we report the implementation details and experimental results on datasets from real-world applications. All the experiments were conducted on a Ubuntu laptop with Intel(R) Core(TM) i7-8550U CPU @ 1.99GHz and 16GB memory, and all the methods are implemented in C and compiled using GCC with -O3 optimization.

5.1 Baselines and Implementation Details

To show the effectiveness of our proposed data structures, we implement and compare five filters, including the standard Bloom filter (BF), the stable Bloom filter (SBF), the original learned Bloom filter (LBF), the simple SLBF (s-SLBF) and the grouping SLBF (g-SLBF).

BF and LBF. BF and LBF are the baselines in this ex- periment, to show how non-stable filters behave over streaming data. The BF implementation follows the most standard space-optimal Bloom filter scheme [9] where the number of hash functions is always set to the optimum. For LBF, we use relatively simple models like gradient boosting trees for the efficiency concern in the scenario of streaming data, instead of the deep learning models used in [25] (details would be discussed later).

SBF, s-SLBF and g-SLBF. These three filters achieve stability in the scenario of streaming data. The parameters of SBF and s-SLBF & g-SLBF are set according to [13] and our discussion in Section 4, respectively.

Hash function Implementation. All filters require the computation of K hash values. We adopt xxHash [1], which

(9)

is an extremely fast non-cryptographic hashing scheme with high quality. In addition, instead of exactly computing K independent hash values, we use the speedup suggested in [23] where only two independent hash values ha, hb are calculated and the j-th hash value is given by ha+ j ∗ hb for j = 1, · · · , K.

Classifier Implementation. In the first paper on learned indexes [25], deep learning models, i.e., neural networks, are suggested to construct learned data structures. Specifically, they suggest a Recurrent Neural Network (RNN) for learned Bloom filters. Though deep models perform better on many tasks, considering the real-time requirement for membership query processing over data streams, commonly used deep learning platforms, like Tensorflow and PyTorch, are too heavy to be deployed. Tough the inference efficiency issue can be alleviated by using GPU, a new bottleneck might be migrating data between CPU and GPU. Since this is not a paper introducing new machine learning schemes, in pursuit of efficiency, we test and compare three lightweight models: logistic regression, support vector machine and gradient boosting tree (GBT) based on Catboost [3]. We found that GBT classifiers perform well enough on all our three real- world tasks considering classification quality (e.g., AUC), storage cost, and inference efficiency (details will be discussed later). Then, we adopt GBT as the classifier for both LBF and SLBF in our experiments.

5.2 Datasets, Parameters and Metrics

To demonstrate the effectiveness and efficiency of our stable learned Bloom filters, we test all five filters on three real-world datasets: Amazon, Attack and Higgs. We briefly introduce each dataset as follows, and the statistics of these datasets are summarized in Table 2.

Task 1: Amazon [2]. This dataset consists of resource access records from Amazon employees collected from 2010 to 2011 in which employees are allowed or denied access to resources over time. Each record contains a unique ID and 10 features which are used to build the classifiers.

Task 2: Attack [5]. This is a dataset of web attack traces. A total of 23 features are extracted including packet source/destination and traffic statistics.

Task 3: Higgs [4]. The Higgs data is a scientific dataset which asks for classifying whether a signal process produces Higgs bosons or not. There are in total 28 kinematic features obtained through particle detectors.

Table 2: Statistics of datasets.

Name #Samples #Positives #Negatives

Amazon 91,690 86,382 5,308

Attack 2,278,689 923,216 1,355,473 Higgs 11,000,000 5,829,597 5,170,403 For each dataset, a small portion, specifically 20%, is sampled to obtain a pre-trained classifier and to estimate some parameters like pj, qj(Eq. (9)), and the remaining 80% data are used to generate the insertion and query streams. We train the classifiers for each task using the gradient boosting tree model, and the model information is shown in Table 3.

Insertion Workloads. All positive samples in dataset are regarded as members, and negative samples are regarded as non-members. For a dataset, the corresponding insertion workload is the sequence of positive samples. Note that, when inserted elements into non-learned filters like BF and SBF, we only insert the unique identifier to the filter, and the features associated with the element are discarded. Sim-

Table 3: Classifier information.

Dataset Training

Time AUC Inference

Throughput Storage

Amazon 0.27 s 0.87 9 Mops/s 173 KBits

Attack 3.49 s 0.91 11 Mops/s 328 KBits

Higgs 44.8 s 0.82 11 Mops/s 215 KBits

ilarly, for learned filters like LBF, s-SLBF and g-SLBF, features are used to calculate the membership confidence score using the classifier, and only identifiers are inserted to the filter if necessary.

Query Workloads. We need the query workloads to evaluate the performance of all filters. According to the analysis in Section 3, FPR and FNR are measured for the members and non-members, respectively. Besides, FNR is also affected by the time gap between a member element being inserted to the filter and being queried. Thus, for each dataset, given a gap value δ, for each element xi inserted to the filter, we will query it after δ insertions of other elements to measure the FNR. After all elements have been inserted (from the positive sample set of each task), we will query the filter using the negative sample set to measure the FPR. The empirical FNR and FPR under the query workloads are then calculated as EFNR = #false negatives

#positive samples and EFPR = #false positives

#negative samples. Note that, in the results reported in this section, we fix δ as 2,000, which is a reasonable value for real-world applications. However, we also report results by varying δ, the results demonstrate a clear increasing ten- dency of FNR as δ increases for both SBF and our SLBF.

This is reasonable since a higher δ increases the likelihood of the a counter in the backup filter being decreased to 0, which leads to false negatives. Please refer to Appendix D for more information.

Control variables. Three parameters, the desired FPR upper bound , the total storage budget (#bits) B, and the number of groups g for g-SLBF, are varied to evaluate the robustness of the filters. Table 4 summarizes the parameter settings for each dataset where the underlined values are regarded as default values.

Table 4: Parameter setting.

Parameter Values

0.5%, 1%, 5%, 10%, 20%

g 2, 4, 6, 8, 10

B Amazon&Attack: 2¹⁴, 2¹⁶, 2¹⁸, 2²⁰, 2²⁴ Higgs: 2²⁰, 2²², 2²⁴, 2²⁶, 2²⁸

5.3 Experimental Results

Validation of Assumptions. In Section 3.3, to analyze the performance of our SLBF, we make the assumption that p1 ≥ p2 ≥ · · · ≥ pg and q1 ≤ q2 ≤ · · · ≤ qg for a well trained classifier. To validate such assumption, we use the positive set and negative set for datasets Attack and Higgs to calculate the corresponding classification scores and draw the histograms as shown in Figure 10, from which we can verify our assumption. Besides, the score histogram can also be used to guide the setting of g since we can keep on partitioning the interval until such monotonic relationship does not hold.

Validation of Stability. To verify the stability of the proposed filters, we test g-SLBF using the Amazon dataset by setting g = 6, = 10% and varying B in the range 2¹⁰· · · 2¹⁸. Specifically, we measure the empirical FPR using the negative sample set after every 4,000 new insertions and plot the results in Figure 11. We can find that EFPR grows