Chapter 4 Adaptive Cuckoo Filters
4.5 Negative Cache
The negative cache is a small hash table storing the true negative keys in full representation.
The bigger the hash table is, the more negative keys can be saved which will avoid duplicate false positive queries and hence decrease the overall false positive rate. The tradeoff between the size of the negative cache and the predefined memory budget is not discussed in this thesis. This thesis only shows the effectiveness of decreasing the false positive rate in the skewed workload using only a small negative cache.
We adopt the second-chance strategy as the timing to insert a new false positive key into the negative cache. When a query key is a false positive query, we will not insert the key into the nega-tive cache immediately. Instead, we will insert a false posinega-tive key into the neganega-tive cache when the queried bucket has false positive queries before. Because the skewed workload will make some buckets have more false positives than other buckets, the second-chance strategy will prevent us from replacing the more-queried false positive keys with the less-queried false positive keys too of-ten.
As described in Chapter 4.3.2, the negative cache is treated as a victim cache, which will be queried only needed compared to a regular cache that is placed in the topmost position and need to be queried on every lookup. In this manner, we do not have to query the negative cache on every lookup, thus speeding up the lookup process. We need to query the negative cache only when the queried bucket in a filter has false positive queries before.
As described above, we need to know if a bucket of a filter has confronted false positive que-ries before. Therefore, we use one bit for each bucket save this information. We compact this one bit with the other 4 bits described in Chapter 4.2 in the extra bits as in Figure 3.
Chapter 5
Experiments
This chapter presents the results of the experiments that we conducted with the implementation of ACF, a synthetic benchmark and a real workload from BUSC server. We compare our method with 3 other methods – Cuckoo filter, Bloom filter and Split Bloom filters to study the false positive rate and the performance under the changing workload pattern. And we show the difference of lookup performance between ACF and Cuckoo filter. Specifically, this chapter shows the results of the fol-lowing two ACF variants:
ACF: The complete design presented in Chapter 4.
ACF no-NC: The ACF without the small negative cache (NC) and other components remains the same.
5.1 Benchmark Environment
To study ACF, we ran a series of synthetic workloads using the two ACF variants and other three types of filters – Cuckoo filter (CF), Bloom filter (BF) and split Bloom filters (SBF).
The benchmark database is uniformly distributed, which means the keys were evenly distributed in the domain, because ACF uses hashing to decide a key’s location, which will lead to uniform distribution naturally. The domain size of the inserted keys and queried keys is 224.
For experiment 1 and 2, we studied two different query workloads – Zipf and Uniform. All que-ries were generated randomly within the domain and run sequentially. For Zipf distribution, we study 9 different degree of skew. We vary the distribution parameter a of Zipf from 1.1 to 1.9. The bigger the a is, the more skewed the distribution is. The more skewed the workload is, the higher frequency the most-queried key has. We also studied a real workload from BUSC server that is also skewed.
For experiment 3 to 6, we use YCSB’s Zipf distribution with default parameters. Each data point in the figures is the average value of running 10 independent workloads.
All the experiments were conducted in the same way. First, we insert 200,000 distinct keys into ACF, CF, BF and SBF at the beginning. Then we ran 1,000,000 randomly generated queries accord-ing to the distributions stated before to measure the false positive rate. For lookup speed, we compare ACF to CF only. The false-positive rate is defined as the number of false positives divided by the total number of queried not in the filter. To be able to compare these 4 types of filters fairly, we make these filters use same amount of memory. To be specific, first, we insert the 200,000 keys into the ACF, and record its used memory which is set as the memory budget for all the 4 types of filters. For Cuckoo filter and Bloom filter, once the memory budget is set, the filter’s size is fixed. For ACF, occasionally the used memory will be a little bit higher than the memory budget but it will be shrunk to be below the budget through our shrinking algorithm. And for SBF, the used memory is dynamic and always below the budget.
5.2 Software/Hardware Settings
We ran all of the experiments on a machine with a 6-core Intel processor (3.40GHz) and 64GB of main memory. And the machine ran CentOS version 6.7. We implemented ACF based on the original C++ implementation by the author of Cuckoo filter. For Bloom filter, we use a Python pack-age, pybloom and revise it to be able to use a specified amount of memory. And we implemented split Bloom filters based on pybloom using Python. The split Bloom filters compared here is based on the design in [12] that all the Bloom filters will be resized to be their theoretical optimal size on every 10,000 lookups. We measured the execution time of ACF and Cuckoo filter using gettimeofday() function.
5.3 Experiment 1: False Positives
5.3.1 Synthetic Workload
Figure 4a and 4b study the false positive rate of the two ACF variants and 3 other types of filters in the Zipf distribution with different distribution parameters. We show two different bits per item – 8 bit and 12 bit. The main competitor of ACF is split Bloom
Filters. Each point in the figures is the average of 10 independent experiments.
In Figure 4a and 4b, it’s clear that ACF and SBF can adapt to skewed workloads gracefully because they are designed to do so, so the more skewed the workload is, the better the precision is.
In Figure 4a, it can be seen that SBF has the smallest false positive rate in all different degree of skew. But it’s reasonable that our approach is not the best, because according to (1), (2), Cuckoo filter can only outperform Bloom filter in false positive rate when the bits per item is over 10 bits. Thus, in Figure 4b, when the bits per item is 12 bit, we can see that ACF is the clear winner in all degree of skew. Furthermore, the “ACF no-NC” variant is always worse than ACF, showing that we can have further improvement using only a small negative cache and proving the effectiveness of the negative cache described in Chapter 4.5.
Figure 6 shows the results under uniform workloads when the bits per item is 12 bit. It can be seen that ACF has higher false positive rate than Cuckoo filter, because it has no “skew” to adapt.
However, we can easily make ACF not adaptive by turning off the “grow” and “shrink” operation and not using negative cache to achieve the same precision as Cuckoo filter.
(a) Average bits per item = 8
(b) Average bits per item = 12
Figure 4. Zipf Workload, Vary Distribution Parameter
5.3.2 Real Workload
Figure 5 shows the results of experiment with the real workload from the Safe Box server which uses BUSC as its database engine. This workload is a query log during a short period of time, con-sisting of 200,000 insertions and 8,000,000 lookups from multiple users. The workload is also skewed because some objects are requested more often than other objects. In Figure 5, we can see that ACF outperforms other types of filters again. Besides, by using a small negative cache, the precision is improved phenomenally (halved from 0.06% to 0.03%).
.
Figure 5. BUSC Workload Figure 6. Uniform Workload
5.4 Experiment 2: Speed
In Figure 7, we can see that for lookup speed and insertion speed, ACF has a little bit perfor-mance degradation (7.4%~10.5%) than cuckoo filter. Because ACF has an added complexity of in-dexing the queried cuckoo filter, the use of extra bits and the use of negative cache, the query speed of ACF will be slower than cuckoo filter. However, this performance penalty is worth it, because we can save many unnecessary disk accesses that will lower the overall read throughput intensely.
In Figure 5, we can see that ACF has 4x lower false positive rate than cuckoo filter.
Figure 7. Lookup/Insert Speed
5.5 Experiment 3: Grow/Shrink Buckets
In this experiment, our goal is to determine how many buckets to grow/shrink at each time.
Figure 8 shows the results of growing/shrinking different number of buckets with limited/unlimited memory budget. With limited memory budget, the memory budget is set to the amount of used memory after inserting all keys into the ACF. In Figure 8a, it can be seen that as the number of growing/shrinking buckets increases, the shrinking times also increases steadily because it gets
more easily to exceed the tight memory budget. The more ACF shrinks, the higher false positive rate ACF has. We can observe that when growing/shrinking over 13 buckets at each time, the false positive rate will be higher than not growing/shrining at all (red line). In Figure 8b, we set the memory budget of ACF to be infinite, thus ACF will only grow but not shrink. It can be seen that the false positive rate does not decrease as we grow more buckets.
Therefore, we can view the process of growing more buckets of a filter as rehashing – just to
prevent the same false positive key from happening again. However, it is still possible that a filter will confront multiple different false positive keys. But, since ACF uses many filters, it has a small probability that multiple high-frequency keys will locate in the same filter. Even if this situation happens, ACF can adapt to it quickly.
In conclusion, we choose to grow/shrink 1 bucket at each time. In other words, we adjust ACF in the most fine-grained way, and this will lead to the best performance.
(a) Limited Memory Budget
(b) Unlimited Memory Budget
Figure 8. Limited/Unlimited Memory Budget, Vary Grow/Shrink Buckets
5.6 Experiment 4: Grow Threshold
In this experiment, we aim to choose the appropriate threshold value of fp_threshold and neg_threshold in Algorithm 3. The bigger the threshold value is, the less frequent will ACF grow.
We do not want to grow too much because every growing will cause a disk access, and it might in-cur more shrinking which will also cause disk accesses. In contrast, if we grow too conservatively, we cannot adapt to the skewed pattern properly. Thus, we need to choose a good balance between the two extremes.
Figure 9 shows the results of the tuning of the two threshold values. We show two lines – false positive rate and total disk accesses. Disk access includes the total disk accesses caused by false positives and growing/shrining. In Figure 9a, we fix neg_threshold to 60 and vary fp_threshold from 1 to 40. We can observe that when fp_threshold = 10, the false positive rate is at the mini-mum, and the total disk access stop decreasing. Thus, we choose 10 as the value of fp_threshold.
In Figure 9b, we fix fp_threshold to 10 and vary neg_threshold from 10 to 340. Although the false positive rate increases linearly as fp_threshold increases, the total disk accesses is at the minimum when neg_threshold = 60. Thus, we choose 60 to be neg_threshold.
The reasoning behind choosing the right threshold is that we want ACF to react in the best tim-ing while maintaintim-ing the false positive rate low but not caustim-ing too much growtim-ing/shrink which will offset the benefit of low false positive rate.
(a) False Positive Rate Threshold
(b) Negative Threshold
Figure 9. Vary Thresholds (fp_threshold, neg_threshold) in Algorithm 3
5.7 Experiment 5: Negative Cache
In this experiment, we aim to choose the proper size of the negative cache and show the ad-vantages of adopting second-chance strategy and treating the negative cache as a victim cache. We turn off the functionalities of growing/shrinking, thus ACF only uses the negative cache to adapt to skewed workload in this experiment.
5.7.1 Negative Cache Size
In Figure 10a, we can observe that the false positive rate decreases as the size of the negative cache increases, which is intuitive because the negative cache can store more false positive keys and the keys stored in the negative cache are less likely to be replaced by others.
We can observe that there is a turning point when the negative cache has 12 buckets, and after-ward the false positive rate decreases slowly. Because only a small fraction of the keys are frequently accessed, we can capture the skewness using a small size of negative cache. After the turning point, it will be less effective of adding more buckets to the negative cache. Moreover, since the keys in the negative cache is in its complete representation, not compressed, we should make the negative cache as small as possible. Therefore, we choose 12 buckets, the turning point, as the negative cache heu-ristically.
5.7.2 Second-Chance Strategy and Victim Cache
In Figure 10a, it can be seen that adopting second-chance strategy achieves consistently lower false positive rate than not adopting, leading to the average 8% improvements.
In Figure 10b, we place the negative cache before ACF (as regular cache) or after ACF (as victim cache) to find which one has better lookup performance. We can observe that the size of the negative cache will not affect the lookup performance, and treating the negative cache as a victim cache achieves consistently better performance, resulting in the average 15.4% improvements.
(a) w/ or w/o Second-Chance Strategy
(b) Victim Cache
Figure 10. Effect of Second-Chance Strategy and Victim Cache
5.8 Experiment 6: Combinations
Figure 11 shows the different combinations of ACF adopting Grow/Shrink strategy or Nega-tive Cache or both. We can see that adopting Grow/Shrink strategy solely can achieve 31.4%
im-provements compared to the not adaptive version (the first bar). Using negative cache solely can achieve 33.5% improvements. If we combine both the two adaptive strategies, then we can achieve 38% improvements, getting additional 7% and 5 % improvements than using only one of the two adaptive strategies.
Figure 11. Combinations
Chapter 6
Conclusion
This thesis presented ACF, a new data structure called Adaptive Cuckoo Filters (ACF). ACF can exploit the skewed access pattern and dynamically adjust the size of an array of cuckoo filters to achieve significant smaller false positive rate than a single cuckoo filter. In addition, ACF uses a small hash table as Negative Cache which stores false positive keys to further lower the false positive rate. We determined the best parameters for ACF and proved the effectiveness of adopting second-chance strategy and treating the negative cache as a victim cache through experiments. Extensive experiments showed that when the workload is skewed, ACF can outperform traditional Bloom filters and cuckoo filters, and a state-of-the-art split Bloom filters that can also adapt to skewed workload.
For future work, we plan to do the quantitative analysis of ACF and develop a mathematical framework for optimizing ACF under various query distributions.
Bibliography
[1] A. Rousskov and D. Wessels. 1998. Cache digests. In Computer Netw. ISDN Syst., vol. 30, no.
22–23, pp. 2155–2168.
[2] A. Malik and P. Lakshman. 2010. Cassandra: a decentralized structured storage system. In SI-GOPS Operating System Review, vol. 44, no. 2.
[3] A. Eldawy, J. J. Levandoski, and P. Larson. 2014. Trekking through Siberia: Managing cold data in a memory-optimized database. In Proc. Int. Conf. Very Large Data Bases, pp. 931–942.
[4] B. H. Bloom. 1970. Space/time trade-offs in hash coding with allowable errors. Communica-tions of the ACM, 13(7):422–426.
[5] Bruck, J., Gao, J., and Jiang, A. 2006. Weighted Bloom Filter. In IEEE International Sympo-sium on Information Theory.
[6] Bin Fan, Dave G. Andersen, Michael Kaminsky, and Michael D. Mitzenmacher. 2014. Cuckoo filter: Practically better than Bloom. In Proc. 10th ACM Int. Conf. Emerging Networking Exper-iments and Technologies (CoNEXT ’14), pages 75–88, 2014. doi:10.1145/2674005. 2674994.
[7] C. Diaconu, C. Freedman, E. Ismert, P.-A. Larson, P. Mittal, R. Stonecipher, N. Verma, and M.
Zwilling. 2013. Hekaton: SQL Server’s memory-optimized OLTP engine. In SIGMOD, pages 1–12.
[8] D. N. Simha, M. Lu, and T.-c. Chiueh. 2012. An update aware storage system for low-locality update intensive workloads. In Proceedings of the International conference on Architectural Support for Programming Languages and Operating Systems, pages 375–386. ACM.
[9] F. Putze, P. Sanders, and S. Johannes. 2007. Cache-, hash- and space efficient bloom filters. In Experimental Algorithms, pages 108–121. Springer Berlin / Heidelberg.
[10] J. K. Mullin. 1990. Optimal semijoins for distributed database systems. In IEEE Trans. Soft-ware Eng., 16(5):558–560.
[11] L. Fan, P. Cao, J. Almeida, and A. Z. Broder. 1998. Summary cache: A scalable wide-area Web cache sharing protocol. In Proc. ACM SIGCOMM, Vancouver, BC, Canada.
[12] L. Sidirourgos and P.Å . Larson, Adjustable and Updatable Bloom Filters. Available from the authors.
[13] N.P.Jouppi. 1990. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. In 17th Annual International Symposium on Com-puter Architecture. doi:10.1109/ISCA.1990.134547
[14] Zhong, M., Lu, P., Shen, K., and Seiferas, J. 2008. Optimizing Data Popularity Conscious Bloom Filters. In Proceedings of the 27th ACM symposium on Principles of Distributed Compu-ting.