1
Shyue-Liang Wang1, Ting-Zheng Lai3, Tzung-Pei Hong2, Yu-Lung Wu3 National University of Kaohsiung1,2
I-Shou University3,4
Kaohsiung, Taiwan
Efficient Hiding of Collaborative
Recommendation Association
2
Privacy Preserving Data Mining
• Privacy Preserving
– Given DO – Minimum support – Minimum confidence – RH • Rules to be hidden – Find DM s.t. RM RO – RH • DM: modified database • RM: modified rulesD
OR
OD
MR
MDM
DM
Modification
3
Problem Description
1
• Collaborative recommendation association rules
– R3, R5 same prediction as R1to R9
• Input: DO, min_supp, min_conf, C
• Output: RO TID Items T1 ABC T2 ABC T3 ABC T4 AB T5 A T6 AC min_supp=33% min_conf=70% 1 B=>A (66%, 100%) 2 C=>A (66%, 100%) 3 B=>C (50%, 75%) 4 C=>B (50%, 75%) 5 AB=>C (50%, 75%) 6 AC=>B (50%, 75%) 7 BC=>A (50%, 100%) 8 C=>AB (50%, 75%) 9 B=>AC (50%, 75%) 10 A=>B (66%, 66%) 11 A=>C (66%, 66%) 12 A=>BC (50%, 50%) DO RO: {R3, R5} |A|=6,|B|=4,|C|=4 |AB|=4,|AC|=4,|BC|=3 |ABC|=3 Not AR
4
Problem Description
2
• Input: DO, H (items to be hidden
on RHS), min_supp, min_conf • Output: DM, RM TID Items T1 ABC T2 ABC T3 ABC T4 AB T5 A T6 AC min_supp=33% min_conf=70% H = {C}
AB
DO, DM R :{R1,R2,R7} 1 B=>A (66%, 100%) 2 C=>A (50%, 100%) 3 B=>C (50%, 50%) 4 C=>B (33%, 60%) 5 AB=>C (33%, 50%) 6 AC=>B (33%, 66%) 7 BC=>A (50%, 100%) 8 C=>AB (33%, 66%) 9 B=>AC (33%, 50%) 10 A=>B (66%, 66%) 11 A=>C (66%, 66%) 12 A=>BC (50%, 50%) |A|=6,|B|=4,|C|=3 |AB|=4,|AC|=3,|BC| =2,|ABC|=2 hidden hidden lost lost lost lost5
Problem Description
3
TID Items T1 ABC T2 ABC T3 ABC T4 AB T5 A T6 AC TID Items T7 BC T8 C T9 ABC TID Items T1 AB T2 ABC T3 ABC T4 AB T5 A T6 AC TID Items T7 B T8 C T9 ABC TID Items T1 ABC T2 ABC T3 ABC T4 AB T5 A T6 A T7 B T8 C T9 ABC (1)Combine, hide (DCBS) (2)Hide, combine(MSCR)6
Numerical Experiments
1
Time Effects Multiple Updates 0 100 200 300 400 500 10K 15K 20K 25K Data Size S ec o n d 1-Item-MSCR 1-Item-DCBS 2-Item-MSCR 2-Item-DCBS7
Numerical Experiments
2
Database Effects Database Effects 0% 2% 4% 6% 8% 10% 10k 15K 20K 25K Data Size P er ce n ta g e DCBS MSCR8
Numerical Experiments
3
Side Effects for MSCR
Multiple Updates Side Effects
0% 2% 4% 6% 8% 10K 15K 20K 25K Data Size P er ce n ta g e New Rules Lost Rules Hiding Failure
9
Numerical Experiments
4
Side Effects for DCBS
DCBS Side Effects 0% 2% 4% 6% 8% 10K 15K 20K 25K Data Size P e rc e n ta g e New Rules Lost Rules Hiding Failure
10