2012 Volume E95.A Issue 12 Pages 2451-2460
Finding rare cases with medical data is important when hospitals or research institutes want to identify rare diseases. To extract meaningful information from a large amount of sensitive medical data, privacy-preserving data mining techniques can be used. A privacy-preserving t-repetition protocol can be used to find rare cases with distributed medical data. A privacy-preserving t-repetition protocol is to find elements which exactly t parties out of n parties have in common in their datasets without revealing their private datasets. A privacy-preserving t-repetition protocol can be used to find not only common cases with a high t but also rare cases with a low t. In 2011, Chun et al. suggested the generic set operation protocol which can be used to find t-repeated elements. In the paper, we first show that the Chun et al.'s protocol becomes infeasible for calculating t-repeated elements if the number of users is getting bigger. That is, the computational and communicational complexities of the Chun et al.'s protocol in calculating t-repeated elements grow exponentially as the number of users grows. Then, we suggest a polynomial-time protocol with respect to the number of users, which calculates t-repeated elements between users.