Seminars

View all Seminars  |  Download ICal for this event

Towards Efficient Privacy-Preserving Two-Party k-Means Clustering Protocol

Series: M.Tech (Research) Colloquium- ON-LINE

Speaker: Ms. Chim Sonali Eknath M.Tech (Research) Dept. of CSA

Date/Time: Mar 24 16:00:00

Location: Microsoft Teams: Link: Chim Sonalis M.Tech. Research Colloquium

Faculty Advisor: Prof. Sanjit Chatterjee

Abstract:
Two-party data mining is a win-win game if played with a guarantee of data privacy
from each other. This guarantee is provided by the use of cryptographic techniques in
designing the two-party protocol. The need to obtain collaborative data mining results
is growing and so is the need for privacy-preserving data mining protocols. Clustering is
one of the data mining techniques and one of the popular clustering algorithms is k-means
clustering. We studied the recent work for the secure two-party k-means clustering by
Bunn and Ostrovosky and found that the protocol is inefficient for practical purposes. The
protocol requires communication rounds which are linear in security parameter for the center
initialization step and are quadratic in security parameter for an iterative Lloyds step of the
k-means clustering algorithm. The challenge in the secure two-party k-means clustering is
the exorbitant communication cost occurring due to the high number of interactions between
the parties for performing computations on the data. Our work attempts to resolve this
problem of inefficiency in k-means clustering protocol in a two-party setting by proposing
some modifications. We have come up with two comparison protocols that are required in
the k-means clustering protocol. One of the protocols is to find a minimum of two shared
numbers which runs in constant communication rounds. Using this protocol as a building
block, another protocol is designed to find a minimum of n shared numbers, which runs
in O(n) communication rounds. We have also improved a protocol that selects a random
value from a domain oblivious to both parties. Apart from this, the idea to avoid the twoparty integer division altogether is incorporated in the k-means clustering protocol. With
these improvements, we propose a two-party k-means clustering protocol for which the
initialization step requires communication rounds linear in security parameter and Lloyds
step requires communications rounds that are independent of the security parameter. The
protocol provides a security guarantee in the semi-honest model except for some minor
information leakage. We argue that this leakage in the protocol can be tolerated considering
the substantial gain in the communication cost We have verified the gain in performance of
the modified protocol by implementing both the k-means clustering protocols and running
their instances in the same set-up.

Microsoft Teams Link: Chim Sonalis M.Tech. Research Colloquium
Link