Skip to main content
eScholarship
Open Access Publications from the University of California

UCSF

UC San Francisco Previously Published Works bannerUCSF

An Efficient Algorithm for ClusteringData Using Map-Reduce Approach

Abstract

We have been studying the problem of clustering data objects. As we have implemented a newalgorithm EMaRC which is An Efficient Map Reduce algorithm for Clustering Data. In clusters Featureselection is the most important part of the clustering process that involves and identifying the set of features of asubset, at which they produces accurate and accordant results with the original set of features. The main conceptbehind this paper is that, to give the effective outcomes of clustering features. In this the nature of clustering andsome more concepts serves for processing large data sets. A map-reduce concept is involved followed by featureselection algorithm which affects the entire process of clustering to get the most effective and features producesefficiently. While efficiency concerns, the time complexity is desirable component, which the time required to findeffective features, where effectiveness is related to the quality of the features of subsets. Based on these criteria, acluster based map-reduce feature selection approach, is proposed and evaluated in this paper.

Many UC-authored scholarly publications are freely available on this site because of the UC's open access policies. Let us know how this access is important for you.

Main Content
For improved accessibility of PDF content, download the file to your device.
Current View