Calculating Adjusted Rank Index using Locality Sensitive Hashing (LSH): A Gaussian Approach
Citation
Aritra Banerjee "Calculating Adjusted Rank Index using Locality Sensitive Hashing (LSH): A Gaussian Approach", International Journal of Engineering Trends and Technology (IJETT), V51(1),41-44 September 2017. ISSN:2231-5381. www.ijettjournal.org. published by seventh sense research group
Abstract
Locality Sensitive Hashing (LSH) is a technique which is generally used to reduce the dimensionality of the given data. In this paper, I have used the Gaussian approach to reduce the dimensionality of a given massive dataset. Then used the binary matrix generated and created a neighbourhood graph for the given dataset. From the neighbourhood graph derived we can calculate the Adjusted Rank Index (ARI) value of the given dataset after applying Locality Sensitive Hashing. Since LSH is an Approximate Nearest Neighbour (ANN) calculation we approximately find the nearest neighbours of the given training dataset and check ARI value to see how closely the value approximately is from the actual neighbours present of different classes in the dataset.
Reference
[1] Anand RajaRaman, Jefferey Ullman, “Mining of Massive Datasets”
[2] Bawa, M., Condie, T. and Ganesan, P., 2005, May. LSH forest: self-tuning indexes for similarity search. In Proceedings of the 14th international conference on World Wide Web (pp. 651-660). ACM.
[3] Lv, Q., Josephson, W., Wang, Z., Charikar, M. and Li, K., 2007, September. Multi-probe LSH: efficient indexing for high-dimensional similarity search. In Proceedings of the 33rd international conference on Very large data bases (pp. 950-961). VLDB Endowment.
[4]Moran, S., Lavrenko, V. and Osborne, M., 2013, August. Variable Bit Quantisation for LSH. In ACL (2) (pp. 753-758).
[5]https://www.youtube.com/watch?v=Arni-zkqMBA
[6] Datar, M., Immorlica, N., Indyk, P. and Mirrokni, V.S., 2004, June. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the twentieth annual symposium on Computational geometry (pp. 253-262). ACM.
[7]http://www.slaney.org/malcolm/yahoo/Slaney2008-LSHTutorial.pdf.
Keywords
Locality Sensitive Hashing, Adjusted Rank Index, Gaussian, Approximate Nearest Neighbour.