Approximate Nearest Neighbors Search in Multidimensional Space
Date:
In modern applications connected with media processing it is often required to perform fast similarity search inside some extra-large datasets. This tasks can be logically separated into two parts, which, to some extent, can be solved independently:
- Finding some feature representation of the media data (i.e. images) that would quantify the “meaning” of the file;
- Building a large scale index which would allow to search for the similar elements given some input query from user.
In this talk the second part of the task is discussed. In particular some overview of the modern approaches (up to year 2023) for the approximate nearest neighbors search(ANNS) techniques is given. This talk is no more than just a naive attempt to present some initial understanding of the task and its complexity, and also give some basic tools to solve the ANNS task in multidimensional space. Most of examples in this talk are based on the brilliant FAISS library with all the necessary links and references outlined.
You can find the slides of this talk here.