Hybrid search
Whereas semantic search utilizing vector embeddings performs nicely for capturing rephrased or paraphrased meanings, it may not do nicely on searches that contain uncommon phrases or jargon. In these instances, combining semantic search with the extra conventional sparse retrieval strategies (BM25 or TF-IDF), which incorporate facets like key phrase frequency, usually helps enhance the retrieval course of. So as to incorporate each of most of these retrieval mechanisms, you would have chunks be assigned each scores, with the ultimate rating being a weighted mixture of the 2, or you would use sparse retrieval as a first-pass filter adopted by semantic search.
Reranking – the ultimate step
Upon getting run the preliminary search to retrieve related chunks, performing a closing step of rating these outcomes helps to make sure that essentially the most helpful info is introduced to the consumer. The explanation for that is that though the chunks may technically be related, they won’t be essentially the most useful reply to the consumer’s question.
There are a couple of other ways by which reranking is finished in follow. One method is to make use of heuristics on sure metadata of the chunks, such because the creator, date, supply reliability, and many others. A good thing about this method is that it’s often computationally cheap and quick.
