Back

About CBC Finder

CBC Finder was built using a simple information retrieval architecture.

The index engine parses through the collection (CBC articles) which tokenizes all unique terms into a vocabulary. Using the vocabulary, meaningful data structures (Lexicon & Inverted Index) are generated which map tokens to documnents.

Documents containing a query term are ranked using the BM25 function, a popular relevancy scoring algorithm for information retrieval.

Finally, using a SprintBoot powered API server, the top 5 results are returned and displayed in the React user interface.

Architecture described by Dr.Mark Shmuck in lecture series (Search Engines)