Introselect algorithm

Introselect

When implementing a beam search, one must track the top K elements to prune the beam. One algorithm for this is Introselect, implemented in the C++ std library as nth_element.

Heaps

Heaps can also select the top-k elements. Push all elements into the heap, then pop k times. Useful for something like, ORDER BY score LIMIT k to pull the top-k elements.

In C++ we can use: make_heap.

Sketches

Sketches can scale to handle large data streams. KLL Sketch enables distribution tracking for large streams of data. https://github.com/apache/datasketches-cpp