Indexing is important for sure. The problem is to preserve privacy and not falling back to heavy weight general purpose Multi-party computation we have to give up a bit on the precision and recall of modern search engines. Minhash, more specifically Locality Sensitive Hashing (LSH) is a good first approximation (Better then Term Freqency, worse them ML based search). Right now much of the web is unqueryable, my first goal was to allow the deep web and TOR services to be searched even at just a rudimentary level.