I was trying to figure out how to write a search engine that could support GamerzCrib forums.
The biggest challenge is that potentially I could be looking at over 10 million posts on the server, and something like 5 million users a month. I've seen vBulletin forums where the search became painfully slow and they had less than a million posts.
After poking around the net, I found Solr, run by the same guys who do Apache web servers. Solr is a Lucene search engine written in Java. It runs as it's own service and accepts updates to the search index, and typically provides XML output as search results, all using http as it's interface.
