Tony's ramblings on Open Source Software, Life and Photography

Roll Your Own Search With SOLR

I was trying to figure out how to write a search engine that could support GamerzCrib forums.

The biggest challenge is that potentially I could be looking at over 10 million posts on the server, and something like 5 million users a month. I've seen vBulletin forums where the search became painfully slow and they had less than a million posts.

After poking around the net, I found Solr, run by the same guys who do Apache web servers. Solr is a Lucene search engine written in Java. It runs as it's own service and accepts updates to the search index, and typically provides XML output as search results, all using http as it's interface.

iTunes Makes Me Mad

Just how many times do you have to check 'Don't ask me again' when you tell iTunes you don't want to download the latest almost daily update? I guess once per reboot, because every single time I reboot the computer within the next 30 minutes it will ask me again.

It's really getting old. For starters, iTunes was installed to be used by a different user login on this computer, so why does it constantly pop up on my profile? I've removed it using hijaack this, removed it from my startup folder and it still keeps coming back.


The Internet Killed April 1st For Me

I can't stand April Fool's Day anymore. Really. There was a time when April Fool's involved tying your dad's shoelaces together, or putting soap on your brother's toothbrush. You really had to put some effort into it.

Now all people do is sit on their collective rear-ends and come up with stupid things to write about in the hopes that some idiot without a calendar believes it. But with GMail's 'Custom Time' and Slashdot tending to replace every chance of posting a valid news story with some cockamamie idiotic post how can one forget what day it is?


Good website design tutorial

Hehe, this is great:

Photography and Storage

In today's digital photography age, we're producing gigabytes and gigabytes of photos. Many more than we ever did in the days of film. This makes people who normally wouldn't think about proper backup practices need to learn these things. Suddenly a photographer has a need to be computer literate, if not a computer expert. For many of you, this takes you out of your comfort zone.

Here's a couple of key things you need to understand about long term storage of digital photos so that your grandkids and great-grandkids will have access to the gems of photography (and horrible snapshots) that we all take.

  • Storing photos on your hard drive is no solution. How often have you heard 'my computer died'? What would you do if you bought a new computer - how would you get those onto next year's model?
  • Storing photos on an external hard drive is no solution. Many people will argue with me on that one, but I was around when the 'Love Bug' virus hit. 'Love Bug' ate images, replacing them with copies of itself. I knew one attorney's office that lost 4 years of scanned records to the Love Bug virus because by the time they realized they'd been infected, they had already cycled through backups so the virus was on all their backups as well. Hard drives are subject to viruses, hackers, crashes and magnetic fields. Don't rely on them for more than a day - when you reboot tomorrow, that hard drive might be dead already.

Microsoft Live Search Lies.

Not quite sure why they would do this, but Microsoft's 'Live' search robot lies in it's useragent.

Even though it's a bot, it pretends to be Internet Explorer 7.

Here's the DNS of the offending bot: name =

Here's the user-agent it reported:

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 1.1.4322)

Doing Video Productions in Linux

I found myself needing to do some video production and editing and decided I'd take the hard way and do it all in Linux. It was a bit harder than I expected, but mostly because I'm doing weird stuff and had hardware issues.

First, I'll give you an idea of what software I'm using:

Flash 8

Yes, I know two of those are Windows apps, but Wine runs them very nicely. The rest are standard packages in Ubuntu.

I'm Officially A Spammer

Oops. Well I didn't actually spam anyone... nor did I leave an open relay. Really, there was no spam that went out.

What did happen though was that I needed to switch physical servers with several web apps very quickly. Since the service sends out notifications at times, I installed Postfix and left the default config.

Buh-Bye Microsoft URL Control

I happened to notice someone had a session open to with something called 'Microsoft URL Control' as the user-agent.

I did a little research, and it appears to be a set of leech utilities written by Microsoft that more often than not are used to write e-mail address scrapers, referrer spammers, or utilities designed to break captcha's.

It got me thinking... first I have no published e-mail addresses except my own on the site so I wasn't worried about that.

Netgear Hates You. And They Would Like You To Know.

I've got an enterprise level Netgear VPN router.

It sucks, but that's beside the point.

In order to even VIEW forum posts related to my product, I have to register my product. No, I didn't say register for the forums - I mean enter a 'qualifying serial number'. After jumping through hoops what I find are several hundred other people who a) think it's too hard to configure and b) wonder where Vista VPN support is that Netgear has been promising for almost a year.