Thursday, January 29, 2009

Searching for audio files on an NFS share from maemo


Searching for audio files on an NFS share from maemo from Ben Martin on Vimeo.

The n810 only has one memory slot. With an 8gb card in there you might fit 1,000 ogg files onto your storage. That was quite boaring, so I instead indexed an NFS share that is over 10x larger ;) I'm using one of the libferris inverted file backends for the index, which is more targeted at the desktop machine assumptions of having a faster CPU and expensive disk head seeks. Needless to say, I'm hacking on a custom index implementation for maemo which will be more oriented at a slower CPU with much much less expensive disk seeks for flash based storage.

Note that the time after I run ferris-music-search to when the first message appears on the console "Using index..." is fairly much all wasted in dynamic linking. A major slowdown that I've yet to sweep away for running apps on maemo.

The artist and title info is taken from the ID3 tags in the audio files. Indexing time is roughly 3 to 10 milliseconds per file when performed on the desktop. The inverted file index format is portable from desktop to maemo device. I plan to make the new explicit maemo format portable too, so you can make indexes on powerful machines and rsync them over to the maemo device. Assuming you are indexing stuff that is stored on your file server, not the maemo device.

During the typing for the first search on title, possible completions are shown by taking your input as a substring of the title you seek. This is more effective if you keep it in mind because you can choose just a few keys in a substring of the title. All searches are performed using regex matching on strings, which is much slower than direct equality because of the huge complications it introduces for indexing. But it is interesting, even with modestly 10x the number of files you can cram onto an n810, using nasty slow regex searching, the performance is acceptable for much of the time. There are a few cases that I'll improve, particularly regex searching on whole URLs.

Notice that the name and URL are shown as columns, so you can easily "group by" when you click on the appropriate header. I need to also include the artist, title etc ID3 fields into the results,. Oh, and have the ability to click on a few files and see the whole ID3 and metadata of those files "side-by-side" so you do not have to try to read it from the results list.

So now when I see a CD in the shops, I wont have to wonder how many of those Mozart tracks I have already, I can know for sure :-p

No comments: