24.4.11

Staff removal and template matching

Tim and I took a step back to see what is ahead and do a little bit of planning. We decided to use template matching to recognize the symbols. These are the major tasks we need to do next:

- Create a bootstrap template library (I am currently working on this - will henceforth refer to this as BTL: Bootstrap Template Library, and while we're on it, BTC : Bootstrap Template Creation)
- Create a template matcher (Tim is currently working on this)
- Create a training program which will add to the bootstrap template library
- ???
-Profit

Backtrack a little: so far we've dealt with staff removal, a task which may or may not help with character recognition. It does more harm than good if our staff removal algorithm mangles symbols that overlap with the staves. No staff removal algorithm is better than a bad one, BUT no staff removal algorithm means more tedious BTC. Creating and using template matcher with a dataset that has staff lines means we would have to account all ways staff lines could pass through a symbol: a near-exponential blowup of the number of templates.

Luckily for us, we found a really cool Gamera library called musicStaves which contains functions that implement some of the current best staff removal algorithms. That thing is Tim and I have heard of all these algorithms in various papers we read about staff removal. It's really cool to actually see them in action.

Here is overview of the algorithms after some testing with the sample set, which consists of super high quality (I'm talking 10+ MB per file) png's which were created when Tim took photographs of music sheets with his fancy camera.

Linetracking - the worst. Makes sense, since the algorithm is a very simple one
Fujinaga - really good, staves were removed cleanly, for the most part
Carter - also really good, not that much different than Fujinaga
Roach-Tatem - this one actually caused the Gamera GUI to crash, resulting in a bunch of errors. Never using this again.
Skeleton - this one was also good, very similar to Fujinaga and Carter.

It seems like Fujinaga, Carter, and Skeleton are the best. I stored sample results with Fujinaga, just because the name is cooler.

See for yourself in this sample result.

Also, another sample result.

As you can see, the results are pretty good, but fails to remove staves in seemingly random locations. This might be due to Tim's lack of photography skills; nevertheless, we still need to account for the fact that the staff removal algorithms are not perfect and will fail on some, albeit rare cases.

This presents an annoying problem for templating: a very good staff removal algorithm, apparently, is not good enough to control the size of our BTL. We can't neglect the potential areas of the music sheet where the staff line removal algorithm failed.

I guess BTC will have to include stafflines. That's something I personally don't look forward to. BTC is not difficult, just tedious and highly repetitive. It will look something like this:

mFile = open(musicfile)
while(true){
copyRegion(mFile);
pasteRegion(new ImageFile() );
cry();
}

Now if only this function really existed.

No comments:

Post a Comment