15 June, 2010

6/15/10 Daily Journal of AT

So I figured out the problem from yesterday. I had created my files with a low amplitude (.1), in case I decided to add them together. However, at that low, the peaks are lost entirely in our FFT. At an amplitude of 1, they give a much different peak. (I should have guess when the waving occurred.) *sigh*

So I ran the numbers again, with a higher amplitude and shorter length (ten times as loud, but four percent of the length). Because it was so short, there was no data available for frequencies below 40 Hz. However, the ratio is now what was expected, half the sampling rate (44.1 kHz) over on fourth the buffer size (8192). Well, more or less. Annoyingly, this means at the very large buffer size of 32768, there are 2.691650390625 Hz per array. If I wanted to reduce it, get closer to (at least) a 1:1 ratio, we'd have to quadruple the buffer size (to 131072) which would drastically increase the time it takes to run the FFT. Potentially, we could merely double the buffer size, and have the ratio be at 1.3:1, which would allow us to at least tell the difference between lower notes (C1 and lower).

After all that, I made a function to figure out, based on frequency, what note was at the peak. It took a little figuring, but I managed to create it and then improve it. Initially, I had thought to use if/elif statements, but realized that would be a lot of code. Instead, I used a shortcut. I put the frequencies for the lowest scale (Ab0 to G0) into an array called basicNotes, with the corresponding letter value in literalNotes. I also put the distance between each element into another array (that was one longer) called basicRanges. The program takes an array of floats, presumably frequencies. For each one, it first figures out what scale it's in by testing to see if it's greater than the C of that scale, using a for loop. In a new for loop, it tests against each of the basic notes, multiplied by 2^scale (because notes are logarithmic), and returns the note that it's closest to (based on threshold between notes. Right now it's in half, which probably isn't exactly right, but it's close enough for jazz.) I had thought to store all this in one array, but Java requires them all to be the same, so I split them into floats and strings.

The rest of the day was spent testing different files to see how high their frequency peaks were, to try and get a ballpark range for the filtering; and testing the note finder. The latter is going better. Tomorrow will be cleaning up the files so that they play nice with Axtell's GUI, and trying to find a rule of thumb for getting the higher peaks.

P.S. For the test file we were given, the peaks are (approximately) at 292.0, 585.4, 522.2, and 259.7. In note terms, they are D4, C5, D5, and C4, respectively. As analyzed in thirds, the first two notes are D4 and C5, second two are D4 and D5, third two are C4 and C5.

No comments:

Post a Comment