Today was spent re-implimenting Peak Data. I know, I know, from all that I said before about Threshold being so great at cleaning data, this is unexpected. However, with the amount of testing done by myself and Axtell, it was clear that Threshold was cutting out hearable peaks in the upper frequencies, while leaving lower peaks that may or may not have existed. I've had to play with several aspects of the program, including getting rid of the equalizer. Oddly, it was changing the results of Peak Data, even though the function works on a relative scale. Fortunately, I've managed to get fairly consistantly accurate data on most of the windows with the same function (rectangular windowing remains the most messy).
In addition, I re-wrote Statistics so it just creates a file of the statistical data, rather than the FFT as well. The peak data is still being written to file, which is the most important part of the data. Tomorrow, I hope to move on to getting the constant Q to work with the other functions (have to change scaling for a lot of functions). I'm debating on abandoning the BPM finder, as it is not terribly accurate, has taken up a lot of time already, and may not be useful in the long run.
This blog is designed to be a web journal to document the progress made on a 2010 summer research project regarding music analysis and Fast Fourier Transforms.
Showing posts with label monday. Show all posts
Showing posts with label monday. Show all posts
19 July, 2010
12 July, 2010
7/12/10 Daily Journal of AT
After a bit of a break, I'm back for the seventh week of the internship. Most of my work today was with Beat Finder. I got the data about as clean as it will ever be, relatively speaking, and moved on to finding the beats per minute of any given song.
Essentially, the BPM, or beats per minute, are a way of recording the tempo of a song, originally for use with a metronome. A moderately speedy song would have a BPM of 120, or contain 120 quarter notes per minute (or two a second). Slower songs have a lower BMP, and faster songs have a higher BPM.
The path to getting BPM from beat data is a bit complicated. First, the program starts with an array of beats and silences. Each point of data is equivalent to 1024 samples long. The program measures the space between each beat, and stores it in a new array. The array is sorted, then each of the lengths are translated into BPM (by being multiplied by their sample size (1024), divided by the number of samples per minute (44100*60), then inverted).
The frequency of each bpm is totaled, and put into another new array. Right now, the program finds the top three occurring BPMs, the average, and the average after outliers are removed. With the test files I've used, the most occurring and the second average tend to be the same number, so it is slightly redundant. However, if a song should change tempo, or has an irregular back beat, this extra data may become necessary.
Overall, this seems fairly accurate. However, when testing the BPM by creating a new beat file, it tended to lag in relation to the song. I'm going to try to fix this tomorrow, then move on to either fixing the file generator or moving on to a new feature.
Essentially, the BPM, or beats per minute, are a way of recording the tempo of a song, originally for use with a metronome. A moderately speedy song would have a BPM of 120, or contain 120 quarter notes per minute (or two a second). Slower songs have a lower BMP, and faster songs have a higher BPM.
The path to getting BPM from beat data is a bit complicated. First, the program starts with an array of beats and silences. Each point of data is equivalent to 1024 samples long. The program measures the space between each beat, and stores it in a new array. The array is sorted, then each of the lengths are translated into BPM (by being multiplied by their sample size (1024), divided by the number of samples per minute (44100*60), then inverted).
The frequency of each bpm is totaled, and put into another new array. Right now, the program finds the top three occurring BPMs, the average, and the average after outliers are removed. With the test files I've used, the most occurring and the second average tend to be the same number, so it is slightly redundant. However, if a song should change tempo, or has an irregular back beat, this extra data may become necessary.
Overall, this seems fairly accurate. However, when testing the BPM by creating a new beat file, it tended to lag in relation to the song. I'm going to try to fix this tomorrow, then move on to either fixing the file generator or moving on to a new feature.
28 June, 2010
6/28/10 Daily Journal of AT
Today was pretty good. I started writing my beat finder function, which took some time. Merely reading the file in was hard enough! Once I got the data, I split it into miniwave-sized chunks, got the data at each point, squared, then totaled them. I then took the average of all the chunks together, and compared each chunk to the overall average. If they were higher, they were beats. That worked okay, so I decided to work on improvements. The first, storing all the data as chunks rather than exact bytes, I had already implemented, so I moved onto getting the variance.
That took a good deal longer. First off, I had to actually understand the math, and write a separate function for it. Then I realized it made a lot more sense within my other function, so moved it, and managed to get it mucked. Basically, it always returned a beat, regardless of height. After about an hour and a half of fixing that (and leaving the program a bit messy) it was working more smoothly. So, I gave it to Axtell to run on a song (because her computer is much faster than mine) and gave it a song.
It ran out of memory.
Fortunately, there's a fix, as the memory-running-out happened when reading the file in, which means that it's too large to create an array (which happened with our FFT in the early days). At first, I thought the trick would be to use the miniwave files, and read them in instead. However, upon closer inspection, the error proved to be in the size of the array holding the data, which would not be reduced by using the miniwaves.
The answer is, rather than reading everything, read the enough data to find a chunk, add the chunk to a list, and make an array out of the list. I've gotten rid of that error, but for some reason, all my data is set to zero. Well, it's something to fix tomorrow, as well as working on adding the FFT data to it. Probably have to write all the data to a file...which we got rid of...oh, the joys of coding.
That took a good deal longer. First off, I had to actually understand the math, and write a separate function for it. Then I realized it made a lot more sense within my other function, so moved it, and managed to get it mucked. Basically, it always returned a beat, regardless of height. After about an hour and a half of fixing that (and leaving the program a bit messy) it was working more smoothly. So, I gave it to Axtell to run on a song (because her computer is much faster than mine) and gave it a song.
It ran out of memory.
Fortunately, there's a fix, as the memory-running-out happened when reading the file in, which means that it's too large to create an array (which happened with our FFT in the early days). At first, I thought the trick would be to use the miniwave files, and read them in instead. However, upon closer inspection, the error proved to be in the size of the array holding the data, which would not be reduced by using the miniwaves.
The answer is, rather than reading everything, read the enough data to find a chunk, add the chunk to a list, and make an array out of the list. I've gotten rid of that error, but for some reason, all my data is set to zero. Well, it's something to fix tomorrow, as well as working on adding the FFT data to it. Probably have to write all the data to a file...which we got rid of...oh, the joys of coding.
21 June, 2010
6/21/10 Daily Journal of AT
Some success, but not a lot of interest today. I made the windowing work, by having get Peaks filter the data more in the Hamming and Hanning windows. It took all morning to get right, and I still have a few issues with the note finder. This afternoon I got the file finder working, when the user wants a specific second to FFT. Also, looked over what professor was doing. It makes sense, but I don't know if she wants us to change something we're doing, or add something, or what.
Yeah, sort post today, sorry. Will write more when more is done.
Quote of the day: "If I had a penny for every time something to do with this program was logarithmic..."
Yeah, sort post today, sorry. Will write more when more is done.
Quote of the day: "If I had a penny for every time something to do with this program was logarithmic..."
14 June, 2010
6/14/10 Daily Journal of AT
Another day where things go wrong. They seem to come in cycles.
I basically had two things I wanted to do today: figure out a better peak finder, and find what peaks correspond to what frequencies. Since the correspondence was easier, I decided to get that out of the way first. It started out fairly easily: I created a number of files of frequencies ranging from 10 to 200, going up by 10 each time, then wrote a shell program to run an FFT and a peak finder on each of them. (Since there was only one peak, I ran the halfing function six times on each, which got the major peak, or occasionally peaks.) I plotted the data and found the ratio. To be on the safe side, I did so for a variety of buffer sizes, from 2^15 to 2^10. I decided to use the largest, as it gave the most exact data. With 2^15, or 32768, the ratio of array placement to Hz was 28.6. So far so good.
The problem came when I decided to run some more frequencies through; namely, frequencies 50 through 2000, going up by 50. That's when the problems began. For 50, 100, 200, and 250, the peaks corresponded perfectly with their respective Hz. (150 had some problems with resonance, I think. I don't know why.) However, for 300 onward, there were no peaks in their respective frequencies. Every peak was less than 300. I thought it might be the multiplier, but changing it didn't help with the higher frequencies, and made the lower ones wrong as well. The only conclusion I could come to was that the FFT was cutting off data. Which made sense, given the data.
However, this flies in the face of the testing data done on one of our files, a440AndOnePartial.wav. When graphed, it clearly gave two peaks, the nearer one much larger. To try and figure out what was going on, I graphed the data from 50, 100, 150, 200, 250, and the a440 file.
A little blurry, but I'll explain.
The five graphs on the left are 50 through 250, with the yellow boxes pointing out the peaks. 150 is hard to see, but looking at 140 and 160 (immediately to the right) it is clear where 150 should be. However, to the top right is the a440 and partial graph, which, fairly clearly, shows those peaks. In the completely wrong positions!
I haven't had much time to mess with the FFT. I tried taking out the mirroring function, which gives a few peaks close to accurate for 300 to 500. However, beyond that, the peaks do not exist. I'm going to do some reading tonight to try and figure out what's wrong, but unless this is fixed, the FFT function is useless for any frequencies above middle C.
I basically had two things I wanted to do today: figure out a better peak finder, and find what peaks correspond to what frequencies. Since the correspondence was easier, I decided to get that out of the way first. It started out fairly easily: I created a number of files of frequencies ranging from 10 to 200, going up by 10 each time, then wrote a shell program to run an FFT and a peak finder on each of them. (Since there was only one peak, I ran the halfing function six times on each, which got the major peak, or occasionally peaks.) I plotted the data and found the ratio. To be on the safe side, I did so for a variety of buffer sizes, from 2^15 to 2^10. I decided to use the largest, as it gave the most exact data. With 2^15, or 32768, the ratio of array placement to Hz was 28.6. So far so good.
The problem came when I decided to run some more frequencies through; namely, frequencies 50 through 2000, going up by 50. That's when the problems began. For 50, 100, 200, and 250, the peaks corresponded perfectly with their respective Hz. (150 had some problems with resonance, I think. I don't know why.) However, for 300 onward, there were no peaks in their respective frequencies. Every peak was less than 300. I thought it might be the multiplier, but changing it didn't help with the higher frequencies, and made the lower ones wrong as well. The only conclusion I could come to was that the FFT was cutting off data. Which made sense, given the data.
However, this flies in the face of the testing data done on one of our files, a440AndOnePartial.wav. When graphed, it clearly gave two peaks, the nearer one much larger. To try and figure out what was going on, I graphed the data from 50, 100, 150, 200, 250, and the a440 file.

The five graphs on the left are 50 through 250, with the yellow boxes pointing out the peaks. 150 is hard to see, but looking at 140 and 160 (immediately to the right) it is clear where 150 should be. However, to the top right is the a440 and partial graph, which, fairly clearly, shows those peaks. In the completely wrong positions!
I haven't had much time to mess with the FFT. I tried taking out the mirroring function, which gives a few peaks close to accurate for 300 to 500. However, beyond that, the peaks do not exist. I'm going to do some reading tonight to try and figure out what's wrong, but unless this is fixed, the FFT function is useless for any frequencies above middle C.
07 June, 2010
6/7/10 Daily Journal of AT
Today was mostly spent with Audacity, again. Started out finishing translating the FFT function from C++ to Java. Harder than it sounds. Also figured out exactly what each method was doing, and why. For some reason, bits have to be reversed, probably to do inverse function, and code had three ways of doing it, each faster than the last. After sorting out that out, figured out which of the two FFT shells was more useful. Since Axtell was working with simple version, focused on the logarithmic one, which was not any faster, but generated much clearer data. After all, human brains are configured logarithmically, not linearly. Didn't get it working yet.
However, did implement Hamming and Bartell windowing in Axtell's function. Made data somewhat clearer. Hopefully get Hanning to work soon, as, acording to Audacity, it provides the clearest graph and most obvious peaks.
Tomorrow: Write/find function to take data from FFT, find peaks, store them, and work on getting windows of a song to better analyze them.
However, did implement Hamming and Bartell windowing in Axtell's function. Made data somewhat clearer. Hopefully get Hanning to work soon, as, acording to Audacity, it provides the clearest graph and most obvious peaks.
Tomorrow: Write/find function to take data from FFT, find peaks, store them, and work on getting windows of a song to better analyze them.
Subscribe to:
Posts (Atom)