26 July, 2010

Axtell's Notes: July 26

Constant Q is working! Really. It even can be graphed. Here's some proof:



These aren't scaled correctly as Constant Q is logarithmic and I have been using the FFT grapher so we could see if we were getting any points back. I would have a screenshot of Maple Leaf Rag to show you, but at the pace these were taking, it seemed like a 2-minute song would have taken around 7 hours. I'll be running it tonight so tomorrow morning we can see that.

We have written the most common kernel for the CQT (min. frequency = 16.352 Hz, max. frequency = 22050.0 Hz, bins per octave = 12, sample rate = 44100.0) to a text file so the computer doesn't have to re-calculate it each time. If those values are changed (min/max frequency and bins are changeable in the advanced menu and the sample rate is given by the sound file) the kernel is calculated, but not written to a file.

Tomorrow's to do list:
-Adding windowing functions to CQT
-Getting the CQT data into a logarithmic scale
-General clean-up and testing

23 July, 2010

7/23/10 Daily Journal of AT

Hey, it's the end of the internship! Okay, not really, since we have at least one more week. But it could be.

The old version of everything is working. Axtell and Gregor are still playing with constant q, but we do have a working FFT with peak finder, beat analysis, note generation, and statistics data. And they don't generate errors (well, unless the file you give it is error-ridden, in which case, we can't help you). I finished off the grapher for the stats data as well. It prints the average, as well as the skew, standard deviation, and spread, with kurtosis as changing colors.

Next week: code clean up!

22 July, 2010

Axtell's Notes: July 22

That problem from yesterday (only graphing the first section of the spectrogram) is somewhere in the Audio object class and how we split the file into many samples of the given length. I've avoided the problem, and the DFT half of Transforms is working now, but I should try to find out what the problem was.

How we split without the Audio object: Make an AudioInputStream of the audio. Make a for loop that makes a mini AudioInputStream that is a sample size long section of the full stream. (The way AudioInputStream works, each time through it will start reading where it last stopped) Get the data from the mini stream and FFT that.

How we tried to split it with the Audio object: Make an Audio object of the audio. Get all the data from that object (in an array). Make a for loop that makes a mini array that is sample size long starting where the last one stopped.

How we're splitting now: Make and Audio object of the audio. Audio has a split () function that makes mini streams and gets the data from each of those and adds is all to a 2D double array. So it's a combination of the two.

I did some time trials with the old and new DFT classes and found that the new is slower, but it is only noticeable on files of a minute or more. CQT is still not working, but Gregor's working on that. The new DFT is also taking a lot more memory than the old. I had to boost the memory mac to 4096MB for it to run Sweet Caroline. The old DFT can run the same file with a max of 2048MB.

The whole morning and half the afternoon were spent on those two projects.

The rest of the day went to working on some null pointers that come up when running Buffet (formerly BigGUI). They happen because there is a listener in the filename textfield that should only listen when the enter key is pressed, but it isn't very easy to get Java to listen to an enter key. While these null pointer errors don't stop the program from running, they are annoying and distracting, so I'm going to get rid of them. I'm working on that, and it should be working by the end of tomorrow.

Tomorrow is all day pair programming to neaten/speed up/shorten/fix all the code and end up with one set of classes that we are all working with since we are all working with different code.

7/22/10 Daily Journal of AT

Today was statistics, statistics, statistics. I got a decent graph out of the stats data, and printed the average, along with three standard deviations, with no problems on several files. I then did a bit of research on skew, to better understand how to best represent it visually, and added in a writer for that. After an hour, I realized it wasn't the drawer that was making the skew far off from the average, but the original statistics data.

Initially, the average and standard deviation were solely on the heights of the data. While this worked fine, it provided useless when trying to find what the average frequency was. However, when trying to find the centroid (the average frequency based on heights), the resultant number is always the same, regardless of the file (silent files have the same average as noisy ones, which is clearly wrong).

I haven't even begun on kurtosis. However, it is good to know now that the data is wrong, and hopefully I can fix it tomorrow. The grapher is not as important, though may work as a replacement for NoteFinder.

21 July, 2010

Axtell's Notes: July 21

The DFT half of the Transforms class is kind of working. It doesn't return all the samples. It does the a440 file correctly, but the fade file shows only the fade up (not the fade down) even though it has the same number of samples shown as the old DFT. I've kept the old DFT working in a separate folder so I can keep updating rainbows and getting rid of spill.


The new DFT is also slowing than the old one. I haven't figured out why yet, but the dialog box that pops up and quacks on completion of the split now also prints the time it took to split and FFT the file. The new DFT is about twice as slow as the old.

The Constant Q still doesn't work. It will work for about ten minutes, and it does get numbers, but none of them get through PeakData, so after all that work the computer shows a blank spectrograph.

Gregor wasn't in today, but should be back in tomorrow afternoon so we'll look at this together and get that working by Friday hopefully.

That was more or less all I got done today as we were locked out of the lab for a while this morning. Lot's of slow and steady progress as we work towards getting the Constant Q working. We need to start looking at cleaning up, commenting and packaging all our code together this week.

20 July, 2010

Axtell's Notes: July 20

Everything has been moving so slowly that I haven't been bothering to type up all I tried that didn't work. I've been working on Threshold v. PeakData and cleaning up BigGUI (Now called Buffet). We're working with PeakData right now, though that's not perfect yet (Tayloe's been doing more work on that, so look at her posts for more information.)

More importantly, Gregor and I have been working on getting a java Constant Q Transform method and yesterday we finally got the same numbers as the MATLAB method. Today we spent all day putting her code (Complex, Audio, DFT and CQT) and my code together. I also did major clean up of BigGUI, FFTGUI, and the graphers. I'm starting to rename classes more useful, updated names (eg. GraphSplitter is now Spectrogram). I made a class Transforms that is a combination of DFT and CQT. We're not sure which will be best, so we made both, and we'll test as we go.

I just got everything compiled and tried running Buffet. The DFT plots some points, but they are clearly wrong:

The Constant Q doesn't return anything as of yet. I hope to get this working tomorrow morning, so I can get the grapher working with CQT data by the end of the day. I'll also be updating the menu to incorporate Constant Q.

7/20/10 Daily Journal of AT

This morning was spent in what I hoped was residual testing. However, I found out something weird. The scaling of the data affects what data is kept in the peak generation. At low multipliers, this can be based on some numbers being set to zero, so that, the lower the multiplier is, the less data is retained. However, a similar effect, though less dramatic, happens as numbers increase (as in, higher numbers return fewer peaks). The image below shows the same short sound file at multipliers of 1 through 12, counting from left to right, up to down.



As you can see, at the multiplier of 4, the most data is returned. This holds true for most files, regardless of overall loudness. We don't know why this is, but have compensated for it.

The afternoon was spent working on graphing the statistical data. Unfortunately, there are no pretty pictures to show from that, as I've only managed to get the data in and corrected, while the adapted GUI grapher is giving me trouble. It will possibly be replacing Note Finder when it's finished, as it seems a bit more useful.

19 July, 2010

7/18/10 Daily Journal of AT

Today was spent re-implimenting Peak Data. I know, I know, from all that I said before about Threshold being so great at cleaning data, this is unexpected. However, with the amount of testing done by myself and Axtell, it was clear that Threshold was cutting out hearable peaks in the upper frequencies, while leaving lower peaks that may or may not have existed. I've had to play with several aspects of the program, including getting rid of the equalizer. Oddly, it was changing the results of Peak Data, even though the function works on a relative scale. Fortunately, I've managed to get fairly consistantly accurate data on most of the windows with the same function (rectangular windowing remains the most messy).

In addition, I re-wrote Statistics so it just creates a file of the statistical data, rather than the FFT as well. The peak data is still being written to file, which is the most important part of the data. Tomorrow, I hope to move on to getting the constant Q to work with the other functions (have to change scaling for a lot of functions). I'm debating on abandoning the BPM finder, as it is not terribly accurate, has taken up a lot of time already, and may not be useful in the long run.

16 July, 2010

7/16/10 Daily Journal of AT

Hey, it's a blog! There hasn't been much to say for the past two days; it mostly related to testing and tweaking numbers of BeatFinder and Threshold. They're mostly stable now, so we should be set with that. I also created a statistics finder to go with the FFT, which works, but doesn't do much except for display numbers. At the professor's request, I added a method to write the statistical data, as well as all the FFT data, to a file. (I will add more about the statistical data soon.) On a four minute song, the data size is 2.4 GB. And I got an error message:


Next week: Making it smaller! Also, graphing statistical data.