Analyzing Music Internship: Daily Blog 2

Today I spent most of the day finishing reading the articles I started yesterday. Part II of An Introduction to the Mathematics of Digital Signal Processing focused on sampling,m transforms, and filtering. DFTs along with z-transforms were discussed. I skipped the section on z-transforms as we are focusing on DFTs right now.

After finishing the articles, I believe I have a much better understand on how DFTs and sound sampling works and how to deal with the phenomena that can occur when sampling a signal and applying a DFT. The rest of this post will focus on summarizing what I read (hopefully) in a manner that will make the main points clear to Axtell and Tayloe.

As mentioned yesterday, the first part of the article focused on explaining the algebraic and trigonometric topics that are relevant to the understanding of transforms and sampling. Some of these topics included a description of the different types of numbers in math; whole (Z), natural (N), integer (I), real (R), and complex (C). Each set is "included" the sets to the right. For example, the integer number set includes the sets of whole and natural numbers, etc. The set of complex numbers are the most important to the discussion of signal processing as all signals have both a real and an imaginary part and can be described as: x + jy where x,y are elements of R and j² = -1. j is used to represent the square root of 1 in signal processing as it is a branch of engineering and i is already used to represent current.

The article then goes on to explain polynomials, roots, exponents, logarithms, and e giving the rules and identities (if any) for each topic. He then explains sums and series. Summations are an important as a DFT is expressed mathematically as a summation.

After explaining the important topics in algebra, the topics of trigonometry are explained. Three measurements can be used in trig; radians, degrees, and grads. Pi/2 radians = 90^o = 100 grads. Pythagoras' theorem is discussed and the trig the 13 most common trig identities are given. sin(x) and cos(x) only differ in phase (when x = 0, sin(x) = 0 and cos(x) = 1).

I found the following explanations to be helpful:

period = frequency = repetition rate of a periodic waveform
amplitude = the difference between peaks and/or troughs = strength
timbre = tone quality = general shape
pitch = "tonal height" of a sound

Fourier's theorem states the following:

any periodic waveform can be described as the sum of a number of sinusoidal variations with each one having a particular
- frequency
- amplitude
- phase
must be Dirichlet as well as periodic meaning it must satisfy the conditions for a real-valued, periodic function
f(t) = f(t + T) where f is the periodic waveform, t is time, and T is the period

The last important piece of information from the first article was Euler's equation which is: e^jx = cosx + jsinx. In other words, the natural number e raised to the power of x multiplied by the imaginary number j can be broken up into two parts; the real part (cosx) and the imaginary part (jsinx).

The second article focused on sampling, transforms, and filtering. With sampling, a phenomenon known as "the wagon-wheel effect" can occur and happens when the sampling rate in relationship to the actual frequency is too small. Using the example given by its namesake, in films, as a wagon increases in speed, the wheel will appear to increase as well until half of the frequency of the sampling rate (frames per second). At this point, the wheel will appear to decelerate or even go backwards. At R/2 (where R equals sampling rate), the frequency of the wheel is indistinguishable from its negative counterpart. For example, in old Western films the frames per second was equal to 24 (R = 24 Hz). At F = 12 Hz (F = actual frequency of the wheel), the wheel appears to be rotating at its maximum velocity. At F = 18 Hz, the wheel appears to be rotating at -6 Hz because it is impossible to tell if the wheel is completely 3/4 of a rotation each frame or -1/4 of a rotation. The Wikipedia article provides a good animation of this effect. The sampling theorem states to avoid aliasing or foldover (the wagon-wheel effect), any simple harmonic variation (a sinusoid) occurring at a rate of F Hz must be sampled at least 2F times per second. However, to avoid ambiguity in the equation statement of this theorem, the sampling rate R can not exactly equal 2F. So the sampling rate must be at least two times greater than the highest frequency component of the original waveform.

Digital signals can be thought of as functions of discrete values of time (n) or sequences of numbers, with each number representing the instantaneous value of a continuous time function (the original analog signal). I came across a good analogy while reading: "Prism is to light as Fourier analysis is to sound." White noise is the sound equivalent to white light in optics.

The article then discusses the DFT. It is "used to calculate the spectrum of a waveform in terms of a set of harmonically related sinusoids, each with a particular amplitude and phase" and is most commonly implemented by the FFT. Some quick notes about the DFT sequence x(n):

x(n) is modeled as one period with N samples (entire audio file is one period)
FFT generally requires N to be a power of two
more than two samples are required (N > 2)
DFT[x(n)] = X(k) =

N - 1
-----
\
| x(n)e^-jwnk 0 <= k <= N - 1
/
-----
n = 0
where w(omega) = 2Pi/N and e^-jwnk = cos(wnk) - jsin(wnk)

magnitude (amplitude) = |X(k)| = sqrt(a_k² + b_k²)
harmonics of a sound = R/N

Hopefully much of this was a review and me pointing out the obvious. I hope this will help clear up any remaining issues with understanding sampling and DFTs. The full articles were written by F.R. Moore and were published in Computer Music Journal Vol. 2 Nos. 1-2. These articles can be accessed through the JSTOR database for further reading.

Analyzing Music Internship

08 June, 2010

Daily Blog 2

1 comment:

Contributors

Blog Archive