Converting a wav file to amplitude and frequency values for textual, time-series analysisHow to convert a .wav file to a spectrogram in python3Convert Amplitude BACK to .wav file (C#)Amplitude / Frequency analysis in MatlabCompare two wav files in the frequency domain in javaJava .wav file frequency analysis - incorrect frequencySolving for Amplitude and Frequency in WAV filesfind the frequency amplitude and phase of a wav file by fftHow to Frequency analysis of wav files in 1 second interval?(not graphical)Python audio wav file analysissignal.spectrogram returns too many hzpython numpy error “TypeError: 'numpy.float64' object cannot be interpreted as an integer”
Python π = 1 + (1/2) + (1/3) + (1/4) - (1/5) + (1/6) + (1/7) + (1/8) + (1/9) - (1/10) ...1748 Euler
Password management for kids - what's a good way to start?
Meaning of 誰かの代わりに
How can a class have multiple methods without breaking the single responsibility principle
Reasons for using monsters as bioweapons
Can birds evolve without trees?
If I buy and download a game through second Nintendo account do I own it on my main account too?
Why are sugars in whole fruits not digested the same way sugars in juice are?
A conjectural trigonometric identity
Ernie and the Superconducting Boxes
How to draw twisted cuves?
Backpacking with incontinence
Does the use of a new concept require a prior definition?
Flat maps and Zariski tangent spaces
"Fewer errors means better products" or "Fewer errors mean better products"?
How is Sword Coast North governed?
Plotting Chebyshev polynomials using PolarPlot and FilledCurve
What's the term for a group of people who enjoy literary works?
When did J.K. Rowling decide to make Ron and Hermione a couple?
PI 4 screen rotation from the terminal
What's the proper way of indicating that a car has reached its destination during a dialogue?
Applied Meditation
Export economy of Mars
How to avoid a lengthy conversation with someone from the neighborhood I don't share interests with
Converting a wav file to amplitude and frequency values for textual, time-series analysis
How to convert a .wav file to a spectrogram in python3Convert Amplitude BACK to .wav file (C#)Amplitude / Frequency analysis in MatlabCompare two wav files in the frequency domain in javaJava .wav file frequency analysis - incorrect frequencySolving for Amplitude and Frequency in WAV filesfind the frequency amplitude and phase of a wav file by fftHow to Frequency analysis of wav files in 1 second interval?(not graphical)Python audio wav file analysissignal.spectrogram returns too many hzpython numpy error “TypeError: 'numpy.float64' object cannot be interpreted as an integer”
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I'm processing wav files for amplitude and frequency analysis with FFT, but I am having trouble getting the data out to csv in a time series format.
Using @Beginner's answer heavily from this post: How to convert a .wav file to a spectrogram in python3, I'm able to get the spectrogram output in an image. I'm trying to simplify that somewhat to get to a text output in csv format, but I'm not seeing how to do so. The outcome I'm hoping to achieve would look something like the following:
time_in_ms, amplitude_in_dB, freq_in_kHz
.001, -115, 1
.002, -110, 2
.003, 20, 200
...
19000, 20, 200
For my testing, I have been using http://soundbible.com/2123-40-Smith-Wesson-8x.html, (Notes: I simplified the wav down to a single channel and removed metadata w/ Audacity to get it to work.)
Heavy props to @Beginner for 99.9% of the following, anything nonsensical is surely mine.
import numpy as np
from matplotlib import pyplot as plt
import scipy.io.wavfile as wav
from numpy.lib import stride_tricks
filepath = "40sw3.wav"
""" short time fourier transform of audio signal """
def stft(sig, frameSize, overlapFac=0.5, window=np.hanning):
win = window(frameSize)
hopSize = int(frameSize - np.floor(overlapFac * frameSize))
# zeros at beginning (thus center of 1st window should be for sample nr. 0)
samples = np.append(np.zeros(int(np.floor(frameSize/2.0))), sig)
# cols for windowing
cols = np.ceil( (len(samples) - frameSize) / float(hopSize)) + 1
# zeros at end (thus samples can be fully covered by frames)
samples = np.append(samples, np.zeros(frameSize))
frames = stride_tricks.as_strided(samples, shape=(int(cols), frameSize), strides=(samples.strides[0]*hopSize, samples.strides[0])).copy()
frames *= win
return np.fft.rfft(frames)
""" scale frequency axis logarithmically """
def logscale_spec(spec, sr=44100, factor=20.):
timebins, freqbins = np.shape(spec)
scale = np.linspace(0, 1, freqbins) ** factor
scale *= (freqbins-1)/max(scale)
scale = np.unique(np.round(scale))
# create spectrogram with new freq bins
newspec = np.complex128(np.zeros([timebins, len(scale)]))
for i in range(0, len(scale)):
if i == len(scale)-1:
newspec[:,i] = np.sum(spec[:,int(scale[i]):], axis=1)
else:
newspec[:,i] = np.sum(spec[:,int(scale[i]):int(scale[i+1])], axis=1)
# list center freq of bins
allfreqs = np.abs(np.fft.fftfreq(freqbins*2, 1./sr)[:freqbins+1])
freqs = []
for i in range(0, len(scale)):
if i == len(scale)-1:
freqs += [np.mean(allfreqs[int(scale[i]):])]
else:
freqs += [np.mean(allfreqs[int(scale[i]):int(scale[i+1])])]
return newspec, freqs
""" compute spectrogram """
def compute_stft(audiopath, binsize=2**10):
samplerate, samples = wav.read(audiopath)
s = stft(samples, binsize)
sshow, freq = logscale_spec(s, factor=1.0, sr=samplerate)
ims = 20.*np.log10(np.abs(sshow)/10e-6) # amplitude to decibel
return ims, samples, samplerate, freq
""" plot spectrogram """
def plot_stft(ims, samples, samplerate, freq, binsize=2**10, plotpath=None, colormap="jet"):
timebins, freqbins = np.shape(ims)
plt.figure(figsize=(15, 7.5))
plt.imshow(np.transpose(ims), origin="lower", aspect="auto", cmap=colormap, interpolation="none")
plt.colorbar()
plt.xlabel("time (s)")
plt.ylabel("frequency (hz)")
plt.xlim([0, timebins-1])
plt.ylim([0, freqbins])
xlocs = np.float32(np.linspace(0, timebins-1, 5))
plt.xticks(xlocs, ["%.02f" % l for l in ((xlocs*len(samples)/timebins)+(0.5*binsize))/samplerate])
ylocs = np.int16(np.round(np.linspace(0, freqbins-1, 10)))
plt.yticks(ylocs, ["%.02f" % freq[i] for i in ylocs])
if plotpath:
plt.savefig(plotpath, bbox_inches="tight")
else:
plt.show()
plt.clf()
"" HERE IS WHERE I'm ATTEMPTING TO GET IT OUT TO TXT """
ims, samples, samplerate, freq = compute_stft(filepath)
""" Print lengths """
print('ims len:', len(ims))
print('samples len:', len(samples))
print('samplerate:', samplerate)
print('freq len:', len(freq))
""" Write values to files """
np.savetxt(filepath + '-ims.txt', ims, delimiter=', ', newline='n', header='ims')
np.savetxt(filepath + '-samples.txt', samples, delimiter=', ', newline='n', header='samples')
np.savetxt(filepath + '-frequencies.txt', freq, delimiter=', ', newline='n', header='frequencies')
In terms of values out, the file I'm analyzing is approx 19.1 seconds long and the sample rate is 44100, so I’d expect to have about 842k values for any given variable. But I'm not seeing what I expected. Instead here is what I see:
freqs comes out with just a handful of values, 512 and while they appear to be correct range for expected frequency, they are ordered least to greatest, not in time series like I expected. The 512 values, I assume, is the "fast" in FFT, basically down-sampled...
ims, appears to be amplitude, but values seem too high, although sample size is correct. Should be seeing -50 up to ~240dB.
samples . . . not sure.
In short, can someone advise on how I'd get the FFT out to a text file with time, amp, and freq values for the entire sample set? Is savetxt the correct route, or is there a better way? This code can certainly be used to make a great spectrogram, but how can I just get out the data?
python-3.x numpy fft wav
add a comment |
I'm processing wav files for amplitude and frequency analysis with FFT, but I am having trouble getting the data out to csv in a time series format.
Using @Beginner's answer heavily from this post: How to convert a .wav file to a spectrogram in python3, I'm able to get the spectrogram output in an image. I'm trying to simplify that somewhat to get to a text output in csv format, but I'm not seeing how to do so. The outcome I'm hoping to achieve would look something like the following:
time_in_ms, amplitude_in_dB, freq_in_kHz
.001, -115, 1
.002, -110, 2
.003, 20, 200
...
19000, 20, 200
For my testing, I have been using http://soundbible.com/2123-40-Smith-Wesson-8x.html, (Notes: I simplified the wav down to a single channel and removed metadata w/ Audacity to get it to work.)
Heavy props to @Beginner for 99.9% of the following, anything nonsensical is surely mine.
import numpy as np
from matplotlib import pyplot as plt
import scipy.io.wavfile as wav
from numpy.lib import stride_tricks
filepath = "40sw3.wav"
""" short time fourier transform of audio signal """
def stft(sig, frameSize, overlapFac=0.5, window=np.hanning):
win = window(frameSize)
hopSize = int(frameSize - np.floor(overlapFac * frameSize))
# zeros at beginning (thus center of 1st window should be for sample nr. 0)
samples = np.append(np.zeros(int(np.floor(frameSize/2.0))), sig)
# cols for windowing
cols = np.ceil( (len(samples) - frameSize) / float(hopSize)) + 1
# zeros at end (thus samples can be fully covered by frames)
samples = np.append(samples, np.zeros(frameSize))
frames = stride_tricks.as_strided(samples, shape=(int(cols), frameSize), strides=(samples.strides[0]*hopSize, samples.strides[0])).copy()
frames *= win
return np.fft.rfft(frames)
""" scale frequency axis logarithmically """
def logscale_spec(spec, sr=44100, factor=20.):
timebins, freqbins = np.shape(spec)
scale = np.linspace(0, 1, freqbins) ** factor
scale *= (freqbins-1)/max(scale)
scale = np.unique(np.round(scale))
# create spectrogram with new freq bins
newspec = np.complex128(np.zeros([timebins, len(scale)]))
for i in range(0, len(scale)):
if i == len(scale)-1:
newspec[:,i] = np.sum(spec[:,int(scale[i]):], axis=1)
else:
newspec[:,i] = np.sum(spec[:,int(scale[i]):int(scale[i+1])], axis=1)
# list center freq of bins
allfreqs = np.abs(np.fft.fftfreq(freqbins*2, 1./sr)[:freqbins+1])
freqs = []
for i in range(0, len(scale)):
if i == len(scale)-1:
freqs += [np.mean(allfreqs[int(scale[i]):])]
else:
freqs += [np.mean(allfreqs[int(scale[i]):int(scale[i+1])])]
return newspec, freqs
""" compute spectrogram """
def compute_stft(audiopath, binsize=2**10):
samplerate, samples = wav.read(audiopath)
s = stft(samples, binsize)
sshow, freq = logscale_spec(s, factor=1.0, sr=samplerate)
ims = 20.*np.log10(np.abs(sshow)/10e-6) # amplitude to decibel
return ims, samples, samplerate, freq
""" plot spectrogram """
def plot_stft(ims, samples, samplerate, freq, binsize=2**10, plotpath=None, colormap="jet"):
timebins, freqbins = np.shape(ims)
plt.figure(figsize=(15, 7.5))
plt.imshow(np.transpose(ims), origin="lower", aspect="auto", cmap=colormap, interpolation="none")
plt.colorbar()
plt.xlabel("time (s)")
plt.ylabel("frequency (hz)")
plt.xlim([0, timebins-1])
plt.ylim([0, freqbins])
xlocs = np.float32(np.linspace(0, timebins-1, 5))
plt.xticks(xlocs, ["%.02f" % l for l in ((xlocs*len(samples)/timebins)+(0.5*binsize))/samplerate])
ylocs = np.int16(np.round(np.linspace(0, freqbins-1, 10)))
plt.yticks(ylocs, ["%.02f" % freq[i] for i in ylocs])
if plotpath:
plt.savefig(plotpath, bbox_inches="tight")
else:
plt.show()
plt.clf()
"" HERE IS WHERE I'm ATTEMPTING TO GET IT OUT TO TXT """
ims, samples, samplerate, freq = compute_stft(filepath)
""" Print lengths """
print('ims len:', len(ims))
print('samples len:', len(samples))
print('samplerate:', samplerate)
print('freq len:', len(freq))
""" Write values to files """
np.savetxt(filepath + '-ims.txt', ims, delimiter=', ', newline='n', header='ims')
np.savetxt(filepath + '-samples.txt', samples, delimiter=', ', newline='n', header='samples')
np.savetxt(filepath + '-frequencies.txt', freq, delimiter=', ', newline='n', header='frequencies')
In terms of values out, the file I'm analyzing is approx 19.1 seconds long and the sample rate is 44100, so I’d expect to have about 842k values for any given variable. But I'm not seeing what I expected. Instead here is what I see:
freqs comes out with just a handful of values, 512 and while they appear to be correct range for expected frequency, they are ordered least to greatest, not in time series like I expected. The 512 values, I assume, is the "fast" in FFT, basically down-sampled...
ims, appears to be amplitude, but values seem too high, although sample size is correct. Should be seeing -50 up to ~240dB.
samples . . . not sure.
In short, can someone advise on how I'd get the FFT out to a text file with time, amp, and freq values for the entire sample set? Is savetxt the correct route, or is there a better way? This code can certainly be used to make a great spectrogram, but how can I just get out the data?
python-3.x numpy fft wav
add a comment |
I'm processing wav files for amplitude and frequency analysis with FFT, but I am having trouble getting the data out to csv in a time series format.
Using @Beginner's answer heavily from this post: How to convert a .wav file to a spectrogram in python3, I'm able to get the spectrogram output in an image. I'm trying to simplify that somewhat to get to a text output in csv format, but I'm not seeing how to do so. The outcome I'm hoping to achieve would look something like the following:
time_in_ms, amplitude_in_dB, freq_in_kHz
.001, -115, 1
.002, -110, 2
.003, 20, 200
...
19000, 20, 200
For my testing, I have been using http://soundbible.com/2123-40-Smith-Wesson-8x.html, (Notes: I simplified the wav down to a single channel and removed metadata w/ Audacity to get it to work.)
Heavy props to @Beginner for 99.9% of the following, anything nonsensical is surely mine.
import numpy as np
from matplotlib import pyplot as plt
import scipy.io.wavfile as wav
from numpy.lib import stride_tricks
filepath = "40sw3.wav"
""" short time fourier transform of audio signal """
def stft(sig, frameSize, overlapFac=0.5, window=np.hanning):
win = window(frameSize)
hopSize = int(frameSize - np.floor(overlapFac * frameSize))
# zeros at beginning (thus center of 1st window should be for sample nr. 0)
samples = np.append(np.zeros(int(np.floor(frameSize/2.0))), sig)
# cols for windowing
cols = np.ceil( (len(samples) - frameSize) / float(hopSize)) + 1
# zeros at end (thus samples can be fully covered by frames)
samples = np.append(samples, np.zeros(frameSize))
frames = stride_tricks.as_strided(samples, shape=(int(cols), frameSize), strides=(samples.strides[0]*hopSize, samples.strides[0])).copy()
frames *= win
return np.fft.rfft(frames)
""" scale frequency axis logarithmically """
def logscale_spec(spec, sr=44100, factor=20.):
timebins, freqbins = np.shape(spec)
scale = np.linspace(0, 1, freqbins) ** factor
scale *= (freqbins-1)/max(scale)
scale = np.unique(np.round(scale))
# create spectrogram with new freq bins
newspec = np.complex128(np.zeros([timebins, len(scale)]))
for i in range(0, len(scale)):
if i == len(scale)-1:
newspec[:,i] = np.sum(spec[:,int(scale[i]):], axis=1)
else:
newspec[:,i] = np.sum(spec[:,int(scale[i]):int(scale[i+1])], axis=1)
# list center freq of bins
allfreqs = np.abs(np.fft.fftfreq(freqbins*2, 1./sr)[:freqbins+1])
freqs = []
for i in range(0, len(scale)):
if i == len(scale)-1:
freqs += [np.mean(allfreqs[int(scale[i]):])]
else:
freqs += [np.mean(allfreqs[int(scale[i]):int(scale[i+1])])]
return newspec, freqs
""" compute spectrogram """
def compute_stft(audiopath, binsize=2**10):
samplerate, samples = wav.read(audiopath)
s = stft(samples, binsize)
sshow, freq = logscale_spec(s, factor=1.0, sr=samplerate)
ims = 20.*np.log10(np.abs(sshow)/10e-6) # amplitude to decibel
return ims, samples, samplerate, freq
""" plot spectrogram """
def plot_stft(ims, samples, samplerate, freq, binsize=2**10, plotpath=None, colormap="jet"):
timebins, freqbins = np.shape(ims)
plt.figure(figsize=(15, 7.5))
plt.imshow(np.transpose(ims), origin="lower", aspect="auto", cmap=colormap, interpolation="none")
plt.colorbar()
plt.xlabel("time (s)")
plt.ylabel("frequency (hz)")
plt.xlim([0, timebins-1])
plt.ylim([0, freqbins])
xlocs = np.float32(np.linspace(0, timebins-1, 5))
plt.xticks(xlocs, ["%.02f" % l for l in ((xlocs*len(samples)/timebins)+(0.5*binsize))/samplerate])
ylocs = np.int16(np.round(np.linspace(0, freqbins-1, 10)))
plt.yticks(ylocs, ["%.02f" % freq[i] for i in ylocs])
if plotpath:
plt.savefig(plotpath, bbox_inches="tight")
else:
plt.show()
plt.clf()
"" HERE IS WHERE I'm ATTEMPTING TO GET IT OUT TO TXT """
ims, samples, samplerate, freq = compute_stft(filepath)
""" Print lengths """
print('ims len:', len(ims))
print('samples len:', len(samples))
print('samplerate:', samplerate)
print('freq len:', len(freq))
""" Write values to files """
np.savetxt(filepath + '-ims.txt', ims, delimiter=', ', newline='n', header='ims')
np.savetxt(filepath + '-samples.txt', samples, delimiter=', ', newline='n', header='samples')
np.savetxt(filepath + '-frequencies.txt', freq, delimiter=', ', newline='n', header='frequencies')
In terms of values out, the file I'm analyzing is approx 19.1 seconds long and the sample rate is 44100, so I’d expect to have about 842k values for any given variable. But I'm not seeing what I expected. Instead here is what I see:
freqs comes out with just a handful of values, 512 and while they appear to be correct range for expected frequency, they are ordered least to greatest, not in time series like I expected. The 512 values, I assume, is the "fast" in FFT, basically down-sampled...
ims, appears to be amplitude, but values seem too high, although sample size is correct. Should be seeing -50 up to ~240dB.
samples . . . not sure.
In short, can someone advise on how I'd get the FFT out to a text file with time, amp, and freq values for the entire sample set? Is savetxt the correct route, or is there a better way? This code can certainly be used to make a great spectrogram, but how can I just get out the data?
python-3.x numpy fft wav
I'm processing wav files for amplitude and frequency analysis with FFT, but I am having trouble getting the data out to csv in a time series format.
Using @Beginner's answer heavily from this post: How to convert a .wav file to a spectrogram in python3, I'm able to get the spectrogram output in an image. I'm trying to simplify that somewhat to get to a text output in csv format, but I'm not seeing how to do so. The outcome I'm hoping to achieve would look something like the following:
time_in_ms, amplitude_in_dB, freq_in_kHz
.001, -115, 1
.002, -110, 2
.003, 20, 200
...
19000, 20, 200
For my testing, I have been using http://soundbible.com/2123-40-Smith-Wesson-8x.html, (Notes: I simplified the wav down to a single channel and removed metadata w/ Audacity to get it to work.)
Heavy props to @Beginner for 99.9% of the following, anything nonsensical is surely mine.
import numpy as np
from matplotlib import pyplot as plt
import scipy.io.wavfile as wav
from numpy.lib import stride_tricks
filepath = "40sw3.wav"
""" short time fourier transform of audio signal """
def stft(sig, frameSize, overlapFac=0.5, window=np.hanning):
win = window(frameSize)
hopSize = int(frameSize - np.floor(overlapFac * frameSize))
# zeros at beginning (thus center of 1st window should be for sample nr. 0)
samples = np.append(np.zeros(int(np.floor(frameSize/2.0))), sig)
# cols for windowing
cols = np.ceil( (len(samples) - frameSize) / float(hopSize)) + 1
# zeros at end (thus samples can be fully covered by frames)
samples = np.append(samples, np.zeros(frameSize))
frames = stride_tricks.as_strided(samples, shape=(int(cols), frameSize), strides=(samples.strides[0]*hopSize, samples.strides[0])).copy()
frames *= win
return np.fft.rfft(frames)
""" scale frequency axis logarithmically """
def logscale_spec(spec, sr=44100, factor=20.):
timebins, freqbins = np.shape(spec)
scale = np.linspace(0, 1, freqbins) ** factor
scale *= (freqbins-1)/max(scale)
scale = np.unique(np.round(scale))
# create spectrogram with new freq bins
newspec = np.complex128(np.zeros([timebins, len(scale)]))
for i in range(0, len(scale)):
if i == len(scale)-1:
newspec[:,i] = np.sum(spec[:,int(scale[i]):], axis=1)
else:
newspec[:,i] = np.sum(spec[:,int(scale[i]):int(scale[i+1])], axis=1)
# list center freq of bins
allfreqs = np.abs(np.fft.fftfreq(freqbins*2, 1./sr)[:freqbins+1])
freqs = []
for i in range(0, len(scale)):
if i == len(scale)-1:
freqs += [np.mean(allfreqs[int(scale[i]):])]
else:
freqs += [np.mean(allfreqs[int(scale[i]):int(scale[i+1])])]
return newspec, freqs
""" compute spectrogram """
def compute_stft(audiopath, binsize=2**10):
samplerate, samples = wav.read(audiopath)
s = stft(samples, binsize)
sshow, freq = logscale_spec(s, factor=1.0, sr=samplerate)
ims = 20.*np.log10(np.abs(sshow)/10e-6) # amplitude to decibel
return ims, samples, samplerate, freq
""" plot spectrogram """
def plot_stft(ims, samples, samplerate, freq, binsize=2**10, plotpath=None, colormap="jet"):
timebins, freqbins = np.shape(ims)
plt.figure(figsize=(15, 7.5))
plt.imshow(np.transpose(ims), origin="lower", aspect="auto", cmap=colormap, interpolation="none")
plt.colorbar()
plt.xlabel("time (s)")
plt.ylabel("frequency (hz)")
plt.xlim([0, timebins-1])
plt.ylim([0, freqbins])
xlocs = np.float32(np.linspace(0, timebins-1, 5))
plt.xticks(xlocs, ["%.02f" % l for l in ((xlocs*len(samples)/timebins)+(0.5*binsize))/samplerate])
ylocs = np.int16(np.round(np.linspace(0, freqbins-1, 10)))
plt.yticks(ylocs, ["%.02f" % freq[i] for i in ylocs])
if plotpath:
plt.savefig(plotpath, bbox_inches="tight")
else:
plt.show()
plt.clf()
"" HERE IS WHERE I'm ATTEMPTING TO GET IT OUT TO TXT """
ims, samples, samplerate, freq = compute_stft(filepath)
""" Print lengths """
print('ims len:', len(ims))
print('samples len:', len(samples))
print('samplerate:', samplerate)
print('freq len:', len(freq))
""" Write values to files """
np.savetxt(filepath + '-ims.txt', ims, delimiter=', ', newline='n', header='ims')
np.savetxt(filepath + '-samples.txt', samples, delimiter=', ', newline='n', header='samples')
np.savetxt(filepath + '-frequencies.txt', freq, delimiter=', ', newline='n', header='frequencies')
In terms of values out, the file I'm analyzing is approx 19.1 seconds long and the sample rate is 44100, so I’d expect to have about 842k values for any given variable. But I'm not seeing what I expected. Instead here is what I see:
freqs comes out with just a handful of values, 512 and while they appear to be correct range for expected frequency, they are ordered least to greatest, not in time series like I expected. The 512 values, I assume, is the "fast" in FFT, basically down-sampled...
ims, appears to be amplitude, but values seem too high, although sample size is correct. Should be seeing -50 up to ~240dB.
samples . . . not sure.
In short, can someone advise on how I'd get the FFT out to a text file with time, amp, and freq values for the entire sample set? Is savetxt the correct route, or is there a better way? This code can certainly be used to make a great spectrogram, but how can I just get out the data?
python-3.x numpy fft wav
python-3.x numpy fft wav
edited Mar 27 at 0:50
Foob
asked Mar 26 at 23:49
FoobFoob
64 bronze badges
64 bronze badges
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
Your output format is too limiting, as the audio spectrum at any interval in time usually contains a range of frequencies. e.g the FFT of a 1024 samples will contain 512 frequency bins for one window of time or time step, each with an amplitude. If you want a time step of one millisecond, then you will have to offset the window of samples you feed each STFT to center the window at that point in your sample vector. Although with an FFT about 23 milliseconds long, that will involve a high overlap of windows. You could use shorter windows, but the time-frequency trade-off will result in proportionately less frequency resolution.
OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.
– Foob
Apr 8 at 19:38
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55367798%2fconverting-a-wav-file-to-amplitude-and-frequency-values-for-textual-time-series%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
Your output format is too limiting, as the audio spectrum at any interval in time usually contains a range of frequencies. e.g the FFT of a 1024 samples will contain 512 frequency bins for one window of time or time step, each with an amplitude. If you want a time step of one millisecond, then you will have to offset the window of samples you feed each STFT to center the window at that point in your sample vector. Although with an FFT about 23 milliseconds long, that will involve a high overlap of windows. You could use shorter windows, but the time-frequency trade-off will result in proportionately less frequency resolution.
OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.
– Foob
Apr 8 at 19:38
add a comment |
Your output format is too limiting, as the audio spectrum at any interval in time usually contains a range of frequencies. e.g the FFT of a 1024 samples will contain 512 frequency bins for one window of time or time step, each with an amplitude. If you want a time step of one millisecond, then you will have to offset the window of samples you feed each STFT to center the window at that point in your sample vector. Although with an FFT about 23 milliseconds long, that will involve a high overlap of windows. You could use shorter windows, but the time-frequency trade-off will result in proportionately less frequency resolution.
OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.
– Foob
Apr 8 at 19:38
add a comment |
Your output format is too limiting, as the audio spectrum at any interval in time usually contains a range of frequencies. e.g the FFT of a 1024 samples will contain 512 frequency bins for one window of time or time step, each with an amplitude. If you want a time step of one millisecond, then you will have to offset the window of samples you feed each STFT to center the window at that point in your sample vector. Although with an FFT about 23 milliseconds long, that will involve a high overlap of windows. You could use shorter windows, but the time-frequency trade-off will result in proportionately less frequency resolution.
Your output format is too limiting, as the audio spectrum at any interval in time usually contains a range of frequencies. e.g the FFT of a 1024 samples will contain 512 frequency bins for one window of time or time step, each with an amplitude. If you want a time step of one millisecond, then you will have to offset the window of samples you feed each STFT to center the window at that point in your sample vector. Although with an FFT about 23 milliseconds long, that will involve a high overlap of windows. You could use shorter windows, but the time-frequency trade-off will result in proportionately less frequency resolution.
answered Mar 28 at 17:36
hotpaw2hotpaw2
62.5k10 gold badges73 silver badges133 bronze badges
62.5k10 gold badges73 silver badges133 bronze badges
OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.
– Foob
Apr 8 at 19:38
add a comment |
OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.
– Foob
Apr 8 at 19:38
OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.
– Foob
Apr 8 at 19:38
OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.
– Foob
Apr 8 at 19:38
add a comment |
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55367798%2fconverting-a-wav-file-to-amplitude-and-frequency-values-for-textual-time-series%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown