Converting a wav file to amplitude and frequency values for textual, time-series analysisHow to convert a .wav file to a spectrogram in python3Convert Amplitude BACK to .wav file (C#)Amplitude / Frequency analysis in MatlabCompare two wav files in the frequency domain in javaJava .wav file frequency analysis - incorrect frequencySolving for Amplitude and Frequency in WAV filesfind the frequency amplitude and phase of a wav file by fftHow to Frequency analysis of wav files in 1 second interval?(not graphical)Python audio wav file analysissignal.spectrogram returns too many hzpython numpy error “TypeError: 'numpy.float64' object cannot be interpreted as an integer”

Python π = 1 + (1/2) + (1/3) + (1/4) - (1/5) + (1/6) + (1/7) + (1/8) + (1/9) - (1/10) ...1748 Euler

Password management for kids - what's a good way to start?

Meaning of 誰かの代わりに

How can a class have multiple methods without breaking the single responsibility principle

Reasons for using monsters as bioweapons

Can birds evolve without trees?

If I buy and download a game through second Nintendo account do I own it on my main account too?

Why are sugars in whole fruits not digested the same way sugars in juice are?

A conjectural trigonometric identity

Ernie and the Superconducting Boxes

How to draw twisted cuves?

Backpacking with incontinence

Does the use of a new concept require a prior definition?

Flat maps and Zariski tangent spaces

"Fewer errors means better products" or "Fewer errors mean better products"?

How is Sword Coast North governed?

Plotting Chebyshev polynomials using PolarPlot and FilledCurve

What's the term for a group of people who enjoy literary works?

When did J.K. Rowling decide to make Ron and Hermione a couple?

PI 4 screen rotation from the terminal

What's the proper way of indicating that a car has reached its destination during a dialogue?

Applied Meditation

Export economy of Mars

How to avoid a lengthy conversation with someone from the neighborhood I don't share interests with



Converting a wav file to amplitude and frequency values for textual, time-series analysis


How to convert a .wav file to a spectrogram in python3Convert Amplitude BACK to .wav file (C#)Amplitude / Frequency analysis in MatlabCompare two wav files in the frequency domain in javaJava .wav file frequency analysis - incorrect frequencySolving for Amplitude and Frequency in WAV filesfind the frequency amplitude and phase of a wav file by fftHow to Frequency analysis of wav files in 1 second interval?(not graphical)Python audio wav file analysissignal.spectrogram returns too many hzpython numpy error “TypeError: 'numpy.float64' object cannot be interpreted as an integer”






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















I'm processing wav files for amplitude and frequency analysis with FFT, but I am having trouble getting the data out to csv in a time series format.



Using @Beginner's answer heavily from this post: How to convert a .wav file to a spectrogram in python3, I'm able to get the spectrogram output in an image. I'm trying to simplify that somewhat to get to a text output in csv format, but I'm not seeing how to do so. The outcome I'm hoping to achieve would look something like the following:



time_in_ms, amplitude_in_dB, freq_in_kHz

.001, -115, 1

.002, -110, 2

.003, 20, 200

...

19000, 20, 200



For my testing, I have been using http://soundbible.com/2123-40-Smith-Wesson-8x.html, (Notes: I simplified the wav down to a single channel and removed metadata w/ Audacity to get it to work.)



Heavy props to @Beginner for 99.9% of the following, anything nonsensical is surely mine.



import numpy as np
from matplotlib import pyplot as plt
import scipy.io.wavfile as wav
from numpy.lib import stride_tricks

filepath = "40sw3.wav"

""" short time fourier transform of audio signal """
def stft(sig, frameSize, overlapFac=0.5, window=np.hanning):
win = window(frameSize)
hopSize = int(frameSize - np.floor(overlapFac * frameSize))

# zeros at beginning (thus center of 1st window should be for sample nr. 0)
samples = np.append(np.zeros(int(np.floor(frameSize/2.0))), sig)
# cols for windowing
cols = np.ceil( (len(samples) - frameSize) / float(hopSize)) + 1
# zeros at end (thus samples can be fully covered by frames)
samples = np.append(samples, np.zeros(frameSize))

frames = stride_tricks.as_strided(samples, shape=(int(cols), frameSize), strides=(samples.strides[0]*hopSize, samples.strides[0])).copy()
frames *= win

return np.fft.rfft(frames)

""" scale frequency axis logarithmically """
def logscale_spec(spec, sr=44100, factor=20.):
timebins, freqbins = np.shape(spec)

scale = np.linspace(0, 1, freqbins) ** factor
scale *= (freqbins-1)/max(scale)
scale = np.unique(np.round(scale))

# create spectrogram with new freq bins
newspec = np.complex128(np.zeros([timebins, len(scale)]))
for i in range(0, len(scale)):
if i == len(scale)-1:
newspec[:,i] = np.sum(spec[:,int(scale[i]):], axis=1)
else:
newspec[:,i] = np.sum(spec[:,int(scale[i]):int(scale[i+1])], axis=1)

# list center freq of bins
allfreqs = np.abs(np.fft.fftfreq(freqbins*2, 1./sr)[:freqbins+1])
freqs = []
for i in range(0, len(scale)):
if i == len(scale)-1:
freqs += [np.mean(allfreqs[int(scale[i]):])]
else:
freqs += [np.mean(allfreqs[int(scale[i]):int(scale[i+1])])]

return newspec, freqs

""" compute spectrogram """
def compute_stft(audiopath, binsize=2**10):
samplerate, samples = wav.read(audiopath)
s = stft(samples, binsize)
sshow, freq = logscale_spec(s, factor=1.0, sr=samplerate)
ims = 20.*np.log10(np.abs(sshow)/10e-6) # amplitude to decibel
return ims, samples, samplerate, freq

""" plot spectrogram """
def plot_stft(ims, samples, samplerate, freq, binsize=2**10, plotpath=None, colormap="jet"):
timebins, freqbins = np.shape(ims)

plt.figure(figsize=(15, 7.5))
plt.imshow(np.transpose(ims), origin="lower", aspect="auto", cmap=colormap, interpolation="none")
plt.colorbar()

plt.xlabel("time (s)")
plt.ylabel("frequency (hz)")
plt.xlim([0, timebins-1])
plt.ylim([0, freqbins])

xlocs = np.float32(np.linspace(0, timebins-1, 5))
plt.xticks(xlocs, ["%.02f" % l for l in ((xlocs*len(samples)/timebins)+(0.5*binsize))/samplerate])
ylocs = np.int16(np.round(np.linspace(0, freqbins-1, 10)))
plt.yticks(ylocs, ["%.02f" % freq[i] for i in ylocs])

if plotpath:
plt.savefig(plotpath, bbox_inches="tight")
else:
plt.show()

plt.clf()

"" HERE IS WHERE I'm ATTEMPTING TO GET IT OUT TO TXT """
ims, samples, samplerate, freq = compute_stft(filepath)

""" Print lengths """
print('ims len:', len(ims))
print('samples len:', len(samples))
print('samplerate:', samplerate)
print('freq len:', len(freq))

""" Write values to files """
np.savetxt(filepath + '-ims.txt', ims, delimiter=', ', newline='n', header='ims')
np.savetxt(filepath + '-samples.txt', samples, delimiter=', ', newline='n', header='samples')
np.savetxt(filepath + '-frequencies.txt', freq, delimiter=', ', newline='n', header='frequencies')


In terms of values out, the file I'm analyzing is approx 19.1 seconds long and the sample rate is 44100, so I’d expect to have about 842k values for any given variable. But I'm not seeing what I expected. Instead here is what I see:



freqs comes out with just a handful of values, 512 and while they appear to be correct range for expected frequency, they are ordered least to greatest, not in time series like I expected. The 512 values, I assume, is the "fast" in FFT, basically down-sampled...



ims, appears to be amplitude, but values seem too high, although sample size is correct. Should be seeing -50 up to ~240dB.



samples . . . not sure.



In short, can someone advise on how I'd get the FFT out to a text file with time, amp, and freq values for the entire sample set? Is savetxt the correct route, or is there a better way? This code can certainly be used to make a great spectrogram, but how can I just get out the data?










share|improve this question
































    1















    I'm processing wav files for amplitude and frequency analysis with FFT, but I am having trouble getting the data out to csv in a time series format.



    Using @Beginner's answer heavily from this post: How to convert a .wav file to a spectrogram in python3, I'm able to get the spectrogram output in an image. I'm trying to simplify that somewhat to get to a text output in csv format, but I'm not seeing how to do so. The outcome I'm hoping to achieve would look something like the following:



    time_in_ms, amplitude_in_dB, freq_in_kHz

    .001, -115, 1

    .002, -110, 2

    .003, 20, 200

    ...

    19000, 20, 200



    For my testing, I have been using http://soundbible.com/2123-40-Smith-Wesson-8x.html, (Notes: I simplified the wav down to a single channel and removed metadata w/ Audacity to get it to work.)



    Heavy props to @Beginner for 99.9% of the following, anything nonsensical is surely mine.



    import numpy as np
    from matplotlib import pyplot as plt
    import scipy.io.wavfile as wav
    from numpy.lib import stride_tricks

    filepath = "40sw3.wav"

    """ short time fourier transform of audio signal """
    def stft(sig, frameSize, overlapFac=0.5, window=np.hanning):
    win = window(frameSize)
    hopSize = int(frameSize - np.floor(overlapFac * frameSize))

    # zeros at beginning (thus center of 1st window should be for sample nr. 0)
    samples = np.append(np.zeros(int(np.floor(frameSize/2.0))), sig)
    # cols for windowing
    cols = np.ceil( (len(samples) - frameSize) / float(hopSize)) + 1
    # zeros at end (thus samples can be fully covered by frames)
    samples = np.append(samples, np.zeros(frameSize))

    frames = stride_tricks.as_strided(samples, shape=(int(cols), frameSize), strides=(samples.strides[0]*hopSize, samples.strides[0])).copy()
    frames *= win

    return np.fft.rfft(frames)

    """ scale frequency axis logarithmically """
    def logscale_spec(spec, sr=44100, factor=20.):
    timebins, freqbins = np.shape(spec)

    scale = np.linspace(0, 1, freqbins) ** factor
    scale *= (freqbins-1)/max(scale)
    scale = np.unique(np.round(scale))

    # create spectrogram with new freq bins
    newspec = np.complex128(np.zeros([timebins, len(scale)]))
    for i in range(0, len(scale)):
    if i == len(scale)-1:
    newspec[:,i] = np.sum(spec[:,int(scale[i]):], axis=1)
    else:
    newspec[:,i] = np.sum(spec[:,int(scale[i]):int(scale[i+1])], axis=1)

    # list center freq of bins
    allfreqs = np.abs(np.fft.fftfreq(freqbins*2, 1./sr)[:freqbins+1])
    freqs = []
    for i in range(0, len(scale)):
    if i == len(scale)-1:
    freqs += [np.mean(allfreqs[int(scale[i]):])]
    else:
    freqs += [np.mean(allfreqs[int(scale[i]):int(scale[i+1])])]

    return newspec, freqs

    """ compute spectrogram """
    def compute_stft(audiopath, binsize=2**10):
    samplerate, samples = wav.read(audiopath)
    s = stft(samples, binsize)
    sshow, freq = logscale_spec(s, factor=1.0, sr=samplerate)
    ims = 20.*np.log10(np.abs(sshow)/10e-6) # amplitude to decibel
    return ims, samples, samplerate, freq

    """ plot spectrogram """
    def plot_stft(ims, samples, samplerate, freq, binsize=2**10, plotpath=None, colormap="jet"):
    timebins, freqbins = np.shape(ims)

    plt.figure(figsize=(15, 7.5))
    plt.imshow(np.transpose(ims), origin="lower", aspect="auto", cmap=colormap, interpolation="none")
    plt.colorbar()

    plt.xlabel("time (s)")
    plt.ylabel("frequency (hz)")
    plt.xlim([0, timebins-1])
    plt.ylim([0, freqbins])

    xlocs = np.float32(np.linspace(0, timebins-1, 5))
    plt.xticks(xlocs, ["%.02f" % l for l in ((xlocs*len(samples)/timebins)+(0.5*binsize))/samplerate])
    ylocs = np.int16(np.round(np.linspace(0, freqbins-1, 10)))
    plt.yticks(ylocs, ["%.02f" % freq[i] for i in ylocs])

    if plotpath:
    plt.savefig(plotpath, bbox_inches="tight")
    else:
    plt.show()

    plt.clf()

    "" HERE IS WHERE I'm ATTEMPTING TO GET IT OUT TO TXT """
    ims, samples, samplerate, freq = compute_stft(filepath)

    """ Print lengths """
    print('ims len:', len(ims))
    print('samples len:', len(samples))
    print('samplerate:', samplerate)
    print('freq len:', len(freq))

    """ Write values to files """
    np.savetxt(filepath + '-ims.txt', ims, delimiter=', ', newline='n', header='ims')
    np.savetxt(filepath + '-samples.txt', samples, delimiter=', ', newline='n', header='samples')
    np.savetxt(filepath + '-frequencies.txt', freq, delimiter=', ', newline='n', header='frequencies')


    In terms of values out, the file I'm analyzing is approx 19.1 seconds long and the sample rate is 44100, so I’d expect to have about 842k values for any given variable. But I'm not seeing what I expected. Instead here is what I see:



    freqs comes out with just a handful of values, 512 and while they appear to be correct range for expected frequency, they are ordered least to greatest, not in time series like I expected. The 512 values, I assume, is the "fast" in FFT, basically down-sampled...



    ims, appears to be amplitude, but values seem too high, although sample size is correct. Should be seeing -50 up to ~240dB.



    samples . . . not sure.



    In short, can someone advise on how I'd get the FFT out to a text file with time, amp, and freq values for the entire sample set? Is savetxt the correct route, or is there a better way? This code can certainly be used to make a great spectrogram, but how can I just get out the data?










    share|improve this question




























      1












      1








      1








      I'm processing wav files for amplitude and frequency analysis with FFT, but I am having trouble getting the data out to csv in a time series format.



      Using @Beginner's answer heavily from this post: How to convert a .wav file to a spectrogram in python3, I'm able to get the spectrogram output in an image. I'm trying to simplify that somewhat to get to a text output in csv format, but I'm not seeing how to do so. The outcome I'm hoping to achieve would look something like the following:



      time_in_ms, amplitude_in_dB, freq_in_kHz

      .001, -115, 1

      .002, -110, 2

      .003, 20, 200

      ...

      19000, 20, 200



      For my testing, I have been using http://soundbible.com/2123-40-Smith-Wesson-8x.html, (Notes: I simplified the wav down to a single channel and removed metadata w/ Audacity to get it to work.)



      Heavy props to @Beginner for 99.9% of the following, anything nonsensical is surely mine.



      import numpy as np
      from matplotlib import pyplot as plt
      import scipy.io.wavfile as wav
      from numpy.lib import stride_tricks

      filepath = "40sw3.wav"

      """ short time fourier transform of audio signal """
      def stft(sig, frameSize, overlapFac=0.5, window=np.hanning):
      win = window(frameSize)
      hopSize = int(frameSize - np.floor(overlapFac * frameSize))

      # zeros at beginning (thus center of 1st window should be for sample nr. 0)
      samples = np.append(np.zeros(int(np.floor(frameSize/2.0))), sig)
      # cols for windowing
      cols = np.ceil( (len(samples) - frameSize) / float(hopSize)) + 1
      # zeros at end (thus samples can be fully covered by frames)
      samples = np.append(samples, np.zeros(frameSize))

      frames = stride_tricks.as_strided(samples, shape=(int(cols), frameSize), strides=(samples.strides[0]*hopSize, samples.strides[0])).copy()
      frames *= win

      return np.fft.rfft(frames)

      """ scale frequency axis logarithmically """
      def logscale_spec(spec, sr=44100, factor=20.):
      timebins, freqbins = np.shape(spec)

      scale = np.linspace(0, 1, freqbins) ** factor
      scale *= (freqbins-1)/max(scale)
      scale = np.unique(np.round(scale))

      # create spectrogram with new freq bins
      newspec = np.complex128(np.zeros([timebins, len(scale)]))
      for i in range(0, len(scale)):
      if i == len(scale)-1:
      newspec[:,i] = np.sum(spec[:,int(scale[i]):], axis=1)
      else:
      newspec[:,i] = np.sum(spec[:,int(scale[i]):int(scale[i+1])], axis=1)

      # list center freq of bins
      allfreqs = np.abs(np.fft.fftfreq(freqbins*2, 1./sr)[:freqbins+1])
      freqs = []
      for i in range(0, len(scale)):
      if i == len(scale)-1:
      freqs += [np.mean(allfreqs[int(scale[i]):])]
      else:
      freqs += [np.mean(allfreqs[int(scale[i]):int(scale[i+1])])]

      return newspec, freqs

      """ compute spectrogram """
      def compute_stft(audiopath, binsize=2**10):
      samplerate, samples = wav.read(audiopath)
      s = stft(samples, binsize)
      sshow, freq = logscale_spec(s, factor=1.0, sr=samplerate)
      ims = 20.*np.log10(np.abs(sshow)/10e-6) # amplitude to decibel
      return ims, samples, samplerate, freq

      """ plot spectrogram """
      def plot_stft(ims, samples, samplerate, freq, binsize=2**10, plotpath=None, colormap="jet"):
      timebins, freqbins = np.shape(ims)

      plt.figure(figsize=(15, 7.5))
      plt.imshow(np.transpose(ims), origin="lower", aspect="auto", cmap=colormap, interpolation="none")
      plt.colorbar()

      plt.xlabel("time (s)")
      plt.ylabel("frequency (hz)")
      plt.xlim([0, timebins-1])
      plt.ylim([0, freqbins])

      xlocs = np.float32(np.linspace(0, timebins-1, 5))
      plt.xticks(xlocs, ["%.02f" % l for l in ((xlocs*len(samples)/timebins)+(0.5*binsize))/samplerate])
      ylocs = np.int16(np.round(np.linspace(0, freqbins-1, 10)))
      plt.yticks(ylocs, ["%.02f" % freq[i] for i in ylocs])

      if plotpath:
      plt.savefig(plotpath, bbox_inches="tight")
      else:
      plt.show()

      plt.clf()

      "" HERE IS WHERE I'm ATTEMPTING TO GET IT OUT TO TXT """
      ims, samples, samplerate, freq = compute_stft(filepath)

      """ Print lengths """
      print('ims len:', len(ims))
      print('samples len:', len(samples))
      print('samplerate:', samplerate)
      print('freq len:', len(freq))

      """ Write values to files """
      np.savetxt(filepath + '-ims.txt', ims, delimiter=', ', newline='n', header='ims')
      np.savetxt(filepath + '-samples.txt', samples, delimiter=', ', newline='n', header='samples')
      np.savetxt(filepath + '-frequencies.txt', freq, delimiter=', ', newline='n', header='frequencies')


      In terms of values out, the file I'm analyzing is approx 19.1 seconds long and the sample rate is 44100, so I’d expect to have about 842k values for any given variable. But I'm not seeing what I expected. Instead here is what I see:



      freqs comes out with just a handful of values, 512 and while they appear to be correct range for expected frequency, they are ordered least to greatest, not in time series like I expected. The 512 values, I assume, is the "fast" in FFT, basically down-sampled...



      ims, appears to be amplitude, but values seem too high, although sample size is correct. Should be seeing -50 up to ~240dB.



      samples . . . not sure.



      In short, can someone advise on how I'd get the FFT out to a text file with time, amp, and freq values for the entire sample set? Is savetxt the correct route, or is there a better way? This code can certainly be used to make a great spectrogram, but how can I just get out the data?










      share|improve this question
















      I'm processing wav files for amplitude and frequency analysis with FFT, but I am having trouble getting the data out to csv in a time series format.



      Using @Beginner's answer heavily from this post: How to convert a .wav file to a spectrogram in python3, I'm able to get the spectrogram output in an image. I'm trying to simplify that somewhat to get to a text output in csv format, but I'm not seeing how to do so. The outcome I'm hoping to achieve would look something like the following:



      time_in_ms, amplitude_in_dB, freq_in_kHz

      .001, -115, 1

      .002, -110, 2

      .003, 20, 200

      ...

      19000, 20, 200



      For my testing, I have been using http://soundbible.com/2123-40-Smith-Wesson-8x.html, (Notes: I simplified the wav down to a single channel and removed metadata w/ Audacity to get it to work.)



      Heavy props to @Beginner for 99.9% of the following, anything nonsensical is surely mine.



      import numpy as np
      from matplotlib import pyplot as plt
      import scipy.io.wavfile as wav
      from numpy.lib import stride_tricks

      filepath = "40sw3.wav"

      """ short time fourier transform of audio signal """
      def stft(sig, frameSize, overlapFac=0.5, window=np.hanning):
      win = window(frameSize)
      hopSize = int(frameSize - np.floor(overlapFac * frameSize))

      # zeros at beginning (thus center of 1st window should be for sample nr. 0)
      samples = np.append(np.zeros(int(np.floor(frameSize/2.0))), sig)
      # cols for windowing
      cols = np.ceil( (len(samples) - frameSize) / float(hopSize)) + 1
      # zeros at end (thus samples can be fully covered by frames)
      samples = np.append(samples, np.zeros(frameSize))

      frames = stride_tricks.as_strided(samples, shape=(int(cols), frameSize), strides=(samples.strides[0]*hopSize, samples.strides[0])).copy()
      frames *= win

      return np.fft.rfft(frames)

      """ scale frequency axis logarithmically """
      def logscale_spec(spec, sr=44100, factor=20.):
      timebins, freqbins = np.shape(spec)

      scale = np.linspace(0, 1, freqbins) ** factor
      scale *= (freqbins-1)/max(scale)
      scale = np.unique(np.round(scale))

      # create spectrogram with new freq bins
      newspec = np.complex128(np.zeros([timebins, len(scale)]))
      for i in range(0, len(scale)):
      if i == len(scale)-1:
      newspec[:,i] = np.sum(spec[:,int(scale[i]):], axis=1)
      else:
      newspec[:,i] = np.sum(spec[:,int(scale[i]):int(scale[i+1])], axis=1)

      # list center freq of bins
      allfreqs = np.abs(np.fft.fftfreq(freqbins*2, 1./sr)[:freqbins+1])
      freqs = []
      for i in range(0, len(scale)):
      if i == len(scale)-1:
      freqs += [np.mean(allfreqs[int(scale[i]):])]
      else:
      freqs += [np.mean(allfreqs[int(scale[i]):int(scale[i+1])])]

      return newspec, freqs

      """ compute spectrogram """
      def compute_stft(audiopath, binsize=2**10):
      samplerate, samples = wav.read(audiopath)
      s = stft(samples, binsize)
      sshow, freq = logscale_spec(s, factor=1.0, sr=samplerate)
      ims = 20.*np.log10(np.abs(sshow)/10e-6) # amplitude to decibel
      return ims, samples, samplerate, freq

      """ plot spectrogram """
      def plot_stft(ims, samples, samplerate, freq, binsize=2**10, plotpath=None, colormap="jet"):
      timebins, freqbins = np.shape(ims)

      plt.figure(figsize=(15, 7.5))
      plt.imshow(np.transpose(ims), origin="lower", aspect="auto", cmap=colormap, interpolation="none")
      plt.colorbar()

      plt.xlabel("time (s)")
      plt.ylabel("frequency (hz)")
      plt.xlim([0, timebins-1])
      plt.ylim([0, freqbins])

      xlocs = np.float32(np.linspace(0, timebins-1, 5))
      plt.xticks(xlocs, ["%.02f" % l for l in ((xlocs*len(samples)/timebins)+(0.5*binsize))/samplerate])
      ylocs = np.int16(np.round(np.linspace(0, freqbins-1, 10)))
      plt.yticks(ylocs, ["%.02f" % freq[i] for i in ylocs])

      if plotpath:
      plt.savefig(plotpath, bbox_inches="tight")
      else:
      plt.show()

      plt.clf()

      "" HERE IS WHERE I'm ATTEMPTING TO GET IT OUT TO TXT """
      ims, samples, samplerate, freq = compute_stft(filepath)

      """ Print lengths """
      print('ims len:', len(ims))
      print('samples len:', len(samples))
      print('samplerate:', samplerate)
      print('freq len:', len(freq))

      """ Write values to files """
      np.savetxt(filepath + '-ims.txt', ims, delimiter=', ', newline='n', header='ims')
      np.savetxt(filepath + '-samples.txt', samples, delimiter=', ', newline='n', header='samples')
      np.savetxt(filepath + '-frequencies.txt', freq, delimiter=', ', newline='n', header='frequencies')


      In terms of values out, the file I'm analyzing is approx 19.1 seconds long and the sample rate is 44100, so I’d expect to have about 842k values for any given variable. But I'm not seeing what I expected. Instead here is what I see:



      freqs comes out with just a handful of values, 512 and while they appear to be correct range for expected frequency, they are ordered least to greatest, not in time series like I expected. The 512 values, I assume, is the "fast" in FFT, basically down-sampled...



      ims, appears to be amplitude, but values seem too high, although sample size is correct. Should be seeing -50 up to ~240dB.



      samples . . . not sure.



      In short, can someone advise on how I'd get the FFT out to a text file with time, amp, and freq values for the entire sample set? Is savetxt the correct route, or is there a better way? This code can certainly be used to make a great spectrogram, but how can I just get out the data?







      python-3.x numpy fft wav






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Mar 27 at 0:50







      Foob

















      asked Mar 26 at 23:49









      FoobFoob

      64 bronze badges




      64 bronze badges

























          1 Answer
          1






          active

          oldest

          votes


















          0














          Your output format is too limiting, as the audio spectrum at any interval in time usually contains a range of frequencies. e.g the FFT of a 1024 samples will contain 512 frequency bins for one window of time or time step, each with an amplitude. If you want a time step of one millisecond, then you will have to offset the window of samples you feed each STFT to center the window at that point in your sample vector. Although with an FFT about 23 milliseconds long, that will involve a high overlap of windows. You could use shorter windows, but the time-frequency trade-off will result in proportionately less frequency resolution.






          share|improve this answer

























          • OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.

            – Foob
            Apr 8 at 19:38











          Your Answer






          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "1"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55367798%2fconverting-a-wav-file-to-amplitude-and-frequency-values-for-textual-time-series%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          0














          Your output format is too limiting, as the audio spectrum at any interval in time usually contains a range of frequencies. e.g the FFT of a 1024 samples will contain 512 frequency bins for one window of time or time step, each with an amplitude. If you want a time step of one millisecond, then you will have to offset the window of samples you feed each STFT to center the window at that point in your sample vector. Although with an FFT about 23 milliseconds long, that will involve a high overlap of windows. You could use shorter windows, but the time-frequency trade-off will result in proportionately less frequency resolution.






          share|improve this answer

























          • OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.

            – Foob
            Apr 8 at 19:38
















          0














          Your output format is too limiting, as the audio spectrum at any interval in time usually contains a range of frequencies. e.g the FFT of a 1024 samples will contain 512 frequency bins for one window of time or time step, each with an amplitude. If you want a time step of one millisecond, then you will have to offset the window of samples you feed each STFT to center the window at that point in your sample vector. Although with an FFT about 23 milliseconds long, that will involve a high overlap of windows. You could use shorter windows, but the time-frequency trade-off will result in proportionately less frequency resolution.






          share|improve this answer

























          • OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.

            – Foob
            Apr 8 at 19:38














          0












          0








          0







          Your output format is too limiting, as the audio spectrum at any interval in time usually contains a range of frequencies. e.g the FFT of a 1024 samples will contain 512 frequency bins for one window of time or time step, each with an amplitude. If you want a time step of one millisecond, then you will have to offset the window of samples you feed each STFT to center the window at that point in your sample vector. Although with an FFT about 23 milliseconds long, that will involve a high overlap of windows. You could use shorter windows, but the time-frequency trade-off will result in proportionately less frequency resolution.






          share|improve this answer













          Your output format is too limiting, as the audio spectrum at any interval in time usually contains a range of frequencies. e.g the FFT of a 1024 samples will contain 512 frequency bins for one window of time or time step, each with an amplitude. If you want a time step of one millisecond, then you will have to offset the window of samples you feed each STFT to center the window at that point in your sample vector. Although with an FFT about 23 milliseconds long, that will involve a high overlap of windows. You could use shorter windows, but the time-frequency trade-off will result in proportionately less frequency resolution.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Mar 28 at 17:36









          hotpaw2hotpaw2

          62.5k10 gold badges73 silver badges133 bronze badges




          62.5k10 gold badges73 silver badges133 bronze badges















          • OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.

            – Foob
            Apr 8 at 19:38


















          • OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.

            – Foob
            Apr 8 at 19:38

















          OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.

          – Foob
          Apr 8 at 19:38






          OK, that makes sense, so what if I extend my format to include 512 frequencies for each time bin, so the output would instead be: time_in_ms, amplitude_in_dB, freq1_in_kHz...freq512_in_kHz .001, -115, 1...13000 .002, -110, 2...556 .003, 20, 200...21000 ... 19000, 20, 200...5609 In other words, how do I tie time domain back to my discovered frequency domain so that I can identify which frequency buckets appear at which time? Time designation would be arbitrary, sequence # would work too.

          – Foob
          Apr 8 at 19:38







          Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.







          Got a question that you can’t ask on public Stack Overflow? Learn more about sharing private information with Stack Overflow for Teams.



















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55367798%2fconverting-a-wav-file-to-amplitude-and-frequency-values-for-textual-time-series%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          SQL error code 1064 with creating Laravel foreign keysForeign key constraints: When to use ON UPDATE and ON DELETEDropping column with foreign key Laravel error: General error: 1025 Error on renameLaravel SQL Can't create tableLaravel Migration foreign key errorLaravel php artisan migrate:refresh giving a syntax errorSQLSTATE[42S01]: Base table or view already exists or Base table or view already exists: 1050 Tableerror in migrating laravel file to xampp serverSyntax error or access violation: 1064:syntax to use near 'unsigned not null, modelName varchar(191) not null, title varchar(191) not nLaravel cannot create new table field in mysqlLaravel 5.7:Last migration creates table but is not registered in the migration table

          용인 삼성생명 블루밍스 목차 통계 역대 감독 선수단 응원단 경기장 같이 보기 외부 링크 둘러보기 메뉴samsungblueminx.comeh선수 명단용인 삼성생명 블루밍스용인 삼성생명 블루밍스ehsamsungblueminx.comeheheheh

          155 수학 과학 기타 둘러보기 메뉴eh추가해eh문서를 완성해