.. note::
    :class: sphx-glr-download-link-note

    Click :ref:`here <sphx_glr_download_beginner_audio_datasets_tutorial.py>` to download the full example code
.. rst-class:: sphx-glr-example-title

.. _sphx_glr_beginner_audio_datasets_tutorial.py:


Audio Datasets
========

``torchaudio`` provides easy access to common, publicly accessible
datasets. Please refer to the official documentation for the list of
available datasets.

.. code-block:: default


    # When running this tutorial in Google Colab, install the required packages
    # with the following.
    # !pip install torchaudio

    import torch
    import torchaudio

    print(torch.__version__)
    print(torchaudio.__version__)


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    1.11.0+cu102
    0.11.0+cu102


Preparing data and utility functions (skip this section)
--------------------------------------------------------


.. code-block:: default


    #@title Prepare data and utility functions. {display-mode: "form"}
    #@markdown
    #@markdown You do not need to look into this cell.
    #@markdown Just execute once and you are good to go.

    #-------------------------------------------------------------------------------
    # Preparation of data and helper functions.
    #-------------------------------------------------------------------------------
    import multiprocessing
    import os

    import matplotlib.pyplot as plt
    from IPython.display import Audio, display


    _SAMPLE_DIR = "_sample_data"
    YESNO_DATASET_PATH = os.path.join(_SAMPLE_DIR, "yes_no")
    os.makedirs(YESNO_DATASET_PATH, exist_ok=True)

    def _download_yesno():
      if os.path.exists(os.path.join(YESNO_DATASET_PATH, "waves_yesno.tar.gz")):
        return
      torchaudio.datasets.YESNO(root=YESNO_DATASET_PATH, download=True)

    YESNO_DOWNLOAD_PROCESS = multiprocessing.Process(target=_download_yesno)
    YESNO_DOWNLOAD_PROCESS.start()

    def plot_specgram(waveform, sample_rate, title="Spectrogram", xlim=None):
      waveform = waveform.numpy()

      num_channels, num_frames = waveform.shape
      time_axis = torch.arange(0, num_frames) / sample_rate

      figure, axes = plt.subplots(num_channels, 1)
      if num_channels == 1:
        axes = [axes]
      for c in range(num_channels):
        axes[c].specgram(waveform[c], Fs=sample_rate)
        if num_channels > 1:
          axes[c].set_ylabel(f'Channel {c+1}')
        if xlim:
          axes[c].set_xlim(xlim)
      figure.suptitle(title)
      plt.show(block=False)

    def play_audio(waveform, sample_rate):
      waveform = waveform.numpy()

      num_channels, num_frames = waveform.shape
      if num_channels == 1:
        display(Audio(waveform[0], rate=sample_rate))
      elif num_channels == 2:
        display(Audio((waveform[0], waveform[1]), rate=sample_rate))
      else:
        raise ValueError("Waveform with more than 2 channels are not supported.")


Here, we show how to use the ``YESNO`` dataset.


.. code-block:: default


    YESNO_DOWNLOAD_PROCESS.join()

    dataset = torchaudio.datasets.YESNO(YESNO_DATASET_PATH, download=True)

    for i in [1, 3, 5]:
      waveform, sample_rate, label = dataset[i]
      plot_specgram(waveform, sample_rate, title=f"Sample {i}: {label}")
      play_audio(waveform, sample_rate)


.. rst-class:: sphx-glr-horizontal


    *

      .. image:: /beginner/images/sphx_glr_audio_datasets_tutorial_001.png
            :class: sphx-glr-multi-img

    *

      .. image:: /beginner/images/sphx_glr_audio_datasets_tutorial_002.png
            :class: sphx-glr-multi-img

    *

      .. image:: /beginner/images/sphx_glr_audio_datasets_tutorial_003.png
            :class: sphx-glr-multi-img


.. rst-class:: sphx-glr-script-out

 Out:

 .. code-block:: none

    <IPython.lib.display.Audio object>
    <IPython.lib.display.Audio object>
    <IPython.lib.display.Audio object>


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** ( 0 minutes  3.243 seconds)


.. _sphx_glr_download_beginner_audio_datasets_tutorial.py:


.. only :: html

 .. container:: sphx-glr-footer
    :class: sphx-glr-footer-example


  .. container:: sphx-glr-download

     :download:`Download Python source code: audio_datasets_tutorial.py <audio_datasets_tutorial.py>`


  .. container:: sphx-glr-download

     :download:`Download Jupyter notebook: audio_datasets_tutorial.ipynb <audio_datasets_tutorial.ipynb>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.readthedocs.io>`_