Audio Hacking with the esp8266

Wemos Audio

The esp8266 is quite powerful for audio applications with a CPU frequency of 160MHz and 4MB flash and easily outperforms the Volca Sample with a WiFi/Web GUI.

The goal for this blog series is to build synthesizers on the esp8266 platform so we will also go through adding MIDI, Sync24 and trigger inputs.

And use the WiFi for easy uploading and editing of samples.

But the PDM DAC also works for webradio and other audio streaming applications.

We will use the Wemos D1 Mini board with the Arduino IDE and all the source will be available.

WeMos D1 mini

The series starts off with creating the 44.1KHz, 16-bit PDM Audio DAC.

If you have no clue about soldering but still want to follow you can order the Wemos D1 Mini with a 3.5mm Audio jack prebuilt.

Order the Wemos D1 Mini Audio $10

My work on these free synthesizers is based on donations from people.
If you find the code useful, please consider a $3 donation to keep future developments open source.

Donate $3

We will start by adding an audio output.

esp8266 i2s interface

The esp8266 handles audio through something called i2s.
i2s is high speed shifting out of 2 16-bit serial words, left and right channel, and a shift clock powered by DMA.


This interface normally requires an external i2s DAC that converts the serial stream to analog signals.

To make it more easy we are going to build a PDM (Pulse Density Modulation) DAC based on the i2s interface.

PDM is a high rate bitstream and at 44.1KHz sample rate it will be 32 times higher or about 1.4MHz.


Pulse Density Modulation being a 1-bit DAC gives us a dynamic range of 6dB.
That will generate ALOT of noise or 90dB to be exact.

PDM Spectrum

The good thing is that the noise is in a frequency range far above the audio spectrum and can easily be filtered off with a lowpass filter leaving us just the audio signal.

So delta-sigma coding our 16-bit sample words to PDM will give us one 16-bit DAC output with only an external passive filter.

This is the schematics for the audio output:


But why is it connected to the RX pin?
Isn’t that the serial input pin?

It’s also the i2s data output pin.

Lets show some code

This is the setup() for our first test.

It turns off the WiFi radio to reduce power to about 15mA and setting up the pins and DMA for the i2s subsystem at a 44100Hz sample rate:

#include <Arduino.h>
#include <ESP8266WiFi.h>
#include <i2s.h>
#include <i2s_reg.h>

void setup(void) {
// ESP8266 Low power
WiFi.forceSleepBegin(); // turn off ESP8266 RF
delay(1); // give RF section time to shutdown
pinMode(2, INPUT);
pinMode(15, INPUT);

We need a function to write samples to the DMA buffer.
This function generates 16-bit samples by Delta-Sigma coding the bits.

//Pulse Density Modulated 16-bit I2S DAC
uint32_t i2sACC;
uint16_t DAC=0x8000;
uint16_t err;

void writeDAC(uint16_t DAC) {
for (uint8_t i=0;i<32;i++) {
if(DAC >= err) {
err += 0xFFFF-DAC;
err -= DAC;
bool flag=i2s_write_sample(i2sACC);

To test the DAC we generate a slow sine wave:

uint8_t phase;
void loop() {

And the sinewave data:

int16_t sine[256] = {

0x0000, 0x0324, 0x0647, 0x096a, 0x0c8b, 0x0fab, 0x12c8, 0x15e2,

0x18f8, 0x1c0b, 0x1f19, 0x2223, 0x2528, 0x2826, 0x2b1f, 0x2e11,

0x30fb, 0x33de, 0x36ba, 0x398c, 0x3c56, 0x3f17, 0x41ce, 0x447a,

0x471c, 0x49b4, 0x4c3f, 0x4ebf, 0x5133, 0x539b, 0x55f5, 0x5842,

0x5a82, 0x5cb4, 0x5ed7, 0x60ec, 0x62f2, 0x64e8, 0x66cf, 0x68a6,

0x6a6d, 0x6c24, 0x6dca, 0x6f5f, 0x70e2, 0x7255, 0x73b5, 0x7504,

0x7641, 0x776c, 0x7884, 0x798a, 0x7a7d, 0x7b5d, 0x7c29, 0x7ce3,

0x7d8a, 0x7e1d, 0x7e9d, 0x7f09, 0x7f62, 0x7fa7, 0x7fd8, 0x7ff6,

0x7fff, 0x7ff6, 0x7fd8, 0x7fa7, 0x7f62, 0x7f09, 0x7e9d, 0x7e1d,

0x7d8a, 0x7ce3, 0x7c29, 0x7b5d, 0x7a7d, 0x798a, 0x7884, 0x776c,

0x7641, 0x7504, 0x73b5, 0x7255, 0x70e2, 0x6f5f, 0x6dca, 0x6c24,

0x6a6d, 0x68a6, 0x66cf, 0x64e8, 0x62f2, 0x60ec, 0x5ed7, 0x5cb4,

0x5a82, 0x5842, 0x55f5, 0x539b, 0x5133, 0x4ebf, 0x4c3f, 0x49b4,

0x471c, 0x447a, 0x41ce, 0x3f17, 0x3c56, 0x398c, 0x36ba, 0x33de,

0x30fb, 0x2e11, 0x2b1f, 0x2826, 0x2528, 0x2223, 0x1f19, 0x1c0b,

0x18f8, 0x15e2, 0x12c8, 0x0fab, 0x0c8b, 0x096a, 0x0647, 0x0324,

0x0000, 0xfcdc, 0xf9b9, 0xf696, 0xf375, 0xf055, 0xed38, 0xea1e,

0xe708, 0xe3f5, 0xe0e7, 0xdddd, 0xdad8, 0xd7da, 0xd4e1, 0xd1ef,

0xcf05, 0xcc22, 0xc946, 0xc674, 0xc3aa, 0xc0e9, 0xbe32, 0xbb86,

0xb8e4, 0xb64c, 0xb3c1, 0xb141, 0xaecd, 0xac65, 0xaa0b, 0xa7be,

0xa57e, 0xa34c, 0xa129, 0x9f14, 0x9d0e, 0x9b18, 0x9931, 0x975a,

0x9593, 0x93dc, 0x9236, 0x90a1, 0x8f1e, 0x8dab, 0x8c4b, 0x8afc,

0x89bf, 0x8894, 0x877c, 0x8676, 0x8583, 0x84a3, 0x83d7, 0x831d,

0x8276, 0x81e3, 0x8163, 0x80f7, 0x809e, 0x8059, 0x8028, 0x800a,

0x8000, 0x800a, 0x8028, 0x8059, 0x809e, 0x80f7, 0x8163, 0x81e3,

0x8276, 0x831d, 0x83d7, 0x84a3, 0x8583, 0x8676, 0x877c, 0x8894,

0x89bf, 0x8afc, 0x8c4b, 0x8dab, 0x8f1e, 0x90a1, 0x9236, 0x93dc,

0x9593, 0x975a, 0x9931, 0x9b18, 0x9d0e, 0x9f14, 0xa129, 0xa34c,

0xa57e, 0xa7be, 0xaa0b, 0xac65, 0xaecd, 0xb141, 0xb3c1, 0xb64c,

0xb8e4, 0xbb86, 0xbe32, 0xc0e9, 0xc3aa, 0xc674, 0xc946, 0xcc22,

0xcf05, 0xd1ef, 0xd4e1, 0xd7da, 0xdad8, 0xdddd, 0xe0e7, 0xe3f5,

0xe708, 0xea1e, 0xed38, 0xf055, 0xf375, 0xf696, 0xf9b9, 0xfcdc


And the resulting waveform output , a sine wave at 172Hz:


Two important things to keep in mind:

The esp8266 is a RTOS system and other things happen in the background.
So don’t use delay() or other blocking functions.
Use yield() if something takes a long time.

The DMA buffer is 512 samples long and will exhaust in 11.5mS
To have uninterrupted audio output you need to feed it samples before it exhaust.

Feel free to try and get this running and I’ll be back with a sample player.


%d bloggers like this: