NEWS · September 2nd 2020

Lab Notes: Setting up air quality notifications with Purple Air, Sanity, Vercel, and Twilio

Stuck in a megafire in the middle of a pandemic? This handy little service lets your friends know when to cancel outdoor teaching with crowdsourced sensor networks from Purple Air, Sanity.io as the data store, Vercel for compute, and Twilio for notification!

Even Eidsten Westvang

Even is a Sanity.io co-founder

Haha. So guess where we live:

The Purple Air Sensor Network

Yes, in the Bay Area - out there on the left coast, by the red circles. In the COVID-19 pandemic in the mega-fire. And we’d set our eldest daughter up to hang out with her friends under some parasols on their Chromebooks. Just to see other kids and to pretend things are normal.

But we keep needing to cancel. Because there’s a mega-fire. In a global pandemic. We keep laughing about this. With a slightly hysterical edge, "Haha, the pandemic mega-fire, haha."

So we spend a lot of time looking at air quality indications – Purple Air specifically. Purple Air is a crowdsourced network that uses cheap sensors to measure particulates. And on Friday SO said she would like a text when it looks like the fires might force us all to stay in.

I can't do much about the fire or the pandemic, but this I can actually address over the weekend.

Get the source on GitHub

Deciding on a setup

So how does one build such a thing? What's the absolute easiest way? I don't write a lot of code these days. To give you an idea, this is me last night trying to remember how to get array length in JavaScript:

❯ node
Welcome to Node.js v12.13.0.
Type ".help" for more information.
> a = [1,2,3,4]
[ 1, 2, 3, 4 ]
> len(a)
Thrown:
ReferenceError: len is not defined
> a.len
undefined
> a.count
undefined
> a.len()
Thrown:
TypeError: a.len is not a function
> length(a)
Thrown:
ReferenceError: length is not defined
> a.length
4

So we're going to need three things.

  1. A database – let's use a Sanity.io project with an instant dashboard for our distribution list, Purple Air sensor IDs, measurements and broadcasts
  2. Notifications – Let's use Twilio. Guess we could do email or app notifications, but SMS has a nice emergency services vibe that seems good for this
  3. A little bit of compute – just a few cycles to read and write data between endpoints

Was wondering about where to get the compute from. Five years ago I might have run this as a droplet on Digital Ocean. Or on a tiny instance on GCP or AWS. I would need to persist measurements in a database so maybe run SQLLite or even a full Postgres.

Believe it or not, I first tried to use cloud functions on GCP. My reasoning: all the data science people are pythonistas. A quick google and there are libs for converting and getting at all kinds of air quality data. I waste 45 minutes trying to set up python right. In encountered at least 75% of this chart:

Me, on a Friday eve, setting up Python. Source: https://xkcd.com/1987/

So for compute I instead go for JavaScript on Vercel. I'd rather take the hit of porting libraries to JavaScript to get started right away.

Initial setup: Sanity and Vercel (15min)

sanity init takes 2 (two) minutes. Remember to make your datasets private as we're storing phone numbers and they're semi-personal.

Vercel is just as fast. Started with the web onboard, selected a next.js sample as it says it comes with a serverless function. And it's up and deployed and running on local. Very slick given the huge amounts of magic involved.

We also need an API key also from manage.sanity.io so I can read/write to sanity.

I set up my environment variables in Vercel, but don't read the fine print and mistakenly assume the envs would be injected by vercel dev, but no. You need to vercel env pull.

I now have not only a cloud function, and also environment handling on local and in production, that’s nice!

I also have a nice React/Next.js website that I don’t really need, but that’s also working.

Let's also set up our Sanity schemas for the phone numbers of people to message and the sensor ID for the Purple Air sensors we want to listen to:

types: schemaTypes.concat([
    {
      title: "Person",
      name: "person",
      type: "document",
      fields: [
        {
          title: "Name",
          name: "name",
          type: "string",
        },
        {
          title: "Mobile number",
          name: "mobileNumber",
          type: "string",
        },
      ],
    },
    {
      title: "Sensor",
      name: "sensor",
      type: "document",
      fields: [
        {
          title: "ID",
          name: "id",
          type: "number",
        },
      ],
    },
  ]),

That’s nice. Newly updated version of Sanity Studio lookin’ all Chef’s Kiss here:

Getting at Air Quality Data (1 hour - long, fun detour)

Let’s get the data. I am so grateful for async await syntax. Asynchronous JavaScript programming feels like regular expressions to me. It takes an hour to install, at least, if you haven't been around it for a while.

  const sensorURL = `https://www.purpleair.com/json?show=${sensors[0].id}`;
  const sensorResponse = await fetch(sensorURL);

YOLO, no error handling.

The raw data looks like this:

{
    ID: 37491,
    Label: "Bowie's House",
    DEVICE_LOCATIONTYPE: 'outside',
    THINGSPEAK_PRIMARY_ID: '846991',
    THINGSPEAK_PRIMARY_ID_READ_KEY: '69RLIL962LMTCY2T',
    THINGSPEAK_SECONDARY_ID: '846992',
    THINGSPEAK_SECONDARY_ID_READ_KEY: 'F9LR6VGYTWZXF5HM',
    Lat: 37.820986,
    Lon: -122.240903,
    PM2_5Value: '19.31',
    LastSeen: 1598669995,
    Type: 'PMS5003+PMS5003+BME280',
    Hidden: 'false',
    DEVICE_BRIGHTNESS: '15',
    DEVICE_HARDWAREDISCOVERED: '2.0+BME280+PMSX003-B+PMSX003-A',
    Version: '6.01',
    LastUpdateCheck: 1598668794,
    Created: 1565974448,
    Uptime: '25800',
    RSSI: '-69',
    Adc: '0.0',
    p_0_3_um: '2150.69',
    p_0_5_um: '610.78',
    p_1_0_um: '125.9',
    p_2_5_um: '20.62',
    p_5_0_um: '4.38',
    p_10_0_um: '2.21',
    pm1_0_cf_1: '11.34',
    pm2_5_cf_1: '19.31',
    pm10_0_cf_1: '23.5',
    pm1_0_atm: '11.34',
    pm2_5_atm: '19.31',
    pm10_0_atm: '23.5',
    isOwner: 0,
    humidity: '55',
    temp_f: '68',
    pressure: '1007.61',
    AGE: 5,
    Stats: '{"v":19.31,"v1":19.63,"v2":20.34,"v3":25.36,"v4":33.71,"v5":27.44,"v6":9.2,"pm":19.31,"lastModified":1598669995428,"timeSinceModified":119916}'
  }

That’s three different pm2_5 values. A little googling tells us it's pm2_5_atm we want as it’s corrected for outdoor atmospherics.

Yeah, and there’s no air quality number (AQI) here! But that's what everyone wants to know.

So how do we calculate an AQI. AQI is based on biology – it’s how much the air is hurting you at any given time. Don't think the air can hurt you? Think again.

Turns out the EPA has 200-page tomes on background, but it’s really just a little table mapping PM 2.5 breakpoints value ranges to AQI ranges. Within each range you interpolate.

PM25 to AQI conversion index

Someone wrapped this up nicely in Typescript for us!

So to get from IoT sensors to AQI we’re looking at:

  const sensors = await client.fetch(query); // get sensor listing from sanity
  const sensorURL = `https://www.purpleair.com/json?show=${sensors[0].id}`; // make an URL
  const response = await fetch(sensorURL); // get the sensor
  const pm25Atmospheric = response.data.results[0].pm2_5_atm; // get the athmospheric pm2.5 value
  const aqi = convert("pm25", "raw", "usaEpa", pm25Atmospheric); // convert to AQI

This value is wrong though. It's way too high. Turns out cheap sensors get the value wrong. Especially for wood smoke. The Lane Regional Air Protection Agency (LRAPA) bought a couple of Purple Air sensors and put them next to their expensive Scientific Instruments™ and came back with a report with this figure:

Figure from the LRAPA report on purple air sensors

So uh, the sensors align on offset on the Y-axis (intercept), they have approximately the same curve (R2), but the Slope is clearly different. Actually, the expensive instruments are 0.5 of the Purple Air sensors.

I wish the PDF could have had as a title: You, yes you, divide by two for wood smoke!

const WOOD_SMOKE_REBATE_MAGIC_NUMBER = 0.48;

Wondering how the sensor actually works? There's an appendix at the end!

Sending SMSes

Let’s sign up for Twilio. There are some restrictions on what you can do without a verified account so I throw $20 at them to get a phone number. After that it more or less Just Works™.

people.forEach(async (person) => {
  await twilio.messages.create({
    from: TWILIO_NUMBER,
    to: person.mobileNumber,
    body: broadcastMessage,
  });
});

I now have a through-line from the data store -> purple air -> AQI LRAPA (or LARPa as I like to think of it) -> my phone! Yatta!

Texts containing AQI measurements sent by SMS


And I still haven't plugged in my laptop to charge it. Two hours have gone by. Still to do:

  • Register sent messages so we can throttle
  • Check if we’re trending one way or another and have crossed a threshold (AQI 75 & 100?)
  • Generate a message and send to everyone
  • Maybe add an image? Chart? To send in an MMS?

Telling “Fine” from “This is Fine”

Ok, so let’s figure out a simple boundary crossing detection from “fine” to “this is fine”.

We’re currently just storing users and sensors with Sanity. Let’s go ahead and use it for measurements and broadcasts as well. I don’t know if sending messages to myself and a couple of our friends is a "broadcast", but at least it sounds cool.

Schema:

{
  title: "Measurement",
  name: "measurement",
  type: "document",
  fields: [
    {
      title: "AQI",
      name: "aqi",
      type: "number",
    },
    {
      title: "Range",
      name: "range",
      type: "number",
    },
    {
      title: "PM25",
      name: "pm25",
      type: "number",
    },
  ],
},
{
  title: "Broadcast",
  name: "broadcast",
  type: "document",
  fields: [
    {
      title: "AQI",
      name: "aqi",
      type: "number",
    },
    {
      title: "Range",
      name: "range",
      type: "number",
    },
  ],
},

Let's do some reading:

const broadcasts = await client.fetch('*[_type == "broadcasts"] | order(_createdAt desc)');
const previousBroadcastTime = (broadcasts[0] && broadcasts[0]._createdAt) || 0;
const minutesSinceBroadCast = (new Date().getTime() - previousBroadcastTime) / 1000 / 60;

And some writing:

await client.create({
  _type: "broadcast",
  value: aqi,
  range: currentCondition.range,
});

Sweet!

Sanity Studio showing a measurement written over the API

We now need to send messages to people when stuff changes. Let’s make some AQI bands:

const conditions = [
  {
    name: "Excellent",
    valueBeneath: 35,
  },
  {
    name: "Fine",
    valueBeneath: 50,
  },
  {
    name: "OKish",
    valueBeneath: 75,
  },
  {
    name: "Get inside",
    valueBeneath: 100,
  },
  {
    name: "Stay inside",
    valueBeneath: 150,
  },
  {
    name: "Nope. Just Nope.",
    valueBeneath: 200,
  }
];

You really want pretty robust qualifications of what an “event” is when you’re messaging people. If a value hovers around a boundary you could end up sending people messages every 5 minutes. That would be super annoying.

As “Simple Methods for Detecting Zero Crossing” (2003) puts it:

Affects of noise, harmonics, and multi-frequency signal make frequency and period measurements difficult for synchronizing control events. Various methods are presented to minimize errors in period and phase measurements. Both frequency and amplitude domain approaches are analyzed. Post detection processing allows greater accuracy. Static and dynamic hystereses as well as interpolation methods of zero-crossing detection are investigated.

OTOH, life is short. Let’s go with:

  • Send a message when we go between boundaries
  • Don’t send people messages more than every 5 minutes
  • Don’t send people a message if the absolute AQI value has changed less than 10 since last message broadcast

We also have our “person” objects in Sanity so we could add messaging prefs that you could configure over SMS with Twilio through a little callback. Let’s not do that now. Maybe never.

if (minutesSinceBroadcast < 5 || previousBroadcastRange == currentCondition.range || Math.abs(previousBroadcastAQI - aqi) < 10) {
  res.json({ status: "No change" });
  return;
}

With the message:

  const broadcastMessage = `Air quality has gone from '${conditions[previousBroadcastRange].name}' (${previousBroadcastAQI}) to '${currentCondition.name}' (${aqi})`;

CRON in the cloud

I need to run my little cloud function on a timetable. Every minute I want it called. There’s CRON as a service, but I don’t know if I need a new service to do curl every few minutes. I futz around with GitHub actions for a good 30 minutes but spinning up a container to do a single HTTPS GET feels like going shopping in a Sherman tank.

Also: 5 minutes is the fastest GitHub will spin it. And I want minute intervals. GitHub action DX is surprisingly hard. Docs are “top-down abstracts first” and 30 minutes just understanding what goes where.

I so wish Vercel had CRON.

CRON as a service it is then. Easycron.com has won the SEO wars and looks fine. It’s feature-complete af, which isn’t strange given this is all they do:


I picture a world where the humans are gone, but where solar-powered CRON HTTP GET keeps doing its thing. Senselessly sending air quality warnings to non-existent endpoints.

Charts?!

Twilio sends images as well if you just append an URL to an image. So let’s use Quickchart, which is really just chart.js sitting in the cloud rendering PNGs:

function lineChartURLSpec(measurements) {
  const obj = {
    type: "line",
    data: {
      labels: measurements.map((d) => {
        return "";
      }),
      datasets: [
        {
          label: "LRAPA AQI",
          backgroundColor: "rgb(255, 99, 132)",
          borderColor: "rgb(255, 99, 132)",
          data: measurements
            .map((d) => {
              return d.aqi;
            })
            .reverse(),
          fill: false,
        },
      ],
    },
    options: {
      title: {
        display: true,
        text: `Air quality last ${measurements.length} minutes`,
      },
    },
  };
  return encodeURIComponent(JSON.stringify(obj));
}

End Result

WFH AQI BBQ

The sum total for the server side is 158 lines of pedestrian JS:

import { convert } from "@shootismoke/convert";
const sanityClient = require("@sanity/client");

const client = sanityClient({
  projectId: process.env.SANITY_PROJECT_ID,
  dataset: "production",
  token: process.env.SANITY_TOKEN, // or leave blank to be anonymous user
  useCdn: false, // `false` if you want to ensure fresh data
});

const ACCOUNT_SID = process.env.TWILIO_ACCOUNT_SID;
const TWILIO_TOKEN = process.env.TWILIO_AUTH_TOKEN;
const TWILIO_NUMBER = process.env.TWILIO_NUMBER;
const twilio = require("twilio")(ACCOUNT_SID, TWILIO_TOKEN);

const WOOD_SMOKE_REBATE_MAGIC_NUMBER = 0.48;

const conditions = [
  {
    name: "Excellent",
    valueBeneath: 35,
  },
  {
    name: "Fine",
    valueBeneath: 50,
  },
  {
    name: "OKish",
    valueBeneath: 75,
  },
  {
    name: "Use judgement",
    valueBeneath: 100,
  },
  {
    name: "Get inside",
    valueBeneath: 150,
  },
  {
    name: "Nope. Just Nope.",
    valueBeneath: 200,
  },
];

export default async (req, res) => {
  const people = await client.fetch('*[_type == "person"]');
  const sensors = await client.fetch('*[_type == "sensor"]'); // TODO: average out more

  const sensorURL = `https://www.purpleair.com/json?show=${sensors[0].id}`;
  const sensorResult = await fetch(sensorURL);
  const sensorResponse = await sensorResult.json();
  const atmosphericPM25 = sensorResponse.results[0].pm2_5_atm;
  let aqi = convert("pm25", "raw", "usaEpa", atmosphericPM25 * WOOD_SMOKE_REBATE_MAGIC_NUMBER);

  let currentCondition;
  for (let i = 0; i < conditions.length; i++) {
    currentCondition = conditions[i];
    if (aqi < conditions[i].valueBeneath) {
      currentCondition.range = i;
      break;
    }
  }

  const measurements = await client.fetch('*[_type == "measurement"] | order(_createdAt desc)');

  const measurementDoc = {
    _type: "measurement",
    range: currentCondition.range,
    aqi: +aqi,
    pm25: +atmosphericPM25,
  };

  measurements.unshift(measurementDoc);
  truncateData(measurements, 120);
  client.create(measurementDoc);

  const broadcasts = await client.fetch('*[_type == "broadcast"] | order(_createdAt desc)');
  const previousBroadcastAQI = (broadcasts[0] && broadcasts[0].aqi) || 0;
  const previousBroadcastRange = (broadcasts[0] && broadcasts[0].range) || 0;
  const previousBroadcastTime = (broadcasts[0] && broadcasts[0]._createdAt) || 0;
  const minutesSinceBroadcast = (new Date().getTime() - new Date(previousBroadcastTime).getTime()) / 1000 / 60;

  const recentlySent = minutesSinceBroadcast < 5;
  const rangeNotChanged = previousBroadcastRange == currentCondition.range;
  const aqiChangeTooSmall = Math.abs(previousBroadcastAQI - aqi) < 10;

  if (recentlySent || rangeNotChanged || aqiChangeTooSmall) {
    const status = {
      status: {
        recentlySent: recentlySent,
        rangeNotChanged: rangeNotChanged,
        aqiChangeTooSmall: aqiChangeTooSmall,
      },
    };

    res.statusCode = 200;
    res.json(status);
    return;
  }
  truncateData(broadcasts, 2);

  const broadcastMessage = `Air quality has gone from '${conditions[previousBroadcastRange].name}' (${previousBroadcastAQI}) to '${currentCondition.name}' (${aqi})`;

  await client.create({
    _type: "broadcast",
    aqi: +aqi,
    range: +currentCondition.range,
  });

  const chartURL = `https://quickchart.io/chart?c=${lineChartURLSpec(measurements)}`;

  people.forEach(async (person) => {
    await twilio.messages.create({
      from: TWILIO_NUMBER,
      to: person.mobileNumber,
      body: broadcastMessage,
      mediaUrl: chartURL,
    });
  });

  const status = { status: "Broadcast", LRAPA_AQI: aqi };
  res.statusCode = 200;
  res.json(status);
};

function truncateData(objects, count) {
  objects.slice(count).forEach(async function (doc) {
    await client.delete(doc._id);
  });
}

function lineChartURLSpec(measurements) {
  const obj = {
    type: "line",
    data: {
      labels: measurements.map((d) => {
        return "";
      }),
      datasets: [
        {
          label: "LRAPA AQI",
          backgroundColor: "rgb(255, 99, 132)",
          borderColor: "rgb(255, 99, 132)",
          data: measurements
            .map((d) => {
              return d.aqi;
            })
            .reverse(),
          fill: false,
        },
      ],
    },
    options: {
      title: {
        display: true,
        text: `Air quality last ${measurements.length} minutes`,
      },
    },
  };
  return encodeURIComponent(JSON.stringify(obj));
}

The schema for the Sanity.io Studio another 76:

import createSchema from "part:@sanity/base/schema-creator";
import schemaTypes from "all:part:@sanity/base/schema-type";
export default createSchema({
  name: "default",
  types: schemaTypes.concat([
    {
      title: "People",
      name: "person",
      type: "document",
      fields: [
        {
          title: "Name",
          name: "name",
          type: "string",
        },
        {
          title: "Mobile number",
          name: "mobileNumber",
          type: "string",
        },
      ],
    },
    {
      title: "Sensors",
      name: "sensor",
      type: "document",
      fields: [
        {
          title: "ID",
          name: "id",
          type: "number",
        },
      ],
    },
    {
      title: "Measurement",
      name: "measurement",
      type: "document",
      fields: [
        {
          title: "AQI",
          name: "aqi",
          type: "number",
        },
        {
          title: "Range",
          name: "range",
          type: "number",
        },
        {
          title: "PM25",
          name: "pm25",
          type: "number",
        },
      ],
    },
    {
      title: "Broadcast",
      name: "broadcast",
      type: "document",
      fields: [
        {
          title: "AQI",
          name: "aqi",
          type: "number",
        },
        {
          title: "Range",
          name: "range",
          type: "number",
        },
      ],
    },
  ]),
});

So, here's to this never actually sending a single message to anyone so the kids can hang out in the backyard.

Things TODO still:

  • Catch errors
  • Drop outliers in time series
  • Be able to add a bunch of local sensors, drop outliers and average out

Appendix A: So how do you measure particle concentrations so cheaply, anyway

Inside the Purple Air sensors, you find the PMS5003 sensor made by Plantower. According to their marketing it's exactly the same size as a uh, Zippo lighter:

Double Lung Electronics (!) will happily sell you them on Ali Express for $14 dollars apiece.

A teardown (and cleaning) shows that there are almost no moving parts. They're super simple and therefore very cheap.

So how do you measure both particulate density and size with a single photodiode and a single 658nm laser? Turns out there’s a lot going on when you have particles that are just a bit bigger than the wavelength of light. The way the particles spread light goes from Rayleigh to Mie and there are sinusoidal nonlinearities in the backscatter that you apparently can use to detect particle size:

This property of the scattering lets you make really cheap sensors to detect smoke particles from a mega-fire in a pandemic.