An Influx of Curling

The Winter Olympics are over and we seem to be pretty good at curling. I’ve been doing some curling myself.

The Olympics may be over this year but the ‘lympics are not. The 2022 Winter Paralympics start on the 4ᵗʰ of March and are even crazier. Events include wheelchair curling and blind skiing! If you want to watch some extreme films to get in the mood then the Banff Mountain Film Festival UK & Ireland Tour has already started.

Curling

There used to be a curling rink near me and although I’ve tried it, the curling I’ve been doing recently is a bit different. I needed to extract a CSV export from an InfluxDB instance and my tool of choice was curl.

Here is a guide for how to extract a CSV of air quality readings from a node on the Sensor.Community global sensor network. I needed the particulate matter air pollution readings for a DIY sensor I’d built and deployed. The following examples have been formatted to read better but you can put them on one line if you like.

The first step is to build a query for the data you are interested in. You can use browser dev tools to inspect the queries being issued to the Grafana API and modify as required. In the following query replace NODE_ID with the numeric ID of the node you want the data for and change the time values to the Unix epoch values you want too. The timestamp is included by default so does not need to be selected.

SELECT sds011_p1 AS PM_10, sds011_p2 AS PM_2_5
FROM "feinstaub"
WHERE ("node" =~ /^NODE_ID$/)
AND time >= 1599099790168ms
AND time <= 1618772569065ms

You can then extract to a CSV file using the following curl command. The quotes in the query need to be escaped for bash with \" and epoch=s gets the timestamps in seconds even though we are querying in milliseconds.

curl -H "Accept: application/csv" \
-G "https://maps.sensor.community/grafana/api/datasources/proxy/3/query" \
--data-urlencode "db=feinstaub" \
--data-urlencode "q=SELECT sds011_p1 AS PM_10, sds011_p2 AS PM_2_5 FROM \"feinstaub\" WHERE (\"node\" =~ /^NODE_ID$/) AND time >= 1599099790168ms AND time <= 1618772569065ms" \
--data-urlencode "epoch=s" > data.csv

This can result in a lot of records, so if you want to aggregate to the mean value every hour use the following query.

SELECT mean(sds011_p1) AS PM_10, mean(sds011_p2) AS PM_2_5
FROM "feinstaub"
WHERE ("node" =~ /^NODE_ID$/)
AND time >= 1599099790168ms
AND time <= 1618772569065ms
GROUP BY time(1h)

Then save to a CSV file using the following command.

curl -H "Accept: application/csv" \
-G "https://maps.sensor.community/grafana/api/datasources/proxy/3/query" \
--data-urlencode "db=feinstaub" \
--data-urlencode "q=SELECT mean(sds011_p1) AS PM_10, mean(sds011_p2) AS PM_2_5 FROM \"feinstaub\" WHERE (\"node\" =~ /^NODE_ID$/) AND time >= 1599099790168ms AND time <= 1618772569065ms GROUP BY time(1h)" \
--data-urlencode "epoch=s" > data_hourly.csv

You can then import the data into your favourite tool for analysis data science, whether that is pandas, NumPy, Jupyter Notebook or just a spreadsheet.


This blog is treeware! If you found it useful then please plant a tree.
Donate a treeDonate a tree🌳🌳