The indigoparadox Web Zone

dbfetch and Sensor Data

Back to projects

Web Zone Navigation

Related Web Zones

Other Web Zones

1. Introduction

dbfetch is a utility that will fetch a JSON object (or list of JSON objects) from a web source and flatten it to store relevant fields into a traditional SQL-based relational database. Fields and transformations for flattening JSON blobs are provided in YAML files, the format of which is outlined below.

The notes on this page mostly apply to the classic branch. The master branch went in a slightly new direction that hasn't been tested as thoroughly and may not be as compatible as we'd like.

2. YAML Model Definition

Models are defined in YAML files under the models subdirectory. dbfetch will try to create a table and add columns to match the model. If a column is added to the model for an existing table, dbfetch will attempt to add those columns to the existing table.

All models must have at least one field with the primary_key property.

2 - 1. Top Level

tablename
The name of the SQL table in which to store processed JSON objects as rows.
options
Options pertaining to fetch operations. See Options for more information.
transformations
A list of field names, each containing a list of Transformations that should be applied to each object field before it is stored in a column.
fields
A list of field names, each containing a set of Field Properties for the database column it defines.

2 - 2. Options

delete_undef
Delete all fields of the JSON object which are not listed in the fields list before pushing to the database.

2 - 3. Transformations

Requester.format_date
Takes a UNIX epoch timestamp or ISO-ish date string and attempts to turn it into a DateTime object compatible with an SQL DateTime field.
Requester.format_flatten
Pulls the fields inside of an object field up to the same level as their parent, adding the parent's key as a prefix to each child's key. Removes the parent object field.

2 - 4. Field Properties

type
The SQL type of the column. Generally, int, datetime, float should be usable, among others.
primary_key
A value of true or false, indicating the whether this field is part of the primary key in the database. All models should have at least one primary key, and this should not be modified after the table is created!

3. Configuration

With some models present, dbfetch can then be configured to fetch JSON objects from various sources and process them using those models. The configuration file is dbfetch.ini by default, but this can be changed using the -c command-line argument.

The configuration file is split into stanzas, with one stanza per model to define options for that model and multiple location stanzas per model to define different sources to pull JSON objects from for processing and storage with that model.

3 - 1. Model Stanza Options

connection
The database connection string (any SQLAlchemy-compatible connection string should work).
locations
Comma-separated list of integers matched to location stanzas.

3 - 2. Location Stanza Options

url
Web URL from which to fetch JSON objects.

4. Basic Example

This example will attempt to process the following JSON object, fetched from http://example.com/aqi.json:

aqi.json
1{
2 "pm25_standard": 24,
3 "relative_humidity": 39.67132568359375,
4}

This example will also use a simplified version of the model i2cerv.yml, with irrelevant parts (e.g. for plotting) stripped out.

i2cerv.yml
1---
2tablename: i2cerv
3options: {}
4transformations:
5 updated:
6 - Requester.format_date
7fields:
8 location:
9 type: int
10 primary_key: true
11 updated:
12 type: datetime
13 primary_key: true
14 pm25_standard:
15 type: float
16 relative_humidity:
17 type: float

Finally, the following stanza in the configuration file, dbfetch.ini will activate the model above when it is placed in the models directory and use it to process the JSON object fetched from the example URL into an SQLite database (any valid SQLAlchemy connection string should work):

dbfetch.ini
1[i2cerv]
2connection = sqlite:///i2cerv.db
3locations = 1
4
5[dbfetch-location-1]
6url = http://example.com/aqi.json

5. Flattening Example

This section is under construction!

Please stay tuned for updates pending research and development.

Please see models/waqi.yml for an example of a model that pulls the children from the object.data.x fields up to multiple object.data_x fields.

6. Plotting

This section is under construction!

Please stay tuned for updates pending research and development.

7. MQTT

This section is under construction!

Please stay tuned for updates pending research and development.

Table of Contents

  1. Introduction
  2. YAML Model Definition
  3. Configuration
  4. Basic Example
  5. Flattening Example
  6. Plotting
  7. MQTT