JSON Parsing

JSON Parsing

Update: I now use JSON for Modern C++ with really nice API and reasonable performance.  I recommend checking out what other options are available.

The library jsmn is a C library that allows parsing of JSON stream into tokens.

License and Source

Source repository is available here. jsmn is Open Source, license is available here, MIT.

Usage

Everything is in a pair of C source and header files.

There are only two APIs to call. API jsmn_init() will initialize the parser handle, which you can statically allocate on the stack.

A second API jsmn_parse() will bind token storage to the parser handle, tokenize the incoming JSON stream and return number of parsed tokens. So, simply:

jsmntok_t tokens[256];
jsmn_parser parser;
jsmn_init(&parser);
const int count = jsmn_parse(&parser,
                             json_stream_data,
                             json_stream_length,
                             tokens, 256);

Concepts

Now that I have a list of tokens, I can extract the data I want. jsmntok_t has fields that describe its type, offset within the JSON stream that mark beginning and end of the data and how many children this object has.

For example, an array:

["A", "B", "C"]

Will tokenize into 4 jsmntok_t, first token is the root array of type JSMN_ARRAY, start at 0 (since it's beginning of JSON stream), end at 15 (end of JSON stream), and size of 3 (the array contains 3 strings A, B, and C).

We'll analyze one more token. The first string A, its token has type of JSMN_STRING, start at 2, end at 3, and size of 0.

Parsing

Parsing will always involve string conversion. In case of JSMN_STRING, we can replace the end quote " with a null and lift the string out. The end quote is always conveniently marked by the end offset.

For JSMN_PRIMITIVE, we replace the comma , instead of " with null and call atoi() or similar functions to convert the string into a value. Again, end marks the delimiter in the JSON stream which we don't need once the stream is tokenized. A primitive can be a number, boolean (true or false) or null. Read the first character to figure out what type it is (t or f for boolean, n for null).

Storage

If jsmn_parse() returns a negative value matching JSMN_ERROR_NOMEM it means the token storage is too small. Dynamically reallocate the storage, then call jsmn_parse() again to continue parsing. Of course this doesn't apply to the above example where tokens are allocated on stack.

You can also call jsmn_parse() with nullptr as storage to determine how many tokens you'll get.

Reference