JSON Parsing
Update: I now use JSON for Modern C++ with really nice API and reasonable performance. I recommend checking out what other options are available.
The library jsmn is a C library that allows parsing of JSON stream into tokens.
License and Source
Source repository is available here. jsmn
is Open Source, license is available here, MIT.
Usage
Everything is in a pair of C source and header files.
There are only two APIs to call. API jsmn_init()
will initialize the parser handle, which you can statically allocate on the stack.
A second API jsmn_parse()
will bind token storage to the parser handle, tokenize the incoming JSON stream and return number of parsed tokens. So, simply:
jsmntok_t tokens[256];
jsmn_parser parser;
jsmn_init(&parser);
const int count = jsmn_parse(&parser,
json_stream_data,
json_stream_length,
tokens, 256);
Concepts
Now that I have a list of tokens, I can extract the data I want. jsmntok_t
has fields that describe its type, offset within the JSON stream that mark beginning and end of the data and how many children this object has.
For example, an array:
["A", "B", "C"]
Will tokenize into 4 jsmntok_t
, first token is the root array of type JSMN_ARRAY
, start
at 0 (since it's beginning of JSON stream), end
at 15 (end of JSON stream), and size
of 3 (the array contains 3 strings A
, B
, and C
).
We'll analyze one more token. The first string A
, its token has type of JSMN_STRING
, start
at 2, end
at 3, and size
of 0.
Parsing
Parsing will always involve string conversion. In case of JSMN_STRING
, we can replace the end quote "
with a null and lift the string out. The end quote is always conveniently marked by the end
offset.
For JSMN_PRIMITIVE
, we replace the comma ,
instead of "
with null and call atoi()
or similar functions to convert the string into a value. Again, end
marks the delimiter in the JSON stream which we don't need once the stream is tokenized. A primitive can be a number, boolean (true
or false
) or null
. Read the first character to figure out what type it is (t
or f
for boolean, n
for null
).
Storage
If jsmn_parse()
returns a negative value matching JSMN_ERROR_NOMEM
it means the token storage is too small. Dynamically reallocate the storage, then call jsmn_parse()
again to continue parsing. Of course this doesn't apply to the above example where tokens are allocated on stack.
You can also call jsmn_parse()
with nullptr
as storage to determine how many tokens you'll get.
Reference
- Sample code from GitHub.