pipeline utility for converting mongodb's BSON to JSON
Go to file
yosh a9f7e9d447 chore: regulate tabs 2024-02-26 21:27:48 -05:00
LICENSE initial commit 2024-01-27 20:11:00 -05:00
Makefile use PRI macros for max portability 2024-01-27 20:24:51 -05:00
README.md void main, clarify readme 2024-01-27 20:16:29 -05:00
bson2json.c chore: regulate tabs 2024-02-26 21:27:48 -05:00

README.md

bson2json

a dead-simple no-frills pipeline utility for converting mongodb's BSON format to JSON.

building and installing

make
make install

by default installs in /usr/local/bin/bson2json

usage

bson2json does not take any arguments. it only reads stdin and outputs to stdout. as such, common usage is like so:

bson2json < bson_file.bson > json_file.json

# or, a more complicated example...
bsonurl=$(curl https://some.server/api.php | jq -r '.link.filter')
curl "$bsonurl" | bson2json | jq -r '.filter.to.a.specific.value'

bson2json has very minimal error checking. it assumes that the bson files you give it will be valid. it doesn't tell you what byte errors occur (that'd be weird). if an unrecoverable error occurs, the exit code will be nonzero.

additionally, bson2json does not buffer any input. if an error occurs halfway through, the output will be cut off json. as such, if you're not 100% sure that the bson files you are giving it will be valid, perhaps have a setup like so:

bson2json < bson_file.bson >/dev/null && bson2json < bson_file.bson | jq 'some_filter'

fallbacks

  • this diverges from the "official" libbson way of converting bson to json, because I think some of the official ways suck
  • because json is much more stripped-down type wise than bson, some information is lost or converted when converting. notably:
    • binary data is converted to hex as a string, and doesn't specify what "type" of binary data it is
    • a lot of types that are put in a nested document with a key for their type are simply simplified to either remove the key or replace it with the json-specific type itself (e.g. double)
    • that's all I remember
  • because proper c23 support for _Decimal128 numbers (0x13 for bson) isn't really all too there in c compilers as of writing this and I did not want to roll my own implementation of them, decimal128 numbers are represented as a binary string for the time being

testing

I tested this on the libbson test suite and everything looked fine, only failing on the tests that are meant to fail. all the output was valid json too, so we're good on that front as well