Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Parson, lighweight json parser in C (kgabis.github.com)
46 points by kgabis on Oct 28, 2012 | hide | past | favorite | 35 comments



parson.c, 109 - wrong conditional

Well-written, clean and consistent. A bit too pendantic and therefore too verbose for my taste though :) Why in KnR's name this -

  JSON_Object *new_obj = (JSON_Object*)parson_malloc(sizeof(JSON_Object));
is not written like this -

  JSON_Object *new_obj = parson_malloc(sizeof *newobj);
On a more serious note - I would move json_*_t structs to the header file and eliminate a bunch of API functions that service the access to these structs. It's a C library after all, everyone knows how to shoot one's own leg.

(edit) Oh, and you don't handle realloc failures properly (at least in json_object_add).


> I would move json_* _t structs to the header file and eliminate a bunch of API functions that service the access to these structs. It's a C library after all, everyone knows how to shoot one's own leg.

From what I know I would suggest to only do that if the code is never to be used as a shared library, as otherwise the structs can't be changed without recompiling all programs using the library.

There was a Debian security update to libxml2 in 2008 that made various Gnome apps including, IIRC, the window manager crash for many users in various circumstances (depending on the used theme etc.)(* ), because it changed some struct. Public API functions were provided in that case, but apparently direct access to structs was not discouraged enough to prevent some programs from going the unsafe route anyway (or perhaps the API wasn't sufficiently complete? I don't know the details).

I haven't created any wide spread C programs, and don't know if there are ABI checkers that Debian should use / have used, or if it's really just restraint to procedure based APIs that can avoid such problems.

(* ) e.g. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=496125 http://lists.debian.org/debian-user/2008/08/msg01864.html


Right. This would apply if parson were built into a shared library and distributed as an .so. It is a possibility, though Debian's package dependency system exists exactly to safeguard against this sort of a situation.


Thanks for pointing out those errors, I already fixed the one with the wrong conditional and will fix the realloc one in the next commit. I'm being explicit with casting, because g++ and vc++ will throw warnings without it.


Of course you'd get warnings compiling C code with C++ compiler. C doesn't require casting from void* to another pointer type... it's one of its best features :)


Certain people would still like to compile this as a "CPP" file, examples are lua - it's in "C", but it's compilable as C++ too (some people then are more or less strict whether the function interfaces should be extern"C" or not).

Ayway, not that important... But there is a whole lot of people that like amalgated sources (typical examples - juce, freetype2, lua, sqlite) where you include in one or two files all the rest and they compile fine.


Luckily for you, you can link objects files produced by the C compiler with object files produced by the C++ compiler. Crisis averted.


"I would move json_*_t structs to the header file and eliminate a bunch of API functions that service the access to these structs. It's a C library after all, everyone knows how to shoot one's own leg."

I strongly disagree. The author did exactly the right thing by hiding the internal data representation behind accessor functions. This makes it easy for him to change the internal data representation in the future without breaking binary compatibility.

See the libabc REAME for a fuller explanation: https://git.kernel.org/?p=linux/kernel/git/kay/libabc.git;a=...

Summary: "Do not expose any complex structures in your API"

Unrelated: please don't be one of those people who typedef every struct. It's really ugly and obfuscatory (see Linus' remarks on this).


Looking at the source It seems it doesn't handle surrogate pairs e.g.,

  ["\ud83d\ude04"]


You're right, I'll try to add support for them in the future.


Thanks, I've been looking for something like this for a project I have (https://github.com/caldwell/daemon-manager). I did a survey of all the JSON libs for C and C++ about a year ago and found reasons to dislike them all. This actually looks like a reasonable interface, the code itself looks good so far, and I love the fact that it is just a couple files I can drop in.


Just out of curiosity, was json-parser one of the ones you looked at? If so, what did you dislike about it?


No, I haven't seen that one (looks like you wrote it?). I was looking for them at the tail end of 2011 and it looks like json-parser came into existence in march, 2012.

json-parser, at first glance, appears to be nice, too. The ones I looked at before were all multiple files and seemingly very complicated--either to use or to build/integrate with my code. I'll keep a pointer to this one too for when I get back on that project.


I've been using http://cjson.sourceforge.net/ for a while. Single file, reasonable interface. It lets you both serialize and de-serialize JSON. Was this a library you looked at when you did your survey?


How does this compare to jansson (http://www.digip.org/jansson/), aside from just being a parser?


Parson may be easier to embed in some projects and it has different/smaller API.


+1 for jansson. It rocks.


Whats the thing with saying "Only 2 files" what if it is only 1 file, like bottle.py, but the 1 or 2 files contain 10 000 lines of code? Im just saying.


Yeah, seriously. I just did some work with the s7 implementation of Scheme. "Only two files!"

I think in all fairness, I think the idea is that you should just drop it in your source without needing to worry about git subrepos (or whatever). But it would be perhaps more elegant to split it up in the repo and have a script stitch it together into the "only two files".

Also s7 turned out to be much easier to include than the other ones I was considering, so I guess I shouldn't be mocking it.


Well, as in the case with bottle, it's probably meant to imply that you can just use it in your project as files, not as a complete library. No need to complicate your Makefiles (or worry about your package framework).


Well, you beat json-parser on SLOC. :-)

Main differences are that you use recursion instead of a stack, and that you dynamically resize your buffers as you parse (whereas json-parser passes through the JSON twice so that it knows exactly how much memory to allocate).


Hmm, looks complex compared to this one that I've been using called JSMN:

http://zserge.bitbucket.org/jsmn.html


Have you really been using jsmn for something or did you just recall the link?

IMHO jsmn doesn't do nearly enough with the JSON to make it usable by an application (it's basically a lexer).


jsmn is great if you use it as a lexer and you know that your input is well-formed. We use it for our games to read in our configuration, gamemodes and level files. We've just built a few wrapper functions around jsmn and called it "jsmx", and it suits our needs perfectly :-)

But you're right, if you want a json library to parse arbitrary json input using a more complex library like this one is the easier and more viable route than using jsmn.


jsmn is more of a lexer and it doesn't support unicode.


That style of C programming makes me immediately think how it would be neater in C++. Eg the frontpage example: https://gist.github.com/3969351

Key differences:

* actual namespaces, no need to prefix everything with "JSON_"

* RAII-style resource management, no need for "json_value_free" etc

* objects, so instead of get_array(root_value) you have root_value.get_array()


> json::Array commits = root_value.get_array();

You probably meant:

  json::Array & commits = root_value.get_array();
Don't really want to copy an array there, right? After that you'll probably be tempted to use std::vector as a basis for json::array. Then you'll realize what an abysmal PoS default vector allocator is, and so you'd want to allow customizing it. Hello templates, and then before you know it, the whole thing not only looks like one of them incomprehensible boost beauties, but also compiles into a meg of code with a dozen dependencies. Neat :)


> After that you'll probably be tempted to use std::vector as a basis for json::array. Then you'll realize what an abysmal PoS default vector allocator is, and so you'd want to allow customizing it. Hello templates, and then before you know it, the whole thing not only looks like one of them incomprehensible boost beauties, but also compiles into a meg of code with a dozen dependencies. Neat :)

Writing good code means having discipline to resist all sorts of temptations, such as the one you describe. And not all slippery slopes are actually all that slippery.

The one good thing about C++ is that you can pick the features you want and mostly ignore the rest. The reason C++ is so humungous is that everyone is using different subset of it.


> The reason C++ is so humungous is that everyone is using different subset of it.

I completely agree with this. Incidentally, this is exactly the reason why C++ libraries are far less useful than the C ones.


RAII would break you if longjmp/setjmp are used. For example if you had the code in C++, but with "C" externs (like zeromq for example), then using RAII might bite someone unexpected where longjmp/setjmp would not unwind and call the destructors.


you are calling realloc and assigning the value directly before checking the return value. realloc can fail and return NULL and you'll overwrite the previous pointer which will cause a memory leak.

I don't see why you duplicated strndup in parson_strndup.

https://github.com/kgabis/parson/blob/master/parson.c#L129

parson_strdup can return NULL and this statement will fail silently.


strndup is part of the "standard", but not present on some platforms unfortunately. :( I can't think of a modern C runtime that doesn't offer strdup, though.

The realloc check is good advice -- take heed, OP. :)


Realloc, strdup and strndup errors are handled as of now :) (https://github.com/kgabis/parson/commit/ee9be98974b3fdac0be1...) And yes, I've written my own strndup because it's "relatively new".


True, strndup is relatively new (~7 years) to POSIX standard.


I'm aware of a few different JSON parsing libraries for C.

* Metaparadigm's "JSON-C" library (see https://github.com/json-c/json-c/wiki)

* Jansson (see http://www.digip.org/jansson/)

* yajl (see http://lloyd.github.com/yajl/)

All of those are licensed extremely permissively (BSD, ISC, etc) yajl seems kind of SAX-like, which I'm not really a big fan of, but I guess some people might like it. It would be good if you had a humungous JSON stream, for the same reasons SAX beats DOM in those cases.

json-c and Jansson seem pretty similar to me. Jansson seems to be more full-featured than json-c, with stuff like a json_equal method, a json_deep_copy method, and the ability to specify custom memory allocation functions. But for most use cases there isn't too much difference between the two.

So I guess my question is really what is the use case you're targeting? It would be nice to have some context about what you see as the niche you're filling.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: