j1elo 13 days ago

I'll take the chance to bring attention to the maintenance issues that 'jq' has been having in the last years [1]; there hasn't been a new release since 2018, which IMO wouldn't necessarily be a bad thing if not for the fact that the main branch has been collecting improvements and bug fixes [2] since then.

A group of motivated users are currently talking about what direction to take; a fork is being considered in order to unlock new development and bug fixes [3]. Maybe someone reading this is able and willing to join their efforts.

[1]: https://github.com/stedolan/jq/issues/2305

[2]: https://github.com/stedolan/jq/pull/1697

[3]: https://github.com/stedolan/jq/issues/2550

  • capableweb 13 days ago

    What exactly is missing/broken in jq right now which warrants a fork? I've been using jq daily for years, and I can't remember the last time I hit a bug (must have been many years ago) and I can't recall any features I felt been missing for the years I've been using it.

    For me it's kind of done. It could be faster, but then I tend to program a solution myself instead, otherwise I feel like it's Done Enough.

    • j1elo 12 days ago

      I wouldn't say I need the program to grow with more features, but at the bare minimum they should have been more diligent with cutting releases after accepting bug fixes, instead of letting those contributions langish on the main development branch out of reach for users.

      I mean it would be understandable if the maintainers didn't have the time to keep working on it at all, but clearly the review work was done to accept some patches so why not make .point releases to allow the fixed code reach users via their distribution's channels?

    • Calzifer 13 days ago

      What I miss from jq and what is implemented but unreleased is platform independent line delimiters.

      jq on Windows produces \r\n terminated lines which can be annoying when used with Cygwin / MSYS2 / WSL. The '--binary' option to not convert line delimiters is one of those pending improvements.

      https://github.com/stedolan/jq/commit/0dab2b18d73e561f511801...

      • orev 12 days ago

        You’ll have a much better experience in Cygwin/MSYS2/WSL if you treat them like isolated environments and not call programs from outside of them. If you want to use ‘jq’ (or any tool) within Cygwin, install the Cygwin package. Don’t rely on the Windows install, and you’re guaranteed to run into problems like this.

    • goranmoomin 13 days ago

      > What exactly is missing/broken in jq right now which warrants a fork

      AFAIK there’s quite a few bug fixes and features that are accumulated on the unreleased main branch, or opened as PRs but never merged.

      IIRC I hit one of the bugs while trying to check whether an input document is valid JSON.

      I should try checking out what’s happening to the fork, I’ve never opened a PR or something but I’ve read the source while trying to understand the jq language conceptually, and I’d say it’s quite elegant :)

    • strunz 12 days ago

      The README fir Jj points out how it is exponentially faster than jq. Presumably some of those improvements would help this.

    • jjoonathan 12 days ago

      > It could be faster

      A decaffinated sloth could be faster.

cristoperb 13 days ago

I like jq, but jj is so fast it is my go-to for pretty printing large json blobs. Its parsing engine is available as a standalone go module, and I've used it in a few projects where I needed faster parsing than encoding/json:

https://github.com/tidwall/gjson

  • pcthrowaway 13 days ago

    I don't think I've ever been limited by jq's speed, but good to know there are alternatives if it ever becomes a bottleneck.

    Other than that I can't think of a reason to use this over jq; the query language is perhaps a bit more forgiving in some ways, but not as expressive as jq (and I've spent ~8 years getting pretty familiar with jq's quirks)

    • lathiat 13 days ago

      The limiting speed factor of jq for me is, by far, figuring out how to write the expression I need to parse a fairly small amount of data. I do a bunch of support analysis and often writing a one-liner to put into a shell script to extract some bit of JSON to re-use later in the script. Often this is going to be used only once by me or a customer to run some task.

      Followed closely by figuring out the path to the area of data I'm interested in. "gron" has been a real time saver there - it converts the json into single lines of key/value - so you can use grep and find the full path for any string.

      Switching to a GUI to browse the JSON that would let you copy the path to the current value would probably also help there, but, I'm usually in the terminal doing a bunch of different tasks looking through all manor of command outputs, logs, etc :)

      Relatedly my primary use of ChatGPT has been asking it to write jq queries for me, it's not too bad at getting close. It's biggest blindness seems to be string values with a dash, which you have to write as ["key-name"].

      • pdimitar 13 days ago

        > Switching to a GUI to browse the JSON that would let you copy the path to the current value would probably also help there

        Try https://jless.io/ then.

      • Simran-B 13 days ago

        I agree that figuring out non-trivial jq expressions takes a lot of time, often accompanied with a consultation of the somewhat lacking docs, and some additional googling.

        Nonetheless, it is pretty slow at processing data. For example, converting a 1 GB JSON array of objects to JSON Lines takes ages, if it works at all. Using the steaming features helps, but they are hard to comprehend. It gets memory consumption under control and doesn't take super long, but still way too long for such a trivial task IMO.

      • bobnamob 13 days ago

        I’m far more likely to parse json into clojure repl session and go from there these days. Learning jq for the odd json manipulation I need to do seems like overkill

        • dicknuckle 12 days ago

          For me it's usually for some automation task to gather a list of IDs for some cloud environment to build infra things.

      • Dobbs 13 days ago

        > Switching to a GUI to browse the JSON that would let you copy the path to the current value would probably also help there

        I use an app called OK JSON on the mac for this. Its okay.

      • AeroNotix 13 days ago

        emacs has a command to get the current path at point.

        • pdimitar 12 days ago

          Which one is it exactly, please? I'd like to use it.

qhwudbebd 13 days ago

Interesting! I tend to use gron to bring JSON into (and out of) the line-based bailiwick of sed and awk where I'm most comfortable, rather than a custom query language like jq that I'd use much more rarely. But I guess that's at the opposite extreme of (in)efficiency than both this and the original jq.

There might be a nice 'edit just this path in-place in gron-style' recipe to be had out of jj/jq + gron together...

  • qhwudbebd 13 days ago

    Are there any gron-like tools for xml? I'm aware it's a harder problem (and an increasingly rare problem) but perhaps someone has tackled it nonetheless?

  • robertlagrant 13 days ago

    Just looked up gron - thanks. This looks useful.

maleldil 13 days ago

Am I correct in understanding that this can only manipulate (get or set values) from a JSON path? That is, is it not a replacement for jq?

For example, I frequently use jq for queries like this:

    jq '.data | map(select(.age <= 25))' input.json
Or this:

    jq '.data | map(.country) | sort[]' input.json | uniq -c
Is it possible to do something similar with this tool?

This is not a slight at jj. Even if it's more limited than jq, it's still of great value if it means it's faster or more ergonomic for a subset of cases. I'm just trying to understand how it fits in my toolbox.

  • TymekDev 10 days ago

    It looks like the README in jj repository does not do justice when it comes to available syntax for queries. jj uses gjson (by the same author) and its syntax [0]. From what I saw the first one can be handled with:

        jj 'data.#(age<=25)#' -i input.json
    
    I don't think there is a way to sort an array, though. However, there is an option to have keys sorted. Personally, I don't think there is much annoyance in that. One could just pipe jj output to `sort | uniq -c`.

    I just discovered that gjson supports custom modifiers [1]. So technically, you could fork jj, and add another file registering `@sort` modifier via `gjson.AddModifier` and have a custom jj version supporting sorting.

    [0]: https://github.com/tidwall/gjson/blob/master/SYNTAX.md

    [1]: https://github.com/tidwall/gjson/blob/master/SYNTAX.md#modif...

  • zimpenfish 12 days ago

    Annoyingly, I think `jq` might still be the only tool capable of these kinds of things. The rest seem to be "query simple paths and print the result" (which is handy, of course - I often use `gron` to get an idea of the keys I'm after because the linear format is easier to handle than JSON.)

harisamin 12 days ago

A while ago I wrote jlq, a utility explicitly querying/filtering jsonl/json log files. It’s powered by SQLite. Nice advantage is it can persist results to a sqlite database for later inspection or to pass around. Hope it helps someone :)

https://github.com/hamin/jlq

wvh 12 days ago

I've been using the gjson (get) and sjson (set) libraries this is based on for many years in Go code to avoid deserialising JSON responses. Those libraries act on a byte array and can get only the value(s) you want without creating structs and other objects all over the place, giving you a speed bump and less allocations if all you need is a simple value. It's been working well.

This program could be an alternative to jq for simple uses.

BiteCode_dev 14 days ago

For those wondering, the README states it's a lot faster than JQ, which may be the selling point.

  • nigeltao 13 days ago

    jj is faster than jq.

    However, jsonptr is even faster and also runs in a self-imposed SECCOMP_MODE_STRICT sandbox (very secure; also implies no dynamically allocated memory).

      $ time cat citylots.json | jq -cM .features[10000].properties.LOT_NUM
      "091"
      real  0m4.844s
      
      $ time cat citylots.json | jj -r features.10000.properties.LOT_NUM
      "091"
      real  0m0.210s
    
      $ time cat citylots.json | jsonptr -q=/features/10000/properties/LOT_NUM
      "091"
      real  0m0.040s
    
    jsonptr's query format is RFC 6901 (JSON Pointer). More details are at https://nigeltao.github.io/blog/2020/jsonptr.html
    • zokier 13 days ago

      Looks neat. One suggestion: add better build instructions on wuffs readme/getting started guide. I jumped in and tried to build it using the "build-all.sh" script that seemed convenient, but gave up (for now) after nth build failure due yet another missing dependency. It's extra painful because the build-all.sh is slow, so maybe also consider some proper build automation tool (seeing this is goog project, maybe bazel?)?

      • nigeltao 12 days ago

        Thanks for the feedback. I'll add better build instructions.

        If you just want the jsonptr program, instead of everything in the repo (the Wuffs compiler (written in Go), the Wuffs standard library (written in Wuffs), tests and benchmarks (written in C/C++), etc) then you can use "build-example.sh" instead of "build-all.sh".

          ./build-example.sh example/jsonptr
        
        For example/jsonptr, that should work "out of the box", with no dependencies required (other than a C++ compiler). For e.g. example/sdl-imageviewer, you'll also need the SDL library.

        Alternatively, you could just invoke g++ directly, as described at the very top of the "More details are at [link]" page in the grand-parent comment.

          $ git clone https://github.com/google/wuffs.git
          $ g++ -O3 -Wall wuffs/example/jsonptr/jsonptr.cc -o my-jsonptr
  • rektide 13 days ago

    Presumably the memory footprint is often far less too.

Willuminaughty 12 days ago

Hey there,

Just wanted to drop a quick note to say how much I'm loving jj. This tool is seriously a game-changer for dealing with JSON from the command line. It's super easy to use and the syntax is a no-brainer.

The fact that jj is a single binary with no dependencies is just the cherry on top. It's so handy to be able to take it with me wherever I go and plug it into whatever I'm working on.

And props to you for the docs - they're really well put together and made it a breeze to get up and running.

Keep up the awesome work! Can't wait to see where you take jj next.

Cheers

Rygian 13 days ago

This behaviour looks confusing to me:

$ echo '{"name":{"first":"Tom","middle":"null","last":"Smith"}}' | jj name.middle

null

$ echo '{"name":{"first":"Tom","last":"Smith"}}' | jj name.middle

null

It can be avoided with option '-r' which should be the default, but is not.

  • planede 13 days ago

    I don't get this behavior for your second command, it just seems to return an empty string.

    edit:

    There are three cases to cover:

    1. The value at the path exists and not null.

    2. The value at the path exists and is null.

    3. The value at the path doesn't exist.

    jj seems to potentially confuse 1 and 2 without the -r flag. "middle": "null" and "middle": null more specifically. It probably confuses "middle": "" and missing value as well, that's 1 and 3.

asadm 13 days ago

I wish this existed when I was trying to look at 20G of firebase database JSON dump.

  • vmfunction 13 days ago

    that is what gets me, why did the file get to 20g? At that point just ship a SQLite file.

    • capableweb 13 days ago

      Does it matter why? Sometimes files gets big, and you don't control the generation or trying to change the generation is a bigger task than just dealing with a "big" (I'd argue 20GB isn't that big anyways) file with standard tools.

      • notorandit 12 days ago

        Nope, it matters a lot! Unstructured unindexed files get that gig usually as the result of some design flaw.

notorandit 13 days ago

Interesting. How often do you manipulate a 1+MB JSON file? Maybe I am wrong, but going from 0.01s to 0.001s doesn't motivate me to switch to jj.

  • untech 12 days ago

    Datasets are often stored in (sometimes gzipped) jsonlines format in my field (NLP). The file size could reach 100s of GBs.

    • notorandit 12 days ago

      100s of GBs?

      In those cases, querying un-indexed files seems quite a thinko. Even if you can fit it all in RAM.

      If you only scan that monstrous file sequentially, then you don't need either jq or jj or any other "powerful" tool. Just read/write it sequentially.

      If you need to make complex scans and queries, I suspect a database is better suited.

      • untech 10 days ago

        Usually you indeed scan this file sequentially, doing some filtration / transformation. As you do this transformation for each record, the speed of the tool used (e.g. jq) really matters.

        Databases are not used in this case because it’s a complexity overhead compared to plain-text files. The ability to use unix pipelines and tools (such as grep) is a bonus.

Self-Perfection 13 days ago

I would like to see a comparison with jshon. Jshon is way faster than jq and for many years available in your distro repositories.

  • Alifatisk 13 days ago

    Cool, didn’t know about jshon, how’s the query language?

    • Self-Perfection 13 days ago

      Almost non-existing. A couple of excerpts from man page:

        {"a":1,"b":[true,false,null,"str"],"c":{"d":4,"e":5}}
        jshon [actions] < sample.json
        jshon -e c -> {"d":4,"e":5}
        jshon -e c -e d -u -p -e e -u -> 4 5
      
      Yet this covers like ~50% of possible use cases for jq.