onderkalaci a year ago

PG community had a similar patch, which got reverted from PG 15 on the last minute: https://www.depesz.com/2022/04/06/waiting-for-postgresql-15-...

  • kiwicopple a year ago

    this would be our preference. we'll try to support this for the next commitfest, and if it gets merged then we will deprecate our extension in favour of the native solution.

    • mkurz a year ago

      Is there a chance it gets committed during the current commitfest? See https://commitfest.postgresql.org/42/4086/

      This years feature freeze will be on April 8th (https://www.postgresql.org/message-id/flat/9fbe60ec-fd1b-6ee...), so if it does not get committed within the next two and a half weeks it will miss this years Postgres release in September...

      • kiwicopple a year ago

        I'll flag it with the team this week. I'm not sure what the blocker was previously, but it might just be a matter of submitting the patch again (with minor changes) so that it's "in" for commitfest, with someone willing to own the work over the next few months.

        • mkurz a year ago

          I think the things that needed to be fixed from last year are already committed (more general stuff not directly related to the JSON patches). Also according to this message at least partial stuff from the JSON patches should be committed "...in the next few days..." however that was two weeks ago: https://www.postgresql.org/message-id/454db29b-7d81-c97a-bc1...

          I am a bit worried, even though the patches seem to be "stable", they will miss the deadline... (since I would need those features as well)

          • kiwicopple a year ago

            I don't know any of the people involved in this patch, so I've sent it to Alexander Korotkov to get his opinion. I'll let you know his response after he has a chance to look at it.

            • kiwicopple a year ago

              Alexander's response:

              > This is very long story starting from 2017. This patch should finally be committed. Some preliminary infrastructure already landed to PostgreSQL 16. Regarding SQL/JSON itself I doubt it will be committed to PostgreSQL 16, because feature freeze is coming soon. It's likely be postponed to PostgreSQL 17.

              > Regarding replacement for pg_jsonschema, I don't think it will be a good replacement. Yes, one can construct a jsonpath expression which checks if particular items have particular data types. But I doubt that is nearly as convenient as jsonschema.

              It looks like there would still be some benefit for pg_jsonschema, unless the community decided that they wanted support jsonschema validation. We could propose this, but I don't think it would arrive to pg core any time soon.

    • nextaccountic a year ago

      That JSON_TABLE feature looks pretty useful [0], but it seems complimentary to pg_jsonschema. Can it actually be used to validate a json prior to inserting it to the database?

      [0] For my use case, there is a problem: if the json represents a sum type (like Rust enums or Haskell ADTs) as SQL tables. Often you will have a "tag" that specifies which variant the data encodes, and each one has its own properties. When representing as a table, you will usually add the fields of all variants as columns, setting as non-null only the fields belonging to the variant of each row. And the reason I insert jsons into the database is really just to represent sum types in a better way.

    • debugdog a year ago

      Well written article! Also very glad to hear your approach to support the native implementation. For all of our projects when we're integrating external services we usually keep the relevant original JSON responses as a jsonb as kind of a backup. Next to that we extract the data we'll be using to queryable data. To be able to use those "dumps" directly would be a nice thing to have.

  • nextaccountic a year ago

    How can this be used to validate JSON? (and prevent invalid json from being inserted into the database in the first place). I don't see "jsonschema" mentioned in that post.

    Perhaps Postgres could support jsonschema validation natively.

boomskats a year ago

This is long overdue and elegantly done. Great work!

On a personal note, it's great to finally see Neon & Supabase playing with each other. Much more interesting to me than Hasura.

  • endisneigh a year ago

    There’s nothing stopping you from using this with Hasura. Don’t see the relevance.

  • tough a year ago

    The power of FOSS

    • thangngoc89 a year ago

      Hasura is also FOSS

      • zwily a year ago

        Hasura is open core. Given the massive price increase they just did for their hosted version, I’d expect more and more future features will not land in core at all, to push people to pay.

        • klabb3 a year ago

          It’s already quite bad unfortunately. Both support for read replicas and metrics(!) are not in the open source version. They have a prometheus exporter but not on free, afaik.

          I was also concerned with people reporting memory consumption/leak issues, as I’m planning to have lots of subscriptions. I don’t know haskell well enough, but from the outside it does match the symptoms of having dug themselves into an architectural complexity hole, which I assume is much harder to navigate with unorthodox tooling and practices.

          Im still rooting for hasura, I like their way of making things dumb, simple and without too much technophilia for its own sake. They genuinely want you to focus on your business problems instead of blasting you with novel and overengineered concepts, like others do.

          • tango12 a year ago

            (from Hasura)

            Scaling subscriptions is hard, but we work with our users/customers at scale to make sure settings are tweaked correctly.

            We have users running 100k - 1M concurrent users in production for live-event type platforms. It's not completely trivial to benchmark and setup because query patterns, streaming vs live queries etc have an impact, but it works very reliably. No missing events, no problems disconnecting/reconnecting, no need for sticky sessions and so on.

            An initial POC benchmark [1] should be a quick affair so if you're trying it out and run into any problems, please hit me up! Email on my bio.

            [1]: https://github.com/hasura/graphql-bench

            • klabb3 a year ago

              Thanks, that looks reassuring. I will let share my own benchmarks too if I get around to it.

          • zwily a year ago

            Yeah, I love the product. I’ve been using it a few years in a fairly complicated internal tool used by a dozen people, and it’s been amazing. But I’m pretty discouraged by what I see as its future.

        • tough a year ago

          AFAIK I have not run supabase in my own infra but they seem to allow you to do so and are quite good citizens publishing all their built tools on top of pg or whatever and as far as I remember with sane licenses.

          I love supabase, neon are new-ish but a great alterantive for hosted serverless databases (they also did a great staging-db-for-pr's) when launched that we integrated at work quite soon while on beta and saved a lot of headaches of introducing new features that touched database before

        • tango12 a year ago

          Hasura CEO here. Totally hear you.

          We're working on a new serverless infrastructure layer that'll make the pricing better for users compared to a DIY API server or to a self-hosted Hasura.

          It's a significant engineering lift on our side - at its core we're engineering Hasura to achieve 90%+ infrastructure utlization (no cold-start, sub-millisecond auto-scaling), and that's what will allow us to do this.

          Not there yet, but we'll be demo-ing and talking about the engineering at HasuraCon in June!

Pxtl a year ago

I've tried to use JSON Schema and the big gap I couldn't figure was how to handle polymorphism.

That is, if I have a case where object

    {
        "Foo": {
            "Type":"Fred"
            "Bar":1
            "Baz":2
            "Quux":2
        }
    }
and the object

    {
        "Foo": {
            "Type":"Waldo"
            "Bar":1
            "Corge":2
            "Xyzzy":7
        }
    }
are both valid, without just allowing any object members or allowing the union of their members.

I did a hack by multiplexing the types into a child-object, but that was ugly and clumsy.

In XSD or any statically-typed programming language I could handle this trivially using types and polymorphism, because "Fred" and "Waldo" would be different types.

But I can't figure out how to do that in Json Schema.

  • mullsork a year ago

    IIRC that's what oneOf is for. i.e. a discriminated union / sum type. My experience with oneOf is that tooling support for it is terrible.

    • Pxtl a year ago

      Ah, I didn't realize you could use oneOf with refs or objects, I've only ever seen it used with primitives. Thanks!

mehdix a year ago

The article is also a joy to read. Well done.

hhthrowaway1230 a year ago

Ah yeah super nice always wanted schema support in progress love to see them hammering away on top of the solid foundation of postgres

Kydlaw a year ago

> 2022-08-19

Has anyone tried it? Any feedback on it?

  • wendyshu a year ago

    It works just fine but it will only give a boolean value valid/invalid, it won't tell you why it's invalid.

korijn a year ago

So how do you apply this in the real world when dealing with schema versions and database migrations?

  • dtech a year ago

    You could add a schema_version column and use one constraint per schema version. It will only be per major version, if the schema is backwards compatible there's no problem.

  • willmeyers a year ago

    Add a new constraint that supports the new schema version, migrate json columns to conform to new schema as need be, and then drop old constraint. This assumes your schema is backwards compatible. If it isn't I would probably not use a DB constraint and just handle everything on the application layer.

    • alexvoda a year ago

      If the schema is not backwards compatible it either means you will have to do a data migration or it means you want to keep track of the schema version along with the data. In both cases it is much easier if the json columns are split off into a separate table.

    • korijn a year ago

      Thanks, that makes sense.

anonu a year ago

well written article and looks like a great extension. However, my only issue with JSON Schema is that it becomes unsupportable once your JSON objects get too big: to many keys or too many items in an array for example. If you are looking to find "where" the issue is in your JSON object, most schema validators don't provide enough guidance, they just say "something is wrong" as it appears this one does.

marwis a year ago

The detail error from example is pretty awful and missing any useful detail. Is it the limitation of the jsonschema library?

naedish a year ago

This is interesting. Would be curious to see if this can replace pydantic for specific cases.

  • kkirsche a year ago

    With pydantic’s funding, it wouldn’t surprise me if they enter this space themselves (the company side not the existing python library side)

    Funding Ref: https://pydantic.dev/announcement/

rapfaria a year ago

Wouldn't you wanna save the json even if the structure is incorrect?

  • rco8786 a year ago

    If you want that behavior, just don't use json schema.

roenxi a year ago

I can see some technical advantages to supporting JSON schema directly; but I suspect most people will be using this extension because SQL is really ugly and they don't want to use it to set up their schemas in pg proper.

It says a lot about how weak the SQL syntax is. An extension to replace CREATE TABLE with a JSON schema construct would be wildly popular.