This is useful. I've been part of a team implementing webhooks in the past and there are a lot of difficult details you need to get right - things like responsible retries, authentication, thin-vs-fat hooks and server-side request forgery.
I wondered, "whose standard?", and was pleasantly surprised to find a list of real names [1] on the website. Nonetheless, I do bristle at the semantic weight of the name despite the fact it's not attached to a relevant existing institution.
Good sign to me is one of the steering members is Tom Hacohen founder of Svix (webhooks-as-a-service). We're adopting them where I work and everything from them has been solid. I know he's seen a lot of different use cases and will have good consideration for the schema they define in their spec file.
I'm curious as to why webhooks are becoming a defacto standard for triggering events between isolated systems.
Why not have a dedicated event bus (could be - but not limited to - kafka, NATS etc.) where remote systems connect to dedicated event queues? Push a message onto the queue which is picked up by the remote system.
Authentication is handled by the event bus which can also act as a storage for message (re)delivery. Partitioned by customer ID for separation of concerns etc.
Anything immediately obvious why this wouldn't be a preferred option or is it because HTTP is just easier to implement across systems?
A webhook is "peer-to-peer" and uses the existing HTTP infrastructure that's already in your application. Whereas an event bus is "centralized" and requires a third service to run/maintain/design. I'd like to avoid that if possible.
Also webhooks make a lot of sense for communication between organizations, which you can't do with a centralized event bus (unless someone is out there running a global event bus that I'm not aware of). Let's say I'm using Managed Service Foo hosted by a third party, and I want to trigger some event in my own system whenever certain things happen in side Managed Service Foo. How else do you expect me to receive those events? Surely a webhook is a lot easier than figuring out an event bus to be shared across our two organizations.
This is also a really good summary of "what to think about while implementing webhooks" -- personally I wish we had thought of doing "thin" webhooks in an implementation I was a part of.
The only advantage is that it validates that the send composed the message in the case without a shared secret (which is not what the article appears to advocate for).
A shared secret alone, or an HMAC based on a shared secret, just means any party with the secret -- which could include anyone who would need to verify it -- composed the message.
I generally don't do what's advocated for in the article because it doesn't make a lot of sense, I do either:
In my case I'm using mTLS and verifying the CN of the client cert. This is for an internal service. I'm also surprised the recommended headers doesn't include the event type. I found it beneficial to be able route the event before parsing the body and w/o having to use different endpoints.
> While this specification does not dictate the structure, or impose any requirements, on the shape, format, and content of the payload it does offer recommendations
Too loose to be a standard but better than nothing.
Yeah, we don't call it a standard for that reason, here's the tagline:
> Open source tools and guidelines for sending webhooks easily, securely and reliably
We have been going back and forth about this. Though we purposefully made Standard Webhooks more like guidelines than a formal specification (note there's no mention of RFC2119, for example) so that it's easier to conform to without forcing implementations to have breaking changes. Even if it means you don't get the full benefits.
I think people can get a lot of benefits even if they don't follow the whole thing, and it's our job to continue building tools to make it easier to build conforming implementations than non-conforming ones.
webhooks were a phase we should have passed by now, not try to entrench with some standard. it’s a broken/obscure implementation of full duplex communications between two systems but thought up as if it were a luxury, auxiliary system whose downtime we should be able to tolerate. if we’re required to survive downtime of a webhook system, i think we’re right to ask: why are they there in the first place?
I'm not totally anti-webhook or the person you replied to, but I'd prefer at-most-once delivery via something that establishes a reusable connection (grpc or even websockets?) and backed by an events endpoint like Stripe's where the client can read everything that would have been sent. That way the client can replay all the events at leisure and retries aren't the server's responsibility.
> I'd prefer at-most-once delivery via something that establishes a reusable connection (grpc or even websockets?) and backed by an events endpoint like Stripe's where the client can read everything that would have been sent. That way the client can replay all the events at leisure and retries aren't the server's responsibility.
You could use SSE, long polling, or even a webhook. With the latter two you'd miss out on some of the performance gains of not needing to re-establish a connection unless the producer does http streaming, but the main stability points are not using webhooks as the sole means of delivery and not flushing events after they're delivered. So many webhook implementations don't go far enough and just fling events at the consumer with a short-term retry policy, or none at all, and then don't provide a way to see what events got missed.
I mean, you can use anything if you just want a dumb pipe to put an ad-hoc data stream over.
My point is that SSE is already a standard to do exactly the thing you're asking for: publish a one-way event stream, with a built-in mechanism for reconnecting and telling the remote end the last event you received (so you can catch up on anything missed during the disconnected period).
Right, sure, I've never used it outside a browser but I guess it would work fine. My point is that A) no matter which standard you use for sending a stream of events it's probably better than a webhook, and B) it's still important to provide both a long retention window and good tools to query events for debugging no matter what you use.
Right: in this day and age there are an almost unlimited array of inexpensive, easy ways to spin up an HTTP endpoint that can receive a POST request.
We have a lot of experience scaling those kinds of endpoints (and making them redundant) too.
Spinning up an always-on server that can maintain a persistent connection - and reconnect automatically if it drops, and if the server is rebooted with an update and suchlike - is a whole lot harder. Possible, but not nearly as easy.
My experience is that if your server doesn't retry webhooks then you could be doing something more efficient, and if it does retry webhooks then that indicates it's important not to miss anything and you should use persistent events rather than relying on bug-prone "if there's a 200, drop the event from the DB" webhook retries from the server.
I do think there's an argument for the interoperability of webhooks for integrating different services. I'm skeptical that they're the best choice from a purely technical perspective.
This is useful. I've been part of a team implementing webhooks in the past and there are a lot of difficult details you need to get right - things like responsible retries, authentication, thin-vs-fat hooks and server-side request forgery.
This document covered all of them. Here's the SSRF bit for example: https://github.com/standard-webhooks/standard-webhooks/blob/...
I wondered, "whose standard?", and was pleasantly surprised to find a list of real names [1] on the website. Nonetheless, I do bristle at the semantic weight of the name despite the fact it's not attached to a relevant existing institution.
[1]: https://www.standardwebhooks.com/#committee
idk man, no one bristles at the S&P 500.
? That's obviously just a name, it's not making a claim.
Good sign to me is one of the steering members is Tom Hacohen founder of Svix (webhooks-as-a-service). We're adopting them where I work and everything from them has been solid. I know he's seen a lot of different use cases and will have good consideration for the schema they define in their spec file.
Hard agree, they're a great team who have more knowledge in this space than almost anyone. The spec is great and makes a ton of sense.
Thanks a lot Austin! All of us on the committee have a lot of first hand experience with webhooks, which we then distilled into the spec. :P
I'm curious as to why webhooks are becoming a defacto standard for triggering events between isolated systems.
Why not have a dedicated event bus (could be - but not limited to - kafka, NATS etc.) where remote systems connect to dedicated event queues? Push a message onto the queue which is picked up by the remote system.
Authentication is handled by the event bus which can also act as a storage for message (re)delivery. Partitioned by customer ID for separation of concerns etc.
Anything immediately obvious why this wouldn't be a preferred option or is it because HTTP is just easier to implement across systems?
> dedicated event bus
A webhook is "peer-to-peer" and uses the existing HTTP infrastructure that's already in your application. Whereas an event bus is "centralized" and requires a third service to run/maintain/design. I'd like to avoid that if possible.
Also webhooks make a lot of sense for communication between organizations, which you can't do with a centralized event bus (unless someone is out there running a global event bus that I'm not aware of). Let's say I'm using Managed Service Foo hosted by a third party, and I want to trigger some event in my own system whenever certain things happen in side Managed Service Foo. How else do you expect me to receive those events? Surely a webhook is a lot easier than figuring out an event bus to be shared across our two organizations.
This is also a really good summary of "what to think about while implementing webhooks" -- personally I wish we had thought of doing "thin" webhooks in an implementation I was a part of.
Question:
Why do so many webhooks use HMAC signatures for authorization?
For everything else in APIs, people are perfectly happy to use API tokens/secrets directly in headers.
Why don't webhooks directly share secrets, instead of HMAC signatures?
Like, I understand the advantages of HMAC, but for some reason it seems to be that webhooks are unique in their usage of it.
The only advantage is that it validates that the send composed the message in the case without a shared secret (which is not what the article appears to advocate for).
A shared secret alone, or an HMAC based on a shared secret, just means any party with the secret -- which could include anyone who would need to verify it -- composed the message.
I generally don't do what's advocated for in the article because it doesn't make a lot of sense, I do either:
- A shared secret
- A signed and HMACed payload with asymmetric key
Although this ONLY holds if you're using HTTPS -- which is a separate thing, so maybe they're considering that you might not use HTTPS.
In my case I'm using mTLS and verifying the CN of the client cert. This is for an internal service. I'm also surprised the recommended headers doesn't include the event type. I found it beneficial to be able route the event before parsing the body and w/o having to use different endpoints.
Looks great.
What are people using to store and send retries?
Plain ol’ sql
as in, a database queue?
> While this specification does not dictate the structure, or impose any requirements, on the shape, format, and content of the payload it does offer recommendations
Too loose to be a standard but better than nothing.
Yeah, we don't call it a standard for that reason, here's the tagline:
> Open source tools and guidelines for sending webhooks easily, securely and reliably
We have been going back and forth about this. Though we purposefully made Standard Webhooks more like guidelines than a formal specification (note there's no mention of RFC2119, for example) so that it's easier to conform to without forcing implementations to have breaking changes. Even if it means you don't get the full benefits.
I think people can get a lot of benefits even if they don't follow the whole thing, and it's our job to continue building tools to make it easier to build conforming implementations than non-conforming ones.
The signature format is just weird. Why not include it as part of the request headers?
Edit: It seems to also have a header version. Not sure why there's two different ways to pass a signature here.
It's only passed as a header. Where did you see the other way of passing it? We'll clarify the spec if confusing.
webhooks were a phase we should have passed by now, not try to entrench with some standard. it’s a broken/obscure implementation of full duplex communications between two systems but thought up as if it were a luxury, auxiliary system whose downtime we should be able to tolerate. if we’re required to survive downtime of a webhook system, i think we’re right to ask: why are they there in the first place?
What should replace them?
what is your suggestion for duplex communication?
I'm not totally anti-webhook or the person you replied to, but I'd prefer at-most-once delivery via something that establishes a reusable connection (grpc or even websockets?) and backed by an events endpoint like Stripe's where the client can read everything that would have been sent. That way the client can replay all the events at leisure and retries aren't the server's responsibility.
> I'd prefer at-most-once delivery via something that establishes a reusable connection (grpc or even websockets?) and backed by an events endpoint like Stripe's where the client can read everything that would have been sent. That way the client can replay all the events at leisure and retries aren't the server's responsibility.
Isn't that basically a description of SSE (https://en.m.wikipedia.org/wiki/Server-sent_events)?
You could use SSE, long polling, or even a webhook. With the latter two you'd miss out on some of the performance gains of not needing to re-establish a connection unless the producer does http streaming, but the main stability points are not using webhooks as the sole means of delivery and not flushing events after they're delivered. So many webhook implementations don't go far enough and just fling events at the consumer with a short-term retry policy, or none at all, and then don't provide a way to see what events got missed.
I mean, you can use anything if you just want a dumb pipe to put an ad-hoc data stream over.
My point is that SSE is already a standard to do exactly the thing you're asking for: publish a one-way event stream, with a built-in mechanism for reconnecting and telling the remote end the last event you received (so you can catch up on anything missed during the disconnected period).
Right, sure, I've never used it outside a browser but I guess it would work fine. My point is that A) no matter which standard you use for sending a stream of events it's probably better than a webhook, and B) it's still important to provide both a long retention window and good tools to query events for debugging no matter what you use.
I also prefer events over a persistent connection for efficiency, but webhooks are far better when the client is using a function as a service model.
Right: in this day and age there are an almost unlimited array of inexpensive, easy ways to spin up an HTTP endpoint that can receive a POST request.
We have a lot of experience scaling those kinds of endpoints (and making them redundant) too.
Spinning up an always-on server that can maintain a persistent connection - and reconnect automatically if it drops, and if the server is rebooted with an update and suchlike - is a whole lot harder. Possible, but not nearly as easy.
My experience is that if your server doesn't retry webhooks then you could be doing something more efficient, and if it does retry webhooks then that indicates it's important not to miss anything and you should use persistent events rather than relying on bug-prone "if there's a 200, drop the event from the DB" webhook retries from the server.
I do think there's an argument for the interoperability of webhooks for integrating different services. I'm skeptical that they're the best choice from a purely technical perspective.