Running a Certificate Transparency log

words.filippo.io

152 points by Metalnem a day ago

agwa a day ago

Sunlight and static-ct-api are a breath of fresh air in the CT log space. Traditional CT log implementations were built on databases (because that's the easiest way to implement the old API) and were over-complicated due to a misplaced desire for high write availability. This made operating a CT log difficult and expensive (some operators were spending upwards of $100k/year). Consequentially, there have been a rash of CT log failures and few organizations willing to run logs. I'm extremely excited by how Sunlight and static-ct-api are changing this.

eddythompson80 a day ago

I wonder if this is the solution something like SponsorBlock is looking for[1][2]. They have a similar-ish problem. How to replicate crowdsourced data that trickles in slowly, but ideally you want replicated quickly.
WAL replication, rsync, bittorrent, etc all things that don't quite work as needed.
[1] https://github.com/mchangrh/sb-mirror/blob/main/docs/breakdo...
[2] https://github.com/ajayyy/SponsorBlock/issues/1570

baobun a day ago

For the downstream side Mozilla has done some great progress with CRLite, which I think hasn't gotten enough attention.

https://youtube.com/watch?v=gnB76DQI1GE&t=19517s

https://research.mozilla.org/files/2025/04/clubcards_for_the...

torbid a day ago

These sound like good improvements but I still don't really get why the ct log server is responsible for storage at all (as a 3rd party entity)..

Couldn't it just be responsible for its own key and signing incremental advances to a log that all publishers are responsible for storing up to their latest submission to it?

If it needed to restart and some last publisher couldn't give it its latest entries, well they would deserve that rollback to the last publish from a good publisher..

singron a day ago

The publishers can't entirely do the storage themselves since the whole point of CT is that they can't retract anything. If they did their own storage, they could rollback any change. Even if the log forms a verification chain, they could do a rollback shortly after issuing a certificate without arousing too much suspicion.
Maybe there is an acceptable way to shift long-term storage to CAs while using CT verifiers only for short term storage? E.g. they keep track of their last 30 days of signatures for a CA, which can then get cross-verified by other verifiers in that timeframe.
The storage requirements don't seem that bad though and it might not be worth any reduced redundancy and increased complexity for a different storage scheme. E.g. what keeps me from doing this is the >1Gbps and >1 pager requirements.
- NoahZuniga 14 hours ago
  
  > Even if the log forms a verification chain, they could do a rollback shortly after issuing a certificate without arousing too much suspicion.
  This is not true. A rollback is instantly noticeable (because the consistency of Signed True Heads can not be demonstrated) and is a very large failure of the log. What could happen is that a log issues a Signed Certificate Timestamp that can be used to show browsers that the cert is in the log, but never incorporating said cert in the log. This is less obvious, but doing this maliciously isn't really going to achieve much because all certs have to be logged in at least 2 logs to be accepted by browsers.
  > Maybe there is an acceptable way to shift long-term storage to CAs while using CT verifiers only for short term storage? E.g. they keep track of their last 30 days of signatures for a CA, which can then get cross-verified by other verifiers in that timeframe.
  An important source of stress in the PKI community is that there are many CAs, and a significant portion of them don't really want the system to be secure. (Their processes are of course perfect, so all this certificate logging is just them being pestered). Browser operators (and other cert users) do want the system to be secure.
  An important design goal for CT was that it would require very little extra effort from CAs (and this drove many compromises). Google and other members of the CA/Browser would rather spend their goodwill on things that make the system more secure (ie shorter certificate lifetimes) than on getting CAs to pay for operating costs of CT logs. The cost for google to host a CT log is very little.
- torbid a day ago
  
  If CAs have to share CTs and have to save everything the CT would save to their last submission then no CA can destroy the log without colluding with other CAs.
  (I.e. your log ends abruptly but polling any other CA that published to the same CT shows there is more including reasons to shut you down.)
  I don't see how a scheme where the CT signer has this responsibility makes any sense. If they stop operating because they are sick of it, all the CAs involved have a somewhat suspicious looking CT history on things already issued that has to be explained instead of having always had the responsibility to provide the history up to anything they have signed whether or not some CT goes away.
michaelt a day ago

The point of CT logging is to ensure a person can ask "What certificates were issued for example.com?" or "What certificates were issued by Example CA?" and get an answer that's correct - even if the website or CA fucked up or got hacked and certificates are in the hands of people who've tried to cover their tracks.
This requires the logs be held by independent parties, and retained forever.
- torbid a day ago
  
  I understand that. But..
  If 12 CAs send to the same log and all have to save up to their latest entry not to be declared incompetent to be CAs, how would all 12 possibly do a worse job of providing that log on demand than a random 3rd party who has no particular investment at risk?
  (Every other CA in a log is a 3rd party with respect to any other, but they are one who can actually be told to keep something indefinitely because they would also need to return it for legitimizing their own issuance.)
  
  michaelt a day ago
  
  As far as I know, CAs don't have to "save up to their latest entry"
  The info they get back from the CT log may be a Merkle Hash that partly depends on the other entries in the log - but they don't have to store the entire log, just a short checksum.
  
  torbid a day ago
  
  Right and this is what I am saying is backwards with the protocol. It is not in anyone's best interest that some random 3rd party takes responsibility to preserve data for CAs indefinitely to prove things. The CA should identify where it has its copy in the extension and looking at one CAs copy one would find every other CAs copy of the same CT log.

tonymet a day ago

Is any amateur or professional auditing done on the CA system? Something akin to amateur radio auditing?

Consumers and publishers take certificates and certs for granted. I see many broken certs, or brands using the wrong certs and domains for their services.

SSL/TLS has done well to prevent eavesdropping, but it hasn't done well to establish trust and identity.

sleevi a day ago

All the time. Many CA distrust events involved some degree of “amateurs” reporting issues. While I hesitate to call commenters like agwa an amateur, it certainly was not professionally sponsored work by root programs or CAs. This is a key thing that Certificate Transparency enables: amateurs, academics, and the public at large to report CA issues.
At the same time, it sounds like the issues you describe aren’t CA/issuance issues, but rather, simple misconfigurations. Those aren’t incidents for the ecosystem, although definitely can be disruptive to the site, but I also wouldn’t expect them to call trust or identity into disrepute. That’d be like arguing my drivers license is invalid if I handed you my passport; giving you the wrong doc doesn’t invalidate the claims of either, just doesn’t address your need.
- tonymet 9 hours ago
  
  it seems more ad-hoc, bounty-driven , rather than systematic. is that a fair perspective?
  
  agwa 3 hours ago
  
  I wish there were bounties :-)
  There is systematic checking - e.g. crt.sh continuously runs linters on certificates found in CT logs, I continuously monitor domains which are likely to be used in test certificates (e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=1496088), and it appears the Chrome root program has started doing some continuous compliance monitoring based on CT as well.
  But there is certainly a lot of ad-hoc checking by community members and academics, which as Sleevi said is one of the great things that CT enables.
oasisbob a day ago

Yup, it happens. There was a case I remember where a CA was issuing certs using the .int TLD for their own internal use, which it should not be doing.
Happened to see it in the CT logs, and when that CA next came up for discussion on the Mozilla dev security policy list, their failure to address and disclose the misissuance in a timely manner was enough to stop the process to approve their request for EV recognition, and it ended in a denial from Mozilla.
dlgeek a day ago

Yes. All CAs trusted by browsers have to go through WebTRUST or ETSI audits by accredited auditors.
See https://www.mozilla.org/en-US/about/governance/policies/secu... and https://www.ccadb.org/auditors and https://www.ccadb.org/policy#51-audit-statement-content
- tptacek a day ago
  
  As I understand them, these are accounting audits, similar (if perhaps more detail) to a SOC2. The real thing keeping CAs from being gravely insecure is the CA death penalty Google will inflict if a CA suffers a security breach that results in any kind of misissuance.
  
  creatonez a day ago
  
  It's not just Google, but also Mozilla, Apple, and Microsoft. They all work together on shutting down bad behavior.
  Apple and Microsoft mainly have power because they control Safari and Edge. Firefox is of course dying, but they still wield significant power because their trusted CA list is copied by all the major Linux distributions that run on servers.
  
  tptacek a day ago
  
  Sure. I think Google and Mozilla have been the prime movers to date, but everyone has upped their game since Verisign/Symantec.
- tonymet 9 hours ago
  
  that's good news about the CA's , but how about the publisher certificates that are in use?
Spivak a day ago

I think over the years trust and identity have gone out of scope for TLS—I think for the better. Your identity is your domain and it's not TLS's problem to connect that identity to any real life person or legal entity. I'm sure you still can buy EV certs but no one really cares about them anymore. Certainly browsers no longer care about them. And TLS makes no claim on the trustworthiness of the site you're connecting to, just that the owner of the cert proved control of the domain and that your connection is encrypted.
I can't even imagine how much a pain it would be to try and moderate certs based on some consistent international notion of trustworthiness. I think the best you could hope to do is have 3rd parties like the BBB sign your cert as a way of them "vouching" for you.
- NovemberWhiskey a day ago
  
  Meet the QWAC.
  https://en.m.wikipedia.org/wiki/Qualified_website_authentica...

gslin a day ago

> You Should Run a Certificate Transparency Log

And:

> Bandwidth: 2 – 3 Gbps outbound.

I am not sure if this is correct, is 2-3Gbps really required for CT?

remus a day ago

It seems like Fillipo has been working quite closely with people running existing ct logs to try and reduce the requirements for running a log, so I'd assume he has a fairly realistic handle on the requirements.
Do you have a reason to think his number is off?
- gslin 20 hours ago
  
  Let's Encrypt issues 9M certs per day (https://letsencrypt.org/stats/), and its market share is 50%+ (https://w3techs.com/technologies/overview/ssl_certificate), so I assume there are <20M certs issued per day.
  If all certs are sent to just one CT log server, and each cert generates ~10KBytes outbound traffic, it's ~200GB/day, or ~20Mbps (full & even traffic), not in the same ballpark (2-3Gbps).
  So I guess there are something I don't understnad?
  
  bo0tzz 19 hours ago
  
  I've been trying to get an understanding of this number myself as well. I'm not quite there yet, but I believe it's talking about read traffic, ie serving clients that are looking at the log, not handling new certificates coming in.
  
  FiloSottile 18 hours ago
  
  I added a footnote about it. It’s indeed read traffic, so it’s (certificate volume x number of monitors x compression ratio) on average. But then you have to let new monitors catch up, so you need burst.
  It’s unfortunately an estimate, because right now we see 300 Mbps peaks, but as Tuscolo moves to Usable and more monitors implement Static CT, 5-10x is plausible.
  It might turn out that 1 Gbps is enough and the P95 is 500 Mbps. Hard to tell right now, so I didn’t want to get people in trouble down the line.
  Happy to discuss this further with anyone interested in running a log via email or Slack!
  
  bo0tzz 16 hours ago
  
  Thanks, that clarifies a lot!
- ApeWithCompiler a day ago
  
  > or an engineer looking to justify an overprovisioned homelab
  In Germany 2 – 3 Gbps outbound is a milestone, even for enterprises. As a individual I am privileged to have 250Mbs down/50Mbs up.
  So it`s at least off by what any individual in this country could imagine.
  
  nucleardog 9 hours ago
  
  Yeah the requirements aren't too steep here. I could easily host this in my "homelab" if I gave a friend a key to access my utility room if I were away / unavailable.
  But 2-3Gbps of bandwidth makes this pretty inaccessible unless you're just offloading the bulk of this on to CloudFront/CloudFlare at which point... it seems to me we don't really have more people running logs in a very meaningful sense, just somebody paying Amazon a _lot_ of money. If I'm doing my math right this is something like 960TB/mo which is like a $7.2m/yr CloudFront bill. Even some lesser-known CDN providers we're still talking like $60k/yr.
  Seems to me the bandwidth requirement means this is only going to work if you already have some unmetered connections laying around.
  If anyone wants to pay the build out costs to put an unmetered 10Gbps line out to my house I'll happily donate some massively overprovisioned hardware, redundant power, etc!
  
  jeroenhd a day ago
  
  You can rent 10gbps service from various VPS providers if you can't get the bandwidth at home. Your home ISP will probably have something to say about a continuous 2gbps upstream anyway, whether it's through data caps or fair use policy.
  Still, even in Germany, with its particularly lacking internet infrastructure for the wealth the country possesses, M-net is slowly rolling out 5gbps internet.
nomaxx117 14 hours ago

I wonder how much putting a CDN in front of this would reduce this.
According to the readme, it seems like the bulk of the traffic is highly cacheable, so presumably you could park something a CDN in front and substantially reduce the bandwidth requirements.
- mcpherrinm 13 hours ago
  
  Yes, the static-ct api is designed to be highly cacheable by a CDN.
  That is one of the primary motivations of its design over the previous CT API, which had some relatively flexible requests that led to less good caching.
xiconfjs a day ago

So we are talking about 650TB+ traffic per month or $700 per month just for bandwith…so surr not a one-man-project
- dpifke 12 hours ago
  
  I pay roughly $800/mo each for two 10 Gbps transit connections (including cross-connect fees), plus $150/mo for another 10 Gbps peering connection to my local IX. 2-3 Gbps works out to less than $200/mo. (This is at a colo in Denver for my one-man LLC.)
- dilyevsky 13 hours ago
  
  If you’re paying metered you’re off by an order of magnitude - much more expensive. Even bandwidth based transit will be more expensive than that at most colos

gslin a day ago

The original article seems deleted, so https://archive.ph/TTXnK this.

FiloSottile a day ago

My bad! This is what I get for doing a deploy to fix the layout while the post is on HN. Back up now.
- ysnp 13 hours ago
  
  https://words.filippo.io/passkey-encryption/ also seems to be gone?
  
  FiloSottile 12 hours ago
  
  That instead was a draft that should have not gone out yet, but the API filter didn’t work :)
  I’ll mail that one towards the end of the week.

gucci-on-fleek a day ago

https://web.archive.org/web/20250707205158/https://words.fil...

ncrmro a day ago

Seems like something that might be useful to store on Arweave a block chain for storage. Fees go to an endowment that’s has been calculated to far exceed the cost of growing storage

udev4096 a day ago

Instead of mainstreaming DANE, you want me to help a bunch of centralized CAs? No thanks. DANE is the future, it will happen

jcgl 14 hours ago

I like the idea and I like DNSSEC too (well, well enough at least—lots of tooling could be better), but DANE can’t catch on faster than DNSSEC does. And DNSSEC isn’t exactly taking the world by storm.

cypherpunks01 a day ago

How does the CT system generally accomplish the goal of append-only entries, with public transparency of when entries were made?

Is this actually a good use case for (gasp) blockchains? Or would it be too much data?

1vuio0pswjnm7 13 hours ago

Original HN title: "You Should Run a Certificiate Transparency Log"

johnklos a day ago

[flagged]

johnklos a day ago

I suppose we've got a lot of Google fans here! Do you like not being able to contact anyone there? You could be a Youtube creator with a million followers and you'll never correspond with anyone with any control over anything at Google ;)
- danpalmer a day ago
  
  This just doesn't match my experience.
  People love to say it, but when we had GSuite issues at my previous workplace we spoke to GSuite support and had a resolution quickly. When we had GCP queries we spoke to our account manager who gave us a technical contact who escalated internally and got us the advice we needed. When we asked about a particular feature we were added to the alpha stage of an in-development product and spoke with the team directly about that. I've got friends who have had various issues with Pixel phones over the years and they just contact support and get a replacement or fix or whatever.
  Meanwhile I've seen colleagues go down the rabbit hole of AWS support and have a terrible time. For us it was fine but nothing special, I've never experienced the amazing support that I've heard some people talk about.
  We were a <100 person company with a spend quite a bit less than many companies of our size. From what I've heard from YouTubers with a million followers, they have account managers and they always seem to encourage talking to account managers.
  
  resize2996 a day ago
  
  Just to add my own anecdata: My experience with Pixel/GoogleFi support has been some of the worst customer support I've ever experienced, and I have given them boatloads of money.
  source: I used to do vendor relations for a large public org where contractors (medium tech companies) would routinely try to skirt the line on what they had to deliver. I would rather deal with them than GoogleFi, because in that situation there was a certain point where I could give up and hand it off to our lawyers.
  
  toast0 a day ago
  
  > People love to say it, but when we had GSuite issues at my previous workplace we spoke to GSuite support and had a resolution quickly.
  That certainly wasn't my experience. Unless 'we're not going to help you' counts as a resolution. We did get a response quickly, but there was no path to resolving the issues I had other just ignoring the issues.
  
  johnklos a day ago
  
  But you're giving Google money.
  I should've qualified what I wrote, but what I mean is that no matter who you are, if you don't know someone there and aren't paying them money, there's no way to communicate with humans there.
  It's like companies that won't let you sign up unless you give them a cell phone number, but not only do they not have a number themselves, they don't even have email. Or, for companies like Verizon, they don't have email, but they have phone numbers with countless layers of "voice assistants" you can't skip. It's a new way of "communicating" that's just crazymaking.
  
  danpalmer a day ago
  
  That's true of most companies, unless you're a customer or they think they can sell you something, they're unlikely to give you much time even if you can theoretically call them up.
  In this case, you point to the hypocrisy of being uncontactable but demanding your contact details, except that Google does provide support to customers, and in this relationship they are essentially a customer of your CT log, and given the criticality of that service they rightly expect the service provider to be held to a high standard. I don't think they're holding you to a standard that they themselves wouldn't agree to be held to for a service that critical. I've got to make it clear that this is my personal opinion though.

dboreham a day ago

Add an incentive mechanism to motivate runn a server, and hey it's a blockchain. But those have no practical application so it must not be a blockchain..

schoen a day ago

There is some historical connection between CT and blockchains.
http://www.aaronsw.com/weblog/squarezooko
Ben Laurie read this post by Aaron Swartz while thinking about how a certificate transparency mechanism could work. (I think Peter Eckersley may have told him about it!) The existence proof confirmed that we sort of knew how to make useful append-only data structures with anonymous writers.
CT dropped the incentive mechanism and the distributed log updates in favor of more centralized log operation, federated logging, and out-of-band audits of identified log operators' behavior. This mostly means that CT lacks the censorship resistance of a blockchain. It also means that someone has to directly pay to operate it, without recouping the expenses of maintaining the log via block rewards. And browser developers have to manually confirm logs' availability properties in order to decide which logs to trust (with -- returning to the censorship resistance property -- no theoretical guarantee that there will always be suitable logs available in the future).
This has worked really well so far, but everyone is clear on the trade-offs, I think.
Dylan16807 a day ago

Yes, that is correct. (Other than the word "must"? I'm not entirely sure your intent there.) This is close to a blockchain in some ways, but a blockchain-style incentive mechanism would be a net negative, so it doesn't have that.
If you figure out a good way to involve an incentive structure like that, let us know!
some_random 14 hours ago

I'm happy to offer an incentive of 100 Cert Points (issued by me, redeemable with me at my discretion) to anyone running CT /s
In all seriousness, the incentive is primarily in having the data imo