If you're going to host user content on subdomains, then you should probably have your site on the Public Suffix List https://publicsuffix.org/list/ .
That should eventually make its way into various services so they know that a tainted subdomain doesn't taint the entire site....
In the past, browsers used an algorithm which only denied setting wide-ranging cookies for top-level domains with no dots (e.g. com or org). However, this did not work for top-level domains where only third-level registrations are allowed (e.g. co.uk). In these cases, websites could set a cookie for .co.uk which would be passed onto every website registered under co.uk.
Since there was and remains no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain (the policies differ with each registry), the only method is to create a list. This is the aim of the Public Suffix List.
(https://publicsuffix.org/learn/)
So, once they realized web browsers are all inherently flawed, their solution was to maintain a static list of websites.
God I hate the web. The engineering equivalent of a car made of duct tape.
> Since there was and remains no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain
A centralized list like this not just for domains as a whole (e.g. co.uk) but also specific sites (e.g. s3-object-lambda.eu-west-1.amazonaws.com) is both kind of crazy in that the list will bloat a lot over the years, as well as a security risk for any platform that needs this functionality but would prefer not to leak any details publicly.
We already have the concept of a .well-known directory that you can use, when talking to a specific site. Similarly, we know how you can nest subdomains, like c.b.a.x, and it's more or less certain that you can't create a subdomain b without the involvement of a, so it should be possible to walk the chain.
Example:
c --> https://b.a.x/.well-known/public-suffix
b --> https://a.x/.well-known/public-suffix
a --> https://x/.well-known/public-suffix
Maybe ship the domains with the browsers and such and leave generic sites like AWS or whatever to describe things themselves. Hell, maybe that could also have been a TXT record in DNS as well.
> any platform that needs this functionality but would prefer not to leak any details publicly.
I’m not sure how you’d have this - it’s for the public facing side of user hosted content, surely that must be public?
> We already have the concept of a .well-known directory that you can use, when talking to a specific site.
But the point is to help identify dangerous sites, by definition you can’t just let the sites mark themselves as trustworthy and rotate around subdomains. If you have an approach that doesn’t have to trust the site, you also don’t need any definition at the top level you could just infer it.
It's actually exactly the same concept that come to mind for me. `SomeUser.geocities.com` is "tainted", along with `*.geocities.com`, so `geocities.com/.wellknown/i-am-tainted` is actually reasonable.
Although technically it might be better as `.wellknown/taint-regex` (now we have three problems), like `TAINT "*.sites.myhost.com" ; "myhost.com/uploads/*" ; ...`
a.scamsite.com gets blocked so they just put their phishing pages on
b.scamsite.com
The psl or your solution isn’t a “don’t trust subdomains” notification it’s “if one subdomain is bad, you should still trust the others” and the problem there is you can’t trust them.
You could combine the two, but you still need the suffix list or similar curation.
Doing this DNS in the browser in real-time would be a performance challenge, though. PSL affects the scope of cookies (github.io is on the PSL, so a.github.io can't set a cookie that b.github.io can read). So the relevant PSL needs to be known before the first HTTP response comes back.
It does smell very much like a feature that is currently implemented as a text file but will eventually need to grow to its own protocol, like, indeed, the hostfile becoming DNS.
One key difference between this list and standard DNS (at least as I understand it; maybe they added an extension to DNS I haven't seen) is the list requires independent attestation. You can't trust `foo.com` to just list its subdomains; that would be a trivial attack vector for a malware distributor to say "Oh hey, yeah, trustme.com is a public suffix; you shouldn't treat its subdomains as the same thing" and then spin up malware1.trustme.com, malware2.trustme.com, etc. Domain owners can't be the sole arbiter of whether their domain counts as a "public suffix" from the point of view of user safety.
Neither is nominating a third party for your parking fine.
The point is to get away from centralized gatekeepers, not establish more of them. A hierarchy of disavowal. It’s like cache invalidation for accountability.
If you don’t wanna be held responsible for something, you’d better be prepared to point the finger at someone whois.
Jeep just had an OTA update cause the car to shut down on the highway (it is rumored).
Before we put computers in cars, we had the myriad small things that would break (stuck doors, stuck windows, failed seals, leaking gaskets), a continuous stream of recalls for low-probability safety issues, and the occasional Gremlin or Pinto.
My favorite example is the Hyundai Elantra. They changed the alloy used in one of the parts in the undercarriage. Tested that model to death for a year, as they do, but their proving ground is in the southern United States.
Several winters later, it turns out that road salt attacks the hell out of that alloy and people have wheels flying off their cars in the middle of the road.
Not really. Does the car still drive? That sounds like a software bug; hardly indicative that the entire car is held together with duct tape, but a pretty bad bug non the less.
So i can't remember the specifics or find any references, but many years ago i remember reading about a car (prius maybe?) that would shut off and lock the doors when pulling away from a stop. (Ex: stopped at a red light, when it turns green the car would go far enough to cut off in the middle of an intersection then trap everyone inside.)
More accurate: a mom-n-pop grocery store has its listing on Google Maps changed to PERMANENTLY CLOSED DUE TO TOXIC HEALTH HAZARDS because the mom-n-pop grocery store didn't submit Form 26B/Z to Google. There was never any health hazard, but now everyone thinks there is, and nobody can/will go there. The fact that Form 26B/Z exists at all is problematic, but what makes it terrible is the way it's used to punish businesses for not filling out a form they didn't know existed.
This is an excellent analogy because it is incumbent upon businesses to follow all the laws, including the ones they don't know about. That's one of the reasons "lawyer" is a profession.
Google doesn't have the force of law (it's in this context acting more like a Yelp: "1 star review --- our secret shopper showed up and the manager didn't give the secret 'we are not criminals' hand sign"), but the basic idea is the same: there is a complex web of interactions that can impact your online presence and experts in the field you can choose to hire for consulting or not.
Didn't used to be that way, but the web used to be a community of 100,000 people, not 5.6 billion. Everything gets more complicated when you add more people.
The other commenter's analogy of a small-business is better I think, the issue with the browser problem is that it doesn't hinder one person getting to one house, it hinders all persons getting to one place the owner _wants_ people to get to easily.
The browser issue can destroy a small business, one thing I think we can universally agree we don't want. If all of the people who come looking for it find it's being marked as malicious or just can't get there at all, they lose customers.
Worse yet, is that Google holds the keys because everyone uses Chrome, and you have to play their game by their rules just to keep breathing.
Here's the thing though: if someone else held the keys, the scenario would be the same unless there was no safe browsing protection. And if there were no safe browsing protection, we'd be trading one ill for another; small business owners facing a much steeper curve to compete vs. everyone being at more risk from malware actors.
I honestly don't immediately know how to weigh those risks against each other, but I'll note that this community likely underestimates the second one. Most web users are not nearly as tech- or socially-savvy as the average HN reader and the various methods of getting someone to a malware subdomain are increasingly sophisticated.
Cookies shouldn't be tied to domains at all, it's a kludge. They should be tied to cryptographic keypairs (client + server). If the web server needs a cookie, it should request one (in its reply to the client's first request for a given url; the client can submit again to "reply" to this "request"). The client can decide whether it wants to hand over cookie data, and can withhold it from servers that use different or invalid keys. The client can also sign the response. This solves many different security concerns, privacy concerns, and also eliminates the dependency on specific domain names.
I just came up with that in 2 minutes, so it might not be perfect, but you can see how with a little bit of work there's much better solutions than "I check for not-evil domain in list!"
> They should be tied to cryptographic keypairs (client + server).
So now, if a website leaks its private key, attackers can exfiltrate cookies from all of its users just by making them open an attacker-controlled link, for as long as the cookie lives (and users don't visit the website to get the rotated key).
> If the web server needs a cookie, it should request one
This adds a round-trip, which slows down the website on slow connections.
> the client can submit again to "reply" to this "request"
This requires significantly overhauling HTTP and load-balancers. The public-suffix list exists because it's an easy workaround that didn't take a decade to specify and implement.
> So now, if a website leaks its private key, attackers can exfiltrate cookies from all of its users just by making them open an attacker-controlled link
This attack already exists in several forms (leaking a TLS private key, DNS hijack, CA validation attack, etc). You could tack a DNS name onto the crypto-cookies if you wanted to, but DNS is trivial to attack.
> This adds a round-trip, which slows down the website on slow connections.
Requests are already slowed down by the gigantic amount of cookies constantly being pushed by default. The server can send a reply-header once which will tell the client which URLs need cookies perpetually, and the client can store that and choose whether it sends the cookies repeatedly or just when requested. This gives the client much more control over when it leaks users' data.
> This requires significantly overhauling HTTP and load-balancers
No change is needed. Web applications already do all of this all the time. (example: the Location: header is frequently sent by web apps in response to specific requests, to say nothing of REST and its many different request and return methods/statuses/headers).
> The public-suffix list exists because it's an easy workaround
So the engine of modern commerce is just a collection of easy hacks. Fantastic.
> This attack already exists in several forms (leaking a TLS private key, DNS hijack, CA validation attack, etc).
An attacker who gets the TLS private key of a website can't use it easily, because they still need to fool users' browser into connecting to a server they control as the victim domain, which brings us to:
> You could tack a DNS name onto the crypto-cookies if you wanted to, but DNS is trivial to attack.
It's not. I can think of two ways to attack the DNS. Either 1. control or MITM of the victim's authoritative DNS server or 2. poison users' DNS cache.
> Requests are already slowed down by the gigantic amount of cookies constantly being pushed by default
Yes, although adding more data and adding a round-trip have different impacts (high-bandwidth high-latency connections exist). Lots of cookies and more round-trips is always worse than lots of cookies and a fewer round-trips.
> The server can send a reply-header once which will tell the client which URLs need cookies perpetually, and the client can store that and choose whether it sends the cookies repeatedly or just when requested.
Everyone hate configuring cache, so in most cases site operators will leave it to a default "send everything", and we're back to square one.
> No change is needed.
I was thinking that servers need to remember state between the initial client request and when the client sends an other request with the cookies. But on second thought that's indeed not necessary.
> So the engine of modern commerce is just a collection of easy hacks. Fantastic.
A part of the issue is IMO that browsers have become ridiculously bloated everything-programs. You could take about 90% of that out and into dedicated tools and end up with something vastly saner and safer and not a lot less capable for all practical purposes. Instead, we collectively are OK with frosting this atrocious layer cake that is today's web with multiple flavors of security measures of sometimes questionable utility.
"You could take about 90% of that out and into dedicated tools "
But then you would loose plattform independency, the main selling point of this atrocity.
Having all those APIs in a sandbox that mostly just work on billion devices is pretty powerful and a potential succesor to HTML would have to beat that, to be adopted.
The best thing to happen, that I can see, is that a sane subset crystalizes, that people start to use dominantly, with the rest becoming legacy, only maintained to have it still working.
But I do dream of a fresh rewrite of the web since university (and the web was way slimmer back then), but I got a bit more pragmatic and I think I understood now the massive problem of solving trusted human communication better. It ain't easy in the real world.
But do we need e.g serial port or raw USB access straight from a random website? Even WebRTC is a bit of a stretch. There is a lot of cruft in modern browsers that does little except increase attack surface.
This all just drives a need to come up with ever more tacked-on protection schemes because browsers have big targets painted on them.
You remove that, and videoconferencing (for business or person to person) has to rely on downloading an app, meaning whoever is behind the website has to release for 10-15 OSes now. Some already do, but not everyone has that budget so now there's a massive moat around it.
> But do we need e.g serial port or raw USB access straight from a random website
Being able to flash an IoT (e.g. ESP32) device from the browser is useful for a lot of people. For the "normies", there was also Stadia allowing you to flash their controller to be a generic Bluetooth/usb one on a website, using that webUSB. Without it Google would have had to release an app for multiple OSes, or more likely, would have just left the devices as paperweights. Also, you can use FIDO/U2F keys directly now, which is pretty good.
Browsers are the modern Excel, people complain that they do too much and you only need 20%. But it's a different 20% for everyone.
I'll flip that around on you: why oh why do we need to browsers to carry these security holes in them? The Stadia flasher is a good example: how do I know that a website doesn't contain a device flasher that will turn one of my connected devices into a malicious actor that will attempt to take over whatever machine it's plugged into?
You know because there is an explicit permission box that pops out and asks if you want to give this website access to a device, and asks you to select that device.
But that still gives completely unvetted direct access to the device to a website! People have been pointing to Itch.io games that supposedly require direct USB access. How hard is it to hide a script in there that reprograms a controller into something malicious?
If you download a executable from a website and run it .. pretty much the same thing?
If you give USB access, it is not really a website anymore, rather a app delivered through the web. I don't see a fundamental difference in trust.
I rather am able to verify the web based version easier and I certainly won't give access to a random website, just like I don't download random exes from websites.
Performance is lower, yes and well ... like I said, it is all a big mess. Just look at the global namespace in js. I still use it because of that power feature called plattform independence. What I release, people can (mostly) just use. I (mostly) don't care which OS the user has.
A fule thst lands on my hard drive is aztomatically scanned for malware. That same kindof protection isn't in place against malicious scripts downloaded by my broswer via an opaque HTTPS connection and run in process.
While that's pretty convenient, I'm worried about what happens when the vendor shuts down the website. "Ugly broken vendor tools" can be run forever in a VM of an old system, but a website would be gone forever unless it's purely client-side and someone archived it.
WebRTC I use since many years and would miss it a lot. P2P is awesome.
WebUSB I don't use or would miss it right now, but .. the main potential use case is security and it sounds somewhat reasonable
"Use in multi-factor authentication
WebUSB in combination with special purpose devices and public identification registries can be used as key piece in an infrastructure scale solution to digital identity on the internet."
> But do we need e.g serial port or raw USB access straight from a random website?
But do we need audio, images, Canvas, WebGL, etc? The web could just be plain text and we’d get most of the “useful” content still, add images and you get a vast majority of it.
But the idea that the web is a rich environment that has all of these bells and whistles is a good thing imo. Yes there’s attack surface to consider, and it’s not negligible. However, the ability to connect so many different things opens up simple access to things that would otherwise require discrete apps and tooling.
One example that kind of blew my mind is that I wanted a controller overlay for my Twitch stream. After a short bit of looking, there isn’t even a plugin needed in OBS (streaming software). Instead, you add a Web View layer and point it to GamePad Viewer[1] and you’re done.
Serial and USB are possibly a boon for very specific users with very specific accessibility needs. Also, iirc some of the early iPhone jailbreaks worked via websites on a desktop with your iPhone plugged into usb. Sure these are niche, and could probably be served just as well or better with native apps, and web also makes the barrier to entry so much lower .
Every decent host OS already has a dedicated driver stack to provide game controller input to applications in a useful manner. Why the heck would you ship a reimplementation of that in JS in a website?
If it hasn't been invented yet, you don't need driver software for it, do you? ;)
Anyway, in your scenario the controller would be essentially a one off and you'd be better off writing a native app to interface with it for the one computer this experiment will run on.
Unlikely. The convenience incentives are far too high to leave features on the table.
Not unlike the programming language or the app (growing until it half-implements LISP or half-implements an email client), the browser will grow until it half-implements an operating system.
> Having all those APIs in a sandbox that mostly just work on billion devices is pretty powerful and a potential succesor to HTML would have to beat that, to be adopted.
I think the giant major downside, is that they've written a rootkit that runs on everything, and to try to make up for that they want to make it so only sites they allow can run.
It's not really very powerful at all if nobody can use it, at that point you are better off just not bothering with it at all.
The Internet may remain, but the Web may really be dead.
> to try to make up for that they want to make it so only sites they allow can run
What do you mean, you can run whatever you want on localhost, and it's quite easy to host whatever you want for whoever you want too. Maybe the biggest modern added barrier to entry is that having TLS is strongly encouraged/even needed for some things, but this is an easily solved problem.
>A part of the issue is IMO that browsers have become ridiculously bloated everything-programs.
I don't see how that solves the issue that PSL tries to fix. I was a script kiddy hosting neopets phishing pages on free cpanel servers from <random>.ripway.com back in 2007. Browsers were way less capable then.
PSL and the way cookies work is just part of the mess. A new approach could solve that in a different way, taking into account all the experience we had with scriptkiddies and professional scammers and pishers since then. But I also don't really have an idea where and how to start.
And of course, if the new solution completely invalidates old sites, it just won't get picked up. People prefer slightly broken but accessible to better designed but inaccessible.
> People prefer slightly broken but accessible to better designed but inaccessible.
We live in world where whatever faang adopts is de facto a standard. Accessible these days means google/gmail/facebook/instagram/tiktok works. Everything else is usually forced to follow along.
People will adopt whatever gives them access to their daily dose of doomscrolling and then complain about rather crucial part of their lives like online banking not working.
> And of course, if the new solution completely invalidates old sites, it just won't get picked up.
Old sites don't matter, only high-traffic sites riddled with dark patterns matter. That's the reality, even if it is harsh.
> People prefer slightly broken but accessible to better designed but inaccessible.
It's not even broken as the edge cases are addressed by ad-hoc solutions.
OP is complaining about global infrastructure not having a pristine design. At best it's a complain over a desirable trait. It's hardly a reason to pull the Jr developer card and mindlessly advocate for throwing everything out and starting over.
Are you saying we should make a <Unix Equivalent Of A Browser?> A large set of really simple tools that each do one thing really really really pedantically well?
This might be what's needed to break out of the current local optimum.
Since this is being downvoted: no, I'm quite serious.
CORS lets sites define their own security boundaries between subdomains, with mutual validation. If you're hosting user content in a subdomain, just don't allow-origin it: that is a clear statement that it's not "the same site". PSL plays absolutely no part in that logic, it seems clear to me that it's at least in part intended to replace the PSL.
Do other sites (like google's safety checks) use CORS for this purpose? Dunno. Seems like they could though? Or am I missing something?
> God I hate the web. The engineering equivalent of a car made of duct tape.
Most of the complex thing I have seen being made (or contributed to) needed duct tape sooner or later. Engineering is the art of trade-offs, of adapting to changing requirements (that can appear due to uncontrollable events external to the project), technology and costs.
Why would you compare Web to that? A first fax message would be more appropriate comparison.
Web is not a new thing and hardly a technical experiment of a few people any more.
If you add the time since announcing the concept of Web to that trip date, you have a very decent established industry already. With many sport and mass production designs:
For me the web is something along the lines at the definition of: https://en.wikipedia.org/wiki/World_Wide_Web to sum up "...universal linked information system...". I think the fax misses many aspects of the core definition to be a good comparison.
Not sure what is your point about "decent established industry" if we relate to "duct tape". I see two possibilities:
a) you imply that the web does not have a decent established industry (but I would guess not).
> 5. Once running, push choke in gradually, advance spark, reduce throttle.
Not sure about your opinion but compared to what a car's objective is (move from point A to point B) to me that sounds rather involved. Not sure if it qualifies as "duct-tape" but definitely it is not a "nicely implemented system that just works".
To resume my point: I think on average progress is slower and harder than people think. And that is mostly because people do not have exposure to the work people are doing to improve things until something can become more "widely available".
I think it's somewhat tribal webdev knowledge that if you host user generated content you need to be on the PSL otherwise you'll eventually end up where Immich is now.
I'm not sure how people not already having hit this very issue before is supposed to know about it beforehand though, one of those things that you don't really come across until you're hit by it.
I think what gets me more is I don't see an easy way to add suffixes to the list. I'm sure if I dig I can figure it out but you'd think given how its used they'd have an obvious step by step guide on the website
Ok so we need a GitHub (Microsoft) account to avoid needing a Google account to in case some undocumented system decides to shut down a website we host. Great.
Besides user uploaded content it's pretty easy to accidentally destroy the reputation of your main domain with subdomains.
For example:
1. Add a subdomain to test something out
2. Complete your test and remove the subdomain from your site
3. Forget to remove the DNS entry and now your A record points to an IP address
At this point if someone else on that hosting provider gets that IP address assigned, your subdomain is now hosting their content.
I had this happen to me once with PDF books being served through a subdomain on my site. Of course it's my mistake for not removing the A record (I forgot) but I'll never make that mistake again.
10 years of my domain having a good history may have gotten tainted in an unrepairable way. I don't get warnings visiting my site but traffic has slowly gotten worse over time since around that time, despite me posting more and more content. The correlation isn't guaranteed, especially with AI taking away so much traffic but it's something I do think about.
They clearly are? It seems like GitHub users submitting a PR could/can add a `preview` label, and that would lead to the application + their changes to be deployed to a public URL under "*.immich.cloud". So they're hosted content generated by users (built application based on user patches) on domains under their control.
Ah, then that's a different situation then, sorry for misunderstanding the context and thanks for clearing that up! I was under the impression that Immich accepted outside contributions, and those would also have those preview sites created for their pending contributions.
Don't get me wrong, Google is bad/evil in many ways, but the public suffix list exists to solve a real risk to users. Google is flagging this for a legit reason in this particular case.
It's not a legit reason at all. A website isn't "unsafe" just because it looks similar to another one to Google's AI. At best such an automated flag should trigger a human review, not take the website offline.
Google needs to be held liable for the damages they do in cases like this or they will continue to implement the laziest solutions as long as they can externalize the costs.
> the safe search guards catch more bad actors than they false positive good actors.
Well, if the legal system used the same "Guilty until proven innocent" model, we would definitely "catch more bad actors than false positive good actors".
You do not want malware protection to be running at the speed of the legal system.
A better analogy, unfortunately for all the reasons it's unfortunate, is police: acting on the partial knowledge in the field to try to make the not-worst decision.
> people keep using Google's browser because the safe search guards catch more bad actors than they false positive good actors.
This is the first thing i disable in Chrome, Firefox and Edge. The only safe thing they do is safely sending all my browsing history to Google or Microsoft.
That's a reasonable thing for you to do (especially if you have some other signal source you use for malware protection), but HN readers are rarely representative of average users.
This feature is there for my mother-in-law, who never saw a popup ad she didn't like. You might think I'm kidding; I am not. I periodically had to go into her Android device and dump twenty apps she had manually installed from the Play Store because they were in a ring of promoting each other.
Looking through some of the links in this post, I there are actually two separate issues here:
1. Immich hosts user content on their domain. And should thus be on the public suffic list.
2. When users host an open source self hosted project like immich, jellyfin, etc. on their own domain it gets flagged as phishing because it looks an awful lot like the publicly hosted version, but it's on a different domain, and possibly a domain that might look suspicious to someone unfamiliar with the project, because it includes the name of the software in the domain. Something like immich.example.com.
The first one is fairly straightforward to deal with, if you know about the public suffix list. I don't know of a good solution for the second though.
I don't think the Internet should be run by being on special lists (other than like, a globally run registry of domain names)...
I get that SPAM, etc., are an issue, but, like f* google-chrome, I want to browse the web, not some carefully curated list of sites some giant tech company has chosen.
A) you shouldn't be using google-chrome at all B) Firefox should definitely not be using that list either C) if you are going to have a "safe sites" list, that should definitely be a non-profit running that, not an automated robot working for a large probably-evil company...
> I don't think the Internet should be run by being on special lists
People are reacting as if this list is some kind of overbearing way of tracking what people do on the web - it's almost the opposite of that. It's worth clarifying this is just a suffix list for user-hosted content. It's neither a list of user-hosted domains nor a list of safe websites generally - it's just suffixes for a very small specific use-case: a company providing subdomains. You can think of this as a registry of domain sub-letters.
For instance:
- GitHub.io is on the list but GitHub.com is not - GitHub.com is still considered safe
- I self-host an immich instance on my own domain name - my immich instance isn't flagged & I don't need to add anything to the list because I fully own the domain.
The specific instance is just for Immich themselves who fully own "immich.cloud" but sublet subdomains under it to users.
> *if you are going to have a "safe sites" list"
This is not a safe sites list! This is not even a sites list at all - suffixes are not sites. This also isn't even a "safe" list - in fact it's really a "dangerous" list for browsers & various tooling to effectively segregate security & privacy contexts.
Google is flagging the Immich domain not because it's missing from the safe list but because it has legitimate dangers & it's missing from the dangerous list that informs web clients of said dangers so they can handle them appropriately.
Firefox and Safari also use the list. At least by default, I think you can turn it off in firefox. And on the whole, I think it is valuable to have _a_ list of known-unsafe sites. And note that Safe Browsing is a blocklist, not an allowlist.
The problem is that at least some of the people maintaining this list seem to be a little trigger happy. And I definitely thing Google probably isn't the best custodian of such a list, as they have obvious conflicts of interest.
> I think it is valuable to have _a_ list of known-unsafe sites
But this is not that list because sites are added using opaque automated processes that are clearly not being reviewed by humans - even if those sites have been removed previously after manual review.
I've coined the phrase "Postel decentralization" to refer to things where people expect there to be some distributed consensus mechanism but it turned out that the design of the internet was to email Jon Postel (https://en.wikipedia.org/wiki/Jon_Postel) to get your name on a list. e.g. how IANA was originally created.
Oh god, you reminded me the horrors of hosting my own mailserver and all of the white/blacklist BS you have to worry about being a small operator (it's SUPER easy to end up on the blacklists, and is SUPER hard to get onto whitelists)
> I don't know of a good solution for the second though.
I know the second issue can be a legitimate problem but I feel like the first issue is the primary problem here & the "solution" to the second issue is a remedy that's worse than the disease.
The public suffix list is a great system (despite getting serious backlash here in HN comments, mainly from people who have jumped to wildly exaggerated conclusions about what it is). Beyond that though, flagging domains for phishing for having duplicate content smells like an anti-self-host policy: sure there's phishers making clone sites, but the vast majority of sites flagged are going to be legit unless you employ a more targeted heuristic, but doing so isn't incentivised by Google's (or most company's) business model.
> When users host an open source self hosted project like immich, jellyfin, etc. on their own domain...
I was just deploying your_spotify and gave it your-spotify.<my services domain> and there was a warning in the logs that talked about thud, linking the issue:
The second is a real problem even with completely unique applications. If they have UI portions that have lookalikes, you will get flagged. At work, I created an application with a sign-in popup. Because it's for internal use only, the form in the popup is very basic, just username and password and a button. Safe Browsing continues to block this application to this day, despite multiple appeals.
Even the first one only works if there's no need to have site-wide user authentication on the domain, because you can't have a domain cookie accessible from subdomains anymore otherwise.
I thought this story would be about some malicious PR that convinced their CI to build a page featuring phishing, malware, porn, etc. It looks like Google is simply flagging their legit, self-created Preview builds as being phishing, and banning the entire domain. Getting immich.cloud on the PSL is probably the right thing to do for other reasons, and may decrease the blast radius here.
Please point me to where GoDaddy or any other hosting site mentions public suffix, or where Apple or Google or Mozilla have a listing hosting best practices that include avoiding false positives by Safe Browsing…
>GoDaddy or any other hosting site mentions public suffix
They don't need to mention it because they handle it on behalf of the client. Them recommending best practices like using separate domains makes as much sense as them recommending what TLS configs to use.
>or where Apple or Google or Mozilla have a listing hosting best practices that include avoiding false positives by Safe Browsing…
Since were those sites the go to place to learn how to host a site? Apple doesn't offer anything related to web hosting besides "a computer that can run nginx". Google might be the place to ask if you were your aunt and "google" means "internet" to her. Mozilla is the most plausible one because they host MDN, but hosting documentation on HTML/CSS/JS doesn't necessarily mean they offer hosting advice, any more than expecting docs.djangoproject.com to contain hosting advice.
Nothing in this article indicates UGC is the problem. It's that Google thinks there's an "official" central immich and these instances are impersonating it.
What malicious UGC would you even deliver over this domain? An image with scam instructiins? CSAM isn't even in scope for Safe Browsing, just phishing and malware.
It's not a "service" at all. It's Google maliciously inserting themselves into the browsing experience of users, including those that consciously choose a non-Google browser, in order to build a global web censorship system.
>You might not think it is, but internet is filled utterly dangerous, scammy, phisy, malwary websites
Google is happy to take their money and show scammy ads. Google ads are the most common vector for fake software support scams. Most people google something like "microsoft support" and end up there. Has Google ever banned their own ad domains?
Google is the last entity I would trust to be neutral here.
The argument would work better if Google wasn't the #1 distributor of scams and malware in the world with adsense. (Which strangely isn't flagged by safe browsing, maybe a coincidence)
Is this a rhetoric question? Safari is just a middleman. G offers seemingly free services in exchange of your data and in order to get a market monopoly. Then they can sell you to their advertisers, squeeze out the competition and become the only Sheriff in town. How many free lunches you have got in your career?
People working for famous adtech companies don't like it when people like op burst their bubble. I myself don't like it one bit - keep on changing the world you beautiful geniuses!
Exactly! Most of HN users work for "big tech" and are complete sell outs to their corporate overlords. Majority of them are to blame for the current bloated state of the web along with excessive mass surveillance and anti-privacy state we are in
HN is extremely tone-policed. Lines like "holy shit look in a mirror" are likely to attract downvotes because of their form, with no other factors being considered.
It's full of people described in this blog post [1]. As it concludes, GTFO! Flagging is the IRL equivalent of crying to your superior instead of actually having an argument which is pathetic
I asked dang if I was shadowbanned from flagging. He said yes, if I flag something then it doesn't count because I flagged the wrong things in the past.
The conclusion is that flagging isn't really up to user choice, but is up to dang who decides which things should be flagged and which shouldn't. It's a bit like how on Reddit, the only comments you can see are the ones that agree with the moderators of that subreddit.
> Is that actually relevant when only images are user content?
Yes. For instance in circumstances exactly as described in the thread you are commenting in now and the article it refers to.
Services like google's bad site warning system may use it to indicate that it shouldn't consider a whole domain harmful if it considers a small number of its subdomains to be so, where otherwise they would. It is no guarantee, of course.
Well, using the public suffix list _also_ isolates cookies and treats the subdomains as different sites, which may or may not be desirable.
For example, if users are supposed to log in on the base account in order to access content on the subdomains, then using the public suffix list would be problematic.
Cross domain identity management is a little extra work, but it's far from a difficult problem. I understand the objection to needing to do it when a shared cookie is so easy, but if you want subdomains to be protected from each other because they do not have shared responsibility for each other then it makes sense in terms of privacy & security that they don't automatically share identity tokens and other client-side data.
In another comment in this thread, it was confirmed that these PR host names are only generated from branches internal to Immich or labels applied by maintainers, and that this does not automatically happen for arbitrary PRs submitted by external parties. So this isn’t the use case for the public suffix list - it is in no way public or externally user-generated.
What would you recommend for this actual use case? Even splitting it off to a separate domain name as they’re planning merely reduces the blast radius of Google’s false positive, but does not eliminate it.
If these are dev subdomains that are actually for internal use only, then a very reliable fix is to put basic auth on them, and give internal staff the user/password. It does not have to be strong, in fact it can be super simple. But it will reliably keep out crawlers, including Google.
They didn't say that these are actually for internal use only. They said that they are generated either from maintainers applying labels (as a manual human decision) or from internal PR branches, but they could easily be publicly facing code reviews of internally developed versions, or manually internally approved deployments of externally developed but internally reviewed code.
None of these are the kind of automatic user-generated content that the warning is attempting to detect, I think. And requiring basic auth for everything is quite awkward, especially if the deployment includes API server functionality with bearer token auth combined with unauthenticated endpoints for things like built-in documentation.
Browsers already do various levels of isolation based on domain / subdomains (e.g. cookies). PSL tells them to treat each subdomain as if it were a top level domain because they are operated (leased out to) different individuals / entities. WRT to blocking, it just means that if one subdomain is marked bad, it's less likely to contaminate the rest of the domain since they know it's operated by different people.
Marking for cookie isolation makes sense, but could be done more effectively via standardized metadata sent by the first party themselves rather than a centralized list maintained by a third party.
Informing decisions about blocking doesn't make much sense (IMO) because it's little more than a speed bump for an attacker. Certainly every little bit can potentially help but it also introduces a new central authority, presents an additional hurdle for legitimate operators, introduces a number of new failure modes, and in this case seems relatively trivial for a determined attacker to overcome.
This is not about user content, but about their own preview environments! Google decided their preview environments were impersonating... Something? And decided to block the entire domain.
I think this only is true if you host independent entities. If you simply construct deep names about yourself with demonstrable chain of authority back, I don't think the PSL wants to know. Otherwise there is no hierarchy the dots are just convenience strings and it's a flat namespace the size of the PSLs length.
There is no law appointing that organization as a world wide authority on tainted/non tainted sites.
The fact it's used by one or more browsers in that way is a lawsuit waiting to happen.
Because they, the browsers, are pointing a finger to someone else and accusing them of criminal behavior. That is what a normal user understands this warning as.
Turns out they are wrong. And in being wrong they may well have harmed the party they pointed at, in reputation and / or sales.
It's remarkable how short sighted this is, given that the web is so international. Its not a defense to say some third party has a list, and you're not on it so you're dangerous
True, and agreed that lawsuits are likely. Disagree that it's short-sighted. The legal system hasn't caught up with internet technology and global platforms. Until it does, I think browsers are right to implement this despite legal issues they might face.
In what country hasn't the legal system caught up?
The point I raise is that the internet is international. There are N legal systems that are going to deal with this. And in 99% of them this isn't going to end well for Google if plaintiff can show there are damages to a reasonable degree.
It's bonkers in terms of risk management.
If you want to make this a workable system you have to make it very clear this isn't necessarily dangerous at all, or criminal. And that a third party list was used, in part, to flag it. And even then you're impeding visitors to a website with warnings without any evidence that there is in fact something wrong.
If this happens to a political party hosting blogs, it's hunting season.
I meant that there is no global authority for saying which websites are OK and which ones are not. So not really that the legal system in specific countries have not caught up.
Lacking a global authority, Google is right to implement a filter themselves. Most people are really really dumb online and if not as clearly "DO NOT ENTER" as now, I don't think the warnings will work. I agree that from a legal standpoint it's super dangerous. Content moderation (which is basically what this is) is an insanely difficult problem for any platform.
Never host your test environments as Subdomains of your actual production domain.
You'll also run into email reputation as well as cookie hell. You can get a lot of cookies from the production env if not managed well.
This. I cannot believe the rest of the comments on this are seemingly completely missing the problem here & kneejerk-blaming Google for being an evil corp. This is a real issue & I don't feel like the article from the Immich team acknowledges it. Far too much passing the buck, not enough taking ownership.
It's true that putting locks on your front door will reduce the chance of your house getting robbed, but if you do get robbed, the fact that your front door wasn't locked does not in any way absolve the thief for his conduct.
Similarly, if an organization deploys a public system that engages in libel and tortious interference, the fact that jumping through technical hoops might make it less likely to be affected by that system does not in any way absolve the organization for operating it carelessly in the first place.
Just because there are steps you can take to lessen the impact of bad behavior does not mean that the behavior itself isn't bad. You shouldn't have restrict how you use your own domains to avoid someone else publishing false information about your site. Google should be responsible for mitigating false positives, not the website owners affected by them.
First & foremost I really need to emphasise that, despite the misleading article title, this was not a false positive. Google flagged this domain for legitimate reasons.
I think there's likely a conversation to be had about messaging - Chrome's warning page seems a little scarier than it should be, Firefox's is more measured in its messaging. But in terms of the API service Google are providing here this is absolutely not a false positive.
The rest of your comment seems to be an analoy about people not being responsible for protecting their home or something, I'm not quite sure. If you leave your apartment unlocked when you go out & a thief steals your housemate's laptop, is your housemate required to exclusively focus on the thief or should they be permitted to request you to be more diligent about locking doors?
> First & foremost I really need to emphasise that, despite the misleading article title, this was not a false positive. Google flagged this domain for legitimate reasons.
Where are you getting that from? I don't see any evidence that there actually was any malicious activity going on on the Immich domain.
> But in terms of the API service Google are providing here this is absolutely not a false positive.
Google is applying heuristics derived from statistical correlations to classify sites. When a statistical indicator is present, but its target variable is not present, that is the very definition of a false positive.
Just because their verbiage uses uncertainty qualifiers like "may" or "might" doesn't change the fact that they are materially interfering with a third party's activities based on presumptive inferences that have not been validated -- and in fact seem to be invalid -- in this particular case.
> If you leave your apartment unlocked when you go out & a thief steals your housemate's laptop, is your housemate required to exclusively focus on the thief or should they be permitted to request you to be more diligent about locking doors?
One has nothing to do with the other. The fact that you didn't lock your door does not legitimize the thief's behavior. Google's behavior is still improper here, even if website operators have the option of investing additional time, effort, or money to reduce the likelihood of being misclassified by Google.
> its target variable is not present, that is the very definition of a false positive
The target variable is user hosted content on subdomains of a domain not listed in Mozilla's public suffix list. Firefox & Chrome apply a much stricter set of security settings for domains on that list, due to the inherent dangers of multiuser domains. That variable is present, Immich have acknowledged it & are migrating to a new domain (which they will hopefully add to Mozilla's list).
> The fact that you didn't lock your door does not legitimize the thief's behavior. Google's behavior is still improper here
I made no claims about legitimising the thief's behaviour - only that leaving your door unlocked was negligent from the perspective of your housemate. That doesn't absolve the thief. Just as any malicious actor trying to compromise Immich users would still be the primary offender here, but that doesn't absolve Immich of a responsibility to take application security seriously.
And I don't really understand where Google fits in your analogy? Is Google the thief? It seems like a confusing analogy.
> First & foremost I really need to emphasise that, despite the misleading article title, this was not a false positive. Google flagged this domain for legitimate reasons.
Judging by what a person from the Immich team said, that does not seem to be true?
So unless one of the developers in the team published something malicious through that system, it seems Google did not have a legitimate reason for flagging it.
Anyone can open a PR. Deploys are triggered by an Immich collaborator labelling the PR, but it doesn't require them to review or approve the code being deployed.
As I've mentioned in several other comments in this thread by now: The whole preview functionality only works for internal PRs, untrusted ones would never even make it to deployment.
The legitimate reason is that the domain is correctly classified as having user generated active content, because the Immich GitHub repo allows anyone to submit arbitrary code via PR, and PRs can be autodeployed to this domain without passing review or approval.
Domains with user generated active content should typically by listed on Mozilla's Public Suffix list, which Firefox & Chrome both check & automatically apply stricter security settings to, to protect users.
A safe browsing service is not a terrible idea (which is why both Safari & Firefox use Google for this) & while I hate that Google has a monopoly here, I do think a safe browsing service should absolutely block your preview environments if those environments have potential dangers for visitors to them & are accessible to the public.
To be clear, the issue here is that some subdomains pose a risk to the overall domain - visiting any increases your risk from others. It's also related to a GitHub workflow that auto-generates new subdomains on demand, so there's no possibility to have a fixed list of known subdomains since new ones are constantly being created.
It is a terrible idea when what is "safe" is determined arbitrarily by a private corporation that is perhaps the biggest source of malicious behavior on the web.
I think my comment came across a bit harsh - the Immich team are brilliant. I've hosted it for a long time & couldn't be happier & I think my criticisms of the tone of the article are likely a case of ignorance rather than any kind of laziness or dismissiveness.
It's also in general a thankless job maintaining any open-source project, especially one of this scale, so a certain level of kneejerk cynical dismissiveness around stuff like this is expected & very forgivable.
Just really hope the ignorance / knowledge-gap can be closed off though, & perhaps some corrections to certain statements published eventually.
.cloud is used to host the map embedded in their webapp.
In fairness, in my local testing sofar, it appears to be an entirely unauthenticated/credential-less service so there's no risk to sessions right now for this particular use-case. That leaves the only risk-factors being phishing & deploy environment credentials.
Happened to me last week. One morning we wake up and the whole company website does not work.
Not advice with some time to fix any possible problem, just blocked.
We gave very bad image to our clients and users, and had to give explanations of a false positive from google detection.
The culprit, according to google search console, was a double redirect on our web email domain (/ -> inbox -> login).
After just moving the webmail to another domain, removing one of the redirections just in case, and asking politely 4 times to be unblocked.. took about 12 hours. And no real recourse, feedback or anything about when its gonna be solved. And no responsibility.
The worse is the feeling of not in control of your own business, and depending on a third party which is not related at all with us, which made a huge mistake, to let out clients use our platform.
It would be glorious if everybody unjustly screwed by Google did that. Barring antitrust enforcement, this may be the only way to force them to behave.
it wouldn't work. they'd hire some minimum wage person to go to all of them and just read the terms and conditions you agreed to that include language about arbitration or whatever
In all US states corporations may be represented by lawyers in small claims cases. The actual difference is that in higher courts corporations usually must be represented by lawyers whereas many states allow normal employees to represent corporations when defending small claims cases, but none require it.
This is not accurate. I filed a claim against Bungalow in Oregon. They petitioned the judge to allow their in house attorney I was dealing with to represent them. The judge denied the request citing the Oregon statute that attorneys may not participate in small claims proceedings. Bungalow flew out their director of some division who was ill prepared.
Slam dunk. took all of 6-8 hours of my time end to end. The claim was a single page document. Got the max award allowable. Would have got more had it been California.
55.090 Appearance by parties and attorneys; witnesses. (1) Except as may otherwise be provided by ORS 55.040, no attorney at law nor any person other than the plaintiff and defendant shall become involved in or in any manner interfere with the prosecution or defense of the litigation in the department without the consent of the justice of the justice court, nor shall it be necessary to summon witnesses.
I've been thinking for a while that a coordinated and massive action against a specific company by people all claiming damages in small claims court would be a very effective way of bringing that company to heel.
Valve tried this. But there's no class action arbitration. Meaning that instead of a single class action suit, they had thousands of individual arbitration cases and they were actually begging people to sue them instead. So we could just do that. If they want mandatory arbitration they can have mandatory arbitration. From half of us, just in case it doesn't work.
Another idea that's worth investigating are coordinated payment strikes on leveraged companies that offer monthly services like telco companies. A bunch of their customers going "Oops, guess I can't afford to pay this month, gonna have to eat that 2% late fee next month, or maybe the month after that, or maybe the month after that" on a service that won't be disconnected in the first month could absolutely crush a company that requires that monthly income to pay their debt.
I was under the impression that the Supreme Court had ruled that mandatory arbitration clauses were indeed mandatory. Meaning, if you are subject to a mandatory arbitration clause in some contract, it removes ALL ability for a plaintiff to sue a company.
But, good news, it seems like they are walking back on that. They recently ruled that lower courts must "pause" a suit and the suit can resume if an agreement is not made through arbitration.
I believe so. For me it was helpful to visualize getting up and convincing the judge of the damages.
I’d run a PnL, get average daily income from visitors, then claim that loss as damages. In court I’d bring a simple spreadsheet showing the hole in income as evidence of damages.
If there were contractors to help get the site back up I’d claim their payments as damages and include their invoices as evidence.
I’ve probably got about a thousand accounts that use a Gmail account as the associated email / username. I doubt this is uncommon compared to the number of people with custom domains.
> The culprit, according to google search console, was a double redirect on our web email domain (/ -> inbox -> login).
I find it hard to believe that the double redirect itself tripped it: multiple redirects in a row is completely normal—discouraged in general because it hurts performance, but you encounter them all the time. For example, http://foo.example → https://foo.example → https://www.foo.example (http → https, then add or remove www subdomain) is the recommended pattern. And site root to app path to login page is also pretty common. This then leads me to the conclusion that they’re not disclosing what actually tripped it. Maybe multiple redirects contributed to it, a bad learned behaviour in an inscrutable machine learning model perhaps, but it alone is utterly innocuous. There’s something else to it.
Want to see how often Microsoft accounts redirect you? I'd love to see Google block all of Microsoft, but of course that will never happen, because these tech giants are effectively a cartel looking out for each other. At least in comparison to users and smaller businesses.
The reason Google doesn’t block Microsoft isn’t that they’re “looking out for Microsoft.” They’re looking out for themselves by being aware that blocking something that millions of people use would be bad for business.
I suspect you're right... The problem is, and i've experienced this with many big tech companies, you never really get any explanation. You report an issue, and then, magically, it's "fixed," with no further communication.
I'm permanently banned from the Play Store because 10+ years ago I made a third-party Omegle client, called it Yo-megle (neither Omegle nor Yo-megle still exist now), got a bunch of downloads and good ratings, then about 2 years later got a message from Google saying I was banned for violating trademark law. No actual legal action, just a message from Google. I suppose I'm lucky they didn't delete my entire Google account.
I'm beginning to seriously think we need a new internet, another protocol, other browsers just to break up the insane monopolies that has been formed, because the way things are going soon all discourse will be censored, and competitors will be blocked soon.
We need something that's good for small and medium businesses again, local news and get an actual marketplace going - you know what the internet actually promised.
We have a “new internet”. We have the indie web, VPNs, websites not behind Cloudflare, other browsers. You won’t have a large audience, but a new protocol won't fix that.
Also, plenty of small and medium businesses are doing fine on the internet. You only hear about ones with problems like this. And if these problems become more frequent and public, Google will put more effort into fixing them.
I think the most practical thing we can do is support people and companies who fall through the cracks, by giving them information to understand their situation and recover, and by promoting them.
Perhaps we need a different "type" of internet. I don't have the expertise to even explain what this would look like, but I know that if politics, religion, junk science and a hundred other influences have anything to do with it, it will eventually become too stupid to use.
We had a "smart person only internet". Then it became financially prudent to make it an "everyone internet", then we had the dot com boom, Apple, Google, etc bloom from that.
We _still_ have a "smart person only internet" really, it's just now used mostly for drug and weapon sales ( Tor )
For some group of smart people, there will be a group of smarter people who want to dominate the The people they designate "the stupids".
The internet was a technological solution to a social problem. It introduced other social problems, although arguably these to your point are old social problems in a new arena.
But there may be yet another technological solution to the old social problems of monopolism, despotic centralized control, and fraud.
The community around NOSTR are basically building a kind of semantic web, where users identities are verified via their public key, data is routed through content agnostic relays, and trustworthiness is verified by peer recommendation.
They are currently experimenting with replicating many types of services which are currently websites as protocols with data types, with the goal being that all of these services can share available data with eachother openly.
It's definitely more of a "bazaar" model over a "catherdral" model, with many open questions and it's also tough to get a good overview of what is really going on there. But at least it's an attempt.
Stop trying to look for technological answers to political problems. We already have a way to avoid excessive accumulation of power by private entities, it's called "anti-trust laws" (heck, "laws" in general).
Any new protocol not only has to overcome the huge incumbent that is the web, it has to do so grassroots against the power of global capital (trillions of dollars of it). Of course, it also has to work in the first place and not be captured and centralised like another certain open and decentralised protocol has (i.e., the Web).
Is that easier than the states doing their jobs and writing a couple pages of text?
States are made of people both at decision and at street level. Many anti-trust laws were made when the decision people that were not very tied with the actual interests - nowadays this seem to change. At no point I think people at street level ever understood the actual implications.
A structural solution is to educate and lift the whole population to better understand the implications of their choices.
A tactic solution is to try to limit the collusion of decision people and private entities, but this does not seem to go extremely well.
An "evolutionary" solution (that just happens) used to be to have a war - that would push a lot of people to look for efficiency rather than for some interests. But this is made more complex by nukes.
I don't really see how anti-trust would address something like Google Chrome's safe browsing infrastructure.
The problem is that the divide of alignment of interests there is between new, small companies and users. New companies want to put up a website without tripping over one of the thousand unwritten rules of "How to not look like a phishing site or malware depot" (many of which are unwritten because protecting users and exploiting users is a cat-and-mouse game)... And users don't want to get owned.
Shard Chrome off from Google and it still has incentives to protect users at the cost of new companies' ease of joining the global network as a peer citizen. It may have less signal as a result of a curtailed visibility on the state of millions of pages, but the consequence of that is that it would offer worse safe browsing protection and more users would get owned as a result.
Perhaps the real issue is that (not unlike email) joining the web as a peer citizen has just plain gotten harder in the era of bad actors exploiting the infrastructure to cause harm to people.
Like... You know what never has these problems? My blog. It's a static-site-generated collection of plain HTML that updates once in a blue moon via scp. I'm not worried about Google's safe browsing infrastructure, because I never look like a malware site. And if I did trip over one of the unwritten rules (or if attackers figured out how to weaponize something personal-blog-shaped)? The needs of the many justify Chrome warning people before going to my now-shady site.
> The problem is that the divide of alignment of interests there is between new, small companies and users. New companies want to put up a website without tripping over one of the thousand unwritten rules of "How to not look like a phishing site or malware depot" (many of which are unwritten because protecting users and exploiting users is a cat-and-mouse game)... And users don't want to get owned
Some candidate language:
- Monopolistic companies may not actively impose restrictions which harm others (includes businesses)
or
- Some restrictions are allowed, but the company must respond to an appeal of restrictions within X minutes; Appeals to the company can themselves be appealed to a governmental independent board which binds the company with no further review permitted; All delays and unreasonable responses incur punitive penalties as judged by the board; All penalties must be paid immediately
or
- If an action taken unilaterally by a company 1) harms someone AND 2) is automated: Then, that automation must be immediately, totally, and unconditionally reversed upon the unilateral request of the victim. The company may reinstate the action upon the sworn statement of an employee that they have made the decision as a human, and agree to be accountable for the decision. The decision must then follow the above appeals process.
> Monopolistic companies may not actively impose restrictions which harm others (includes businesses)
That's not generally how monopoly is interpreted in the US (although jurisprudence on this may be shifting). In general, the litmus test is consumer harm. A company is allowed to control 99% of the market if they do it by providing a better experience to consumers than other companies can; that's just "being successful." Microsoft ran afoul of antitrust because their browser sucked and embedding it in the OS made the OS suck too; if they hadn't tried to parlay one product into the other they would be unlikely to have run afoul of US antitrust law, and they haven't run afoul of it over the fact that 70-90% of x86 architecture PCs run Windows.
> Some restrictions are allowed, but the company must respond to an appeal of restrictions within X minutes; Appeals to the company can themselves be appealed to a governmental independent board which binds the company with no further review permitted; All delays and unreasonable responses incur punitive penalties as judged by the board; All penalties must be paid immediately
There may be meat on those bones (a general law restricting how browsers may operate in terms of rendering user content). Risky because it would codify into law a lot of ideas that are merely technical specifications (you can look to other industries to see the consequences of that, like how "five-over-ones" are cropping up in cities all over the US because they satisfy a pretty uniform fire and structural safety building code to the letter). But this could be done without invoking monopoly protection.
> If an action taken unilaterally by a company 1) harms someone AND 2) is automated: Then, that automation must be immediately, totally, and unconditionally reversed upon the unilateral request of the victim.
Too broad. It harms me when Google blocks my malware distribution service because I'm interested in getting malware on your machine; I really want your Bitcoin wallet passwords, you see. ;)
Most importantly: this whole topic is independent of monopolies. We could cut Chrome out of Google tomorrow and the exact same issues with safe browsing impeding new sites with malware-ish shapes would exist (with the only change probably being the false positive rate would go up, since a Chrome cut off from Google would have to build out its detection and reporting logic from scratch without relying on the search crawler DB). More importantly, a user can install another browser that doesn't have site protection today (or, if I understand correctly, switch it off). The reason this is an issue is that users like Chrome and are free to use it and tend to find site protection useful (or at least "not a burden to them") and that's not something Google imposed on the industry, it's a consequence of free user choice.
> Too broad. It harms me when Google blocks my malware distribution service because I'm interested in getting malware on your machine; I really want your Bitcoin wallet passwords, you see. ;)
That's okay, a random company failing to protect users from harm is still better than harming an innocent person by accident. They already fail in many cases, obviously we accept a failure rate above 0%. You also skipped over the rest of that paragraph.
> users like Chrome and are free to use it and tend to find site protection useful (or at least "not a burden to them")
That's okay, Google can abide by the proposal I set forth avoiding automated mistaken harms to people. If they want to build this system that can do great harms to people, they need to first and foremost build in safety nets to address those harms they cause, and only then focus on reducing false negatives.
I think there's an unevaluated tension in goals between keeping users safe from malware here and making it easy for new sites to reach people, regardless of whether those sites display patterns consistent with malware distributors.
I don't think we can easily discard the first in favor of the second. Not nearly as categorically as is done here. Those "false negatives" mean users lose things (bank accounts, privacy, access to their computer) through no fault of their own. We should pause and consider that before weeping and rending our garments that yet another hosting provider solution had a bad day.
You've stopped considering monopoly and correctly considered that the real issue is safe browsing, as a feature, is useful to users and disruptive to new business models. But that's independent of Google; that's the nature of sharing a network between actors that want to provide useful services to people and actors that want to cause harm. If I build a browser today, from scratch, that included safe browsing we'd be in the same place and there'd be no Google in the story.
It's very, very hard to overcome the gravitational forces which encourage centralization, and doing so requires rooting the different communities that you want to exist in their own different communities of people. It's a political governance problem, not a technical one.
Companies have economy of scale (Google, for instance, is running dozens to hundreds of web apps off of one well-maintained fabric) and the ability to force consolidation of labor behind a few ideas by controlling salaries so that the technically hard, detailed, or boring problems actually get solved. Open source volunteer projects rarely have either of those benefits.
In theory, you could compete with Google via
- Well-defined protocols
- That a handful of projects implement (because if it's too many, you split the available talent pool and end up with e.g. seven mediocre photo storage apps that are thin wrappers around a folder instead of one Google Photos with AI image search capability).
- Which solve very technically hard, detailed, or boring technical problems (AI image search is an actual game-changer feature; the difference between "Where is that one photo I took of my dog? I think it was Christmas. Which Christmas, hell I don't know" and "Show me every photo of my dog, no not that dog, the other dog").
I'd even risk putting up bullet point four: "And be willing to provide solutions for problems other people don't want solved without those other people working to torpedo your volunteer project" (there are lots of folks who think AI image detection is de-facto evil and nobody should be working on it, and any open source photo app they can control the fate of will fall short of Google's offering for end-users).
Problem is that as soon as some technology takes traction, it catches the attention of businesses, and there is where the slow but steady enshittification process begins. Not that business necessarily equals enshittification, but in a world dominated by capitalism without borders soon or later someone will break some unwritten rules and others will have to follow to remain competitive, until that new technology will become a new web, and we'll be back to square one. To me the problem isn't technical, as isn't its solution.
I'm interested to see how this will work with something like Mastodon.
Since Mastodon is, fundamentally, a protocol and reference implementation, people can come up with their own enshittified nodes or clients... And then the rest of the ecosystem can respond by just ignoring that work.
Yes, technically Truth Social is a Mastodon node. My Mastodon node doesn't have to care.
IPFS has been doing some great work around decentralization that actually scales (Netflix uses it internally to speed up container delivery), but a) it's only good for static content, b) things still need friendly URLs, and c) once it becomes the mainstream, bad actors will find a way to ruin it anyway.
These apply to a lot of other decentralized systems too.
It won't get anywhere unless it addresses the issue of spam, scammers, phishing etc. The whole purpose of Google Safe Browsing is to make life harder for scammers.
I own what I think are the key protocols for the future of browsers and the web, and nobody knows it yet. I'm not committed to forking the web by any means, but I do think I have a once-in-a-generation opportunity to remake the system if I were determined to and knew how to remake it into something better.
I'm afraid this can't be built on the current net topology which is owned by the Stupid Money Govporation and inherently allows for roadblocks in the flow of information. Only a mesh could solve that.
But the Stupid Money Govporation must be dethroned first, and I honestly don't see how that could happen without the help of an ELE like a good asteroid impact.
It will take the same or less amount of time, to get where we are with current Web.
What we have is the best sim env to see how stuff shape up. So fixing it should be the aim, avoiding will get us on similar spirals. We'll just go on circles.
I don't know, it is a lot of effort for a decade fresh air. Then you will notice same policies implemented since they will take reference to how people solved it in the past.
The one thing I never understood about these warnings is how they don't run afoul of libel laws. They are directly calling you a scammer and "attacker". The same for Microsoft with their unknown executables.
They used to be more generic saying "We don't know if its safe" but now they are quite assertive at stating you are indeed an attacker.
"The people living at this address might be pedophiles and sexual predators. Not saying that they are, but if your children are in the vicinity, I strongly suggest you get them back to safety."
Still asserts that in that house there may be sexual predators. If I lived in that house I wouldnt be happy, and I would want a way of clearing the accusations and proving that there are indeed no sexual predators in my house quicksmart before other people start avoiding it.
You can’t possibly use the “they use the word ‘might’” argument and not mention the death red screen those words are printed over. If you are referring to abidance to the law, you are technically right. If we remove the human factor, you technically are.
I worked for a company who had this happen to an internal development domain, not exposed to the public internet. (We were doing security research on our own software, so we had a pentest payload hosted on one of those domains as part of a reproduction case for a vulnerability we were developing a fix for.)
Our lawyers spoke to Google's lawyers privately, and our domains got added to a whitelist at Google.
It depends, if it's a clear-cut case, then in jurisdictions with a functioning legal system it can be feasible to sue.
Likewise, if it's a fuckup that just needs to be put in front of someone who cares, a lawsuit is actually a surprisingly effective way of doing that. This moves your problem from "annoying customer support interaction that's best dealt with by stonewalling" into "legal says we HAVE to fix this".
Imagine if you bought a plate at Walmart and any time you put food you bought elsewhere on it, it turned red and started playing a warning about how that food will probably kill you because it wasn't Certified Walmart Fresh™
Now imagine it goes one step further, and when you go to eat the food anyway, your Walmart fork retracts into its handle for your safety, of course.
No brand or food supplier would put up with it.
That's what it's like trying to visit or run non-blessed websites and software coming from Google, Microsoft, etc on your own hardware that you "own".
This is the future. Except you don't buy anything, you rent the permission to use it. People from Walmart can brick your carrots remotely even when you don't use this plate, for your safety ofc
> The one thing I never understood about these warnings is how they don't run afoul of libel laws. They are directly calling you a scammer and "attacker"
Being wrong doesn't count as libel.
If a company has a detection tool, makes reasonable efforts to make sure it is accurate, and isn't being malicious, you'll have a hard time making a libel case
There is a truth defence to libel in the USA but there is no good faith defence. Think about it like a traffic accident, you may not have intended to drive into the other car but you still caused damage. Just because you meant well doesn't absolve you from paying for the damages.
If the false positive rate is consistently 0.0%, that is a surefire sign that the detector is not effective enough to be useful.
If a false positive is libel, then any useful malware detector would occasionally do libel. Since libel carries enormous financial consequences, nobody would make a useful malware detector.
I am skeptical that changing the wording in the warning resolves the fundamental tension here. Suppose we tone it down: "This executable has traits similar to known malware." "This website might be operated by attackers."
Would companies affected by these labels be satisfied by this verbiage? How do we balance this against users' likelihood of ignoring the warning in the face of real malware?
The problem is that it's so one sided. They do what they want with no effort to avoid collateral damage and there's nothing we can do about it.
They could at least send a warning email to the RFC2142 abuse@ or hostmaster@ address with a warning and some instructions on a process for having the mistake reviewed.
The first step in filing a libel lawsuit is demanding a retraction from the publisher. I would imagine Google's lawyers respond pretty quickly to those, which is why SafeBrowsing hasn't been similarly challenged.
This may not be a huge issue depending on mitigating controls but are they saying that anyone can submit a PR (containing anything) to Immich, tag the pr with `preview` and have the contents of that PR hosted on https://pr-<num>.preview.internal.immich.cloud?
Doesn't that effectively let anyone host anything there?
I think only collaborators can add labels on github, so not quite. Does seem a bit hazardous though (you could submit a legit PR, get the label, and then commit whatever you want?).
Exposure also extends not just to the owner of the PR but anyone with write access to the branch from which it was submitted. GitHub pushes are ssh-authenticated and often automated in many workflows.
It's the result of failures across the web, really. Most browsers started using Google's phishing site index because they didn't want to maintain one themselves but wanted the phishing resistance Google Chrome has. Microsoft has SmartScreen, but that's just the same risk model but hosted on Azure.
Google's eternal vagueness is infuriating but in this case the whole setup is a disaster waiting to happen. Google's accidental fuck-up just prevented "someone hacked my server after I clicked on pr-xxxx.imiche.app" because apparently the domain's security was set up to allow for that.
You can turn off safe browsing if you don't want these warnings. Google will only stop you from visiting sites if you keep the "allow Google to stop me from visiting some sites" checkbox enabled.
I really don't know how they got nerds to think scummy advertising is cool. If you think about it, the thing they make money on - no user actually wants ads or wants to see them, ever. Somehow Google has some sort of nerd cult that people think its cool to join such an unethical company.
If you ask, the leaders in that area of Google will tell you something like "we're actually HELPING users because we're giving them targeted ads that are for the things they're looking for at the time they're looking for it, which only makes things for the user better." Then you show them a picture of YouTube ads or something and it transitions to "well, look, we gotta pay for this somehow, and at least's it's free, and isn't free information for all really great?"
It's super simple. Check out all the Fediverse alternatives. How many people that talk a big game actually financially support those services? 2% maybe, on the high end.
Things cost money, and at a large scale, there's either capitalism, or communism.
Google's services, especially their free services, are never really free. It's just that the price tag is so well hidden that ordinary users really believe this. But the HN audience is more technical than that and they see through the smokescreen.
Except for those that are making money off adds directly or indirectly, and who believe in their god given right to my attention and my data.
> I'm increasingly blown away by takes on here that are so dramatic and militant about things that barely even register to most people.
Things 'barely even registering to most people' is not as strong a position as you may think it is. Oxygen barely registers to most people. But take it away and they register it just fine (for a short while). The 'regular' people that you know have been steadily conditioned to an ever worsening experience to the point that they barely recognize the websites they visit when seeing the web with an adblocker for the first time.
It's just that the price tag is so well hidden that ordinary users really believe this.
And if they die believing that, what price did they really pay? I don't think the difference mostly comes down to a lack of knowledge or understanding, but more a difference of care or assigned value. There are a lot of smart people on HN, but with that often comes exaggerated anxieties and paranoias. If most people don't give a crap about giving their data to Google or allowing the big bad advertisements to penetrate their feeble minds or whatever, vociferously beating that drum just amounts to old-man-yelling-at-cloud-esque FUD.
Things 'barely even registering to most people' is not as strong a position as you may think it is.
I understand that logically that is neither here nor there, it was more just an expression of exasperation. It's kind of like how I'm equally blown away by how much energy some people put into anti-abortion laws. It's like, ok, everyone can have their opinions, and there's plenty of reasonable discussion to be had, but to put so much negative energy into something that's like, is this really the battle that's worth this much outrage right now? There are literally genocides and violent deportations going on around us. Google are not the bad guys.
Also, I don't use any kind of ad blocker. There are definitely lots of ad-infested unusable experiences out there but Google products are generally among the classiest and most unobtrusive.
The people that put effort into anti-abortion laws are usually trying to force their view of how other people should live onto those other people.
I block ads out of my life because I am easily distracted and have seen the internet go from a great place to a billboard that continuously screams at me for my attention. It's pure self-preservation, I don't begrudge you your 45 minutes of advertising time per day at all.
They created the largest spying instrument in the world that creates hidden profiles (that can never be deleted) documenting web activity, psychological state, medications, etc, etc for billions of people - and have been caught multiple times sharing data with governments (they're probably compromised internally anyway). I would categorize that as unethical. But yeah, you can cheer for the scraps they throw out.
>about things that barely even register to most people.
News flash: This whole website is about things that don't register to most people. It's called hacker news FFS.
In any case, I think a trillion dollar company probably doesn't need defending. They can easily tweak their algorithm to bury this type of stuff; after all this opinion is probably not "relevant" or "useful" to most people.
On this day, only Google Maps does not have real competitor on Android. Otherwise, it is possible to drop Google and even get better services. Brands are difficult to compete.
You're right but I hate that you're right. The only part I disagree with is
>I think they all are pretty happy with the deal and would not switch to a paid ad-free version.
If they were given a low friction option to pay the advertise price for these services I think a lot would choose it. Advertisement pays almost nothing per person. Almost every person could pay more than the cost to serve them an ad. To use a service ad free for a year would cost less than $1 per user. This differs on the platform obviously with stuff like youtube being far more expensive but for day to day stuff the cost is low.
The open internet is done. Monopolies control everything.
We have an iOS app in the store for 3 years and out of the blue apple is demanding we provide new licenses that don’t exist and threaten to kick our app out. Nothing changed in 3 years.
Getting sick of these companies able to have this level of control over everything, you can’t even self host anymore apparently.
> We have an iOS app in the store for 3 years and out of the blue apple is demanding we provide new licenses that don’t exist and threaten to kick our app out.
I'm fighting this right now on my own domain. Google marked my family Immich instance as dangerous, essentially blocking access from Chrome to all services hosted on the same domain.
I know that I can bypass the warning, but the photo album I sent to my mother-in-law is now effectively inaccessible.
Unless I missed something in the article this seems like a different issue. The article is specifically about the domain "immich.cloud". If you're using your own domain, I'd check to ensure it hasn't been actually compromised by a bonnet or similar in some way you haven't noticed.
It may well be a false positive of Google's heuristics but home server security can be challenging - I would look at ruling out the possibility of it being real first.
It certainly sounds like a separate root issue to this article, even if the end result looks the same.
Just in case you're not sure how to deal with it, you need to request a review via the Google Search Console. You'll need a Google account and you have to verify ownership of the domain via DNS (if you want to appeal the whole domain). After that, you can log into the Google Search Console and you can find "Security Issues" under the "Security & Manual Actions" section.
That area will show you the exact URLs that got you put on the block list. You can request a review from there. They'll send you an email after they review the block.
Hopefully that'll save you from trying to hunt down non-existent malware on a half dozen self-hosted services like I ended up doing.
It's a bit ironic that a user installing immich to escape Google's grip ends up having to create again a Google account to be able to remove their Google account.
Reviews view Google Search Console are pointless because they won't stop the same automated process from flagging the domain again. Save your time and get your lawyer to draft a friendly letter instead.
Add a custom "welcome message" in Server Settings (https://my.immich.app/admin/system-settings?isOpen=server) to make your login page look different compared to all other default Immich login pages.
This is probably the easiest non-intrusive tweak to work around the repeated flagging by Safe Browsing, still no 100% guarantee.
I agree that strict access blocking (with extra auth or IP ACL) can work better. Though I've seen in this thread https://news.ycombinator.com/item?id=45676712 and over the Internet that purely internal/private domains get flagged too. Can it be some Chrome + G Safe Browsing integration, e.g. reporting hashes of visited pages?
Immich is a great software package, and I recommend it. Sadly, Google can still flag sites based on domain name patterns, blocking content behind auth or even on your LAN.
That probably wouldn't work, I get hit with Chrome's red screen of annoyance regularly with stuff only reachable on my LAN. I suspect the trigger is that the URLs are like [product name].home.[mydomain.com].
I'm actually already avoiding this issue but for another reason: hackers will scan subdomains matching known products with known vulnerabilities, so hosting a Wordpress behind "wordpress.domain.tld" will get you way more ill-intentioned requests than "tbyehl.domain.tld".
Thus if I started hosting my Immich instance, I would probably put it behind "pxl.domain.tld" or something like that.
Not a garantee to pass the Google purity test, but, according to some reports, it would avoid raising some redflags.
I love Immich & greatly appreciate the amazing work the team put into maintaining it, but between the OP & this "Cursed Knowledge" page, the apparent team culture of shouting from the rooftops complaints that expose their own ignorance about technology is a little concerning to be honest.
I've now read the entire Cursed Knowledge list & - while I found some of them to be invaluable insights & absolutely love the idea of projects maintaining a public list of this nature to educate - there are quite a few red flags in this particular list.
Before mentioning them: some excellent & valuable, genuinely cursed items: Postgres NOTIFY (albeit adapter-specific), npm scripts, bcrypt string lengths & especially the horrifically cursed Cloudflare fetch: all great knowledge. But...
> Secure contexts are cursed
> GPS sharing on mobile is cursed
These are extremely sane security feature. Do we think keeping users secure is cursed? It honestly seems crazy to me for them to have published these items in the list with a straight face.
> PostgreSQL parameters are cursed
Wherein their definition of "cursed" is that PG doesn't support running SQL queries with more than 65535 separate parameters! It seems to me that any sane engineer would expect the limit to be lower than that. The suggestion that making an SQL query with that many parameters is normal seems problematic.
> JavaScript Date objects are cursed
Javascript is zero-indexed by convention. This one's not a huge red flag but it is pretty funny for a programmer to find this problematic.
> Carriage returns in bash scripts are cursed
Non-default local git settings can break your local git repo. This isn't anything to do with bash & everyone knows git has footguns.
If git didn't have this setting, then after checking out a bash file with LFs in it, there are many Windows editors that would not be able to edit that file properly. That's a limitation of those editors & nobody should be using those pieces of software to edit bash files. This is a problem that is entirely out of scope for a VCS & not something Git should ever have tried to solve.
In fact, having git solve this disincentives Windows editors from solving it correctly.
You will have the same problem if you build a Linux container image using scripts that were checked out on the windows host machine. What's even more devious is that some editors (at least VS Code) will automatically save .sh files with LF line endings on Windows, so the problem doesn't appear for the original author, only someone who clones the repo later. I spent probably half a day troubleshooting this a while back. IMO it's not the fault of any one tool, it's just a thing that most people will never think about until it bites them.
TL;DR - if your repo will contain bash scripts, use .gitattributes to make sure they have LF line endings.
> JavaScript date objects are 1 indexed for years and days, but 0 indexed for months.
This mix of 0 and 1 indexing in calendar APIs goes back a long way. I first remember it coming from Java but I dimly recall Java was copying a Taligent Calendar API.
Huh. Maybe? I don't want that information available to apps to spy on me. But I do want full file contents available to some of them.
And wait. Uh oh. Does this mean my Syncthing-Fork app (which itself would never strike me as needing location services) might have my phone's images' location be stripped before making their way to my backup system?
EDIT: To answer my last question: My images transferred via Syncthing-Fork on a GrapheneOS device to another PC running Fedora Atomic have persisted the GPS data as verified by exiftool. Location permissions have not been granted to Syncthing-Fork.
Happy I didn't lose that data. But it would appear that permission to your photo files may expose your GPS locations regardless of the location permission.
With the Nextcloud app I remember having to enable full file permissions to preserve the GPS data of auto-uploaded photos a couple of years ago. Which I only discovered some months after these security changes went into effect on my phone. That was fun. I think Android 10 or 11 introduced it.
Looking now I can't even find that setting anymore on my current phone. But the photos still does have the GPS data intact.
I think the “cursed” part (from the developers point of view) is that some phones do that, some don’t, and if you don’t have both kinds available during testing, you might miss something?
Yep, and it's there for very goos reasons. However if you don't know about it, it can be quite surprising and challenging to debug.
Also it's annoying when your phones permissions optimiser runs and removes the location permissions from e.g. Google Photos, and you realise a few months later that your photos no longer have their location.
There is never a good reason to permanently modify my files, if that is what is going on here. Seems like I wouldn't be able to search my photos by location reliably if that data was stripped from them.
What happens is that when an application without location permissions tries to get photos, the corresponding OS calls strip the geo location data when passing them. The original photos still have it, but the application doesn't, because it doesn't have access to your location.
This was done because most people didn't know that photos contain their location, and people got burned by stalkers and scammers.
It's not if it silently alters the file.
i do want GPS data for geolocation, so that when i import the images in the right places they are already placed where they should be on the map
Every kind of permission should fail the same way, informing the user about the failure, and asking if the user wants to give the permission, deny the access, or use dummy values. If there's more than one permission needed for an operation, you should be able to deny them all, or use any combination of allowing or using dummy values.
And permissions should also not be so wide. You should be able to give permission to the GPS data in pictures you consciously took without giving permission to track your position whenever.
As it says, bulk inserts with large datasets can fail. Inserting a few thousand rows into a table with 30 columns will hit the limit. You might run into this if you were synchronising data between systems or running big batch jobs.
Sqlite used to have a limit of 999 query parameters, which was much easier to hit. It's now a roomy 32k.
In the past I've used batches of data, inserted into a separate table with all the constraints turned off and using UNNEST, and then inserted into the final table once it was done. We ended up both batching the data and using UNNEST because it was faster but it still let us resume midway through.
We probably should have been partitioning the data instead of inserting it twice, but I never got around to fixing that.
COPY is likely a better option if you have access to the host, or provider-specific extensions like aws_s3 if you have those. I'm sure a data engineer would be able to suggest a better ETL architecture than "shove everything into postgres", too.
Was MERGE too slow/expensive? We tend to MERGE from staging or temporary tables when we sync big data sets. If we were on postgres I think we'd use ... ON CONFLICT, but MERGE does work.
> PostgreSQL USER is cursed
> The USER keyword in PostgreSQL is cursed because you can select from it like a table, which leads to confusion if you have a table name user as well.
SQL's "feature" of having table and field names in the same syntactic namespace as an ever expanding set of english language keywords is the original eldritch curse behind it all.
> JavaScript date objects are 1 indexed for years and days, but 0 indexed for months.
I don't disagree that months should be 1-indexed, but I would not make that assumption solely based on days/years being 1-indexed, since 0-indexing those would be psychotic.
The only reason I can think of to 0-index months is so you can do monthName[date.getMonth()] instead of monthName[date.getMonth() - 1].
I don't think adding counterintuitive behavior to your data to save a "- 1" here and there is a good idea, but I guess this is just legacy from the ancient times.
A [StackOverflow thread](https://stackoverflow.com/a/41992352) about this interface says it was introduced by Java way back in 1995, and copied by the first JavaScript implementation.
Why so? Months in written form also start with 1, same as days/years, so it would make sense to match all of them.
For example, the first day of the first month of the first year is 1.1.1 AD (at least for Gregorian calendar), so we could just go with 0-indexed 0.0.0 AD.
Dark-grey text on black is cursed. (Their light theme is readable.)
Also, you can do bulk inserts in postgres using arrays. Take a look at unnest. Standard bulk inserts are cursed in every database, I'm with the devs here that it's not worth fixing them in postgres just for compatibility.
A friend / client of mine used some kind of WordPress type of hosting service with a simple redirect. The host got on the bad sites list.
This also polluted their own domain, even when the redirect was removed, and had the odd side effect that Google would no longer accept email from them. We requested a review and passed it, but the email blacklist appears to be permanent. (I already checked and there are no spam problems with the domain.)
We registered a new domain. Google’s behaviour here incidentally just incentivises bulk registering throwaway domains, which doesn’t make anything any better.
My general policy now is to confine important email to a very, very basic website that you rigidly control the hosting over and just keep static sites on.
Us nerds *really* need to come together in creating a publicly owned browser (non chromium)
Surely among us devs, as we realize app stores increasingly hostile, that the open web is worth fighting for, and that we have the numbers to build solutions?
Firefox should be on that list. It's clearly a lot closer in functionality to Chrome/Chromium than Servo or Ladybird, so it's easier to switch to it. I like that Servo and Ladybird exist and are developing well, but there's no need to pretend that they're the only available alternatives.
Majority of users are on mobile now, and Firefox mobile sucks ass. I cannot bring myself to use it. Simple things like clicking the home button should take you to homepage, but Firefox opens a new tab. It's so stupid.
I use Firefox Mobile Nightly on Android and appreciate it for the dark mode extension and ad blocking. There are some issues but the benefits outweigh them for me.
I don't even have a Home button that I can see, I must have turned it off in settings? I describe my tab count using scientific notation, though, so I'd be a "new tab" guy, anyway. But I'd also be a proponent of it being configurable.
i think it's great and syncs well with my computer's firefox. i think there should be a setting to choose how to open homepage but i don't mind the extra tabs really.
Funded to the tune of a half billion dollars a year by Google to pretend there's no monopoly, and multiple announcements of them trying to reimagine themselves as an ad-company. They're the best of a bad bunch but they are definitely still part of a bad bunch
Your second point, as well as their so much criticised, especially on HN, attempts at diversification, are trying to fight your first point.
Because they're so reliable on Google funding, they're trying to do whatever they can to find alternative revenue streams. Damned if you do, damned if you don't, especially for the HN crowd.
This is #1 on HN for a while now and I suspect it's because many of us are nervous about it happening to us (or have already had our own homelab domains flagged!).
So is there someone from Google around who can send this along to the right team to ensure whatever heuristic has gone wrong here is fixed for good?
I doubt Google the corporation cares one bit, and any individual employees who do care would likely struggle against the system to cause significant change.
The best we all can do is to stop using Google products and encourage our friends and family to do likewise. Make sure in our own work that we don't force others to rely on Google either.
We really need an internet Bill of Rights. Google has too much power to delete your company from existence with no due process or recourse.
If any company controls some (high) percentage of a particular market, say web browsers, search, or e-commerce, or social media, the public's equal access should start to look more like a right and less like an at-will contract.
30 years ago, if a shop had a falling out with the landlord, it could move to the next building over and resume business. Now if you annoy eBay, Amazon or Walmart, you're locked out nationwide. If you're an Uber, Lyft, or Doordash (etc) gig worker and their bots decide they don't like you anymore, then sayonara sucker! Your account has been disabled, have a nice day and don't reapply.
Our regulatory structure and economies of scale encourage consolidation and scale and grant access to this market to these businesses, but we aren't protecting the now powerless individuals and small businesses who are randomly and needlessly tossed out with nobody to answer their pleas of desperation, no explanation of rules broken, and no opportunity to appeal with transparency.
I know someone with a small business that applied for Venmo Business account (which is the main payment method in their community industry) and Venmo refused to open the account and didn't provide any reason as to why saying that they have the right to choose to refuse providing the service, which they do. But all the competitors of that business in the area do have a Venmo and take payment this way so it is basically a revenue loss for that person.
It's a bit frustrating when a company becomes a major player in an industry and can have a life and death sentence on other businesses.
There are alternative payment method but people are use to pay a certain way in that industry/area, similarly there are other browsers but people are used to Chrome.
Same thing with Paypal - I opened a business account, was able to do one transaction and was shut down for fraud. I tested a donation to myself. Under $10. Lifetime ban.
That’s not unique to PayPal. Pretty much any payment processor that detects a proprietor paying themselves is going to throw up a red flag for circular cash flow fraud and close the account. Bank-operated payment processors are often slower to catch it, but they will also boot you for this.
real payment processors also you just call on the phone and they fix it. That's not a real problem. we do test orders on many go lives per year and never see this. Yes there are sandboxes, but you always gotta test real transactions by the end.
My bank displays me a popup warning me to check who I'm sending money to every time I make a transfer. If I've made that same transfer before, after showing that, it's also telling me that it won't ask for 2FA for this transfer, because I've made it so many times before.
High quality or even medium quality software and UX is getting harder and harder to find.
I sold some camera equipment on eBay once. PayPal flagged my account as fraudulent, asked for a receipt for the equipment which I did not have (I bought it years before), so they banned my account indefinitely.
Randomly, years later, they turned it back on. Thanks, I guess?
We need to fix the jurisprudence around anti-trust.
> No person engaged in commerce or in any activity affecting commerce shall acquire, directly or indirectly, the whole or any part of the stock or other share capital and no person subject to the jurisdiction of the Federal Trade Commission shall acquire the whole or any part of the assets of another person engaged also in commerce or in any activity affecting commerce, where in any line of commerce or in any activity affecting commerce in any section of the country, the effect of such acquisition may be substantially to lessen competition, or to tend to create a monopoly.
Taken at face value, that would forbid companies from buying any large competitors unless the competitor is already failing. Somehow that got watered down into almost nothing.
The issue is that current law around monopolies defines them from the wrong angle.
Instead of taking a consumer-centric / competition perspective, they should be defined in terms of market share (with markets broadly defined from a consumer perspective).
>10% = some minimal interoperability and reporting requirements
>25% = serious interoperability requirements
>35% = severe and audited interoperability requirements, with a method for gaps to be proposed by competitors, with the end goal of making increasing market share past this point difficult
Close the "but it's free to consumers (because we monetize them in other ways)" loophole that every 90s+ internet business used: instead focus on ensuring competition as measured by market share.
well exactly, Verizon, Amazon, etc all LOVE more regulation. they have armies of lawyers who not only help in constructing the laws, they help pass, lobby and implement them. then the same law firms help amazon, verizon, etc execute it.
It's regulatory capture
now a small competitor wants to do something like get into the wifi game and they're look at huge fixed fees to get started.
I think 00s+ tech history has demonstrated that the free market is no longer sufficient to promote healthy competition.
Partly a consequence of the biggest tech firms getting bigger.
And partly because of newfound technical ability to achieve mass lock-in (e.g. vendor-controlled encryption, TPMs, vertical integration in platforms, first-party app stores, etc).
The 'but regulatory capture' counter argument rings hollow when the government has given the market a lighter monopoly regulatory touch... and we've ended up with a more concentrated, less competitive market than when it was more heavily regulated.
If you want to make electronics with any complexity, you'll suddenly discover that you need to pay patent fees. And those come as a fixed share of your revenue. Add enough complexity and you can easily be required to pay more than 100% of your revenue in fees.
Looks like market share is not a concern anymore: when one participant adopts a dark pattern others follow, because then consumer has nowhere to go. What Orwell called it, collectivist oligarchy?
In 2025 you can use Beeper (or run your own local Matrix server with the opensource bridges) and get the same result with WhatsApp, Signal, Telegram, Discord, Google Messages, etc. etc.
That's always been the case. Jailbreaking your phone is also breaking TOS. Sideloading apps on iPhone by using the developer features is breaking TOS. Almost anything that gives a corporation less money or control over you is against that corporation's TOS. That's not the law, though, and we need to grow a collective spine.
... It's a lot easier to have a spine about risking getting banned from a service if getting banned from that service wouldn't destroy your life.
Well it did have to change its name from GAIM to Pidgin at some point because it infringed on "AIM" by AOL.
And whether or not Pidgin was fully "TOS-compliant" (which it might have been depending on the service we'd be looking at) is not as relevant as whether these terms would have been actually legally enforceable or not.
But seriously; the internet is now overrun with AI Slop, Spam, and automated traffic. To try to do something about it requires curation, somebody needs to decide what is junk, which is completely antithetical to open protocols. This problem is structurally unsolvable, there is no solution, there's either a useless open internet or a useful closed one. The internet is voting with Cloudflare, Discord, Facebook, to be useful, not open. The alternative is trying to figure out how to run a decentralized dictatorship that only allows good things to happen; a delusion.
The only other solution is accountability, a presence tied to your physical identity; so that an attacker cannot just create 100,000 identities from 25,000 IP addresses and smash your small forum with them. That's an even less popular idea, even though it would make open systems actually possible. Building your own search engine or video platform would be super easy, barely an inconvenience. No need for Cloudflare if the police know who every visitor is. No need for a spam filter, if the government can enforce laws perfectly.
Take a look at email, the mother of all open protocols (older than HTTP). What happened? Radical recentralization to companies that had effective spam management, and now we on HN complain we can't break through, someone needs to do something about that centralization, so that we can go back to square one where people get spammed to death again, which will inevitably repeat the discretion required -> who has the best discretion -> flee there cycle. Go figure.
I run an email server with no specific spam filter. Sometimes I get spam. Then I add a filter on my end to delete it and move on. It's nowhere near as bad as people proclaim. Neither is deliverability, for that matter, even after I forgot to set an SPF record and some random internet server sent a bunch of spam on my behalf (which I know because I got the bounces).
You have a dirt path to your house and are therefore convinced the interstate highway system should allow direct residential driveways.
Gmail processes 376 billion emails per day. At that volume, even 0.1% spam getting through is 376 million messages. However, we're not talking about 0.1%, but 45.6% of email being spam globally. For Gmail, that's 171 billion spam messages daily. Congrats that your private server works at your scale. It's completely irrelevant, and only works because bad actors don't care about it.
Imagine though, if we even accepted spam culturally and handled it individually, as per your solution. That would mean spam can get through with brute force, which it can't right now, meaning that 45.6% would probably explode closer to 90%, 95%, or more overnight. It's only manageable at 45.6% for you because Gmail's spam filters are working overtime harming the economics.
Why should curation be centralized? We do not need a "decentralized dictatorship" (what would that even be? that's antithetical) and we certainly do not need a centralized one. It seems crazy that your solutions to AI, spam, and "automated traffic" (I don't know what that is, I assume web crawlers and such) is that the police control every single transaction.
First off, we can simply let the user, or client software, choose. Why should we let centralized servers do that by default?
At scale, DNS is somewhat centralized but authorities are disconnected from internet providers and web browsers. They're the best actors to regulate this.
For mail, couldn't we come up with a mail-DNS, that authenticates senders? There could be different limits based on whether you are an individual or a company, and whether you're sending 10'000 emails or just 100.
Regardless of whether these are good solutions -- why jump to extreme ones? "TINA" is not a helpful argument, it's a slogan.
I have no knowledge of DANE but its reliance on DNSSEC makes me worried that it would be difficult for people to adopt it.
Also, I think it solves a different problem: it prevents spoofing/MITM but what about legitimate certificates? We would still need CAs that actually curate their customers and hold them accountable. And we would need email servers/clients to differentiate between strict CAs and ones that are used solely for encryption purposes.
I don't know that DNS should be applied to emails as is anyway but I find it could force spammers to operate with publicly available information which would make holding them accountable easier.
So the solution to AI slop and spam is end of anonymity and total state control of the internet? Talk about the cure being worse than the disease.
The issues with todays internet stem specifically from the centralisation of power in the hands of Google, Apple and the social networks.
Bad search results? Blame Google's monopoly incentivising them intentionally making their results worse.
Difficulty promoting or finding events? Blame Facebooks real revenue model - preventing one to many communications by default and charging for exceptions.
AI overrun with slop? Blame OpenAI and Facebook, both of whom are actively promoting and profiting from the creation of slop.
Automated traffic slowing down sites? It's often the AI companies indexing and reindexing hundreds of times.
Spam? Not a huge issue for anyone that I'm aware of.
The closed internet platforms are the problem. Forcing them to relinquish control over handsets, data and our interpersonal connections is the solution. It will be legislative, or it will be torches to the data centres, likely both. But it is coming.
> The issues with todays internet stem specifically from the centralisation of power in the hands of Google and the social networks.
> Bad search results? Blame Google's monopoly incentivising them intentionally making their results worse.
> Difficulty promoting or finding events? Blame Facebooks real revenue model - preventing one to many communications by default and charging for exceptions.
You're misdiagnosing what happened here. These aren't diseases. These are symptoms that the more open internet, that we had in the early 2000s, completely failed at scale. The disease was the predictable failure of an open system to self-moderate, the symptoms the centralization that followed. You're mistaking effect for cause.
People started using Google, because it was the only tool good enough at digging through manure. Facebook started charging for mass communication, because otherwise, everyone has an excuse why they need to use it. Cloudflare became popular, because the internet didn't care when 40% of traffic was bots, half of them malicious, before AI was even on the scene. And so on.
The open system failed, and was becoming unusable. Big Tech arrived offering proprietary solutions as CPR. They didn't cause the death.
> To try to do something about it requires curation, somebody needs to decide what is junk, which is completely antithetical to open protocols.
The contra-example, of course, is email. SpamAssassin figured this out 24 years (!) ago. There is zero reason you couldn't apply similar heuristics to detect AI-slop or whatever particular kind of content you don't want to accept.
> Radical recentralization to companies that had effective spam management
A. SpamAssassin has never been tested at Gmail scale, and would likely fail in such a scenario.
B. SpamAssassin is benefiting from centralized players, like Gmail, harming spam's economics. You're a free rider from the onslaught that would occur if spamming actually worked. Spam is at 45.6% of email globally with aggressive spam filters, but could easily double, triple, quadruple in volume if filters started failing even moderately. Weaker filters, and we'll start seeing the email DDoS for the first time.
C. Heuristics on AI Content? What are you going to do, run an "AI Detector" model on a GPU for every incoming email? 376 billion of them every day to Gmail alone? This only makes the email DDoS even more likely.
D. Lazy = 99%+ of global computer users - and that's changing as soon as everyone becomes their own paramedic. If you can't convince most people to learn how to save other people's lives, and probably didn't bother yourself, despite it being disproportionately more important, you're never teaching them technical literacy.
I think you misunderstand what I'm getting at. SpamAssassin is older than Gmail. It's an old example, much newer and better spam-filtering-at-scale solutions exist (although SA is still maintained). Trying to claim that only the big boys can filter spam is an uninformed opinion.
No, you don't need an AI model to detect AI content (lmao). Heuristics already exist, and you see people mention them online all the time -- excessive use of lists, em dashes, common phrases, etc. Yes, a basic text heuristic scorer from the 1980s can pick these up without much difficulty. The magic of auto-learning heuristics (which have also existed since the 1980s, and performed fine at scale with less processing power than your smartwatch) is you can train them on whatever content you don't want to receive: marketing, political content, etc. You can absolutely apply this to whatever content suits your fancy, and it doesn't really take any more effort than moving messages you want filtered out to a Junk folder or similar.
FWIW in some jurisdictions you might be able to sue them for tortious interference, which basically means they went out of their way to hurt your business.
I see a lot of comments here about using some browser that will allow ME to see sites I want to see, but I did not see a lot about how do I protect my site or sites of clients from being subjected to this. Is there anything proactive that can be done? A set of checks almost like regression testing? I understand it can be a bit like virus builders using anti virus to test their next virus. But is there a set of best practices that could give you higher probability of not being blocked?
> how do I protect my site or sites of clients from being subjected to this. Is there anything proactive that can be done?
Some steps to prevent this happening to you:
1. Host only code you own & control on your own domain. Unless...
2. If you have a use-case for allowing arbitrary users to publish & host arbitrary code on a domain you own (or subdomains of), then ensure that domain is a separate dedicated one to the ones you use for your own owned code, that can't be confused with your own owned hosted content.
3. If you're allowing arbitrary members of the public to publish arbitrary code for preview/testing purposes on a domain you own - have the same separation in place for that domain as mentioned above.
4. If you have either of the above two use-cases, publish that separated domain on the Mozilla Public Suffix list https://publicsuffix.org/
That would protect your domains from being poisoned by arbitrary publishing, but wouldn't it risk all your users being affected by one user publishing?
Allowing user publishing is an inherent risk - these are good mitigations but nothing will ever be bulletproof.
The main issue is protecting innocent users from themselves - that's a hard one to generalise solutions to & really depends on your publishing workflows.
Beyond that, the last item (Public Suffix list) comes with some decent additional mitigations as an upside - the main one being that Firefox & Chrome both enable more restrictive cookie settings while browsing any domains listed in the public suffix list.
---
All that said - the question asked in the comment at the top of the thread wasn't about protecting users from security risk, but protecting the domain from being flagged by Google. The above steps should at least do that pretty reliably, barring an actual legitimate hack occurring.
A good takeaway is to separate different domains for different purposes.
I had prior been tossing up the pros/cons of this (such as teaching the user to accept millions of arbitrary TLDs as official), but I think this article (and other considerations) have solidified it for me.
The biggest con of this is that to a user it will seem much more like phishing.
It happened to me a while ago that I suddenly got emails from "githubnext.com". Well, I know Github and I know that it's hosted at "github.com". So, to me, that was quite obviously phishing/spam.
This is such a difficult problem. You should be able to buy a “season pass” for $500/year or something that stops anyone from registering adjacent TLDs.
And new TLDs are coming out every day which means that I could probably go buy microsoft.anime if I wanted it.
This is what trademarks are supposed to do, but it’s reactive and not proactive.
PayPal is a real star when it comes to vague, fake-sounding, official domains.
Real users don't care much about phishing as long as you got redirected from the main domain, though. github.io has been accepted for a long time, and githubusercontent.com is invisible 99% of the time. Plus, if your regular users are not developers and still end up on your dev/staging domains, they're bound to be confused regardless.
Maybe a dumb question but what constitutes user-hosted-content?
Is a notion page, github repo, or google doc that has user submitted content that can be publicly shared also user-hosted?
IMO Google should not be able to use definitive language "Dangerous website" if its automated process is not definitive/accurate. A false flag can erode customer trust.
The definition of "active code" is broad & sometimes debatable - e.g. do old MySpace websites count - but broadly speaking the best way of thinking about it is in terms of threat model, & the main two there are:
- credential leakage
- phishing
The first is fairly narrow & pertains to uploading server side code or client javascript. If Alice hosts a login page on alice.immich.cloud that contains some session handling bugs in her code, Mallory can add some cute to mallory.immich.cloud to read cookies set on *.immich.cloud to compromise Alice's logins.
The second is much broader as it's mostly about plausible visual impersonation so will also cases where users can only upload CSS or HTML.
Specifically in this case what Immich is doing here is extremely dangerous & this post from them - while I'll give them the benefit of the doubt on being ignorant - is misinformation.
It may be dangerous but it is an established pattern. There are many cases (like Cloudflare Pages) of others doing the same, hosting strangers' sites on subdomains of a dedicated domain (pages.dev for Cloudflare, immich.cloud for Immich).
By preventing newcomers from using this pattern, Google's system is flawed, severely stifling competition.
It is but this established pattern is well standardised & documented by the public suffix list project. There's generally two conventions followed for this pattern:
1. Use a separate dedicated domain (Immich didn't do this - they're now switching to one in response to this)
2. List the separate dedicated domain in the public suffix list. As far as I can tell Immich haven't mentioned this.
> what Immich is doing here is extremely dangerous
You fully misunderstand what content is hosted on these sites. It's only builds from internal branches by the core team, there is no path for "external user" content to land on this domain.
>> Unfortunately, Google seems to have the ability to arbitrarily flag any domain and make it immediately unaccessible to users. I'm not sure what, if anything, can be done when this happens, except constantly request another review from the all mighty Google.
Perhaps a complaint to the ETC for abusing the monopoly and lack of due process to harm legitimate business? Or DG COMP (in the EU).
Gather evidence of harm and seek alliances with other open-source projects could build a momentum.
Looking forward to Louis Rossmann's reaction. Wouldn't be surprised if this leads to a lawsuit over monopolistic behavior - this is clearly abusing their dominant position in the browser space to eliminate competitors in photos sharing.
He's a right-to-repair activist Youtuber who is quite involved in GrayJay, another app made by this company, which is a video player client for other platforms like YouTube.
I'm not sure why his reaction would be relevant, though. It'll just be another rant about how Google has too much control like he's done in the past. He may be right, but there's nothing new to say.
He wasn't just involved with GrayJay, he's actually a member of FUTO - the company behind Immich and GrayJay. Now read grandparent comment one more time:
> Wouldn't be surprised if this leads to a lawsuit over monopolistic behavior
His reaction also matters because he's basically the public face for the company on YouTube and has a huge following. You've probably seen a bunch of social media accounts with the "clippy" character as their avatar. That's a movement started by Louis Rossman.
I write a couple of libraries for creating GOV.UK services and Google has flagged one of them as dangerous. I've appealed the decision several times but it's like screaming into a void.
I use Google Workspace for my company email, so that's the only way for me to get in contact with a human, but they refuse to go off script and won't help me contact the actual department responsible in any way.
It's now on a proper domain, https://govuk-components.x-govuk.org/ - but other than moving, there's still not much anyone can do if they're incorrectly targeted.
Google is not the only one marking subdomains under netlify.app dangerous. For a good reason though, there's a lot of garbage hosted there. Netlify also doesn't do a good enough job of taking down garbage.
Given the scale of Google, and the nerdiness required to run Immich, I bet it's just an accident. Nevertheless, I'm very curious as to how senior Google staff looks at Immich, are they actually registering signals that people use immich-go to empty their Google Photos accounts? Do they see this as something potentially dangrous to their business in the long term?
The nerdsphere has been buzzing with Immich for some time now (I started using it a month back and it lives up to its reputation!), and I assume a lot of Googlers are in that sphere (but not neccessarily pro-Google/anti-Immich of course). So I bet they at least know of it. But do they talk about it?
I love Immich but the entire design and interface is so clearly straight up copied from Google photos. It makes me a bit nervous about their exposure, legally.
I think the other very interesting thing in the reddit thread[0] for this is that if you do well-known-domain.yourdomain.tld then you're likely to get whacked by this too. It makes sense I guess. Lots of people are probably clicking gmail.shady.info and getting phished.
Can I use this space to comment on how amazing Immich is? I self host lots of stuff, and there’s this one tier above everything else that’s currently, and exclusively, held by Home Assistant and Immich. It is actually _better_ than Google photos (if you keep your db and thumbs on ssd, and run the top model for image search). You give up nothing, and own all your data.
I think I found it because it was recommended by Immich as the best, but it still only took a day or two to run against my 5 thousand assets. I’ve tested it against whatever Google is using (I keep a part of my library on Google Photos), and it’s far better.
I’m also self hosting gitea and pertainer and I’m trying this issue every few weeks. I appeal, they remove the warning, after a week is back. This is ongoing for at least 4 years. I have more than 20 appeals all successfully removing the warning. Ridiculous. I heard legal action is the best option now, any other ideas?
Safe Browsing collects a lot of data, such as hashes of URLs (URLs can be easily decoded by comparison) and probably other interactions with web like downloads.
But how effective is it in malware detection?
The benefits seem to me dubious. It looks like a feature offered to collect browsing data, useful to maybe 1% in special situations.
It's the only thing that has reasonable coverage to effectively block a phishing attack or malware distribution. It can certainly do other things like collecting browsing data, but it does get rid of long-lasting persistent garbage hosted at some bulletproof hosts.
I’ve heard anecdotes of people using an entirely internal domain like “plex.example.com” even if it’s never exposed to the public internet, google might flag it as impersonating plex. Google will sometimes block it based only on name, if they think the name is impersonating another service.
Its unclear exactly what conditions cause a site to get blocked by safe browsing. My nextcloud.something.tld domain has never been flagged, but I’ve seen support threads of other people having issues and the domain name is the best guess.
I'm almost positive GMail scanning messages is one cause. My domain got put on the list for a URL that would have been unknowable to anyone but GMail and my sister who I invited to a shared Immich album. It was a URL like this that got emailed directly to 1 person:
Then suddenly the domain is banned even though there was never a way to discover that URL besides GMail scanning messages. In my case, the server is public so my siblings can access it, but there's nothing stopping Google from banning domains for internal sites that show up in emails they wrongly classify as phishing.
Think of how Google and Microsoft destroyed self hosted email with their spam filters. Now imagine that happening to all self hosted services via abuse of the safe browsing block lists.
if it was just the domain, remember that there is a Cert Transparency log for all TLS certs issued nowadays by valid CAs, which is probably what Google is also using to discover new active domains
It doesn’t seem like email scanning is necessary to explain this. It appears that simply having a “bad” subdomain can trigger this. Obviously this heuristic isn’t working well, but you can see the naive logic of it: anything with the subdomain “apple” might be trying to impersonate Apple, so let’s flag it. This has happened to me on internal domains on my home network that I've exposed to no one. This also has been reported at the jellyfin project: https://github.com/jellyfin/jellyfin-web/issues/4076
That's not going to be gleaned from a CT log or guessed randomly. The URL was only transmitted once to one person via e-mail. The sending was done via MXRoute and the recipient was using GMail (legacy Workspace).
The only possible way for Google to have gotten that URL to start the process would have been by scanning the recipient's e-mail.
I've read almost everything linked in this post and on Reddit and, with what you pointed out considered, I'd say the most likely thing that got my domain flagged is having a redirect to a default styled login page.
The thing that really frustrates me if that's the case is that it has a large impact on non-customized self-hosted services and Google makes no effort to avoid the false positives. Something as simple as guidance for self-hosted apps to have a custom login screen to differentiate from each other would make a huge difference.
Of course, it's beneficial to Google if they can make self-hosting as difficult as possible, so there's no incentive to fix things like this.
Well, that's potentially horrifying. I would love for someone to attempt this in as controlled of a manner as possible. I would assume it's possible for anyone using Google DNS servers to also trigger some type of metadata inspection resulting in this type of situation as well.
Also - when you say banned, you're speaking of the "red screen of death" right? Not a broader ban from the domain using Google Workplace services, yeah?
> Also - when you say banned, you're speaking of the "red screen of death" right?
Yes.
> I would love for someone to attempt this in as controlled of a manner as possible.
I'm pretty confident they scanned a URL in GMail to trigger the blocking of my domain. If they've done something as stupid as tying GMail phishing detection heuristics into the safe browsing block list, you might be able to generate a bunch of phishy looking emails with direct links to someone's login page to trigger the "red screen of death".
This reminds me of another post where a scammer sent a gmail message containing https://site.google.com/xxx link to trick users into click, but gmail didn't detect the risk.
I'm kind of curious, do you have your own domain for immich or is this part of a malware-flagged subdomain issue? It's kind of wild to me that Google would flag all instances of a particular piece of self-hosted software as malicious.
- A self-hosted project has a demo instance with a default login page (demo.immich.app, demo.jellyfin.org, demo1.nextcloud.com) that is classified as "primary" by google's algorithms
- Any self-hosted instance with the same login page (branding, title, logo, meta html) becomes a candidate for deceptive/phishing by their algorithm. And immich.cloud has a lot of preview envs falling in that category.
BUT in Immich case its _demo_ login page has its own big banner, so it is already quite different from others.
Maybe there's no "original" at all. The algorithm/AI just got lost among thousands of identically looking login pages and now considers every other instance as deceptive...
I'm guessing Google's phishing analysis must be going off the rails seeing all of these login prompts saying "immich" when there's an actual immich cloud product online.
If I were tasked with automatically finding phishing pages, I too would struggle to find a solution to differentiate open-source, self-hosted software from phishing pages.
I find it curious that this is happening to Immich so often while none of my own self-hosted services have ever had this problem, though. Maybe this is why so many self-hosted tools have you configure a name/descriptor/title/whatever for your instance, so they can say "log in to <my amazing photo site>" rather than "log in to Product"? Not that Immich doesn't offer such a setting.
Tangential to the flagging issue, but is there any documentation on how Immich is doing the PR site generation feature? That seems pretty cool, and I'd be curious to learn more.
I’m curious about basically all of it. It seems like such a powerful tool.
I seem to have irritated the parallel commenters tremendously by asking, but it seemed implausible I’d understand the design considerations by just skimming the CI config.
Top of mind would be:
1. How do y'all think about mitigating the risk of somebody launching malicious or spammy PR sites? Is there a limiting factor on whose PRs trigger a launch?
2. Have you seen resource constraint issues or impact to how PRs are used by devs? It seems like Immich is popular enough that it could easily have a ton of inflight PR dev (and thus a ton of parallel PR instances eating resources)
3. Did you borrow this pattern from elsewhere / do you think the current implementation of CI hooks into k8s would be generalizable? I’ve seen this kind of PR preview functionality in other repos that build assets (like CLI tools) or static content (like docs sites), but I think this is the first time I’ve seen it for something that’s a networked service.
1. It only works at all for internal PRs, not for forks. That is a limitation we'd like to lift if we could figure out a way to do it safely though.
2. It's running on a pretty big machine, so I haven't seen it approach any limits yet. We also only create an instance when requested (with a PR label).
3. I've of course been inspired by other examples, but I think the current pattern is mostly my own, if largely just one of the core uses of the flux-operator ResourceSet APIs [1]. It's absolutely generalizable - the main 'loop' [2] just templates whatever Kubernetes resources based on the existence of a PR, you could put absolutely anything in there.
Sometimes it is also rude to ask without looking the obvious place themselves. It is about signaling that ”my” time is more precious than ”your” time so I let them do that check for me, if I can use someone elses time.
I think we might have hit the inflection point where being rude is more polite. It's not that I want people to be rude to me, it's that I don't want to talk to AI when I intend to be talking to a person, and anyone engaging with me via AI is infinitely more disrespectful than any curse word or rudeness.
These days, when I get a capitalized, grammatically correct sentence — and proper punctuation to boot, there is an unfortunate chance it was written using an AI and I am not engaging fully with a human.
its when my covnersation partner makes human mistakes, like not capitalizing things, or when they tell me i'm a bonehead, that i know i'm talking to a real human not a bot. it makes me feel happier and more respected. i want to interact with humans dammit, and at this point rude people are more likely to be human than polite ones on the internet.
i know you can prompt AIs to make releaistic mistakes too, the arms race truly never ends
Pretty sure Immich is on github, so I assume they have a workflow for it, but in case you're interested in this concept in general, gitlab has first-class support for this which I've been using for years: https://docs.gitlab.com/ci/review_apps/ . Very cool and handy stuff.
This happened to one of our documentation sites. My co-workers all saw it before I did, because Brave (my daily driver) wasn't showing it. I'm not sure if Brave is more relaxed in determining when a site is "dangerous" but I was glad not to be seeing it, because it was a false positive.
Not sure if this is exactly the scenario from the discussed article but it's interesting to understand it nonetheless.
TL;DR the browser regularly downloads a dump of color profile fingerprints of known bad websites. Then when you load whatever website, it calculates the color profile fingerprint of it as well, and looks for matches.
(This could be outdated and there are probably many other signals.)
I had this same problem with my self-hosted Home Assistant deployment, where Google marked the entire domain as phishing because it contains a login page that looks like other self-hosted Home Assistant deployments.
Fortunately, I expose it to the internet on its own domain despite running through the same reverse proxy as other projects. It would have sucked if this had happened to a domain used for anything else, since the appeal process is completely opaque.
This can happen to everyone. It happened to Amazon.de's Cloudfront endpoint a week ago. Most people didn't notice because Chrome doesn't look at the intermediate bits in the resolver chain, but DNS providers using Safe Browsing blocked it.
Yes, this is not a new problem: Web browsers has taken on the role as internet police but they only care about their judgement and don't afford websites operators any due process or recourse. And by web browsers I mean Google because of course everyone just defers to them. "File a complaint with /dev/null" might be how Google operates their own properties but this should not be acceptable for the web as a whole. Google and those integrating their "solutions" need to be held accountable for the damage they cause.
> There is a user in the JavaScript community who goes around adding "backwards compatibility" to projects. They do this by adding 50 extra package dependencies to your project, which are maintained by them.
This is crazy, it happened to the SoGO webmailer, standalone or bundled with the mailcow: dockerized stack as well. They implemented a slight workaround where URLs are being encrypted to avoid pattern detection to flag it as "deceiving".
There is no responses from Google about this. I had my instance flagged 3 times on 2 different domains including all subdomains, displaying a nice red banner on a representative business website. Cool stuff!
Google often marks my homelab domains as dangerous which all point to an A record that is in the private IP space, completely inaccessible to the internet.
The .internal.immich.cloud sites do not have matching certs!
Navigating to https://main.preview.internal.immich.cloud, I'm right away informed by the browser that the connection is not secure due to an issue with the certificate. The problem is that it has the following CN (common name): main.preview.internal.immich.build. The list of alternative names also contains that same domain name. It does not match the site: the certificate's TLD .build is different from the site's .cloud!
I don't see the same problem on external sites like tiles.immich.cloud. That has a CN=immich.cloud
with tiles.immich.cloud as an alternative.
We still don't know what caused it because it happened to the Cloudflare R2 subdomain, and none of the Search Console verification methods work with R2. It also means it's impossible to request verification.
This happened to me, I hosted a Wordpress site and it got 0'day'd (this was probably 8 years ago). Google spotted the list of insane pornographic URLs and banned it. You might want to verify nothing is compromised.
First thing I do when I start to use a browser for the first time is making sure 'Google Safe Browsing' feature is disabled. I don't need yet another annoyance while I browse the web, especially when it's from Google.
> The most alarming thing was realizing that a single flagged subdomain would apparently invalidate the entire domain.
Correct. It works this way because in general the domain has the rights over routing all the subdomains. Which means if you were a spammer, and doing something untoward on a subdomain only invalidated the subdomain, it would be the easiest game in the world to play.
Honestly, where do people live that the DMV (or equivalent - in some states it is split or otherwise named) is a pain? Every time I've ever been it has been "show up, take a number, wait 5 minutes, get served" - and that's assuming website self-service doesn't suffice.
I’d say this is a clear slight from Google, using their Chrome browser because something or someone is inconveniencing another part of their business, google cloud / google photos.
They did a similar thing with the uBlock Origin extension, flagging it with “this extension might be slowing down your browser” in a big red banner in the last few months of manifest v2 on Chrome. After already having to upload the extension yourself to Chrome cause they took it off the extension store cause it was inhibiting on their ad business.
Google is a massive monopolistic company who will pull strings on one side of their business to help another.
With only Firefox not being based on Chromium and still having manifest v2 the future (5 to 10 years from now) looks bleak. With only 1 browser like this web devs can phase it out slowly by not taking it into consideration when coding or Firefox could enshittify to such an extent because of their manifest v2 monopoly that even that wont make it worth it anymore.
Oh and for the ones not in the know, Manifest is the name of a javascript file manifest.js that decides what browser extensions can and cant modify and the “upgrade” from manifest v2 to v3 has made it near impossible for adblockers to block ads.
This is a known thing since quite some time and the only solution is to use separate domain. This problem has existed for so long that at this point we as users adapt to it rather than still expecting Google to fix this.
From their perspective, a few false positives over the total number of actual malicious websites blocked is fractional.
I've had it work for me several times. Most of the time following links/redirects from search engines, ironically a few times from Google itself. Not that I was going to enter anything (the phishing attempts themselves were quite amateurish) but they do help in some rare cases.
When I worked customer service, these phishing blocks worked wonders preventing people from logging in to your-secure-webmail.jobz. People would be filling in phishing forms days after sending out warnings on all official channels. Once Google's algorithm kicked in, the attackers finally needed to switch domains and re-do their phishing attempts.
I had my personal domain I use for self-hosting flagged. I've had the domain for 25 years and it's never had a hint of spam, phishing, or even unintentional issues like compromised sites / services.
It's impossible to know what Google's black box is doing, but, in my case, I suspect my flagging was the result of failing to use a large email provider. I use MXRoute for locally hosted services and network devices because they do a better job of giving me simple, hard limits for sending accounts. That way if anything I have ever gets compromised, the damage in terms of spam will be limited to (ex) 10 messages every 24h.
I invited my sister to a shared Immich album a couple days ago, so I'm guessing that GMail scanned the email notifying her, used the contents + some kind of not-google-or-microsoft sender penalty, and flagged the message as potential spam or phishing. From there, I'd assume the linked domain gets pushed into another system that eventually decides they should blacklist the whole domain.
The thing that really pisses me off is that I just received an email in reply to my request for review and the whole thing is a gas-lighting extravaganza. Google systems indicate your domain no longer contains harmful links or downloads. Keep yourself safe in the future by blah blah blah blah.
Umm. No! It's actually Google's crappy, non-deterministic, careless detection that's flagging my legitimate resources as malicious. Then I have to spend my time running it down and double checking everything before submitting a request to have the false positive mistake on Google's end fixed.
Convince me that Google won't abuse this to make self hosting unbearable.
> I suspect my flagging was the result of failing to use a large email provider.
This seems like the flagging was a result of the same login page detection that the Immich blog post is referencing? What makes you think it's tied to self-hosted email?
I'm not using self hosted email. My theory is that Google treats smaller mail providers as less trustworthy and that increases the odds of having messages flagged for phishing.
In my case, the Google Search Console explicitly listed the exact URL for a newly created shared album as the cause.
I wish I would have taken a screenshot. That URL is not going to be guessed randomly and the URL was only transmitted once to one person via e-mail. The sending was done via MXRoute and the recipient was using GMail (legacy Workspace).
The only possible way for Google to have gotten that URL to start the process would have been by scanning the recipient's e-mail. What I was trying to say is that the only way it makes sense to me is if Google via GMail categorized that email as phishing and that kicked off the process to add my domain to the block list.
So, if email categorization / filtering is being used as a heuristic for discovering URLs for the block list, it's possible Google's discriminating against domains that use smaller email hosts that Google doesn't trust as much as themselves, Microsoft, etc..
All around it sucks and Google shouldn't be allowed to use non-deterministic guesswork to put domains on a block list that has a significant negative impact. If they want to operate a clown show like that, they should at least be liable for the outcomes IMO.
I'm in a similar boat. Google's false flag is causing issues for my family members who use Chrome, even for internal services that aren't publicly exposed, just because they're on related subdomains.
It's scary how much control Google has over which content people can access on the web - or even on their local network!
This is another case where it's highly important to "plant your flag" [1] and set up all those services like Search Console, even if you don't plan to use them. Not only can this sort of thing happen, but bad-guys can find crafty ways of hijacking your search console account if you're not super vigilant.
Google Postmaster Console [2] is another one everybody should set up on every domain, even if you don't use gmail. And Google Ads, even if you don't run ads.
I also recommend that people set up Bing search console [3] and some service to monitor DMARC reports.
It's unfortunate that so much of the internet has coalesced around a few private companies, but it's undeniably important to "keep them happy" to make sure your domain's reputation isn't randomly ruined.
It does seem kind of stupid to (apparently) not have google search console, or even a google account according to them, for your business. I don't like Google being in control of so much of the internet - but they are, and it won't do us any good to shout into the void about it when our domain and livelihood is on the line.
Simply opening a case saying that this is our website not impersonating anyone else is unlikely to get anything resolved.
Just because it's your website, and you're not a bad agent doesn't prove that no part of the site is under the control of a bad agent, and that your site isn't accidentally hosting something malicious somewhere, or have some UI that is exploitable for cross-site scripting or whatever.
Sure, but why does Google approve our review over and over again without us making any changes or modifications to the flagged sites/urls? It's a vanilla Immich deployment with docker containers from GitHub pushed there by the core team.
I believe that Jellyfin, Immish, and NextCloud login pages are automatically flagged as dangerous by Google. What's more, I suspect that Google is somehow collecting data from its browser - Chrome.
Google flagged my domain as dangerous once. I do host Jellyfin, Immish, and NextCloud. I run an IP whitelist on the router. All packets from IPs that are not whitelisted are dropped. There are no links to my domain on the internet. At any time, there are 2-3 IPs belonging to me and my family that can load the website. I never whitelisted Google IPs.
How on earth did Google manage to determine that my domain is dangerous?
F you, Google!
Thank goodness I severed that relationship years ago. With so many other great (and ethically superior) products out there to choose from, you'd have to be a true masochist to intentionally throw yourself into their pool of shit.
I've rarely seen a HN comment section this overwhelmingly wrong on a technical topic. This community is usually better than this.
Google is an evil company I want the web to be free of, I resent that even Firefox & Safari use this safe browsing service. Immich is a phenomenal piece of software - I've hosted it myself & sung its praises on HN in the past.
Put putting aside David vs Goliath biases here, Google is 100% correct here & what Immich are doing is extremely dangerous. The fact they don't acknowledge that in the blog post shows a security knowledge gap that I'm really hoping is closed over the course of remediating this.
I don't think the Immich team mean any harm but as it currently stands the OP constitutes misinformation.
They're auto-deploying PRs to a subdomain of a domain that they also use for production traffic. This allows any member of the public with a GitHub account to deploy any arbitrary code to that subdomain without any review or approval from the Immich team. That's bad for two reasons:
1. PR deploys on public repos are inherently tricky as code gains access to the server environment, so you need to be diligent about segregating secrets for pr deployments from production secret management. That diligence is a complex & continuous undertaking, especially for an open source project.
2. Anyone with a GitHub account can use your domain for phishing scams or impersonation.
The second issue is why they're flagged by Google (he first issue may be higher risk to the Immich project but it's out of scope for Google's safe browsing service).
To be clear: this isn't about people running their own immich instance. This is about members of the public having the ability to deploy arbitrary code without review.
---
The article from the Immich team does mention they're switching to using a non-production domain (immich.build) for their PR builds which does indicate to me they somewhat understand the issue (though they've explained it badly in the article), but they don't seem to understand the significance or scope.
> This allows any member of the public with a GitHub account to deploy any arbitrary code to that subdomain without any review or approval from the Immich team.
This part is not correct: the "preview" label can be set only by collaborators.
> a subdomain of a domain that they also use for production traffic
To clarify this part: the only production traffic that immich.cloud serves are static map tiles (tiles.immich.cloud)
Overall, I share your concerns, and as you already mentioned, a dedicated "immich.build" domain is the way to go.
> This part is not correct: the "preview" label can be set only by collaborators.
That's good & is a decent starting point. A decent second step might be to have the Github Actions workflow also check the approval status of the PR before deploying (requiring all collaborators to be constantly aware that the risk of applying a label is similar to that of an approval seems less viable)
The workflow is fundamentally unable to deploy a PR from a fork, it only works for internal branches, as it relies on the container image being pushed somewhere which needs secrets available in the CI workflow.
If there are any googlers here, I'd like to report an even more dangerous website. As much as 30-50% of the traffic to it relates to malware or scams, and it has gone unpunished for a very long time.
I see the same scam/deepfake ad(s) pretty much persistently. Maybe they actually differ slightly (they are AI gen mostly), but it's pretty obvious what they are, and I'm sure they get flagged a lot.
They just need to introduce a basic deposit to post ads, and you lose it if you put up a scam ad. Would soon pay for the staff needed to police it, and prevent scammers from bypassing admin by trivially creating new accounts.
I used to flag obvious scam adverts. A bunch of times I'd even get an email response a few weeks later saying it was taken down. But then I'd see it again (maybe slightly different or by a "different" advertiser, who knows). Its whack-a-mole.
The reality is that google profits from scam adverts, so they don't proactively do anything about it and hide behind the "at our scale, we can't effectively do anything about it" argument. Which is complete horseshit because if you can't prevent obvious scams on your platform, you don't deserve to have a platform. Google doesn't have to be running at their scale. "We would make less money" is not a valid excuse. We'd all make more money if we could ignore laws and let people be scammed or taken advantage of.
There's plenty of ways they could solve it, but they choose not to. IMHO this should be a criminal offence and google executives should be harshly punished. Its also why I have a rather negative view of googlers, since they wilfully perpetuate this stuff by working on adtech while nothing is being done about the normal everyday people getting scammed each day. Its only getting worse with AI, but I've been seeing it for years.
What i really don't understand at least here in Europe the advertising partner (adsense) must investigate at least minimally whether the advertising is illegal or fraudulent, i understand that sites.google etc are under "safe harbor" but that's not the point with adsense since people from google "click" the publish button and also get money to publish that ad.
I have reported over a dozen ads to AdSense (Europe) because of them being outright scams (e.g. on weather apps, an AdSense banner claiming "There is a new upgrade to this program, click here to download it") . Google has invariably closed my reports claiming that they do not find any violation of the adsense policies.
Same thing with Instagram, they accept all scam ads.
Google and Meta are trillion dollar criminal enterprises. The lion's share of their income comes from fraud and scams, with real victims having their lives destroyed. That is the sad truth, no matter how good and important some of their services are. They will never stop their principal source of income.
The law is only for plebs like you and me. Companies get a pass.
I'm still amazed how deploying spyware would've rightfully landed you in jail a couple decades back, but do the same thing on the web under the justification of advertising/marketing and suddenly it's ok.
The same outfit is runimg a domain called blogger.
Reminds me of MS blocking a website of mine for dangerous script. The offending thing i did was use document.write to put copyright 2025 (with the current year) at the end of static pages.
Microsoft's own Outlook.com flags Windows Insider emails coming from a .microsoft.com domain as junk even after marking the domain as "no junk". They know themselves well.
The integrated button to join a Microsoft Teams meeting directly from my Microsoft Outlook Calendar doesn't work because Microsoft needs to scan the link from Microsoft to Microsoft for malware before proceeding, and the malware scanning service has temporary downtime and serves me static page saying "The content you are accessing cannot currently be verified".
sites.google.com is widely abused but so practically any site which allows users to host content of their choice and make it publicly available. Where google can be different is that they famously refuse yo do work which they cannot automate and probably they cannot (or don’t want) to automate detection/blocking of spam/phishing hosted on sites.google.com and processing of abuse reports.
Use one of the forks. librewolf, waterfox, zen. Firefox itself lost trust when Mozilla tried to push the new Terms of Use earlier this year. That was so aggressively user-hostile that nobody should trust Mozilla ever again. Using a fork puts an insulation layer between you and Mozilla.
Librewolf is just a directly de-mozillaed and privacy-enhanced Firefox, similar to Ungoogled Chromium. I've been trying to get in the habit of using Zen Browser, which has a bunch of UI changes.
Rolling back a change that causes loss of user trust does not automatically restore that trust. It takes time and ongoing public commitment to regain that trust.
The problem is that all those forks are beholden to Mozilla's corporate interests the same way the chromium derivatives are beholden to Google's corporate interests. What we need is one of the newer independent engines to mature - libweb, servo or blitz.
You can read this as, "I want Mozilla to spend millions developing a competitive Chrome alternative, but I want it for free and aligned with all my personal nitpicks".
Typical freeloader behaviour, moans about free software politics but won't contribute anything themselves.
No they're not. They can pull what they like and not pull what they don't.
Librewolf is trying to be de-Mozillaed, privacy-enhanced Firefox, so it'll probably take whatever not-overtly-spyware patches Mozilla adds. Some others, like Waterfox and Pale Moon, are more selective.
Apparently the "best practise" is using Manifest V3 versus V2.
Reading a bit online (not having any personal/deep knowledge) it seems the original extension also downloaded updates from a private (the developers) server, while that is no longer allowed - they now need to update via the chrome extension, which also means waiting for code review/approval from google.
I can see the security angle there, it is just awkward how much of an vested interest google has in the whole topic. ad-blocking is already a grey area (legally), and there is a cat-and-mouse between blockers and advertisers; it's hard to believe there is only security best-practise going on here.
You know what? I don't even mind them killing it, because of course there are a whole pile of items under the anti-trust label that google is doing so why not one more. But what I do take issue with is the gaslighting, their attempt to make the users believe that this is in the users interests, rather than in google's interests.
If we had functional anti-trust laws then this company would have been broken up long ago, Alphabet or not. But they keep doing these things because we - collectively - let them.
I know they won't. But we have all the tools to force them to care. We just don't use the tools effectively, and between that and lobbying they get a free pass to pretty much do as they please.
As someone who doesn't like Google and absolutely thinks they need to be broken up, no probably not. Google's algorithms around security are so incompetent and useless that stupidity is far more likely than malice here.
Callous disregard for the wellbeing of others is not stupidity, especially when demonstrated by a company ostensibly full of very intelligent people. This behavior - in particular, implementing an overly eager mechanism for damaging the reputation of other people - is simply malicious.
Incompetently or "coincidentally" abusing your monopoly in a way that "happens" to suppress competitors (while whitelisting your own sites) probably won't fly in court. Unless you buy the judge of course.
Intent does not always matter to the law ... and if a C&D is sent, doesn't that imply that intent is subsequently present?
Defamation laws could also apply independently of monopoly laws.
I don't see how this is an issue. To me, this does seem at least confusing, but possibly dangerous.
If you have internal auth testing domains at the same place as user generated content, what's to stop somebody thinking a user-generated page isn't a legit page when it asked you to login or something?
If you're going to host user content on subdomains, then you should probably have your site on the Public Suffix List https://publicsuffix.org/list/ . That should eventually make its way into various services so they know that a tainted subdomain doesn't taint the entire site....
God I hate the web. The engineering equivalent of a car made of duct tape.
> Since there was and remains no algorithmic method of finding the highest level at which a domain may be registered for a particular top-level domain
A centralized list like this not just for domains as a whole (e.g. co.uk) but also specific sites (e.g. s3-object-lambda.eu-west-1.amazonaws.com) is both kind of crazy in that the list will bloat a lot over the years, as well as a security risk for any platform that needs this functionality but would prefer not to leak any details publicly.
We already have the concept of a .well-known directory that you can use, when talking to a specific site. Similarly, we know how you can nest subdomains, like c.b.a.x, and it's more or less certain that you can't create a subdomain b without the involvement of a, so it should be possible to walk the chain.
Example:
Maybe ship the domains with the browsers and such and leave generic sites like AWS or whatever to describe things themselves. Hell, maybe that could also have been a TXT record in DNS as well.> any platform that needs this functionality but would prefer not to leak any details publicly.
I’m not sure how you’d have this - it’s for the public facing side of user hosted content, surely that must be public?
> We already have the concept of a .well-known directory that you can use, when talking to a specific site.
But the point is to help identify dangerous sites, by definition you can’t just let the sites mark themselves as trustworthy and rotate around subdomains. If you have an approach that doesn’t have to trust the site, you also don’t need any definition at the top level you could just infer it.
It's actually exactly the same concept that come to mind for me. `SomeUser.geocities.com` is "tainted", along with `*.geocities.com`, so `geocities.com/.wellknown/i-am-tainted` is actually reasonable.
Although technically it might be better as `.wellknown/taint-regex` (now we have three problems), like `TAINT "*.sites.myhost.com" ; "myhost.com/uploads/*" ; ...`
I think we disagree on the problem.
The thing you want to avoid is this:
a.scamsite.com gets blocked so they just put their phishing pages on b.scamsite.com
The psl or your solution isn’t a “don’t trust subdomains” notification it’s “if one subdomain is bad, you should still trust the others” and the problem there is you can’t trust them.
You could combine the two, but you still need the suffix list or similar curation.
It looks like Mozilla does use DNS to verify requests to join the list, at least.
Doing this DNS in the browser in real-time would be a performance challenge, though. PSL affects the scope of cookies (github.io is on the PSL, so a.github.io can't set a cookie that b.github.io can read). So the relevant PSL needs to be known before the first HTTP response comes back.It does smell very much like a feature that is currently implemented as a text file but will eventually need to grow to its own protocol, like, indeed, the hostfile becoming DNS.
One key difference between this list and standard DNS (at least as I understand it; maybe they added an extension to DNS I haven't seen) is the list requires independent attestation. You can't trust `foo.com` to just list its subdomains; that would be a trivial attack vector for a malware distributor to say "Oh hey, yeah, trustme.com is a public suffix; you shouldn't treat its subdomains as the same thing" and then spin up malware1.trustme.com, malware2.trustme.com, etc. Domain owners can't be the sole arbiter of whether their domain counts as a "public suffix" from the point of view of user safety.
I presume it has to be a curated list otherwise spammers would use it to evade blocks. Otherwise why not just use DNS?
Whois would be the choice. DNS’s less glamourous sibling, purpose built for delegated publication of accountability records
Whois isn't curated either.
Neither is nominating a third party for your parking fine.
The point is to get away from centralized gatekeepers, not establish more of them. A hierarchy of disavowal. It’s like cache invalidation for accountability.
If you don’t wanna be held responsible for something, you’d better be prepared to point the finger at someone whois.
> God I hate the web
This is mostly a browser security mistake but also partly a product of ICANN policy & the design of the domain system, so it's not just the web.
Also, the list isn't really that long, compared to, say, certificate transparency logs; now that's a truly mad solution.
Show me a platform not made out of duct tape and I'll show you a platform nobody uses.
regular cars?
Jeep just had an OTA update cause the car to shut down on the highway (it is rumored).
Before we put computers in cars, we had the myriad small things that would break (stuck doors, stuck windows, failed seals, leaking gaskets), a continuous stream of recalls for low-probability safety issues, and the occasional Gremlin or Pinto.
My favorite example is the Hyundai Elantra. They changed the alloy used in one of the parts in the undercarriage. Tested that model to death for a year, as they do, but their proving ground is in the southern United States.
Several winters later, it turns out that road salt attacks the hell out of that alloy and people have wheels flying off their cars in the middle of the road.
The Honda issue where setting a certain radio station, would brick the infotainment? That good enough?
> That good enough?
Not really. Does the car still drive? That sounds like a software bug; hardly indicative that the entire car is held together with duct tape, but a pretty bad bug non the less.
So i can't remember the specifics or find any references, but many years ago i remember reading about a car (prius maybe?) that would shut off and lock the doors when pulling away from a stop. (Ex: stopped at a red light, when it turns green the car would go far enough to cut off in the middle of an intersection then trap everyone inside.)
"This is Fine."
That's terrifying.
The browser still drives when Google throws up a safety warning.
It's just harder to drive to one house, and the homeowner is justifiably irritated about this.
More accurate: a mom-n-pop grocery store has its listing on Google Maps changed to PERMANENTLY CLOSED DUE TO TOXIC HEALTH HAZARDS because the mom-n-pop grocery store didn't submit Form 26B/Z to Google. There was never any health hazard, but now everyone thinks there is, and nobody can/will go there. The fact that Form 26B/Z exists at all is problematic, but what makes it terrible is the way it's used to punish businesses for not filling out a form they didn't know existed.
This is an excellent analogy because it is incumbent upon businesses to follow all the laws, including the ones they don't know about. That's one of the reasons "lawyer" is a profession.
Google doesn't have the force of law (it's in this context acting more like a Yelp: "1 star review --- our secret shopper showed up and the manager didn't give the secret 'we are not criminals' hand sign"), but the basic idea is the same: there is a complex web of interactions that can impact your online presence and experts in the field you can choose to hire for consulting or not.
Didn't used to be that way, but the web used to be a community of 100,000 people, not 5.6 billion. Everything gets more complicated when you add more people.
The other commenter's analogy of a small-business is better I think, the issue with the browser problem is that it doesn't hinder one person getting to one house, it hinders all persons getting to one place the owner _wants_ people to get to easily.
The browser issue can destroy a small business, one thing I think we can universally agree we don't want. If all of the people who come looking for it find it's being marked as malicious or just can't get there at all, they lose customers.
Worse yet, is that Google holds the keys because everyone uses Chrome, and you have to play their game by their rules just to keep breathing.
Here's the thing though: if someone else held the keys, the scenario would be the same unless there was no safe browsing protection. And if there were no safe browsing protection, we'd be trading one ill for another; small business owners facing a much steeper curve to compete vs. everyone being at more risk from malware actors.
I honestly don't immediately know how to weigh those risks against each other, but I'll note that this community likely underestimates the second one. Most web users are not nearly as tech- or socially-savvy as the average HN reader and the various methods of getting someone to a malware subdomain are increasingly sophisticated.
The road network is a much better analogy here.
Never heard of this. Link please?
Don't know about Honda, but there is this Mazda one [0] (Would not be surprised if it affected multiple vendors!)
[0] https://www.soundandvision.com/content/remembering-time-when...
Admitting I'm old, but my HP-11C still gets pretty-regular use.
And judging by eBay prices, or the SwissMicros product line, I suspect I have plenty of company.
"The engineering equivalent of a car made of duct tape"
Kind of. But do you have a better proposition?
I'd probably say we ought to use DNS.
And while we’re at it, 1) mark domains as https-only, and 2) when root domains map to a subdomain (eg www).
I might amuse you to know hat we also already have a text file as a solution for https-only sites.
Cookies shouldn't be tied to domains at all, it's a kludge. They should be tied to cryptographic keypairs (client + server). If the web server needs a cookie, it should request one (in its reply to the client's first request for a given url; the client can submit again to "reply" to this "request"). The client can decide whether it wants to hand over cookie data, and can withhold it from servers that use different or invalid keys. The client can also sign the response. This solves many different security concerns, privacy concerns, and also eliminates the dependency on specific domain names.
I just came up with that in 2 minutes, so it might not be perfect, but you can see how with a little bit of work there's much better solutions than "I check for not-evil domain in list!"
> They should be tied to cryptographic keypairs (client + server).
So now, if a website leaks its private key, attackers can exfiltrate cookies from all of its users just by making them open an attacker-controlled link, for as long as the cookie lives (and users don't visit the website to get the rotated key).
> If the web server needs a cookie, it should request one
This adds a round-trip, which slows down the website on slow connections.
> the client can submit again to "reply" to this "request"
This requires significantly overhauling HTTP and load-balancers. The public-suffix list exists because it's an easy workaround that didn't take a decade to specify and implement.
> So now, if a website leaks its private key, attackers can exfiltrate cookies from all of its users just by making them open an attacker-controlled link
This attack already exists in several forms (leaking a TLS private key, DNS hijack, CA validation attack, etc). You could tack a DNS name onto the crypto-cookies if you wanted to, but DNS is trivial to attack.
> This adds a round-trip, which slows down the website on slow connections.
Requests are already slowed down by the gigantic amount of cookies constantly being pushed by default. The server can send a reply-header once which will tell the client which URLs need cookies perpetually, and the client can store that and choose whether it sends the cookies repeatedly or just when requested. This gives the client much more control over when it leaks users' data.
> This requires significantly overhauling HTTP and load-balancers
No change is needed. Web applications already do all of this all the time. (example: the Location: header is frequently sent by web apps in response to specific requests, to say nothing of REST and its many different request and return methods/statuses/headers).
> The public-suffix list exists because it's an easy workaround
So the engine of modern commerce is just a collection of easy hacks. Fantastic.
> This attack already exists in several forms (leaking a TLS private key, DNS hijack, CA validation attack, etc).
An attacker who gets the TLS private key of a website can't use it easily, because they still need to fool users' browser into connecting to a server they control as the victim domain, which brings us to:
> You could tack a DNS name onto the crypto-cookies if you wanted to, but DNS is trivial to attack.
It's not. I can think of two ways to attack the DNS. Either 1. control or MITM of the victim's authoritative DNS server or 2. poison users' DNS cache.
Control/MITM of the authoritative server is not an option for everyone (only ISPs/backbone operators), and according to Cloudflare: "DNS poisoning attacks are not easy" (https://www.cloudflare.com/learning/dns/dns-cache-poisoning/)
> Requests are already slowed down by the gigantic amount of cookies constantly being pushed by default
Yes, although adding more data and adding a round-trip have different impacts (high-bandwidth high-latency connections exist). Lots of cookies and more round-trips is always worse than lots of cookies and a fewer round-trips.
> The server can send a reply-header once which will tell the client which URLs need cookies perpetually, and the client can store that and choose whether it sends the cookies repeatedly or just when requested.
Everyone hate configuring cache, so in most cases site operators will leave it to a default "send everything", and we're back to square one.
> No change is needed.
I was thinking that servers need to remember state between the initial client request and when the client sends an other request with the cookies. But on second thought that's indeed not necessary.
> So the engine of modern commerce is just a collection of easy hacks. Fantastic.
I'm afraid so
A part of the issue is IMO that browsers have become ridiculously bloated everything-programs. You could take about 90% of that out and into dedicated tools and end up with something vastly saner and safer and not a lot less capable for all practical purposes. Instead, we collectively are OK with frosting this atrocious layer cake that is today's web with multiple flavors of security measures of sometimes questionable utility.
End of random rant.
"You could take about 90% of that out and into dedicated tools "
But then you would loose plattform independency, the main selling point of this atrocity.
Having all those APIs in a sandbox that mostly just work on billion devices is pretty powerful and a potential succesor to HTML would have to beat that, to be adopted.
The best thing to happen, that I can see, is that a sane subset crystalizes, that people start to use dominantly, with the rest becoming legacy, only maintained to have it still working.
But I do dream of a fresh rewrite of the web since university (and the web was way slimmer back then), but I got a bit more pragmatic and I think I understood now the massive problem of solving trusted human communication better. It ain't easy in the real world.
But do we need e.g serial port or raw USB access straight from a random website? Even WebRTC is a bit of a stretch. There is a lot of cruft in modern browsers that does little except increase attack surface.
This all just drives a need to come up with ever more tacked-on protection schemes because browsers have big targets painted on them.
> Even WebRTC is a bit of a stretch
You remove that, and videoconferencing (for business or person to person) has to rely on downloading an app, meaning whoever is behind the website has to release for 10-15 OSes now. Some already do, but not everyone has that budget so now there's a massive moat around it.
> But do we need e.g serial port or raw USB access straight from a random website
Being able to flash an IoT (e.g. ESP32) device from the browser is useful for a lot of people. For the "normies", there was also Stadia allowing you to flash their controller to be a generic Bluetooth/usb one on a website, using that webUSB. Without it Google would have had to release an app for multiple OSes, or more likely, would have just left the devices as paperweights. Also, you can use FIDO/U2F keys directly now, which is pretty good.
Browsers are the modern Excel, people complain that they do too much and you only need 20%. But it's a different 20% for everyone.
I'll flip that around on you: why oh why do we need to browsers to carry these security holes in them? The Stadia flasher is a good example: how do I know that a website doesn't contain a device flasher that will turn one of my connected devices into a malicious actor that will attempt to take over whatever machine it's plugged into?
You know because there is an explicit permission box that pops out and asks if you want to give this website access to a device, and asks you to select that device.
Same as your camera/microphone/location.
But that still gives completely unvetted direct access to the device to a website! People have been pointing to Itch.io games that supposedly require direct USB access. How hard is it to hide a script in there that reprograms a controller into something malicious?
If you download a executable from a website and run it .. pretty much the same thing?
If you give USB access, it is not really a website anymore, rather a app delivered through the web. I don't see a fundamental difference in trust.
I rather am able to verify the web based version easier and I certainly won't give access to a random website, just like I don't download random exes from websites.
Performance is lower, yes and well ... like I said, it is all a big mess. Just look at the global namespace in js. I still use it because of that power feature called plattform independence. What I release, people can (mostly) just use. I (mostly) don't care which OS the user has.
A fule thst lands on my hard drive is aztomatically scanned for malware. That same kindof protection isn't in place against malicious scripts downloaded by my broswer via an opaque HTTPS connection and run in process.
And we all know that non-technical users never just click Yes to make the annoying popup go away.
Itch.io games and controller support.
You have sites now that let you debug microcontrollers on your browser, super cool.
Same thing but with firmware updates in the browser. Cross platform, replaced a mess of ugly broken vendor tools.
While that's pretty convenient, I'm worried about what happens when the vendor shuts down the website. "Ugly broken vendor tools" can be run forever in a VM of an old system, but a website would be gone forever unless it's purely client-side and someone archived it.
Just because you can do something doesn't mean you should.
Your micro-controllers should use open standards for their debugging interface and not force people to use the vendor website.
WebRTC I use since many years and would miss it a lot. P2P is awesome.
WebUSB I don't use or would miss it right now, but .. the main potential use case is security and it sounds somewhat reasonable
"Use in multi-factor authentication
WebUSB in combination with special purpose devices and public identification registries can be used as key piece in an infrastructure scale solution to digital identity on the internet."
https://en.wikipedia.org/wiki/WebUSB
> But do we need e.g serial port or raw USB access straight from a random website?
But do we need audio, images, Canvas, WebGL, etc? The web could just be plain text and we’d get most of the “useful” content still, add images and you get a vast majority of it.
But the idea that the web is a rich environment that has all of these bells and whistles is a good thing imo. Yes there’s attack surface to consider, and it’s not negligible. However, the ability to connect so many different things opens up simple access to things that would otherwise require discrete apps and tooling.
One example that kind of blew my mind is that I wanted a controller overlay for my Twitch stream. After a short bit of looking, there isn’t even a plugin needed in OBS (streaming software). Instead, you add a Web View layer and point it to GamePad Viewer[1] and you’re done.
Serial and USB are possibly a boon for very specific users with very specific accessibility needs. Also, iirc some of the early iPhone jailbreaks worked via websites on a desktop with your iPhone plugged into usb. Sure these are niche, and could probably be served just as well or better with native apps, and web also makes the barrier to entry so much lower .
[1]: https://gamepadviewer.com/
> But do we need e.g serial port or raw USB access straight from a random website?
Yes. Regards, CIA, Mossad, FSB etc.
How else am I going to make a game in the browser that be controlled with a controller?
Every decent host OS already has a dedicated driver stack to provide game controller input to applications in a useful manner. Why the heck would you ship a reimplementation of that in JS in a website?
So that you can take input from countrollers that haven't been invented yet and won't fit the HID model.
If it hasn't been invented yet, you don't need driver software for it, do you? ;)
Anyway, in your scenario the controller would be essentially a one off and you'd be better off writing a native app to interface with it for the one computer this experiment will run on.
If it hasn't been invented yet we don't know the implications of giving a website access to it either.
And that's before realizing it's already a bad idea with existing devices because they were never designed for giving untrusted actors direct access.
That's why we have a privacy and security sandbox in browsers.
You don't, that's the point: not everything needs to be crammed into a browser.
Unlikely. The convenience incentives are far too high to leave features on the table.
Not unlike the programming language or the app (growing until it half-implements LISP or half-implements an email client), the browser will grow until it half-implements an operating system.
For everyone else, there's already w3m.
> Having all those APIs in a sandbox that mostly just work on billion devices is pretty powerful and a potential succesor to HTML would have to beat that, to be adopted.
I think the giant major downside, is that they've written a rootkit that runs on everything, and to try to make up for that they want to make it so only sites they allow can run.
It's not really very powerful at all if nobody can use it, at that point you are better off just not bothering with it at all.
The Internet may remain, but the Web may really be dead.
"It's not really very powerful at all if nobody can use it"
But people do use it, like the both of us right now?
People also use maps, do online banking, play games, start complex interactive learning environments, collaborate in real time on documents etc.
All of that works right now.
> to try to make up for that they want to make it so only sites they allow can run
What do you mean, you can run whatever you want on localhost, and it's quite easy to host whatever you want for whoever you want too. Maybe the biggest modern added barrier to entry is that having TLS is strongly encouraged/even needed for some things, but this is an easily solved problem.
The blog post and several anecdotes in the comments prove otherwise
Not sure if it counts but I've been enjoying librewolf. I believe just a stripped down firefox.
>A part of the issue is IMO that browsers have become ridiculously bloated everything-programs.
I don't see how that solves the issue that PSL tries to fix. I was a script kiddy hosting neopets phishing pages on free cpanel servers from <random>.ripway.com back in 2007. Browsers were way less capable then.
PSL and the way cookies work is just part of the mess. A new approach could solve that in a different way, taking into account all the experience we had with scriptkiddies and professional scammers and pishers since then. But I also don't really have an idea where and how to start.
And of course, if the new solution completely invalidates old sites, it just won't get picked up. People prefer slightly broken but accessible to better designed but inaccessible.
> People prefer slightly broken but accessible to better designed but inaccessible.
We live in world where whatever faang adopts is de facto a standard. Accessible these days means google/gmail/facebook/instagram/tiktok works. Everything else is usually forced to follow along.
People will adopt whatever gives them access to their daily dose of doomscrolling and then complain about rather crucial part of their lives like online banking not working.
> And of course, if the new solution completely invalidates old sites, it just won't get picked up.
Old sites don't matter, only high-traffic sites riddled with dark patterns matter. That's the reality, even if it is harsh.
> People prefer slightly broken but accessible to better designed but inaccessible.
It's not even broken as the edge cases are addressed by ad-hoc solutions.
OP is complaining about global infrastructure not having a pristine design. At best it's a complain over a desirable trait. It's hardly a reason to pull the Jr developer card and mindlessly advocate for throwing everything out and starting over.
2007 you say and less capable you say?!
Try 90s! We had to fight off ActiveX Plugins left and right in the good olde Internet Explorer! Yarr! ;-)
Are you saying we should make a <Unix Equivalent Of A Browser?> A large set of really simple tools that each do one thing really really really pedantically well?
This might be what's needed to break out of the current local optimum.
Maybe it's time to revive something like the uzbl[1] project, or start something similar.
[1] https://www.uzbl.org/
I haven't thought of it that way, but that might be a solution.
There was an attempt in that direction.
https://www.uzbl.org/
You are right from a technical point, I think, but in reality - how would one begin to make that change?
I'm under the impression that CORS largely solves it?
which is still much too new to be able to shut down the PSL of course. but maybe in 2050.
Since this is being downvoted: no, I'm quite serious.
CORS lets sites define their own security boundaries between subdomains, with mutual validation. If you're hosting user content in a subdomain, just don't allow-origin it: that is a clear statement that it's not "the same site". PSL plays absolutely no part in that logic, it seems clear to me that it's at least in part intended to replace the PSL.
Do other sites (like google's safety checks) use CORS for this purpose? Dunno. Seems like they could though? Or am I missing something?
I think we lost the web somewhere between PageRank and JavaScript. Up to there it was just linked documents and it was mostly fine.
I love the web. It's the corporate capitalistic ad fueled and govt censorship web that is the problem.
Why is it a centrally maintained list of domains, when there is a whole extensible system for attaching metadata to domain names?
> God I hate the web. The engineering equivalent of a car made of duct tape.
Most of the complex thing I have seen being made (or contributed to) needed duct tape sooner or later. Engineering is the art of trade-offs, of adapting to changing requirements (that can appear due to uncontrollable events external to the project), technology and costs.
Related, this is how the first long distance automobile trip was done: https://en.wikipedia.org/wiki/Bertha_Benz#First_cross-countr... . Seems to me it had quite some duct tape.
Why would you compare Web to that? A first fax message would be more appropriate comparison.
Web is not a new thing and hardly a technical experiment of a few people any more.
If you add the time since announcing the concept of Web to that trip date, you have a very decent established industry already. With many sport and mass production designs:
https://en.wikipedia.org/wiki/Category:Cars_introduced_in_19...
For me the web is something along the lines at the definition of: https://en.wikipedia.org/wiki/World_Wide_Web to sum up "...universal linked information system...". I think the fax misses many aspects of the core definition to be a good comparison.
Not sure what is your point about "decent established industry" if we relate to "duct tape". I see two possibilities:
a) you imply that the web does not have a decent established industry (but I would guess not).
b) you would claim that there was no "duct tape" in 1924 car industry. I am no expert but I would refer you to the article describing what was the procedure to start the car at https://www.quora.com/How-do-people-start-their-cars-in-the-..., to quote:
> Typical cold-start routine (common 1930s workflow)
> 1. Set hand choke (pull knob).
> 2. Set throttle lever to slight fast‑idle.
> 3. Retard spark if manual advance present.
> 4. Engage starter (electric) or use hand crank.
> 5. Once running, push choke in gradually, advance spark, reduce throttle.
Not sure about your opinion but compared to what a car's objective is (move from point A to point B) to me that sounds rather involved. Not sure if it qualifies as "duct-tape" but definitely it is not a "nicely implemented system that just works".
To resume my point: I think on average progress is slower and harder than people think. And that is mostly because people do not have exposure to the work people are doing to improve things until something can become more "widely available".
That's the nature of decentralised control. It's not just DNS, phone numbers work in the same way.
All web encryption is backed by static list of root certs each browser maintains.
Idk any other way to solve it for the general public (ideally each user would probably pick what root certs they trust), but it does seem crazy.
We already have a solution to solve it: DNS-based Authentication of Named Entities (DANE)
This solution is even more obvious today where most certificates are just DNS lookups with extra steps.
What we need is a web made in a similar way to the wicker-bodied cars of yesteryear
I'm not sure I'm following what inherent flaw you are suggesting browsers had that the public suffix list originators knew they had.
Wait until you learn about the HSTS preload list.
I think it's somewhat tribal webdev knowledge that if you host user generated content you need to be on the PSL otherwise you'll eventually end up where Immich is now.
I'm not sure how people not already having hit this very issue before is supposed to know about it beforehand though, one of those things that you don't really come across until you're hit by it.
This is the first time I hear about https://publicsuffix.org
You're in good company! From 12 days ago: https://news.ycombinator.com/item?id=45538760
I’ve been doing this for at least 15 years and it’s the first I heard of this.
Fun learning new things so often but I never once heard of the public suffix list.
That said, I do know the other best practices mentioned elsewhere
First rule of the public suffix list...
I think what gets me more is I don't see an easy way to add suffixes to the list. I'm sure if I dig I can figure it out but you'd think given how its used they'd have an obvious step by step guide on the website
Last link the menu header: https://publicsuffix.org/submit/
Which then links to: https://github.com/publicsuffix/list/wiki/Guidelines#submitt...
Fairly obvious and typical webpage > documentation flow I think, doesn't seem too hard to find.
Ok so we need a GitHub (Microsoft) account to avoid needing a Google account to in case some undocumented system decides to shut down a website we host. Great.
Besides user uploaded content it's pretty easy to accidentally destroy the reputation of your main domain with subdomains.
For example:
At this point if someone else on that hosting provider gets that IP address assigned, your subdomain is now hosting their content.I had this happen to me once with PDF books being served through a subdomain on my site. Of course it's my mistake for not removing the A record (I forgot) but I'll never make that mistake again.
10 years of my domain having a good history may have gotten tainted in an unrepairable way. I don't get warnings visiting my site but traffic has slowly gotten worse over time since around that time, despite me posting more and more content. The correlation isn't guaranteed, especially with AI taking away so much traffic but it's something I do think about.
The Immich domains that are hit by this issue are -not- user generated content.
They clearly are? It seems like GitHub users submitting a PR could/can add a `preview` label, and that would lead to the application + their changes to be deployed to a public URL under "*.immich.cloud". So they're hosted content generated by users (built application based on user patches) on domains under their control.
I'm the guy that built the system, lol. Labels can only be added by maintainers, and the whole system only works for PRs from internal branches.
Ah, then that's a different situation then, sorry for misunderstanding the context and thanks for clearing that up! I was under the impression that Immich accepted outside contributions, and those would also have those preview sites created for their pending contributions.
Clearly they are not reading HN enough. It hasn’t even been two weeks since this issue last hit the front page.
I wish this comment were top ranked so it would be clear immediately from the comments what the root issue was.
[flagged]
so its skill issue ??? or just google being bad????
I will go with Google being bad / evil for 500.
Google 90s to 2010 is nothings like Google 2025. There is a reason they removed "Don't be evil" ... being evil and authoritarian makes more money.
Looking at you Manifest V2 ... pour one out for your homies.
Don't get me wrong, Google is bad/evil in many ways, but the public suffix list exists to solve a real risk to users. Google is flagging this for a legit reason in this particular case.
It's not a legit reason at all. A website isn't "unsafe" just because it looks similar to another one to Google's AI. At best such an automated flag should trigger a human review, not take the website offline.
Google needs to be held liable for the damages they do in cases like this or they will continue to implement the laziest solutions as long as they can externalize the costs.
Sympathy for the devil, people keep using Google's browser because the safe search guards catch more bad actors than they false positive good actors.
> the safe search guards catch more bad actors than they false positive good actors.
Well, if the legal system used the same "Guilty until proven innocent" model, we would definitely "catch more bad actors than false positive good actors".
That's a tricky one, isn't it.
You do not want malware protection to be running at the speed of the legal system.
A better analogy, unfortunately for all the reasons it's unfortunate, is police: acting on the partial knowledge in the field to try to make the not-worst decision.
> people keep using Google's browser because the safe search guards catch more bad actors than they false positive good actors.
This is the first thing i disable in Chrome, Firefox and Edge. The only safe thing they do is safely sending all my browsing history to Google or Microsoft.
That's a reasonable thing for you to do (especially if you have some other signal source you use for malware protection), but HN readers are rarely representative of average users.
This feature is there for my mother-in-law, who never saw a popup ad she didn't like. You might think I'm kidding; I am not. I periodically had to go into her Android device and dump twenty apps she had manually installed from the Play Store because they were in a ring of promoting each other.
This is not an honest argument. Most people don't even know this web censorship mechanism exists until they see something (usually legit) blocked.
Do they then switch browsers in response?
downvoted for saying truth
many google employee is in here, so I dont expect them to be agree with you
Looking through some of the links in this post, I there are actually two separate issues here:
1. Immich hosts user content on their domain. And should thus be on the public suffic list.
2. When users host an open source self hosted project like immich, jellyfin, etc. on their own domain it gets flagged as phishing because it looks an awful lot like the publicly hosted version, but it's on a different domain, and possibly a domain that might look suspicious to someone unfamiliar with the project, because it includes the name of the software in the domain. Something like immich.example.com.
The first one is fairly straightforward to deal with, if you know about the public suffix list. I don't know of a good solution for the second though.
I don't think the Internet should be run by being on special lists (other than like, a globally run registry of domain names)...
I get that SPAM, etc., are an issue, but, like f* google-chrome, I want to browse the web, not some carefully curated list of sites some giant tech company has chosen.
A) you shouldn't be using google-chrome at all B) Firefox should definitely not be using that list either C) if you are going to have a "safe sites" list, that should definitely be a non-profit running that, not an automated robot working for a large probably-evil company...
> I don't think the Internet should be run by being on special lists
People are reacting as if this list is some kind of overbearing way of tracking what people do on the web - it's almost the opposite of that. It's worth clarifying this is just a suffix list for user-hosted content. It's neither a list of user-hosted domains nor a list of safe websites generally - it's just suffixes for a very small specific use-case: a company providing subdomains. You can think of this as a registry of domain sub-letters.
For instance:
- GitHub.io is on the list but GitHub.com is not - GitHub.com is still considered safe
- I self-host an immich instance on my own domain name - my immich instance isn't flagged & I don't need to add anything to the list because I fully own the domain.
The specific instance is just for Immich themselves who fully own "immich.cloud" but sublet subdomains under it to users.
> *if you are going to have a "safe sites" list"
This is not a safe sites list! This is not even a sites list at all - suffixes are not sites. This also isn't even a "safe" list - in fact it's really a "dangerous" list for browsers & various tooling to effectively segregate security & privacy contexts.
Google is flagging the Immich domain not because it's missing from the safe list but because it has legitimate dangers & it's missing from the dangerous list that informs web clients of said dangers so they can handle them appropriately.
Firefox and Safari also use the list. At least by default, I think you can turn it off in firefox. And on the whole, I think it is valuable to have _a_ list of known-unsafe sites. And note that Safe Browsing is a blocklist, not an allowlist.
The problem is that at least some of the people maintaining this list seem to be a little trigger happy. And I definitely thing Google probably isn't the best custodian of such a list, as they have obvious conflicts of interest.
> I think it is valuable to have _a_ list of known-unsafe sites
But this is not that list because sites are added using opaque automated processes that are clearly not being reviewed by humans - even if those sites have been removed previously after manual review.
>I think it is valuable to have _a_ list of known-unsafe sites
And how and who should define what is consider unsafe sites?
Ideally there should be several/many and the user should be able to direct their browser as to which they would like to use (or none at all)
It always has been run on special lists.
I've coined the phrase "Postel decentralization" to refer to things where people expect there to be some distributed consensus mechanism but it turned out that the design of the internet was to email Jon Postel (https://en.wikipedia.org/wiki/Jon_Postel) to get your name on a list. e.g. how IANA was originally created.
Oh god, you reminded me the horrors of hosting my own mailserver and all of the white/blacklist BS you have to worry about being a small operator (it's SUPER easy to end up on the blacklists, and is SUPER hard to get onto whitelists)
There are other browsers if you want to browse the web with the blinders off.
It's browser beware when you do, but you can do it.
You can turn it off in Chrome settings if you want.
If you have such strong feelings, you could always use vanilla chromium.
> I don't know of a good solution for the second though.
I know the second issue can be a legitimate problem but I feel like the first issue is the primary problem here & the "solution" to the second issue is a remedy that's worse than the disease.
The public suffix list is a great system (despite getting serious backlash here in HN comments, mainly from people who have jumped to wildly exaggerated conclusions about what it is). Beyond that though, flagging domains for phishing for having duplicate content smells like an anti-self-host policy: sure there's phishers making clone sites, but the vast majority of sites flagged are going to be legit unless you employ a more targeted heuristic, but doing so isn't incentivised by Google's (or most company's) business model.
> When users host an open source self hosted project like immich, jellyfin, etc. on their own domain...
I was just deploying your_spotify and gave it your-spotify.<my services domain> and there was a warning in the logs that talked about thud, linking the issue:
https://github.com/Yooooomi/your_spotify/issues/271
That means the Safe Browsing abuse could be weaponized against self-hosted services, oh my...
New directive from the Whitehouse. Block all non approved sites. If you don't do it we will block your merger etc...
Yeah it's only time until someone in power will realize there is already a mechanism for global web censorship that they can make use of.
The second is a real problem even with completely unique applications. If they have UI portions that have lookalikes, you will get flagged. At work, I created an application with a sign-in popup. Because it's for internal use only, the form in the popup is very basic, just username and password and a button. Safe Browsing continues to block this application to this day, despite multiple appeals.
Even the first one only works if there's no need to have site-wide user authentication on the domain, because you can't have a domain cookie accessible from subdomains anymore otherwise.
The issue isn't the user-hosted content - I'm running a release build of Immich on my own server and Google flagged my entire domain.
Is it on your own domain?
Yes, my own domain.
[dead]
Is the subdomain named immich or something more general?
The subdomain is "immich", which has crossed my mind as a potential flagging characteristic.
Thanks for the datapoint. I agree with sibling that it shouldn't be a problem, but am glad to discover from this thread that it may be.
Don't accept that rhetoric. Google shouldn't get to decide how you can design your own website.
They aren't hosting user content; it was their pull request preview domains that was triggering it.
This is very clearly just bad code from Google.
I thought this story would be about some malicious PR that convinced their CI to build a page featuring phishing, malware, porn, etc. It looks like Google is simply flagging their legit, self-created Preview builds as being phishing, and banning the entire domain. Getting immich.cloud on the PSL is probably the right thing to do for other reasons, and may decrease the blast radius here.
The root cause is bad behaviour by google. This is merely a workaround.
[flagged]
Please point me to where GoDaddy or any other hosting site mentions public suffix, or where Apple or Google or Mozilla have a listing hosting best practices that include avoiding false positives by Safe Browsing…
>GoDaddy or any other hosting site mentions public suffix
They don't need to mention it because they handle it on behalf of the client. Them recommending best practices like using separate domains makes as much sense as them recommending what TLS configs to use.
>or where Apple or Google or Mozilla have a listing hosting best practices that include avoiding false positives by Safe Browsing…
Since were those sites the go to place to learn how to host a site? Apple doesn't offer anything related to web hosting besides "a computer that can run nginx". Google might be the place to ask if you were your aunt and "google" means "internet" to her. Mozilla is the most plausible one because they host MDN, but hosting documentation on HTML/CSS/JS doesn't necessarily mean they offer hosting advice, any more than expecting docs.djangoproject.com to contain hosting advice.
The underlying question is how are people supposed to know about this before they have a big problem?
[flagged]
Nothing in this article indicates UGC is the problem. It's that Google thinks there's an "official" central immich and these instances are impersonating it.
What malicious UGC would you even deliver over this domain? An image with scam instructiins? CSAM isn't even in scope for Safe Browsing, just phishing and malware.
It's not a "service" at all. It's Google maliciously inserting themselves into the browsing experience of users, including those that consciously choose a non-Google browser, in order to build a global web censorship system.
>You might not think it is, but internet is filled utterly dangerous, scammy, phisy, malwary websites
Google is happy to take their money and show scammy ads. Google ads are the most common vector for fake software support scams. Most people google something like "microsoft support" and end up there. Has Google ever banned their own ad domains?
Google is the last entity I would trust to be neutral here.
The argument would work better if Google wasn't the #1 distributor of scams and malware in the world with adsense. (Which strangely isn't flagged by safe browsing, maybe a coincidence)
[flagged]
> Imagine defending the most evil, trillion dollar corp
Hyperbole much?
Don't forget to get your worthless fiat pay check from Google adsense for a successful shilling campaign!
Not at all.
[flagged]
What is Safari getting by using Safe Browsing?
Is this a rhetoric question? Safari is just a middleman. G offers seemingly free services in exchange of your data and in order to get a market monopoly. Then they can sell you to their advertisers, squeeze out the competition and become the only Sheriff in town. How many free lunches you have got in your career?
”Competition is for losers.” -Peter Thiel
[flagged]
You should not be downvoted. Either HN has had an influx of ignorant normies or it's google bots attacking any negative comments
People working for famous adtech companies don't like it when people like op burst their bubble. I myself don't like it one bit - keep on changing the world you beautiful geniuses!
Exactly! Most of HN users work for "big tech" and are complete sell outs to their corporate overlords. Majority of them are to blame for the current bloated state of the web along with excessive mass surveillance and anti-privacy state we are in
HN is extremely tone-policed. Lines like "holy shit look in a mirror" are likely to attract downvotes because of their form, with no other factors being considered.
It's full of people described in this blog post [1]. As it concludes, GTFO! Flagging is the IRL equivalent of crying to your superior instead of actually having an argument which is pathetic
[1] - https://geohot.github.io/blog/jekyll/update/2025/10/15/pathe...
HN flagging is just shadow moderation.
I asked dang if I was shadowbanned from flagging. He said yes, if I flag something then it doesn't count because I flagged the wrong things in the past.
The conclusion is that flagging isn't really up to user choice, but is up to dang who decides which things should be flagged and which shouldn't. It's a bit like how on Reddit, the only comments you can see are the ones that agree with the moderators of that subreddit.
Is that actually relevant when only images are user content?
Normally I see the PSL in context of e.g. cookies or user-supplied forms.
> Is that actually relevant when only images are user content?
Yes. For instance in circumstances exactly as described in the thread you are commenting in now and the article it refers to.
Services like google's bad site warning system may use it to indicate that it shouldn't consider a whole domain harmful if it considers a small number of its subdomains to be so, where otherwise they would. It is no guarantee, of course.
Well, using the public suffix list _also_ isolates cookies and treats the subdomains as different sites, which may or may not be desirable.
For example, if users are supposed to log in on the base account in order to access content on the subdomains, then using the public suffix list would be problematic.
Cross domain identity management is a little extra work, but it's far from a difficult problem. I understand the objection to needing to do it when a shared cookie is so easy, but if you want subdomains to be protected from each other because they do not have shared responsibility for each other then it makes sense in terms of privacy & security that they don't automatically share identity tokens and other client-side data.
In another comment in this thread, it was confirmed that these PR host names are only generated from branches internal to Immich or labels applied by maintainers, and that this does not automatically happen for arbitrary PRs submitted by external parties. So this isn’t the use case for the public suffix list - it is in no way public or externally user-generated.
What would you recommend for this actual use case? Even splitting it off to a separate domain name as they’re planning merely reduces the blast radius of Google’s false positive, but does not eliminate it.
If these are dev subdomains that are actually for internal use only, then a very reliable fix is to put basic auth on them, and give internal staff the user/password. It does not have to be strong, in fact it can be super simple. But it will reliably keep out crawlers, including Google.
They didn't say that these are actually for internal use only. They said that they are generated either from maintainers applying labels (as a manual human decision) or from internal PR branches, but they could easily be publicly facing code reviews of internally developed versions, or manually internally approved deployments of externally developed but internally reviewed code.
None of these are the kind of automatic user-generated content that the warning is attempting to detect, I think. And requiring basic auth for everything is quite awkward, especially if the deployment includes API server functionality with bearer token auth combined with unauthenticated endpoints for things like built-in documentation.
How does the PSL make any sense? What stops an attacker from offering free static hosting and then making use of their own service?
I appreciate the issue it tries to solve but it doesn't seem like a sane solution to me.
PSL isn't a list of dangerous sites per-se.
Browsers already do various levels of isolation based on domain / subdomains (e.g. cookies). PSL tells them to treat each subdomain as if it were a top level domain because they are operated (leased out to) different individuals / entities. WRT to blocking, it just means that if one subdomain is marked bad, it's less likely to contaminate the rest of the domain since they know it's operated by different people.
Marking for cookie isolation makes sense, but could be done more effectively via standardized metadata sent by the first party themselves rather than a centralized list maintained by a third party.
Informing decisions about blocking doesn't make much sense (IMO) because it's little more than a speed bump for an attacker. Certainly every little bit can potentially help but it also introduces a new central authority, presents an additional hurdle for legitimate operators, introduces a number of new failure modes, and in this case seems relatively trivial for a determined attacker to overcome.
This is not about user content, but about their own preview environments! Google decided their preview environments were impersonating... Something? And decided to block the entire domain.
I think this only is true if you host independent entities. If you simply construct deep names about yourself with demonstrable chain of authority back, I don't think the PSL wants to know. Otherwise there is no hierarchy the dots are just convenience strings and it's a flat namespace the size of the PSLs length.
Aw. I saw Jothan Frakes and briefly thought my favorite Starfleet first officer's actor had gotten into writing software later in life.
Does Google use this for Safe Browsing though?
Looks like it? https://developers.google.com/safe-browsing/reference/URLs.a...
Oh - of course this is where I find the answer why there's a giant domain list bloating my web bundles (tough-cookie/tldts).
There is no law appointing that organization as a world wide authority on tainted/non tainted sites.
The fact it's used by one or more browsers in that way is a lawsuit waiting to happen.
Because they, the browsers, are pointing a finger to someone else and accusing them of criminal behavior. That is what a normal user understands this warning as.
Turns out they are wrong. And in being wrong they may well have harmed the party they pointed at, in reputation and / or sales.
It's remarkable how short sighted this is, given that the web is so international. Its not a defense to say some third party has a list, and you're not on it so you're dangerous
Incredible
I love all the theoretical objections to something that has been in use for nearly 20 years.
As far as I know there is currently no international alternative authority for this. So definitely not ideal, but better than not having the warnings.
Yes but that's not a legal argument.
You're honor, we hurt the plaintiff because it's better than nothing!
True, and agreed that lawsuits are likely. Disagree that it's short-sighted. The legal system hasn't caught up with internet technology and global platforms. Until it does, I think browsers are right to implement this despite legal issues they might face.
In what country hasn't the legal system caught up?
The point I raise is that the internet is international. There are N legal systems that are going to deal with this. And in 99% of them this isn't going to end well for Google if plaintiff can show there are damages to a reasonable degree.
It's bonkers in terms of risk management.
If you want to make this a workable system you have to make it very clear this isn't necessarily dangerous at all, or criminal. And that a third party list was used, in part, to flag it. And even then you're impeding visitors to a website with warnings without any evidence that there is in fact something wrong.
If this happens to a political party hosting blogs, it's hunting season.
I meant that there is no global authority for saying which websites are OK and which ones are not. So not really that the legal system in specific countries have not caught up.
Lacking a global authority, Google is right to implement a filter themselves. Most people are really really dumb online and if not as clearly "DO NOT ENTER" as now, I don't think the warnings will work. I agree that from a legal standpoint it's super dangerous. Content moderation (which is basically what this is) is an insanely difficult problem for any platform.
The alternative is to not do this.
Never host your test environments as Subdomains of your actual production domain. You'll also run into email reputation as well as cookie hell. You can get a lot of cookies from the production env if not managed well.
This. I cannot believe the rest of the comments on this are seemingly completely missing the problem here & kneejerk-blaming Google for being an evil corp. This is a real issue & I don't feel like the article from the Immich team acknowledges it. Far too much passing the buck, not enough taking ownership.
It's true that putting locks on your front door will reduce the chance of your house getting robbed, but if you do get robbed, the fact that your front door wasn't locked does not in any way absolve the thief for his conduct.
Similarly, if an organization deploys a public system that engages in libel and tortious interference, the fact that jumping through technical hoops might make it less likely to be affected by that system does not in any way absolve the organization for operating it carelessly in the first place.
Just because there are steps you can take to lessen the impact of bad behavior does not mean that the behavior itself isn't bad. You shouldn't have restrict how you use your own domains to avoid someone else publishing false information about your site. Google should be responsible for mitigating false positives, not the website owners affected by them.
> mitigating false positives
First & foremost I really need to emphasise that, despite the misleading article title, this was not a false positive. Google flagged this domain for legitimate reasons.
I think there's likely a conversation to be had about messaging - Chrome's warning page seems a little scarier than it should be, Firefox's is more measured in its messaging. But in terms of the API service Google are providing here this is absolutely not a false positive.
The rest of your comment seems to be an analoy about people not being responsible for protecting their home or something, I'm not quite sure. If you leave your apartment unlocked when you go out & a thief steals your housemate's laptop, is your housemate required to exclusively focus on the thief or should they be permitted to request you to be more diligent about locking doors?
> First & foremost I really need to emphasise that, despite the misleading article title, this was not a false positive. Google flagged this domain for legitimate reasons.
Where are you getting that from? I don't see any evidence that there actually was any malicious activity going on on the Immich domain.
> But in terms of the API service Google are providing here this is absolutely not a false positive.
Google is applying heuristics derived from statistical correlations to classify sites. When a statistical indicator is present, but its target variable is not present, that is the very definition of a false positive.
Just because their verbiage uses uncertainty qualifiers like "may" or "might" doesn't change the fact that they are materially interfering with a third party's activities based on presumptive inferences that have not been validated -- and in fact seem to be invalid -- in this particular case.
> If you leave your apartment unlocked when you go out & a thief steals your housemate's laptop, is your housemate required to exclusively focus on the thief or should they be permitted to request you to be more diligent about locking doors?
One has nothing to do with the other. The fact that you didn't lock your door does not legitimize the thief's behavior. Google's behavior is still improper here, even if website operators have the option of investing additional time, effort, or money to reduce the likelihood of being misclassified by Google.
> its target variable is not present, that is the very definition of a false positive
The target variable is user hosted content on subdomains of a domain not listed in Mozilla's public suffix list. Firefox & Chrome apply a much stricter set of security settings for domains on that list, due to the inherent dangers of multiuser domains. That variable is present, Immich have acknowledged it & are migrating to a new domain (which they will hopefully add to Mozilla's list).
> The fact that you didn't lock your door does not legitimize the thief's behavior. Google's behavior is still improper here
I made no claims about legitimising the thief's behaviour - only that leaving your door unlocked was negligent from the perspective of your housemate. That doesn't absolve the thief. Just as any malicious actor trying to compromise Immich users would still be the primary offender here, but that doesn't absolve Immich of a responsibility to take application security seriously.
And I don't really understand where Google fits in your analogy? Is Google the thief? It seems like a confusing analogy.
> The target variable is user hosted content on subdomains of a domain not listed in Mozilla's public suffix list.
No, that's the indicator. The target variable is "malicious website".
> First & foremost I really need to emphasise that, despite the misleading article title, this was not a false positive. Google flagged this domain for legitimate reasons.
Judging by what a person from the Immich team said, that does not seem to be true?
> the whole system only works for PRs from internal branches - https://news.ycombinator.com/item?id=45681230
So unless one of the developers in the team published something malicious through that system, it seems Google did not have a legitimate reason for flagging it.
> unless one of the developers in the team published something malicious through that system
If that happened we'd have much bigger problems than Google's flagging.
Anyone can open a PR. Deploys are triggered by an Immich collaborator labelling the PR, but it doesn't require them to review or approve the code being deployed.
As I've mentioned in several other comments in this thread by now: The whole preview functionality only works for internal PRs, untrusted ones would never even make it to deployment.
Yes, but unless that pr contain malicious code domain shouldn't be marked as such. You should assume good faith, not the other way around.
> Google flagged this domain for legitimate reasons.
No they didn't.
Do you know the legitimate reasons?
Because the article seems to only ever get an excuse from Google that is easy to dismiss because most sites do something similar.
The legitimate reason is that the domain is correctly classified as having user generated active content, because the Immich GitHub repo allows anyone to submit arbitrary code via PR, and PRs can be autodeployed to this domain without passing review or approval.
Domains with user generated active content should typically by listed on Mozilla's Public Suffix list, which Firefox & Chrome both check & automatically apply stricter security settings to, to protect users.
> correctly classified as having user generated active content
No it's not
> PRs can be autodeployed to this domain without passing review or approval.
No they can't
There is no untrusted/user content on these domains.
> Google flagged this domain for legitimate reasons.
Why would it flag a domain rather than a subdomain?
Which subdomain?
The one that contained malicious code, if there was any.
Both things can be problems.
1. You should host dev stuff and separate domains.
2. Google shouldn't be blocking your preview environments.
A safe browsing service is not a terrible idea (which is why both Safari & Firefox use Google for this) & while I hate that Google has a monopoly here, I do think a safe browsing service should absolutely block your preview environments if those environments have potential dangers for visitors to them & are accessible to the public.
However, why does it work in such a way that it blocks the whole domain and not just the subdomains?
Is it far fetched that the people controlling a subdomain may not be the same that control the domain?
Which subdomains?
To be clear, the issue here is that some subdomains pose a risk to the overall domain - visiting any increases your risk from others. It's also related to a GitHub workflow that auto-generates new subdomains on demand, so there's no possibility to have a fixed list of known subdomains since new ones are constantly being created.
That’s what the Public Suffix List is for
It is a terrible idea when what is "safe" is determined arbitrarily by a private corporation that is perhaps the biggest source of malicious behavior on the web.
Yes they could do better, but who appointed Google "chief of web security"? Google can eff right off.
Yep. Still I feel bad for them.
I think my comment came across a bit harsh - the Immich team are brilliant. I've hosted it for a long time & couldn't be happier & I think my criticisms of the tone of the article are likely a case of ignorance rather than any kind of laziness or dismissiveness.
It's also in general a thankless job maintaining any open-source project, especially one of this scale, so a certain level of kneejerk cynical dismissiveness around stuff like this is expected & very forgivable.
Just really hope the ignorance / knowledge-gap can be closed off though, & perhaps some corrections to certain statements published eventually.
There's quite a few comments of people having this happen to them when they self-host Immich, the issue you point out seems minor in comparison.
I think immich.app is the production domain, not cloud?
.cloud is used to host the map embedded in their webapp.
In fairness, in my local testing sofar, it appears to be an entirely unauthenticated/credential-less service so there's no risk to sessions right now for this particular use-case. That leaves the only risk-factors being phishing & deploy environment credentials.
Happened to me last week. One morning we wake up and the whole company website does not work.
Not advice with some time to fix any possible problem, just blocked.
We gave very bad image to our clients and users, and had to give explanations of a false positive from google detection.
The culprit, according to google search console, was a double redirect on our web email domain (/ -> inbox -> login).
After just moving the webmail to another domain, removing one of the redirections just in case, and asking politely 4 times to be unblocked.. took about 12 hours. And no real recourse, feedback or anything about when its gonna be solved. And no responsibility.
The worse is the feeling of not in control of your own business, and depending on a third party which is not related at all with us, which made a huge mistake, to let out clients use our platform.
File a small claim for damages up to 10,000 to 20,000 USD depending on your local statues.
It’s actually pretty quick and easy. They cannot defend themselves with lawyers, so a director usually has to show up.
It would be glorious if everybody unjustly screwed by Google did that. Barring antitrust enforcement, this may be the only way to force them to behave.
it wouldn't work. they'd hire some minimum wage person to go to all of them and just read the terms and conditions you agreed to that include language about arbitration or whatever
Terms of service, written by a corporation, do not overrule the law, of a country.
Especially not when the plaintiff isn't even a user of the service.
How did they agree to those terms?
Probably includes something insane like "By allowing your website to be crawled by google spiders, you agree to the following terms...."
Ok, by not objecting withing 5 seconds you hereby agree to let me shoot you in the head.
In all US states corporations may be represented by lawyers in small claims cases. The actual difference is that in higher courts corporations usually must be represented by lawyers whereas many states allow normal employees to represent corporations when defending small claims cases, but none require it.
This is not accurate. I filed a claim against Bungalow in Oregon. They petitioned the judge to allow their in house attorney I was dealing with to represent them. The judge denied the request citing the Oregon statute that attorneys may not participate in small claims proceedings. Bungalow flew out their director of some division who was ill prepared.
Slam dunk. took all of 6-8 hours of my time end to end. The claim was a single page document. Got the max award allowable. Would have got more had it been California.
55.090 Appearance by parties and attorneys; witnesses. (1) Except as may otherwise be provided by ORS 55.040, no attorney at law nor any person other than the plaintiff and defendant shall become involved in or in any manner interfere with the prosecution or defense of the litigation in the department without the consent of the justice of the justice court, nor shall it be necessary to summon witnesses.
I’m guessing you got luck and most justices consent?
Why would you guess that? Most justices concern themselves with statute.
This is just so inaccurate, at least for California.
Not to mention that they have general council, who are lawyers but also just employees.
I've been thinking for a while that a coordinated and massive action against a specific company by people all claiming damages in small claims court would be a very effective way of bringing that company to heel.
I wonder how that will work with mandatory arbitration clauses. Guess you don't know until you try.
Valve tried this. But there's no class action arbitration. Meaning that instead of a single class action suit, they had thousands of individual arbitration cases and they were actually begging people to sue them instead. So we could just do that. If they want mandatory arbitration they can have mandatory arbitration. From half of us, just in case it doesn't work.
Swimmingly. It apparently works swimmingly.[0]
Another idea that's worth investigating are coordinated payment strikes on leveraged companies that offer monthly services like telco companies. A bunch of their customers going "Oops, guess I can't afford to pay this month, gonna have to eat that 2% late fee next month, or maybe the month after that, or maybe the month after that" on a service that won't be disconnected in the first month could absolutely crush a company that requires that monthly income to pay their debt.
[0] https://jacobin.com/2022/05/mass-arbitration-mandatory-agree...
I was under the impression that the Supreme Court had ruled that mandatory arbitration clauses were indeed mandatory. Meaning, if you are subject to a mandatory arbitration clause in some contract, it removes ALL ability for a plaintiff to sue a company.
But, good news, it seems like they are walking back on that. They recently ruled that lower courts must "pause" a suit and the suit can resume if an agreement is not made through arbitration.
https://www.bressler.com/news-supreme-court-clarifies-mandat...
Do small claims apply to things like this where damages are indirect?
I believe so. For me it was helpful to visualize getting up and convincing the judge of the damages.
I’d run a PnL, get average daily income from visitors, then claim that loss as damages. In court I’d bring a simple spreadsheet showing the hole in income as evidence of damages.
If there were contractors to help get the site back up I’d claim their payments as damages and include their invoices as evidence.
And now your Gmail account has been deleted as well as any other accounts you had with Google
That's okay, you have backup of your data, and you don't really depend on your Gmail account for anything important.
I’ve probably got about a thousand accounts that use a Gmail account as the associated email / username. I doubt this is uncommon compared to the number of people with custom domains.
The problem here wouldn't be the data but all the people whose only (or at least primary) way to reach you is the Gmail address.
So what? Why would you want to continue to use the services of a company you had to sue? That’s kind of a “burning the bridges” moment.
The whole problem is vendor lock in. Changing your email address if you’ve had it long is not straightforward or easy.
> The culprit, according to google search console, was a double redirect on our web email domain (/ -> inbox -> login).
I find it hard to believe that the double redirect itself tripped it: multiple redirects in a row is completely normal—discouraged in general because it hurts performance, but you encounter them all the time. For example, http://foo.example → https://foo.example → https://www.foo.example (http → https, then add or remove www subdomain) is the recommended pattern. And site root to app path to login page is also pretty common. This then leads me to the conclusion that they’re not disclosing what actually tripped it. Maybe multiple redirects contributed to it, a bad learned behaviour in an inscrutable machine learning model perhaps, but it alone is utterly innocuous. There’s something else to it.
Want to see how often Microsoft accounts redirect you? I'd love to see Google block all of Microsoft, but of course that will never happen, because these tech giants are effectively a cartel looking out for each other. At least in comparison to users and smaller businesses.
The reason Google doesn’t block Microsoft isn’t that they’re “looking out for Microsoft.” They’re looking out for themselves by being aware that blocking something that millions of people use would be bad for business.
So why isn't blocking something that is starred 82k times on GitHub bad for business.
I forget. How much do users pay per star again?
That's peanuts compared to Microsoft's userbase
Same difference.
I suspect you're right... The problem is, and i've experienced this with many big tech companies, you never really get any explanation. You report an issue, and then, magically, it's "fixed," with no further communication.
This looks like the same suicide inducing type of crap by google that previously only android devs on playstore were subject to.
I'm permanently banned from the Play Store because 10+ years ago I made a third-party Omegle client, called it Yo-megle (neither Omegle nor Yo-megle still exist now), got a bunch of downloads and good ratings, then about 2 years later got a message from Google saying I was banned for violating trademark law. No actual legal action, just a message from Google. I suppose I'm lucky they didn't delete my entire Google account.
I'm beginning to seriously think we need a new internet, another protocol, other browsers just to break up the insane monopolies that has been formed, because the way things are going soon all discourse will be censored, and competitors will be blocked soon.
We need something that's good for small and medium businesses again, local news and get an actual marketplace going - you know what the internet actually promised.
Anyone working on something like this?
We have a “new internet”. We have the indie web, VPNs, websites not behind Cloudflare, other browsers. You won’t have a large audience, but a new protocol won't fix that.
Also, plenty of small and medium businesses are doing fine on the internet. You only hear about ones with problems like this. And if these problems become more frequent and public, Google will put more effort into fixing them.
I think the most practical thing we can do is support people and companies who fall through the cracks, by giving them information to understand their situation and recover, and by promoting them.
"Google will put more effort into fixing them"
Why would they do that? Do they lose money from these people? Why would they care? they're a monopoly they don't need to care
Perhaps we need a different "type" of internet. I don't have the expertise to even explain what this would look like, but I know that if politics, religion, junk science and a hundred other influences have anything to do with it, it will eventually become too stupid to use.
Making a "smart person only" Internet is a social problem, not a technology problem.
We had a "smart person only internet". Then it became financially prudent to make it an "everyone internet", then we had the dot com boom, Apple, Google, etc bloom from that.
We _still_ have a "smart person only internet" really, it's just now used mostly for drug and weapon sales ( Tor )
Smart people want to dominate the stupids.
For some group of smart people, there will be a group of smarter people who want to dominate the The people they designate "the stupids".
The internet was a technological solution to a social problem. It introduced other social problems, although arguably these to your point are old social problems in a new arena.
But there may be yet another technological solution to the old social problems of monopolism, despotic centralized control, and fraud.
.... I did say "may".
Everybody wants to dominate others using their strongest ability: smart, rich, strong, popular, fast, etc.
The community around NOSTR are basically building a kind of semantic web, where users identities are verified via their public key, data is routed through content agnostic relays, and trustworthiness is verified by peer recommendation.
They are currently experimenting with replicating many types of services which are currently websites as protocols with data types, with the goal being that all of these services can share available data with eachother openly.
It's definitely more of a "bazaar" model over a "catherdral" model, with many open questions and it's also tough to get a good overview of what is really going on there. But at least it's an attempt.
Stop trying to look for technological answers to political problems. We already have a way to avoid excessive accumulation of power by private entities, it's called "anti-trust laws" (heck, "laws" in general).
Any new protocol not only has to overcome the huge incumbent that is the web, it has to do so grassroots against the power of global capital (trillions of dollars of it). Of course, it also has to work in the first place and not be captured and centralised like another certain open and decentralised protocol has (i.e., the Web).
Is that easier than the states doing their jobs and writing a couple pages of text?
States are made of people both at decision and at street level. Many anti-trust laws were made when the decision people that were not very tied with the actual interests - nowadays this seem to change. At no point I think people at street level ever understood the actual implications.
A structural solution is to educate and lift the whole population to better understand the implications of their choices.
A tactic solution is to try to limit the collusion of decision people and private entities, but this does not seem to go extremely well.
An "evolutionary" solution (that just happens) used to be to have a war - that would push a lot of people to look for efficiency rather than for some interests. But this is made more complex by nukes.
I don't really see how anti-trust would address something like Google Chrome's safe browsing infrastructure.
The problem is that the divide of alignment of interests there is between new, small companies and users. New companies want to put up a website without tripping over one of the thousand unwritten rules of "How to not look like a phishing site or malware depot" (many of which are unwritten because protecting users and exploiting users is a cat-and-mouse game)... And users don't want to get owned.
Shard Chrome off from Google and it still has incentives to protect users at the cost of new companies' ease of joining the global network as a peer citizen. It may have less signal as a result of a curtailed visibility on the state of millions of pages, but the consequence of that is that it would offer worse safe browsing protection and more users would get owned as a result.
Perhaps the real issue is that (not unlike email) joining the web as a peer citizen has just plain gotten harder in the era of bad actors exploiting the infrastructure to cause harm to people.
Like... You know what never has these problems? My blog. It's a static-site-generated collection of plain HTML that updates once in a blue moon via scp. I'm not worried about Google's safe browsing infrastructure, because I never look like a malware site. And if I did trip over one of the unwritten rules (or if attackers figured out how to weaponize something personal-blog-shaped)? The needs of the many justify Chrome warning people before going to my now-shady site.
> The problem is that the divide of alignment of interests there is between new, small companies and users. New companies want to put up a website without tripping over one of the thousand unwritten rules of "How to not look like a phishing site or malware depot" (many of which are unwritten because protecting users and exploiting users is a cat-and-mouse game)... And users don't want to get owned
Some candidate language:
- Monopolistic companies may not actively impose restrictions which harm others (includes businesses)
or
- Some restrictions are allowed, but the company must respond to an appeal of restrictions within X minutes; Appeals to the company can themselves be appealed to a governmental independent board which binds the company with no further review permitted; All delays and unreasonable responses incur punitive penalties as judged by the board; All penalties must be paid immediately
or
- If an action taken unilaterally by a company 1) harms someone AND 2) is automated: Then, that automation must be immediately, totally, and unconditionally reversed upon the unilateral request of the victim. The company may reinstate the action upon the sworn statement of an employee that they have made the decision as a human, and agree to be accountable for the decision. The decision must then follow the above appeals process.
or
- No monopolies allowed
> Monopolistic companies may not actively impose restrictions which harm others (includes businesses)
That's not generally how monopoly is interpreted in the US (although jurisprudence on this may be shifting). In general, the litmus test is consumer harm. A company is allowed to control 99% of the market if they do it by providing a better experience to consumers than other companies can; that's just "being successful." Microsoft ran afoul of antitrust because their browser sucked and embedding it in the OS made the OS suck too; if they hadn't tried to parlay one product into the other they would be unlikely to have run afoul of US antitrust law, and they haven't run afoul of it over the fact that 70-90% of x86 architecture PCs run Windows.
> Some restrictions are allowed, but the company must respond to an appeal of restrictions within X minutes; Appeals to the company can themselves be appealed to a governmental independent board which binds the company with no further review permitted; All delays and unreasonable responses incur punitive penalties as judged by the board; All penalties must be paid immediately
There may be meat on those bones (a general law restricting how browsers may operate in terms of rendering user content). Risky because it would codify into law a lot of ideas that are merely technical specifications (you can look to other industries to see the consequences of that, like how "five-over-ones" are cropping up in cities all over the US because they satisfy a pretty uniform fire and structural safety building code to the letter). But this could be done without invoking monopoly protection.
> If an action taken unilaterally by a company 1) harms someone AND 2) is automated: Then, that automation must be immediately, totally, and unconditionally reversed upon the unilateral request of the victim.
Too broad. It harms me when Google blocks my malware distribution service because I'm interested in getting malware on your machine; I really want your Bitcoin wallet passwords, you see. ;)
Most importantly: this whole topic is independent of monopolies. We could cut Chrome out of Google tomorrow and the exact same issues with safe browsing impeding new sites with malware-ish shapes would exist (with the only change probably being the false positive rate would go up, since a Chrome cut off from Google would have to build out its detection and reporting logic from scratch without relying on the search crawler DB). More importantly, a user can install another browser that doesn't have site protection today (or, if I understand correctly, switch it off). The reason this is an issue is that users like Chrome and are free to use it and tend to find site protection useful (or at least "not a burden to them") and that's not something Google imposed on the industry, it's a consequence of free user choice.
> Too broad. It harms me when Google blocks my malware distribution service because I'm interested in getting malware on your machine; I really want your Bitcoin wallet passwords, you see. ;)
That's okay, a random company failing to protect users from harm is still better than harming an innocent person by accident. They already fail in many cases, obviously we accept a failure rate above 0%. You also skipped over the rest of that paragraph.
> users like Chrome and are free to use it and tend to find site protection useful (or at least "not a burden to them")
That's okay, Google can abide by the proposal I set forth avoiding automated mistaken harms to people. If they want to build this system that can do great harms to people, they need to first and foremost build in safety nets to address those harms they cause, and only then focus on reducing false negatives.
I think there's an unevaluated tension in goals between keeping users safe from malware here and making it easy for new sites to reach people, regardless of whether those sites display patterns consistent with malware distributors.
I don't think we can easily discard the first in favor of the second. Not nearly as categorically as is done here. Those "false negatives" mean users lose things (bank accounts, privacy, access to their computer) through no fault of their own. We should pause and consider that before weeping and rending our garments that yet another hosting provider solution had a bad day.
You've stopped considering monopoly and correctly considered that the real issue is safe browsing, as a feature, is useful to users and disruptive to new business models. But that's independent of Google; that's the nature of sharing a network between actors that want to provide useful services to people and actors that want to cause harm. If I build a browser today, from scratch, that included safe browsing we'd be in the same place and there'd be no Google in the story.
It's very, very hard to overcome the gravitational forces which encourage centralization, and doing so requires rooting the different communities that you want to exist in their own different communities of people. It's a political governance problem, not a technical one.
This is the key idea.
Companies have economy of scale (Google, for instance, is running dozens to hundreds of web apps off of one well-maintained fabric) and the ability to force consolidation of labor behind a few ideas by controlling salaries so that the technically hard, detailed, or boring problems actually get solved. Open source volunteer projects rarely have either of those benefits.
In theory, you could compete with Google via
- Well-defined protocols
- That a handful of projects implement (because if it's too many, you split the available talent pool and end up with e.g. seven mediocre photo storage apps that are thin wrappers around a folder instead of one Google Photos with AI image search capability).
- Which solve very technically hard, detailed, or boring technical problems (AI image search is an actual game-changer feature; the difference between "Where is that one photo I took of my dog? I think it was Christmas. Which Christmas, hell I don't know" and "Show me every photo of my dog, no not that dog, the other dog").
I'd even risk putting up bullet point four: "And be willing to provide solutions for problems other people don't want solved without those other people working to torpedo your volunteer project" (there are lots of folks who think AI image detection is de-facto evil and nobody should be working on it, and any open source photo app they can control the fate of will fall short of Google's offering for end-users).
You make it seem like the problem is of technical nature (instead of regulatory or other). Would you mind explaining why?
Technical alternatives already exist, see for example GNUnet.
Problem is that as soon as some technology takes traction, it catches the attention of businesses, and there is where the slow but steady enshittification process begins. Not that business necessarily equals enshittification, but in a world dominated by capitalism without borders soon or later someone will break some unwritten rules and others will have to follow to remain competitive, until that new technology will become a new web, and we'll be back to square one. To me the problem isn't technical, as isn't its solution.
I'm interested to see how this will work with something like Mastodon.
Since Mastodon is, fundamentally, a protocol and reference implementation, people can come up with their own enshittified nodes or clients... And then the rest of the ecosystem can respond by just ignoring that work.
Yes, technically Truth Social is a Mastodon node. My Mastodon node doesn't have to care.
How about the Invisible Internet Project, https://geti2p.net?
IPFS has been doing some great work around decentralization that actually scales (Netflix uses it internally to speed up container delivery), but a) it's only good for static content, b) things still need friendly URLs, and c) once it becomes the mainstream, bad actors will find a way to ruin it anyway.
These apply to a lot of other decentralized systems too.
In no way does IPFS "actually scale" while it takes two minutes (120 seconds) to find an object.
This is not a technical problem. You will not solve it with purely technical solutions.
It won't get anywhere unless it addresses the issue of spam, scammers, phishing etc. The whole purpose of Google Safe Browsing is to make life harder for scammers.
How does the Internet addresses that?
True, but google already censors their search results to push certain imperial agendas so i'm not trusting them in the long run.
I'm not sure, but it's on my mind.
I own what I think are the key protocols for the future of browsers and the web, and nobody knows it yet. I'm not committed to forking the web by any means, but I do think I have a once-in-a-generation opportunity to remake the system if I were determined to and knew how to remake it into something better.
If you want to talk more, reach out!
Intriguing comment, but your username does not inspire confidence.
Lol I get that from time to time, though I don't care much. I've always had the same username. I have the same username everywhere. I'm Conrad.
I do think I invite people to disrespect me a little though. It ensures that I have to work harder and succeed on the merit of my work.
I'm afraid this can't be built on the current net topology which is owned by the Stupid Money Govporation and inherently allows for roadblocks in the flow of information. Only a mesh could solve that.
But the Stupid Money Govporation must be dethroned first, and I honestly don't see how that could happen without the help of an ELE like a good asteroid impact.
It will take the same or less amount of time, to get where we are with current Web.
What we have is the best sim env to see how stuff shape up. So fixing it should be the aim, avoiding will get us on similar spirals. We'll just go on circles.
Having a decade of fresh air is also a good incentive regardless of how it ends
I don't know, it is a lot of effort for a decade fresh air. Then you will notice same policies implemented since they will take reference to how people solved it in the past.
Have you talked to your lawyer? Making Google pay for their carelessness is the ONLY way to get them to care.
The one thing I never understood about these warnings is how they don't run afoul of libel laws. They are directly calling you a scammer and "attacker". The same for Microsoft with their unknown executables.
They used to be more generic saying "We don't know if its safe" but now they are quite assertive at stating you are indeed an attacker.
> They are directly calling you a scammer and "attacker".
No they're not. The word "scammer" does not appear. They're saying attackers on the site and they use the word "might".
This includes third-party hackers who have compromised the site.
They never say the owner of the site is the attacker.
I'm quite sure their lawyers have vetted the language very carefully.
"The people living at this address might be pedophiles and sexual predators. Not saying that they are, but if your children are in the vicinity, I strongly suggest you get them back to safety."
I think that might count as libel.
i think it's more akin to "people may have broken in and taken over this house, and within the house there may be sexual predators"
Still asserts that in that house there may be sexual predators. If I lived in that house I wouldnt be happy, and I would want a way of clearing the accusations and proving that there are indeed no sexual predators in my house quicksmart before other people start avoiding it.
You can’t possibly use the “they use the word ‘might’” argument and not mention the death red screen those words are printed over. If you are referring to abidance to the law, you are technically right. If we remove the human factor, you technically are.
> If you are referring to abidance to the law, you are technically right.
Yes, the question was literally about the law.
I wasn't trying to say anything else. I was answering the commenter's legal question.
I don't know what you are trying to imply.
> The one thing I never understood about these warnings is how they don't run afoul of libel laws.
I’m not a lawyer, but this hasn’t ever been taken to court, has it? It might qualify as libel.
I know of no such cases, and would love to know if someone finds one.
I worked for a company who had this happen to an internal development domain, not exposed to the public internet. (We were doing security research on our own software, so we had a pentest payload hosted on one of those domains as part of a reproduction case for a vulnerability we were developing a fix for.)
Our lawyers spoke to Google's lawyers privately, and our domains got added to a whitelist at Google.
you only sue somebody poorer than you
It depends, if it's a clear-cut case, then in jurisdictions with a functioning legal system it can be feasible to sue.
Likewise, if it's a fuckup that just needs to be put in front of someone who cares, a lawsuit is actually a surprisingly effective way of doing that. This moves your problem from "annoying customer support interaction that's best dealt with by stonewalling" into "legal says we HAVE to fix this".
Imagine if you bought a plate at Walmart and any time you put food you bought elsewhere on it, it turned red and started playing a warning about how that food will probably kill you because it wasn't Certified Walmart Fresh™
Now imagine it goes one step further, and when you go to eat the food anyway, your Walmart fork retracts into its handle for your safety, of course.
No brand or food supplier would put up with it.
That's what it's like trying to visit or run non-blessed websites and software coming from Google, Microsoft, etc on your own hardware that you "own".
This is the future. Except you don't buy anything, you rent the permission to use it. People from Walmart can brick your carrots remotely even when you don't use this plate, for your safety ofc
> The one thing I never understood about these warnings is how they don't run afoul of libel laws. They are directly calling you a scammer and "attacker"
Being wrong doesn't count as libel.
If a company has a detection tool, makes reasonable efforts to make sure it is accurate, and isn't being malicious, you'll have a hard time making a libel case
There is a truth defence to libel in the USA but there is no good faith defence. Think about it like a traffic accident, you may not have intended to drive into the other car but you still caused damage. Just because you meant well doesn't absolve you from paying for the damages.
This is tricky to get right.
If the false positive rate is consistently 0.0%, that is a surefire sign that the detector is not effective enough to be useful.
If a false positive is libel, then any useful malware detector would occasionally do libel. Since libel carries enormous financial consequences, nobody would make a useful malware detector.
I am skeptical that changing the wording in the warning resolves the fundamental tension here. Suppose we tone it down: "This executable has traits similar to known malware." "This website might be operated by attackers."
Would companies affected by these labels be satisfied by this verbiage? How do we balance this against users' likelihood of ignoring the warning in the face of real malware?
The problem is that it's so one sided. They do what they want with no effort to avoid collateral damage and there's nothing we can do about it.
They could at least send a warning email to the RFC2142 abuse@ or hostmaster@ address with a warning and some instructions on a process for having the mistake reviewed.
Spamhaus has been sued—multiple times, I believe—for publishing DNS-based lists used to block email from known spammers.
For instance: https://reason.com/volokh/2020/07/27/injunction-in-libel-cas... (That was a default judgment, though, which means Spamhaus didn't show up, probably due to jurisdictional questions.)
The first step in filing a libel lawsuit is demanding a retraction from the publisher. I would imagine Google's lawyers respond pretty quickly to those, which is why SafeBrowsing hasn't been similarly challenged.
This may not be a huge issue depending on mitigating controls but are they saying that anyone can submit a PR (containing anything) to Immich, tag the pr with `preview` and have the contents of that PR hosted on https://pr-<num>.preview.internal.immich.cloud?
Doesn't that effectively let anyone host anything there?
I think only collaborators can add labels on github, so not quite. Does seem a bit hazardous though (you could submit a legit PR, get the label, and then commit whatever you want?).
Exposure also extends not just to the owner of the PR but anyone with write access to the branch from which it was submitted. GitHub pushes are ssh-authenticated and often automated in many workflows.
So basically like https://docs.google.com/ ?
Yes, except on Google Docs you can't make the document steal credentials or download malware by simply clicking on the link.
It's more like sites.google.com.
That was my first thought - have the preview URLs possibly actually been abused through GitHub?
No, it doesn't work at all for PRs from forks.
Excellent idea for cost-free phishing.
Insane that one company can dictate what websites you're allowed to visit. Telling you what apps you can run wasn't far enough.
US congress not functioning for over a decade causes a few problems.
It's the result of failures across the web, really. Most browsers started using Google's phishing site index because they didn't want to maintain one themselves but wanted the phishing resistance Google Chrome has. Microsoft has SmartScreen, but that's just the same risk model but hosted on Azure.
Google's eternal vagueness is infuriating but in this case the whole setup is a disaster waiting to happen. Google's accidental fuck-up just prevented "someone hacked my server after I clicked on pr-xxxx.imiche.app" because apparently the domain's security was set up to allow for that.
You can turn off safe browsing if you don't want these warnings. Google will only stop you from visiting sites if you keep the "allow Google to stop me from visiting some sites" checkbox enabled.
I really don't know how they got nerds to think scummy advertising is cool. If you think about it, the thing they make money on - no user actually wants ads or wants to see them, ever. Somehow Google has some sort of nerd cult that people think its cool to join such an unethical company.
If you ask, the leaders in that area of Google will tell you something like "we're actually HELPING users because we're giving them targeted ads that are for the things they're looking for at the time they're looking for it, which only makes things for the user better." Then you show them a picture of YouTube ads or something and it transitions to "well, look, we gotta pay for this somehow, and at least's it's free, and isn't free information for all really great?"
Turns out it's cool to make lots of money
unfortunately nobody wants to sacrifice anything nowadays so everyone will keep using google, and microsoft, and tiktok and meta and blah blah
It's super simple. Check out all the Fediverse alternatives. How many people that talk a big game actually financially support those services? 2% maybe, on the high end.
Things cost money, and at a large scale, there's either capitalism, or communism.
> there's either capitalism, or communism
Can you point them out on the map?
Absolutely fuck Google
[flagged]
Google's services, especially their free services, are never really free. It's just that the price tag is so well hidden that ordinary users really believe this. But the HN audience is more technical than that and they see through the smokescreen.
Except for those that are making money off adds directly or indirectly, and who believe in their god given right to my attention and my data.
> I'm increasingly blown away by takes on here that are so dramatic and militant about things that barely even register to most people.
Things 'barely even registering to most people' is not as strong a position as you may think it is. Oxygen barely registers to most people. But take it away and they register it just fine (for a short while). The 'regular' people that you know have been steadily conditioned to an ever worsening experience to the point that they barely recognize the websites they visit when seeing the web with an adblocker for the first time.
It's just that the price tag is so well hidden that ordinary users really believe this.
And if they die believing that, what price did they really pay? I don't think the difference mostly comes down to a lack of knowledge or understanding, but more a difference of care or assigned value. There are a lot of smart people on HN, but with that often comes exaggerated anxieties and paranoias. If most people don't give a crap about giving their data to Google or allowing the big bad advertisements to penetrate their feeble minds or whatever, vociferously beating that drum just amounts to old-man-yelling-at-cloud-esque FUD.
Things 'barely even registering to most people' is not as strong a position as you may think it is.
I understand that logically that is neither here nor there, it was more just an expression of exasperation. It's kind of like how I'm equally blown away by how much energy some people put into anti-abortion laws. It's like, ok, everyone can have their opinions, and there's plenty of reasonable discussion to be had, but to put so much negative energy into something that's like, is this really the battle that's worth this much outrage right now? There are literally genocides and violent deportations going on around us. Google are not the bad guys.
Also, I don't use any kind of ad blocker. There are definitely lots of ad-infested unusable experiences out there but Google products are generally among the classiest and most unobtrusive.
The people that put effort into anti-abortion laws are usually trying to force their view of how other people should live onto those other people.
I block ads out of my life because I am easily distracted and have seen the internet go from a great place to a billboard that continuously screams at me for my attention. It's pure self-preservation, I don't begrudge you your 45 minutes of advertising time per day at all.
They created the largest spying instrument in the world that creates hidden profiles (that can never be deleted) documenting web activity, psychological state, medications, etc, etc for billions of people - and have been caught multiple times sharing data with governments (they're probably compromised internally anyway). I would categorize that as unethical. But yeah, you can cheer for the scraps they throw out.
>about things that barely even register to most people.
News flash: This whole website is about things that don't register to most people. It's called hacker news FFS.
In any case, I think a trillion dollar company probably doesn't need defending. They can easily tweak their algorithm to bury this type of stuff; after all this opinion is probably not "relevant" or "useful" to most people.
On this day, only Google Maps does not have real competitor on Android. Otherwise, it is possible to drop Google and even get better services. Brands are difficult to compete.
Try Mapy. Outperforms Google maps any day.
You're right but I hate that you're right. The only part I disagree with is
>I think they all are pretty happy with the deal and would not switch to a paid ad-free version.
If they were given a low friction option to pay the advertise price for these services I think a lot would choose it. Advertisement pays almost nothing per person. Almost every person could pay more than the cost to serve them an ad. To use a service ad free for a year would cost less than $1 per user. This differs on the platform obviously with stuff like youtube being far more expensive but for day to day stuff the cost is low.
[dead]
The open internet is done. Monopolies control everything.
We have an iOS app in the store for 3 years and out of the blue apple is demanding we provide new licenses that don’t exist and threaten to kick our app out. Nothing changed in 3 years.
Getting sick of these companies able to have this level of control over everything, you can’t even self host anymore apparently.
> We have an iOS app in the store for 3 years and out of the blue apple is demanding we provide new licenses that don’t exist and threaten to kick our app out.
Crazy! If you can elaborate here, please do.
[dead]
Story of when it happened to my company: https://news.ycombinator.com/item?id=25802366
I'm fighting this right now on my own domain. Google marked my family Immich instance as dangerous, essentially blocking access from Chrome to all services hosted on the same domain.
I know that I can bypass the warning, but the photo album I sent to my mother-in-law is now effectively inaccessible.
Unless I missed something in the article this seems like a different issue. The article is specifically about the domain "immich.cloud". If you're using your own domain, I'd check to ensure it hasn't been actually compromised by a bonnet or similar in some way you haven't noticed.
It may well be a false positive of Google's heuristics but home server security can be challenging - I would look at ruling out the possibility of it being real first.
It certainly sounds like a separate root issue to this article, even if the end result looks the same.
*botnet
Just in case you're not sure how to deal with it, you need to request a review via the Google Search Console. You'll need a Google account and you have to verify ownership of the domain via DNS (if you want to appeal the whole domain). After that, you can log into the Google Search Console and you can find "Security Issues" under the "Security & Manual Actions" section.
That area will show you the exact URLs that got you put on the block list. You can request a review from there. They'll send you an email after they review the block.
Hopefully that'll save you from trying to hunt down non-existent malware on a half dozen self-hosted services like I ended up doing.
It's a bit ironic that a user installing immich to escape Google's grip ends up having to create again a Google account to be able to remove their Google account.
Indeed. Thankfully, this isn't the first time Google has caused an issue like this, so I'm familiar with the appeal process.
Reviews view Google Search Console are pointless because they won't stop the same automated process from flagging the domain again. Save your time and get your lawyer to draft a friendly letter instead.
Since other browsers, like Firefox, also use the Google Safe Browsing list, they are affected as well.
No later than last weekend I was comtemplating migrating my family pictures to a self-hosted Immich instance...
I guess a workaround Google's crap would be to put an htpasswd/basic auth in front of Immich, blocking Google to get to the content and flagging it.
Add a custom "welcome message" in Server Settings (https://my.immich.app/admin/system-settings?isOpen=server) to make your login page look different compared to all other default Immich login pages. This is probably the easiest non-intrusive tweak to work around the repeated flagging by Safe Browsing, still no 100% guarantee. I agree that strict access blocking (with extra auth or IP ACL) can work better. Though I've seen in this thread https://news.ycombinator.com/item?id=45676712 and over the Internet that purely internal/private domains get flagged too. Can it be some Chrome + G Safe Browsing integration, e.g. reporting hashes of visited pages?
Btw, folks in the Jellyfin thread tried blocking specifically Google bot / IP ranges (ASNs?) https://github.com/jellyfin/jellyfin-web/issues/4076#issueco... with varying success.
And go through your domain registration/re-review in G Search Console of course.
Thank you for the "welcome message" suggestion! I'll implement that in the hope it may help in the future.
Immich is a great software package, and I recommend it. Sadly, Google can still flag sites based on domain name patterns, blocking content behind auth or even on your LAN.
That probably wouldn't work, I get hit with Chrome's red screen of annoyance regularly with stuff only reachable on my LAN. I suspect the trigger is that the URLs are like [product name].home.[mydomain.com].
I'm actually already avoiding this issue but for another reason: hackers will scan subdomains matching known products with known vulnerabilities, so hosting a Wordpress behind "wordpress.domain.tld" will get you way more ill-intentioned requests than "tbyehl.domain.tld".
Thus if I started hosting my Immich instance, I would probably put it behind "pxl.domain.tld" or something like that.
Not a garantee to pass the Google purity test, but, according to some reports, it would avoid raising some redflags.
Out of curiosity, is your Immich instance published as https://immich.example.com ?
Yes, it's on the "immich" subdomain. This has crossed my mind as a potential triggering cause, as has the default login page.
Update: my appeal of the false positive has been accepted by Google and my domain is now unblocked.
Be sure to see the team's whole list of Cursed Knowledge. https://immich.app/cursed-knowledge
I love Immich & greatly appreciate the amazing work the team put into maintaining it, but between the OP & this "Cursed Knowledge" page, the apparent team culture of shouting from the rooftops complaints that expose their own ignorance about technology is a little concerning to be honest.
I've now read the entire Cursed Knowledge list & - while I found some of them to be invaluable insights & absolutely love the idea of projects maintaining a public list of this nature to educate - there are quite a few red flags in this particular list.
Before mentioning them: some excellent & valuable, genuinely cursed items: Postgres NOTIFY (albeit adapter-specific), npm scripts, bcrypt string lengths & especially the horrifically cursed Cloudflare fetch: all great knowledge. But...
> Secure contexts are cursed
> GPS sharing on mobile is cursed
These are extremely sane security feature. Do we think keeping users secure is cursed? It honestly seems crazy to me for them to have published these items in the list with a straight face.
> PostgreSQL parameters are cursed
Wherein their definition of "cursed" is that PG doesn't support running SQL queries with more than 65535 separate parameters! It seems to me that any sane engineer would expect the limit to be lower than that. The suggestion that making an SQL query with that many parameters is normal seems problematic.
> JavaScript Date objects are cursed
Javascript is zero-indexed by convention. This one's not a huge red flag but it is pretty funny for a programmer to find this problematic.
> Carriage returns in bash scripts are cursed
Non-default local git settings can break your local git repo. This isn't anything to do with bash & everyone knows git has footguns.
> Carriage returns in bash scripts are cursed
Also the full story here seemed to be
1. Person installs git on Windows with autocrlf enabled, automatically converting all LF to CRLF (very cursed in itself in my opinion).
2. Does their thing with git on the Windows' side (clone, checkout, whatever).
3. Then runs the checked out (and now broken due to autocrlf) code on Linux instead of Windows via WSL.
The biggest footgun here is autocrlf but I don't see how this is whole situation is the problem of any Linux tooling.
This is imo ultimately a problem with git.
If git didn't have this setting, then after checking out a bash file with LFs in it, there are many Windows editors that would not be able to edit that file properly. That's a limitation of those editors & nobody should be using those pieces of software to edit bash files. This is a problem that is entirely out of scope for a VCS & not something Git should ever have tried to solve.
In fact, having git solve this disincentives Windows editors from solving it correctly.
> I don't see how this is whole situation is the problem of any Linux tooling
Well, bash could also handle crlf nicely. There's no gain from interpreting cr as a non-space character.
(The same is valid for every language out there and all the spacey things, like zero-width space, non-breaking space, and vertical tabs.)
You will have the same problem if you build a Linux container image using scripts that were checked out on the windows host machine. What's even more devious is that some editors (at least VS Code) will automatically save .sh files with LF line endings on Windows, so the problem doesn't appear for the original author, only someone who clones the repo later. I spent probably half a day troubleshooting this a while back. IMO it's not the fault of any one tool, it's just a thing that most people will never think about until it bites them.
TL;DR - if your repo will contain bash scripts, use .gitattributes to make sure they have LF line endings.
The biggest mistake was running Linux programs over files created by Windows programs. Anything you move between those worlds is suspect.
It wouldn't be a problem if git didn't try to magic away the difference.
The Date complaint is
> JavaScript date objects are 1 indexed for years and days, but 0 indexed for months.
This mix of 0 and 1 indexing in calendar APIs goes back a long way. I first remember it coming from Java but I dimly recall Java was copying a Taligent Calendar API.
You're taking the word cursed way too seriously
This is just a list of things that can catch devs off guard
I guess you're right - I find the tone off but it's not egregious & it is mostly a very useful list.
Some of these seem less cursed, and more just security design?
>Some phones will silently strip GPS data from images when apps without location permission try to access them.
That strikes me as the right thing to do?
Huh. Maybe? I don't want that information available to apps to spy on me. But I do want full file contents available to some of them.
And wait. Uh oh. Does this mean my Syncthing-Fork app (which itself would never strike me as needing location services) might have my phone's images' location be stripped before making their way to my backup system?
EDIT: To answer my last question: My images transferred via Syncthing-Fork on a GrapheneOS device to another PC running Fedora Atomic have persisted the GPS data as verified by exiftool. Location permissions have not been granted to Syncthing-Fork.
Happy I didn't lose that data. But it would appear that permission to your photo files may expose your GPS locations regardless of the location permission.
With the Nextcloud app I remember having to enable full file permissions to preserve the GPS data of auto-uploaded photos a couple of years ago. Which I only discovered some months after these security changes went into effect on my phone. That was fun. I think Android 10 or 11 introduced it.
Looking now I can't even find that setting anymore on my current phone. But the photos still does have the GPS data intact.
I think the “cursed” part (from the developers point of view) is that some phones do that, some don’t, and if you don’t have both kinds available during testing, you might miss something?
> That strikes me as the right thing to do
Yep, and it's there for very goos reasons. However if you don't know about it, it can be quite surprising and challenging to debug.
Also it's annoying when your phones permissions optimiser runs and removes the location permissions from e.g. Google Photos, and you realise a few months later that your photos no longer have their location.
There is never a good reason to permanently modify my files, if that is what is going on here. Seems like I wouldn't be able to search my photos by location reliably if that data was stripped from them.
Nothing is "permanently modifying your files".
What happens is that when an application without location permissions tries to get photos, the corresponding OS calls strip the geo location data when passing them. The original photos still have it, but the application doesn't, because it doesn't have access to your location.
This was done because most people didn't know that photos contain their location, and people got burned by stalkers and scammers.
It's not if it silently alters the file. i do want GPS data for geolocation, so that when i import the images in the right places they are already placed where they should be on the map
IMO, the problem is that it fails silently.
Every kind of permission should fail the same way, informing the user about the failure, and asking if the user wants to give the permission, deny the access, or use dummy values. If there's more than one permission needed for an operation, you should be able to deny them all, or use any combination of allowing or using dummy values.
And permissions should also not be so wide. You should be able to give permission to the GPS data in pictures you consciously took without giving permission to track your position whenever.
I think the bad part is that the users are often unaware. Stripping the data by default makes sense but there should be an easy option not to.
Try to get an iPhone user to send you an original copy of a photo with all metadata. Even if they want to do it most of them don't know how.
How does it makes sense?
This kind of makes we wish CURSED.md was a standard file in projects. So much hard-earned knowledge could be shared.
You know you can just start doing that in your projects. That's how practice often becomes standard.
The Postgres query parameters one is funny. 65k parameters is not enough for you?!
As it says, bulk inserts with large datasets can fail. Inserting a few thousand rows into a table with 30 columns will hit the limit. You might run into this if you were synchronising data between systems or running big batch jobs.
Sqlite used to have a limit of 999 query parameters, which was much easier to hit. It's now a roomy 32k.
Right, for postgres I would use unnest for inserting a non-static amount of rows.
In the past I've used batches of data, inserted into a separate table with all the constraints turned off and using UNNEST, and then inserted into the final table once it was done. We ended up both batching the data and using UNNEST because it was faster but it still let us resume midway through.
We probably should have been partitioning the data instead of inserting it twice, but I never got around to fixing that.
COPY is likely a better option if you have access to the host, or provider-specific extensions like aws_s3 if you have those. I'm sure a data engineer would be able to suggest a better ETL architecture than "shove everything into postgres", too.
Was MERGE too slow/expensive? We tend to MERGE from staging or temporary tables when we sync big data sets. If we were on postgres I think we'd use ... ON CONFLICT, but MERGE does work.
COPY is often a usable alternative.
> PostgreSQL USER is cursed > The USER keyword in PostgreSQL is cursed because you can select from it like a table, which leads to confusion if you have a table name user as well.
is even funnier :D
SQL's "feature" of having table and field names in the same syntactic namespace as an ever expanding set of english language keywords is the original eldritch curse behind it all.
> JavaScript date objects are 1 indexed for years and days, but 0 indexed for months.
I don't disagree that months should be 1-indexed, but I would not make that assumption solely based on days/years being 1-indexed, since 0-indexing those would be psychotic.
The only reason I can think of to 0-index months is so you can do monthName[date.getMonth()] instead of monthName[date.getMonth() - 1].
I don't think adding counterintuitive behavior to your data to save a "- 1" here and there is a good idea, but I guess this is just legacy from the ancient times.
A [StackOverflow thread](https://stackoverflow.com/a/41992352) about this interface says it was introduced by Java way back in 1995, and copied by the first JavaScript implementation.
(We don't have Markdown formatting here, BTW. But thanks for the heads up, and welcome to YC.)
That would have a better solution in a date.getCurrentMonth(), in my opinion.
Temporal[0] is coming which solves many many many issues with JS Date, 1-based months[1] included!
Can't wait for it to be stable and widely available, it's just too good.
> month values start at 1, which is different from legacy Date where months are represented by zero-based indices (0 to 11)
[0] https://tc39.es/proposal-temporal/docs/
[1] https://tc39.es/proposal-temporal/docs/plaindate.html#month
Why so? Months in written form also start with 1, same as days/years, so it would make sense to match all of them.
For example, the first day of the first month of the first year is 1.1.1 AD (at least for Gregorian calendar), so we could just go with 0-indexed 0.0.0 AD.
Hum...
Dark-grey text on black is cursed. (Their light theme is readable.)
Also, you can do bulk inserts in postgres using arrays. Take a look at unnest. Standard bulk inserts are cursed in every database, I'm with the devs here that it's not worth fixing them in postgres just for compatibility.
Saw the long passwords are cursed one. Reminded me of ancient DES unix passwords only reading the first eight characters. What's old is new again...
A friend / client of mine used some kind of WordPress type of hosting service with a simple redirect. The host got on the bad sites list.
This also polluted their own domain, even when the redirect was removed, and had the odd side effect that Google would no longer accept email from them. We requested a review and passed it, but the email blacklist appears to be permanent. (I already checked and there are no spam problems with the domain.)
We registered a new domain. Google’s behaviour here incidentally just incentivises bulk registering throwaway domains, which doesn’t make anything any better.
Wow. That scares me. I've been using my own domain that got (wrongly) blacklisted this week for 25 years and can't imagine having email impacted.
My general policy now is to confine important email to a very, very basic website that you rigidly control the hosting over and just keep static sites on.
And avoid using subdomains.
Us nerds *really* need to come together in creating a publicly owned browser (non chromium)
Surely among us devs, as we realize app stores increasingly hostile, that the open web is worth fighting for, and that we have the numbers to build solutions?
Uh… we are. Servo and Ladybird. It’s a shit tonne of work.
Firefox should be on that list. It's clearly a lot closer in functionality to Chrome/Chromium than Servo or Ladybird, so it's easier to switch to it. I like that Servo and Ladybird exist and are developing well, but there's no need to pretend that they're the only available alternatives.
And also, it's very feasible to contribute to Firefox. And through it, to Zen Browser, Librewolf, etc. as well.
Majority of users are on mobile now, and Firefox mobile sucks ass. I cannot bring myself to use it. Simple things like clicking the home button should take you to homepage, but Firefox opens a new tab. It's so stupid.
I use Firefox Mobile Nightly on Android and appreciate it for the dark mode extension and ad blocking. There are some issues but the benefits outweigh them for me.
I don't even have a Home button that I can see, I must have turned it off in settings? I describe my tab count using scientific notation, though, so I'd be a "new tab" guy, anyway. But I'd also be a proponent of it being configurable.
i think it's great and syncs well with my computer's firefox. i think there should be a setting to choose how to open homepage but i don't mind the extra tabs really.
Firefox enables Google's "safe browsing" aka global internet censorship list by default.
If you knew how the Mozilla corporation was governed, then you would not think that Firefox should be on the list.
How is it governed?
Funded to the tune of a half billion dollars a year by Google to pretend there's no monopoly, and multiple announcements of them trying to reimagine themselves as an ad-company. They're the best of a bad bunch but they are definitely still part of a bad bunch
Your second point, as well as their so much criticised, especially on HN, attempts at diversification, are trying to fight your first point.
Because they're so reliable on Google funding, they're trying to do whatever they can to find alternative revenue streams. Damned if you do, damned if you don't, especially for the HN crowd.
"Fighting" it in this way completely misses the point which the first point is a problem.
> It’s a shit tonne of work.
[Sam didn't like that.]
This is #1 on HN for a while now and I suspect it's because many of us are nervous about it happening to us (or have already had our own homelab domains flagged!).
So is there someone from Google around who can send this along to the right team to ensure whatever heuristic has gone wrong here is fixed for good?
I doubt Google the corporation cares one bit, and any individual employees who do care would likely struggle against the system to cause significant change.
The best we all can do is to stop using Google products and encourage our friends and family to do likewise. Make sure in our own work that we don't force others to rely on Google either.
We really need an internet Bill of Rights. Google has too much power to delete your company from existence with no due process or recourse.
If any company controls some (high) percentage of a particular market, say web browsers, search, or e-commerce, or social media, the public's equal access should start to look more like a right and less like an at-will contract.
30 years ago, if a shop had a falling out with the landlord, it could move to the next building over and resume business. Now if you annoy eBay, Amazon or Walmart, you're locked out nationwide. If you're an Uber, Lyft, or Doordash (etc) gig worker and their bots decide they don't like you anymore, then sayonara sucker! Your account has been disabled, have a nice day and don't reapply.
Our regulatory structure and economies of scale encourage consolidation and scale and grant access to this market to these businesses, but we aren't protecting the now powerless individuals and small businesses who are randomly and needlessly tossed out with nobody to answer their pleas of desperation, no explanation of rules broken, and no opportunity to appeal with transparency.
It's a sorry state of affairs at the moment.
I know someone with a small business that applied for Venmo Business account (which is the main payment method in their community industry) and Venmo refused to open the account and didn't provide any reason as to why saying that they have the right to choose to refuse providing the service, which they do. But all the competitors of that business in the area do have a Venmo and take payment this way so it is basically a revenue loss for that person.
It's a bit frustrating when a company becomes a major player in an industry and can have a life and death sentence on other businesses.
There are alternative payment method but people are use to pay a certain way in that industry/area, similarly there are other browsers but people are used to Chrome.
Same thing with Paypal - I opened a business account, was able to do one transaction and was shut down for fraud. I tested a donation to myself. Under $10. Lifetime ban.
fuck paypal
That’s not unique to PayPal. Pretty much any payment processor that detects a proprietor paying themselves is going to throw up a red flag for circular cash flow fraud and close the account. Bank-operated payment processors are often slower to catch it, but they will also boot you for this.
real payment processors also you just call on the phone and they fix it. That's not a real problem. we do test orders on many go lives per year and never see this. Yes there are sandboxes, but you always gotta test real transactions by the end.
Ok - but maybe tell me that? They won't officially tell me anything or let me talk to a person. Form email every time.
At the time I didn't have a better way to test that my form worked.
Here clown world is going strong too. My bank bans me every time I pay taxes.
My bank displays me a popup warning me to check who I'm sending money to every time I make a transfer. If I've made that same transfer before, after showing that, it's also telling me that it won't ask for 2FA for this transfer, because I've made it so many times before.
High quality or even medium quality software and UX is getting harder and harder to find.
I sold some camera equipment on eBay once. PayPal flagged my account as fraudulent, asked for a receipt for the equipment which I did not have (I bought it years before), so they banned my account indefinitely.
Randomly, years later, they turned it back on. Thanks, I guess?
They want to make money off your transactions again I guess. Must get more from Ebay
Fuck PayPal.
Fwiw Venmo is run by the same thugs who run PayPal. So go figure.
Yup. Both bad companies IMO
Force interoperability. In 2009 I could run Pidgin and load messages from AIM, FB Messages, Yahoo... Where did that go?
I suspect the EU will be the first region to push the big tech companies on this.
Or enforce antitrust.
As firearm enthusiasts like to say, "Enforce the laws we already have".
We need to fix the jurisprudence around anti-trust.
> No person engaged in commerce or in any activity affecting commerce shall acquire, directly or indirectly, the whole or any part of the stock or other share capital and no person subject to the jurisdiction of the Federal Trade Commission shall acquire the whole or any part of the assets of another person engaged also in commerce or in any activity affecting commerce, where in any line of commerce or in any activity affecting commerce in any section of the country, the effect of such acquisition may be substantially to lessen competition, or to tend to create a monopoly.
Taken at face value, that would forbid companies from buying any large competitors unless the competitor is already failing. Somehow that got watered down into almost nothing.
The issue is that current law around monopolies defines them from the wrong angle.
Instead of taking a consumer-centric / competition perspective, they should be defined in terms of market share (with markets broadly defined from a consumer perspective).
>10% = some minimal interoperability and reporting requirements
>25% = serious interoperability requirements
>35% = severe and audited interoperability requirements, with a method for gaps to be proposed by competitors, with the end goal of making increasing market share past this point difficult
Close the "but it's free to consumers (because we monetize them in other ways)" loophole that every 90s+ internet business used: instead focus on ensuring competition as measured by market share.
well exactly, Verizon, Amazon, etc all LOVE more regulation. they have armies of lawyers who not only help in constructing the laws, they help pass, lobby and implement them. then the same law firms help amazon, verizon, etc execute it.
It's regulatory capture
now a small competitor wants to do something like get into the wifi game and they're look at huge fixed fees to get started.
I think 00s+ tech history has demonstrated that the free market is no longer sufficient to promote healthy competition.
Partly a consequence of the biggest tech firms getting bigger.
And partly because of newfound technical ability to achieve mass lock-in (e.g. vendor-controlled encryption, TPMs, vertical integration in platforms, first-party app stores, etc).
The 'but regulatory capture' counter argument rings hollow when the government has given the market a lighter monopoly regulatory touch... and we've ended up with a more concentrated, less competitive market than when it was more heavily regulated.
Fixed fees are nothing.
If you want to make electronics with any complexity, you'll suddenly discover that you need to pay patent fees. And those come as a fixed share of your revenue. Add enough complexity and you can easily be required to pay more than 100% of your revenue in fees.
Looks like market share is not a concern anymore: when one participant adopts a dark pattern others follow, because then consumer has nowhere to go. What Orwell called it, collectivist oligarchy?
In 2025 you can use Beeper (or run your own local Matrix server with the opensource bridges) and get the same result with WhatsApp, Signal, Telegram, Discord, Google Messages, etc. etc.
You'd have to break most of those platforms' TOS to do so.
That's always been the case. Jailbreaking your phone is also breaking TOS. Sideloading apps on iPhone by using the developer features is breaking TOS. Almost anything that gives a corporation less money or control over you is against that corporation's TOS. That's not the law, though, and we need to grow a collective spine.
... It's a lot easier to have a spine about risking getting banned from a service if getting banned from that service wouldn't destroy your life.
Was Pidgin TOS-compliant back in the day? I'm a young whippersnapper, so I don't have experience with it myself.
Well it did have to change its name from GAIM to Pidgin at some point because it infringed on "AIM" by AOL. And whether or not Pidgin was fully "TOS-compliant" (which it might have been depending on the service we'd be looking at) is not as relevant as whether these terms would have been actually legally enforceable or not.
That was due to a trademark violation and nothing to do with TOS.
We (Pidgin/Gaim/Finch/libpurple) have never been TOS compliant.
isn't beeper non-free? there aren't that many decent matrix bridges.
All the Beeper bridges are open source and self hostable: https://github.com/beeper
The project is still alive and we're trying to finish our next major version to be able to better support modern protocols and features.
We do monthly updates on the status of the project that we call State of the Bird and they can be found here https://discourse.imfreedom.org/tag/state-of-the-bird.
Remind me (in a millenium or two) when you can finally do XMPP MAM + message carbons. Until then: lol
> I suspect the EU will be the first region to push the big tech companies on this.
Supposedly, DMA should enforce this already.
https://www.socialmediatoday.com/news/meta-announces-next-st...
Haven't heard much about it lately though.
Your Pidgin example isn't even real interoperability - you still needed real AIM, FB and Yahoo accounts for that.
> 2009 I could run Pidgin and load messages from AIM, FB Messages, Yahoo... Where did that go?
https://www.youtube.com/watch?v=mBcY3W5WgNU
But seriously; the internet is now overrun with AI Slop, Spam, and automated traffic. To try to do something about it requires curation, somebody needs to decide what is junk, which is completely antithetical to open protocols. This problem is structurally unsolvable, there is no solution, there's either a useless open internet or a useful closed one. The internet is voting with Cloudflare, Discord, Facebook, to be useful, not open. The alternative is trying to figure out how to run a decentralized dictatorship that only allows good things to happen; a delusion.
The only other solution is accountability, a presence tied to your physical identity; so that an attacker cannot just create 100,000 identities from 25,000 IP addresses and smash your small forum with them. That's an even less popular idea, even though it would make open systems actually possible. Building your own search engine or video platform would be super easy, barely an inconvenience. No need for Cloudflare if the police know who every visitor is. No need for a spam filter, if the government can enforce laws perfectly.
Take a look at email, the mother of all open protocols (older than HTTP). What happened? Radical recentralization to companies that had effective spam management, and now we on HN complain we can't break through, someone needs to do something about that centralization, so that we can go back to square one where people get spammed to death again, which will inevitably repeat the discretion required -> who has the best discretion -> flee there cycle. Go figure.
I run an email server with no specific spam filter. Sometimes I get spam. Then I add a filter on my end to delete it and move on. It's nowhere near as bad as people proclaim. Neither is deliverability, for that matter, even after I forgot to set an SPF record and some random internet server sent a bunch of spam on my behalf (which I know because I got the bounces).
You have a dirt path to your house and are therefore convinced the interstate highway system should allow direct residential driveways.
Gmail processes 376 billion emails per day. At that volume, even 0.1% spam getting through is 376 million messages. However, we're not talking about 0.1%, but 45.6% of email being spam globally. For Gmail, that's 171 billion spam messages daily. Congrats that your private server works at your scale. It's completely irrelevant, and only works because bad actors don't care about it.
Imagine though, if we even accepted spam culturally and handled it individually, as per your solution. That would mean spam can get through with brute force, which it can't right now, meaning that 45.6% would probably explode closer to 90%, 95%, or more overnight. It's only manageable at 45.6% for you because Gmail's spam filters are working overtime harming the economics.
Why should curation be centralized? We do not need a "decentralized dictatorship" (what would that even be? that's antithetical) and we certainly do not need a centralized one. It seems crazy that your solutions to AI, spam, and "automated traffic" (I don't know what that is, I assume web crawlers and such) is that the police control every single transaction.
First off, we can simply let the user, or client software, choose. Why should we let centralized servers do that by default?
At scale, DNS is somewhat centralized but authorities are disconnected from internet providers and web browsers. They're the best actors to regulate this.
For mail, couldn't we come up with a mail-DNS, that authenticates senders? There could be different limits based on whether you are an individual or a company, and whether you're sending 10'000 emails or just 100.
Regardless of whether these are good solutions -- why jump to extreme ones? "TINA" is not a helpful argument, it's a slogan.
> For mail, couldn't we come up with a mail-DNS, that authenticates senders?
So RFC 7672? https://datatracker.ietf.org/doc/html/rfc7672
I have no knowledge of DANE but its reliance on DNSSEC makes me worried that it would be difficult for people to adopt it.
Also, I think it solves a different problem: it prevents spoofing/MITM but what about legitimate certificates? We would still need CAs that actually curate their customers and hold them accountable. And we would need email servers/clients to differentiate between strict CAs and ones that are used solely for encryption purposes.
I don't know that DNS should be applied to emails as is anyway but I find it could force spammers to operate with publicly available information which would make holding them accountable easier.
> I have no knowledge of DANE but its reliance on DNSSEC makes me worried that it would be difficult for people to adopt it.
It's not hard to set up DNSSEC as long as your DNS server software supports it and most people don't run their own authorative DNS servers anyway.
Uh huh.
https://ianix.com/pub/dnssec-outages.html
So the solution to AI slop and spam is end of anonymity and total state control of the internet? Talk about the cure being worse than the disease.
The issues with todays internet stem specifically from the centralisation of power in the hands of Google, Apple and the social networks.
Bad search results? Blame Google's monopoly incentivising them intentionally making their results worse.
Difficulty promoting or finding events? Blame Facebooks real revenue model - preventing one to many communications by default and charging for exceptions.
AI overrun with slop? Blame OpenAI and Facebook, both of whom are actively promoting and profiting from the creation of slop.
Automated traffic slowing down sites? It's often the AI companies indexing and reindexing hundreds of times.
Spam? Not a huge issue for anyone that I'm aware of.
The closed internet platforms are the problem. Forcing them to relinquish control over handsets, data and our interpersonal connections is the solution. It will be legislative, or it will be torches to the data centres, likely both. But it is coming.
I completely disagree.
> The issues with todays internet stem specifically from the centralisation of power in the hands of Google and the social networks.
> Bad search results? Blame Google's monopoly incentivising them intentionally making their results worse.
> Difficulty promoting or finding events? Blame Facebooks real revenue model - preventing one to many communications by default and charging for exceptions.
You're misdiagnosing what happened here. These aren't diseases. These are symptoms that the more open internet, that we had in the early 2000s, completely failed at scale. The disease was the predictable failure of an open system to self-moderate, the symptoms the centralization that followed. You're mistaking effect for cause.
People started using Google, because it was the only tool good enough at digging through manure. Facebook started charging for mass communication, because otherwise, everyone has an excuse why they need to use it. Cloudflare became popular, because the internet didn't care when 40% of traffic was bots, half of them malicious, before AI was even on the scene. And so on.
The open system failed, and was becoming unusable. Big Tech arrived offering proprietary solutions as CPR. They didn't cause the death.
> To try to do something about it requires curation, somebody needs to decide what is junk, which is completely antithetical to open protocols.
The contra-example, of course, is email. SpamAssassin figured this out 24 years (!) ago. There is zero reason you couldn't apply similar heuristics to detect AI-slop or whatever particular kind of content you don't want to accept.
> Radical recentralization to companies that had effective spam management
Only for the lazy.
A. SpamAssassin has never been tested at Gmail scale, and would likely fail in such a scenario.
B. SpamAssassin is benefiting from centralized players, like Gmail, harming spam's economics. You're a free rider from the onslaught that would occur if spamming actually worked. Spam is at 45.6% of email globally with aggressive spam filters, but could easily double, triple, quadruple in volume if filters started failing even moderately. Weaker filters, and we'll start seeing the email DDoS for the first time.
C. Heuristics on AI Content? What are you going to do, run an "AI Detector" model on a GPU for every incoming email? 376 billion of them every day to Gmail alone? This only makes the email DDoS even more likely.
D. Lazy = 99%+ of global computer users - and that's changing as soon as everyone becomes their own paramedic. If you can't convince most people to learn how to save other people's lives, and probably didn't bother yourself, despite it being disproportionately more important, you're never teaching them technical literacy.
I think you misunderstand what I'm getting at. SpamAssassin is older than Gmail. It's an old example, much newer and better spam-filtering-at-scale solutions exist (although SA is still maintained). Trying to claim that only the big boys can filter spam is an uninformed opinion.
No, you don't need an AI model to detect AI content (lmao). Heuristics already exist, and you see people mention them online all the time -- excessive use of lists, em dashes, common phrases, etc. Yes, a basic text heuristic scorer from the 1980s can pick these up without much difficulty. The magic of auto-learning heuristics (which have also existed since the 1980s, and performed fine at scale with less processing power than your smartwatch) is you can train them on whatever content you don't want to receive: marketing, political content, etc. You can absolutely apply this to whatever content suits your fancy, and it doesn't really take any more effort than moving messages you want filtered out to a Junk folder or similar.
They're too busy trying to strip encryption to do anything
It’s almost as if those companies have country like powers.
Maybe they should be subject to same limitations like First Amendment etc.
The solution is just to enforce the anti-trust act as it is written.
As long as wealth can be transduced into political power, that boat is beached as
FWIW in some jurisdictions you might be able to sue them for tortious interference, which basically means they went out of their way to hurt your business.
I see a lot of comments here about using some browser that will allow ME to see sites I want to see, but I did not see a lot about how do I protect my site or sites of clients from being subjected to this. Is there anything proactive that can be done? A set of checks almost like regression testing? I understand it can be a bit like virus builders using anti virus to test their next virus. But is there a set of best practices that could give you higher probability of not being blocked?
> how do I protect my site or sites of clients from being subjected to this. Is there anything proactive that can be done?
Some steps to prevent this happening to you:
1. Host only code you own & control on your own domain. Unless...
2. If you have a use-case for allowing arbitrary users to publish & host arbitrary code on a domain you own (or subdomains of), then ensure that domain is a separate dedicated one to the ones you use for your own owned code, that can't be confused with your own owned hosted content.
3. If you're allowing arbitrary members of the public to publish arbitrary code for preview/testing purposes on a domain you own - have the same separation in place for that domain as mentioned above.
4. If you have either of the above two use-cases, publish that separated domain on the Mozilla Public Suffix list https://publicsuffix.org/
That would protect your domains from being poisoned by arbitrary publishing, but wouldn't it risk all your users being affected by one user publishing?
Allowing user publishing is an inherent risk - these are good mitigations but nothing will ever be bulletproof.
The main issue is protecting innocent users from themselves - that's a hard one to generalise solutions to & really depends on your publishing workflows.
Beyond that, the last item (Public Suffix list) comes with some decent additional mitigations as an upside - the main one being that Firefox & Chrome both enable more restrictive cookie settings while browsing any domains listed in the public suffix list.
---
All that said - the question asked in the comment at the top of the thread wasn't about protecting users from security risk, but protecting the domain from being flagged by Google. The above steps should at least do that pretty reliably, barring an actual legitimate hack occurring.
Thank you for your thoughtful and helpful reply.
> Is there anything proactive that can be done?
Befriend a lawyer that will agree to send a letter to Google on your behalf in case it happens.
A good takeaway is to separate different domains for different purposes.
I had prior been tossing up the pros/cons of this (such as teaching the user to accept millions of arbitrary TLDs as official), but I think this article (and other considerations) have solidified it for me.
For example
www.contoso.com (public)
www.contoso.blog (public with user comments)
contoso.net (internal)
staging.contoso.dev (dev/zero trust endpoints)
raging-lemur-a012afb4.contoso.build (snapshots)
The biggest con of this is that to a user it will seem much more like phishing.
It happened to me a while ago that I suddenly got emails from "githubnext.com". Well, I know Github and I know that it's hosted at "github.com". So, to me, that was quite obviously phishing/spam.
Turns out it was real...
This is such a difficult problem. You should be able to buy a “season pass” for $500/year or something that stops anyone from registering adjacent TLDs.
And new TLDs are coming out every day which means that I could probably go buy microsoft.anime if I wanted it.
This is what trademarks are supposed to do, but it’s reactive and not proactive.
PayPal is a real star when it comes to vague, fake-sounding, official domains.
Real users don't care much about phishing as long as you got redirected from the main domain, though. github.io has been accepted for a long time, and githubusercontent.com is invisible 99% of the time. Plus, if your regular users are not developers and still end up on your dev/staging domains, they're bound to be confused regardless.
Good
The same thing happened to me earlier this year with a self-hosted instance of Umami Analytics.
https://news.ycombinator.com/item?id=42779544#42783321
Unironically, including a threat of legal action in my appeal on the Google Search Console was what stopped our instance getting flagged in the end.
Could you provide your text? Having same issue for years https://news.ycombinator.com/item?id=45678095
Maybe a dumb question but what constitutes user-hosted-content?
Is a notion page, github repo, or google doc that has user submitted content that can be publicly shared also user-hosted?
IMO Google should not be able to use definitive language "Dangerous website" if its automated process is not definitive/accurate. A false flag can erode customer trust.
A website where a user can upload "active code".
The definition of "active code" is broad & sometimes debatable - e.g. do old MySpace websites count - but broadly speaking the best way of thinking about it is in terms of threat model, & the main two there are:
- credential leakage
- phishing
The first is fairly narrow & pertains to uploading server side code or client javascript. If Alice hosts a login page on alice.immich.cloud that contains some session handling bugs in her code, Mallory can add some cute to mallory.immich.cloud to read cookies set on *.immich.cloud to compromise Alice's logins.
The second is much broader as it's mostly about plausible visual impersonation so will also cases where users can only upload CSS or HTML.
Specifically in this case what Immich is doing here is extremely dangerous & this post from them - while I'll give them the benefit of the doubt on being ignorant - is misinformation.
It may be dangerous but it is an established pattern. There are many cases (like Cloudflare Pages) of others doing the same, hosting strangers' sites on subdomains of a dedicated domain (pages.dev for Cloudflare, immich.cloud for Immich).
By preventing newcomers from using this pattern, Google's system is flawed, severely stifling competition.
Of course, this is perfectly fine for Google.
It is but this established pattern is well standardised & documented by the public suffix list project. There's generally two conventions followed for this pattern:
1. Use a separate dedicated domain (Immich didn't do this - they're now switching to one in response to this)
2. List the separate dedicated domain in the public suffix list. As far as I can tell Immich haven't mentioned this.
> what Immich is doing here is extremely dangerous
You fully misunderstand what content is hosted on these sites. It's only builds from internal branches by the core team, there is no path for "external user" content to land on this domain.
It's builds from PRs that can be submitted by anyone with a GitHub account
Dude, I built it, surely I'd know how it works...
>> Unfortunately, Google seems to have the ability to arbitrarily flag any domain and make it immediately unaccessible to users. I'm not sure what, if anything, can be done when this happens, except constantly request another review from the all mighty Google.
Perhaps a complaint to the ETC for abusing the monopoly and lack of due process to harm legitimate business? Or DG COMP (in the EU).
Gather evidence of harm and seek alliances with other open-source projects could build a momentum.
Looking forward to Louis Rossmann's reaction. Wouldn't be surprised if this leads to a lawsuit over monopolistic behavior - this is clearly abusing their dominant position in the browser space to eliminate competitors in photos sharing.
Who is that and why is his reaction relevant?
He's a right-to-repair activist Youtuber who is quite involved in GrayJay, another app made by this company, which is a video player client for other platforms like YouTube.
I'm not sure why his reaction would be relevant, though. It'll just be another rant about how Google has too much control like he's done in the past. He may be right, but there's nothing new to say.
He wasn't just involved with GrayJay, he's actually a member of FUTO - the company behind Immich and GrayJay. Now read grandparent comment one more time:
> Wouldn't be surprised if this leads to a lawsuit over monopolistic behavior
His reaction also matters because he's basically the public face for the company on YouTube and has a huge following. You've probably seen a bunch of social media accounts with the "clippy" character as their avatar. That's a movement started by Louis Rossman.
Seems that Rossmann left FUTO in february and started his own foundation in march
I write a couple of libraries for creating GOV.UK services and Google has flagged one of them as dangerous. I've appealed the decision several times but it's like screaming into a void.
https://govuk-components.netlify.app/
I use Google Workspace for my company email, so that's the only way for me to get in contact with a human, but they refuse to go off script and won't help me contact the actual department responsible in any way.
It's now on a proper domain, https://govuk-components.x-govuk.org/ - but other than moving, there's still not much anyone can do if they're incorrectly targeted.
Google is not the only one marking subdomains under netlify.app dangerous. For a good reason though, there's a lot of garbage hosted there. Netlify also doesn't do a good enough job of taking down garbage.
Given the scale of Google, and the nerdiness required to run Immich, I bet it's just an accident. Nevertheless, I'm very curious as to how senior Google staff looks at Immich, are they actually registering signals that people use immich-go to empty their Google Photos accounts? Do they see this as something potentially dangrous to their business in the long term?
The nerdsphere has been buzzing with Immich for some time now (I started using it a month back and it lives up to its reputation!), and I assume a lot of Googlers are in that sphere (but not neccessarily pro-Google/anti-Immich of course). So I bet they at least know of it. But do they talk about it?
I love Immich but the entire design and interface is so clearly straight up copied from Google photos. It makes me a bit nervous about their exposure, legally.
This seems related to another hosting site that got caught out by this recently:
https://news.ycombinator.com/item?id=45538760
Not quite the same (other than being an abuse of the same monopoly) since this one is explicitly pointing to first-party content, not user content.
I think the other very interesting thing in the reddit thread[0] for this is that if you do well-known-domain.yourdomain.tld then you're likely to get whacked by this too. It makes sense I guess. Lots of people are probably clicking gmail.shady.info and getting phished.
0: https://old.reddit.com/r/immich/comments/1oby8fq/immich_is_a...
So we can't use photos or immich or images or pics as a sub-domain, but anything nondescript will be considered obfuscated and malicious. Awesome!
Can I use this space to comment on how amazing Immich is? I self host lots of stuff, and there’s this one tier above everything else that’s currently, and exclusively, held by Home Assistant and Immich. It is actually _better_ than Google photos (if you keep your db and thumbs on ssd, and run the top model for image search). You give up nothing, and own all your data.
I migrated over from google photos 2 years ago. It has been nothing but amazing. No wonder google has it in its crosshairs.
Don't they block NextCloud sync in the Play Store, for similar reasons?
yeah same, I'm in the process of migrating so I have both google photo and immich, and honestly immich is just as good.
I actually find the semantic search of immich slightly better.
What model do you recommend for image search?
Not OP, but CLIP from OpenAi (2021) seems pretty standard and gives great results at least in English (not so good in rarer languages).
https://opencv.org/blog/clip/
Essentially CLIP lets to encode both text and images in same vector space.
It is really easy and pretty fast too generate embeddings. Took less than hour on Google Colab.
I made a quick and dirty Flask app that lets me query my own collection of pictures and provide most relevant ones via cosine similarity.
You can query pretty much anything on CLIP (metaphors, lightning, object, time, location etc).
From what I understand many photo apps offer CLIP embedding search these days including Immich - https://meichthys.github.io/foss_photo_libraries/
Alternatives could be something like BLIP.
This is what I use:
ViT-SO400M-16-SigLIP2-384__webli
I think I found it because it was recommended by Immich as the best, but it still only took a day or two to run against my 5 thousand assets. I’ve tested it against whatever Google is using (I keep a part of my library on Google Photos), and it’s far better.
I wonder when google.com will be flagged with all the phishing happening on sites.google.com.
Not to mention the phishing in the sponsored results on google.com proper.
I’m also self hosting gitea and pertainer and I’m trying this issue every few weeks. I appeal, they remove the warning, after a week is back. This is ongoing for at least 4 years. I have more than 20 appeals all successfully removing the warning. Ridiculous. I heard legal action is the best option now, any other ideas?
Safe Browsing collects a lot of data, such as hashes of URLs (URLs can be easily decoded by comparison) and probably other interactions with web like downloads.
But how effective is it in malware detection?
The benefits seem to me dubious. It looks like a feature offered to collect browsing data, useful to maybe 1% in special situations.
It's the only thing that has reasonable coverage to effectively block a phishing attack or malware distribution. It can certainly do other things like collecting browsing data, but it does get rid of long-lasting persistent garbage hosted at some bulletproof hosts.
100% agreed. Adblock does this better and doesn’t randomly block image sharing websites
If you block those internal subdomains from search with robots.txt, does Google still whine?
I’ve heard anecdotes of people using an entirely internal domain like “plex.example.com” even if it’s never exposed to the public internet, google might flag it as impersonating plex. Google will sometimes block it based only on name, if they think the name is impersonating another service.
Its unclear exactly what conditions cause a site to get blocked by safe browsing. My nextcloud.something.tld domain has never been flagged, but I’ve seen support threads of other people having issues and the domain name is the best guess.
I'm almost positive GMail scanning messages is one cause. My domain got put on the list for a URL that would have been unknowable to anyone but GMail and my sister who I invited to a shared Immich album. It was a URL like this that got emailed directly to 1 person:
https://photos.example.com/albums/xxxxxxxx-xxxx-xxxx-xxxx-xx...
Then suddenly the domain is banned even though there was never a way to discover that URL besides GMail scanning messages. In my case, the server is public so my siblings can access it, but there's nothing stopping Google from banning domains for internal sites that show up in emails they wrongly classify as phishing.
Think of how Google and Microsoft destroyed self hosted email with their spam filters. Now imagine that happening to all self hosted services via abuse of the safe browsing block lists.
if it was just the domain, remember that there is a Cert Transparency log for all TLS certs issued nowadays by valid CAs, which is probably what Google is also using to discover new active domains
It doesn’t seem like email scanning is necessary to explain this. It appears that simply having a “bad” subdomain can trigger this. Obviously this heuristic isn’t working well, but you can see the naive logic of it: anything with the subdomain “apple” might be trying to impersonate Apple, so let’s flag it. This has happened to me on internal domains on my home network that I've exposed to no one. This also has been reported at the jellyfin project: https://github.com/jellyfin/jellyfin-web/issues/4076
In my case though, the Google Search Console explicitly listed the exact URL for a newly created shared folder as the cause.
https://photos.example.com/albums/xxxxxxxx-xxxx-xxxx-xxxx-xx...
That's not going to be gleaned from a CT log or guessed randomly. The URL was only transmitted once to one person via e-mail. The sending was done via MXRoute and the recipient was using GMail (legacy Workspace).
The only possible way for Google to have gotten that URL to start the process would have been by scanning the recipient's e-mail.
Not quite. Presumably the recipient clicked the link, at which point their browser knows it and, depending on browser and settings, may submit it to Google to check if it's "safe": https://support.google.com/chrome/answer/9890866#zippy=%2Cen...
Good point. Thank you.
I've read almost everything linked in this post and on Reddit and, with what you pointed out considered, I'd say the most likely thing that got my domain flagged is having a redirect to a default styled login page.
The thing that really frustrates me if that's the case is that it has a large impact on non-customized self-hosted services and Google makes no effort to avoid the false positives. Something as simple as guidance for self-hosted apps to have a custom login screen to differentiate from each other would make a huge difference.
Of course, it's beneficial to Google if they can make self-hosting as difficult as possible, so there's no incentive to fix things like this.
Well, that's potentially horrifying. I would love for someone to attempt this in as controlled of a manner as possible. I would assume it's possible for anyone using Google DNS servers to also trigger some type of metadata inspection resulting in this type of situation as well.
Also - when you say banned, you're speaking of the "red screen of death" right? Not a broader ban from the domain using Google Workplace services, yeah?
> Also - when you say banned, you're speaking of the "red screen of death" right?
Yes.
> I would love for someone to attempt this in as controlled of a manner as possible.
I'm pretty confident they scanned a URL in GMail to trigger the blocking of my domain. If they've done something as stupid as tying GMail phishing detection heuristics into the safe browsing block list, you might be able to generate a bunch of phishy looking emails with direct links to someone's login page to trigger the "red screen of death".
This reminds me of another post where a scammer sent a gmail message containing https://site.google.com/xxx link to trick users into click, but gmail didn't detect the risk.
Chrome sends visited urls to Google (ymmv depending on settings and consents you have given)
Yes, my family Immich instance is blocked from indexing both via headers and robots.txt, yet it's still flagged by Google as dangerous.
I'm kind of curious, do you have your own domain for immich or is this part of a malware-flagged subdomain issue? It's kind of wild to me that Google would flag all instances of a particular piece of self-hosted software as malicious.
G would flag _some_ instances.
Possible scenario:
- A self-hosted project has a demo instance with a default login page (demo.immich.app, demo.jellyfin.org, demo1.nextcloud.com) that is classified as "primary" by google's algorithms
- Any self-hosted instance with the same login page (branding, title, logo, meta html) becomes a candidate for deceptive/phishing by their algorithm. And immich.cloud has a lot of preview envs falling in that category.
BUT in Immich case its _demo_ login page has its own big banner, so it is already quite different from others. Maybe there's no "original" at all. The algorithm/AI just got lost among thousands of identically looking login pages and now considers every other instance as deceptive...
I have my own domain, and Immich is hosted on an "immich" subdomain.
I see, thank you for clarifying.
I'm guessing Google's phishing analysis must be going off the rails seeing all of these login prompts saying "immich" when there's an actual immich cloud product online.
If I were tasked with automatically finding phishing pages, I too would struggle to find a solution to differentiate open-source, self-hosted software from phishing pages.
I find it curious that this is happening to Immich so often while none of my own self-hosted services have ever had this problem, though. Maybe this is why so many self-hosted tools have you configure a name/descriptor/title/whatever for your instance, so they can say "log in to <my amazing photo site>" rather than "log in to Product"? Not that Immich doesn't offer such a setting.
Tangential to the flagging issue, but is there any documentation on how Immich is doing the PR site generation feature? That seems pretty cool, and I'd be curious to learn more.
It's open source, you can find this trivially yourself in less than a minute.
https://github.com/immich-app/devtools/tree/a9257b33b5fb2d30...
If anyone's got questions about this setup I'd be happy to chat about it!
I’m curious about basically all of it. It seems like such a powerful tool.
I seem to have irritated the parallel commenters tremendously by asking, but it seemed implausible I’d understand the design considerations by just skimming the CI config.
Top of mind would be:
1. How do y'all think about mitigating the risk of somebody launching malicious or spammy PR sites? Is there a limiting factor on whose PRs trigger a launch?
2. Have you seen resource constraint issues or impact to how PRs are used by devs? It seems like Immich is popular enough that it could easily have a ton of inflight PR dev (and thus a ton of parallel PR instances eating resources)
3. Did you borrow this pattern from elsewhere / do you think the current implementation of CI hooks into k8s would be generalizable? I’ve seen this kind of PR preview functionality in other repos that build assets (like CLI tools) or static content (like docs sites), but I think this is the first time I’ve seen it for something that’s a networked service.
1. It only works at all for internal PRs, not for forks. That is a limitation we'd like to lift if we could figure out a way to do it safely though.
2. It's running on a pretty big machine, so I haven't seen it approach any limits yet. We also only create an instance when requested (with a PR label).
3. I've of course been inspired by other examples, but I think the current pattern is mostly my own, if largely just one of the core uses of the flux-operator ResourceSet APIs [1]. It's absolutely generalizable - the main 'loop' [2] just templates whatever Kubernetes resources based on the existence of a PR, you could put absolutely anything in there.
[1]: https://fluxcd.control-plane.io/operator/resourcesets/github...
[2]: https://github.com/immich-app/devtools/blob/main/kubernetes/...
Wow. What a rude way to answer.
Sometimes it is also rude to ask without looking the obvious place themselves. It is about signaling that ”my” time is more precious than ”your” time so I let them do that check for me, if I can use someone elses time.
I think we might have hit the inflection point where being rude is more polite. It's not that I want people to be rude to me, it's that I don't want to talk to AI when I intend to be talking to a person, and anyone engaging with me via AI is infinitely more disrespectful than any curse word or rudeness.
These days, when I get a capitalized, grammatically correct sentence — and proper punctuation to boot, there is an unfortunate chance it was written using an AI and I am not engaging fully with a human.
its when my covnersation partner makes human mistakes, like not capitalizing things, or when they tell me i'm a bonehead, that i know i'm talking to a real human not a bot. it makes me feel happier and more respected. i want to interact with humans dammit, and at this point rude people are more likely to be human than polite ones on the internet.
i know you can prompt AIs to make releaistic mistakes too, the arms race truly never ends
Pretty sure Immich is on github, so I assume they have a workflow for it, but in case you're interested in this concept in general, gitlab has first-class support for this which I've been using for years: https://docs.gitlab.com/ci/review_apps/ . Very cool and handy stuff.
This happened to one of our documentation sites. My co-workers all saw it before I did, because Brave (my daily driver) wasn't showing it. I'm not sure if Brave is more relaxed in determining when a site is "dangerous" but I was glad not to be seeing it, because it was a false positive.
Ran a clickbait site, and got flagged for using a bunch of 302 redirects instead of 301s. Went from almost 500k uniques a month to 1k.
During the appeal it was reviewed from India, and I had been using geoblocking. This caused my appeal to be denied.
I ended up deploying to a new domain and starting over.
Never caught back up.
Congrats on this great choice of business endeavor
I'm sure it was a simple mistake. The fact that Immich competes with Google Photos has nothing to do with it.
Regarding how Google safe browsing actually works under the hood, here is a good writeup from Chromium team:
https://blog.chromium.org/2021/07/m92-faster-and-more-effici...
Not sure if this is exactly the scenario from the discussed article but it's interesting to understand it nonetheless.
TL;DR the browser regularly downloads a dump of color profile fingerprints of known bad websites. Then when you load whatever website, it calculates the color profile fingerprint of it as well, and looks for matches.
(This could be outdated and there are probably many other signals.)
I can't imagine that lasted more than 30 seconds after they made a public blog post about how they were doing it.
I had this same problem with my self-hosted Home Assistant deployment, where Google marked the entire domain as phishing because it contains a login page that looks like other self-hosted Home Assistant deployments.
Fortunately, I expose it to the internet on its own domain despite running through the same reverse proxy as other projects. It would have sucked if this had happened to a domain used for anything else, since the appeal process is completely opaque.
This can happen to everyone. It happened to Amazon.de's Cloudfront endpoint a week ago. Most people didn't notice because Chrome doesn't look at the intermediate bits in the resolver chain, but DNS providers using Safe Browsing blocked it.
https://github.com/nextdns/metadata/issues/1425
Yes, this is not a new problem: Web browsers has taken on the role as internet police but they only care about their judgement and don't afford websites operators any due process or recourse. And by web browsers I mean Google because of course everyone just defers to them. "File a complaint with /dev/null" might be how Google operates their own properties but this should not be acceptable for the web as a whole. Google and those integrating their "solutions" need to be held accountable for the damage they cause.
Them maintaining a page of gotchas is a really cool idea - https://immich.app/cursed-knowledge
> There is a user in the JavaScript community who goes around adding "backwards compatibility" to projects. They do this by adding 50 extra package dependencies to your project, which are maintained by them.
This is a spicy one, would love to know more.
It links to a commit; the removed deps are by GitHub user ljharb.
This is crazy, it happened to the SoGO webmailer, standalone or bundled with the mailcow: dockerized stack as well. They implemented a slight workaround where URLs are being encrypted to avoid pattern detection to flag it as "deceiving".
There is no responses from Google about this. I had my instance flagged 3 times on 2 different domains including all subdomains, displaying a nice red banner on a representative business website. Cool stuff!
Google often marks my homelab domains as dangerous which all point to an A record that is in the private IP space, completely inaccessible to the internet.
Makes precisely zero sense.
The .internal.immich.cloud sites do not have matching certs!
Navigating to https://main.preview.internal.immich.cloud, I'm right away informed by the browser that the connection is not secure due to an issue with the certificate. The problem is that it has the following CN (common name): main.preview.internal.immich.build. The list of alternative names also contains that same domain name. It does not match the site: the certificate's TLD .build is different from the site's .cloud!
I don't see the same problem on external sites like tiles.immich.cloud. That has a CN=immich.cloud with tiles.immich.cloud as an alternative.
We've already moved them to immich.build
A similar issue happened to us at APKMirror last week. https://x.com/ArtemR/status/1979428936267501626.
We still don't know what caused it because it happened to the Cloudflare R2 subdomain, and none of the Search Console verification methods work with R2. It also means it's impossible to request verification.
This happened to me, I hosted a Wordpress site and it got 0'day'd (this was probably 8 years ago). Google spotted the list of insane pornographic URLs and banned it. You might want to verify nothing is compromised.
They have to fix their SSL certs. "Kubernetes Ingress Controller Fake Certificate" aint gonna cut it.
Sounds like you're hitting an address that isn't backed by any service, not sure what the issue is.
When the power is concentrated in one hands, those hands will always become the hands of a dictator
> YAML whitespace is cursed
YAML itself is cursed: https://ruudvanasseldonk.com/2023/01/11/the-yaml-document-fr...
"might trick you into installing unsafe software"
Something Google actively facilities with the ads they serve.
First thing I do when I start to use a browser for the first time is making sure 'Google Safe Browsing' feature is disabled. I don't need yet another annoyance while I browse the web, especially when it's from Google.
> The most alarming thing was realizing that a single flagged subdomain would apparently invalidate the entire domain.
Correct. It works this way because in general the domain has the rights over routing all the subdomains. Which means if you were a spammer, and doing something untoward on a subdomain only invalidated the subdomain, it would be the easiest game in the world to play.
malware1.malicious.com
malware2.malicious.com
... Etc.
google: we make going to the DMV look delightful by comparison!
They are not the government and should not have this vast, unaccountable monopoly power with no accountability and no customer service.
the government probably shouldn't either?
At least the government is normally elected.
Most of it kind of isn't. When was the last election for FCC commissioners or US Attorney General or federal district court judges?
The government tends to get out of their way to have accountability and customer service.
Honestly, where do people live that the DMV (or equivalent - in some states it is split or otherwise named) is a pain? Every time I've ever been it has been "show up, take a number, wait 5 minutes, get served" - and that's assuming website self-service doesn't suffice.
I’d say this is a clear slight from Google, using their Chrome browser because something or someone is inconveniencing another part of their business, google cloud / google photos.
They did a similar thing with the uBlock Origin extension, flagging it with “this extension might be slowing down your browser” in a big red banner in the last few months of manifest v2 on Chrome. After already having to upload the extension yourself to Chrome cause they took it off the extension store cause it was inhibiting on their ad business.
Google is a massive monopolistic company who will pull strings on one side of their business to help another.
With only Firefox not being based on Chromium and still having manifest v2 the future (5 to 10 years from now) looks bleak. With only 1 browser like this web devs can phase it out slowly by not taking it into consideration when coding or Firefox could enshittify to such an extent because of their manifest v2 monopoly that even that wont make it worth it anymore.
Oh and for the ones not in the know, Manifest is the name of a javascript file manifest.js that decides what browser extensions can and cant modify and the “upgrade” from manifest v2 to v3 has made it near impossible for adblockers to block ads.
There's a reason GitHub use github.io for user content.
They're using a different TLD (.cloud / .app). But IIRC, GH changed to avoid cookies leaking with user created JS running at their main domain.
Either they have an open redirect being misused, or their domains are being used to host phish content.
This is the way of things.
This is a known thing since quite some time and the only solution is to use separate domain. This problem has existed for so long that at this point we as users adapt to it rather than still expecting Google to fix this.
From their perspective, a few false positives over the total number of actual malicious websites blocked is fractional.
I am confused if the term "self-hosted" means the same thing to them as it means to me, not sure if I'm following.
Curious if anyone had an instance where this blocking mechanism saved them. I can’t remember a single instance in last 10 years
I've had it work for me several times. Most of the time following links/redirects from search engines, ironically a few times from Google itself. Not that I was going to enter anything (the phishing attempts themselves were quite amateurish) but they do help in some rare cases.
When I worked customer service, these phishing blocks worked wonders preventing people from logging in to your-secure-webmail.jobz. People would be filling in phishing forms days after sending out warnings on all official channels. Once Google's algorithm kicked in, the attackers finally needed to switch domains and re-do their phishing attempts.
Your parents probably have
I tried to submit this, but the direct link here is probably better than the Reddit thread I linked to:
https://old.reddit.com/r/immich/comments/1oby8fq/immich_is_a...
I had my personal domain I use for self-hosting flagged. I've had the domain for 25 years and it's never had a hint of spam, phishing, or even unintentional issues like compromised sites / services.
It's impossible to know what Google's black box is doing, but, in my case, I suspect my flagging was the result of failing to use a large email provider. I use MXRoute for locally hosted services and network devices because they do a better job of giving me simple, hard limits for sending accounts. That way if anything I have ever gets compromised, the damage in terms of spam will be limited to (ex) 10 messages every 24h.
I invited my sister to a shared Immich album a couple days ago, so I'm guessing that GMail scanned the email notifying her, used the contents + some kind of not-google-or-microsoft sender penalty, and flagged the message as potential spam or phishing. From there, I'd assume the linked domain gets pushed into another system that eventually decides they should blacklist the whole domain.
The thing that really pisses me off is that I just received an email in reply to my request for review and the whole thing is a gas-lighting extravaganza. Google systems indicate your domain no longer contains harmful links or downloads. Keep yourself safe in the future by blah blah blah blah.
Umm. No! It's actually Google's crappy, non-deterministic, careless detection that's flagging my legitimate resources as malicious. Then I have to spend my time running it down and double checking everything before submitting a request to have the false positive mistake on Google's end fixed.
Convince me that Google won't abuse this to make self hosting unbearable.
> I suspect my flagging was the result of failing to use a large email provider.
This seems like the flagging was a result of the same login page detection that the Immich blog post is referencing? What makes you think it's tied to self-hosted email?
I'm not using self hosted email. My theory is that Google treats smaller mail providers as less trustworthy and that increases the odds of having messages flagged for phishing.
In my case, the Google Search Console explicitly listed the exact URL for a newly created shared album as the cause.
https://photos.example.com/albums/xxxxxxxx-xxxx-xxxx-xxxx-xx...
I wish I would have taken a screenshot. That URL is not going to be guessed randomly and the URL was only transmitted once to one person via e-mail. The sending was done via MXRoute and the recipient was using GMail (legacy Workspace).
The only possible way for Google to have gotten that URL to start the process would have been by scanning the recipient's e-mail. What I was trying to say is that the only way it makes sense to me is if Google via GMail categorized that email as phishing and that kicked off the process to add my domain to the block list.
So, if email categorization / filtering is being used as a heuristic for discovering URLs for the block list, it's possible Google's discriminating against domains that use smaller email hosts that Google doesn't trust as much as themselves, Microsoft, etc..
All around it sucks and Google shouldn't be allowed to use non-deterministic guesswork to put domains on a block list that has a significant negative impact. If they want to operate a clown show like that, they should at least be liable for the outcomes IMO.
I'm in a similar boat. Google's false flag is causing issues for my family members who use Chrome, even for internal services that aren't publicly exposed, just because they're on related subdomains.
It's scary how much control Google has over which content people can access on the web - or even on their local network!
It's a good opportunity to recommend Firefox when you can show a clear abuse of position
Firefox uses the same list.
Wonder if there would be any way to redress this in small claims court.
This is another case where it's highly important to "plant your flag" [1] and set up all those services like Search Console, even if you don't plan to use them. Not only can this sort of thing happen, but bad-guys can find crafty ways of hijacking your search console account if you're not super vigilant.
Google Postmaster Console [2] is another one everybody should set up on every domain, even if you don't use gmail. And Google Ads, even if you don't run ads.
I also recommend that people set up Bing search console [3] and some service to monitor DMARC reports.
It's unfortunate that so much of the internet has coalesced around a few private companies, but it's undeniably important to "keep them happy" to make sure your domain's reputation isn't randomly ruined.
[1] https://krebsonsecurity.com/2020/08/why-where-you-should-you...
[2] https://postmaster.google.com/
[3] https://www.bing.com/webmasters/
It does seem kind of stupid to (apparently) not have google search console, or even a google account according to them, for your business. I don't like Google being in control of so much of the internet - but they are, and it won't do us any good to shout into the void about it when our domain and livelihood is on the line.
Simply opening a case saying that this is our website not impersonating anyone else is unlikely to get anything resolved.
Just because it's your website, and you're not a bad agent doesn't prove that no part of the site is under the control of a bad agent, and that your site isn't accidentally hosting something malicious somewhere, or have some UI that is exploitable for cross-site scripting or whatever.
Sure, but why does Google approve our review over and over again without us making any changes or modifications to the flagged sites/urls? It's a vanilla Immich deployment with docker containers from GitHub pushed there by the core team.
There is no reason why a browser should __be__ a contentfilter.
Instead, you should be able to install a preferred contentfilter into your browser.
I have no idea what immich is or what this post says, but I LOVE that this company has a collection of posts called, “Cursed Knowledge.”
I believe that Jellyfin, Immish, and NextCloud login pages are automatically flagged as dangerous by Google. What's more, I suspect that Google is somehow collecting data from its browser - Chrome.
Google flagged my domain as dangerous once. I do host Jellyfin, Immish, and NextCloud. I run an IP whitelist on the router. All packets from IPs that are not whitelisted are dropped. There are no links to my domain on the internet. At any time, there are 2-3 IPs belonging to me and my family that can load the website. I never whitelisted Google IPs.
How on earth did Google manage to determine that my domain is dangerous?
I don't think I ever saw a legitimate warning, EVER. I push past SSL warnings EVERY DAY to manage infra.
This happened to amazon.de last week. It was resolved quickly.
Google shouldn’t be a single chokepoint for web censorship.
My local SABNZBD instance (not even accessible from the internet) was marked as a malicious site too.
Is there any linkage to the semifactoid that immich Web gui looks very like Google Photos or is that just one of the coincidences?
Not a coincidence, Immich was started as a personal replacement for Google Photos.
The coincidence here would be google flagging it as malware, not the origin story of the look and feel.
Oh my bad, I severely misinterpreted your comment.
I’m launching a web version for an online game. What to do to prevent this from happening?
Install your non-self generated SSL certificate correctly, and make sure users can't upload arbitrary content to your domain.
F you, Google! Thank goodness I severed that relationship years ago. With so many other great (and ethically superior) products out there to choose from, you'd have to be a true masochist to intentionally throw yourself into their pool of shit.
I don't want Google to abuse the world wide web. It is time for real change - a world without Google. A world with less Evil.
This just makes me feel more loyalty towards Immich and disgust towards Google Photos.
At this point I would rather use an analog camera with photo albums than Google Photos.
I've rarely seen a HN comment section this overwhelmingly wrong on a technical topic. This community is usually better than this.
Google is an evil company I want the web to be free of, I resent that even Firefox & Safari use this safe browsing service. Immich is a phenomenal piece of software - I've hosted it myself & sung its praises on HN in the past.
Put putting aside David vs Goliath biases here, Google is 100% correct here & what Immich are doing is extremely dangerous. The fact they don't acknowledge that in the blog post shows a security knowledge gap that I'm really hoping is closed over the course of remediating this.
I don't think the Immich team mean any harm but as it currently stands the OP constitutes misinformation.
> what Immich are doing is extremely dangerous
I've read the article and don't see anything dangerous, much less extremely so. Care to explain?
They're auto-deploying PRs to a subdomain of a domain that they also use for production traffic. This allows any member of the public with a GitHub account to deploy any arbitrary code to that subdomain without any review or approval from the Immich team. That's bad for two reasons:
1. PR deploys on public repos are inherently tricky as code gains access to the server environment, so you need to be diligent about segregating secrets for pr deployments from production secret management. That diligence is a complex & continuous undertaking, especially for an open source project.
2. Anyone with a GitHub account can use your domain for phishing scams or impersonation.
The second issue is why they're flagged by Google (he first issue may be higher risk to the Immich project but it's out of scope for Google's safe browsing service).
To be clear: this isn't about people running their own immich instance. This is about members of the public having the ability to deploy arbitrary code without review.
---
The article from the Immich team does mention they're switching to using a non-production domain (immich.build) for their PR builds which does indicate to me they somewhat understand the issue (though they've explained it badly in the article), but they don't seem to understand the significance or scope.
> This allows any member of the public with a GitHub account to deploy any arbitrary code to that subdomain without any review or approval from the Immich team.
This part is not correct: the "preview" label can be set only by collaborators.
> a subdomain of a domain that they also use for production traffic
To clarify this part: the only production traffic that immich.cloud serves are static map tiles (tiles.immich.cloud)
Overall, I share your concerns, and as you already mentioned, a dedicated "immich.build" domain is the way to go.
> This part is not correct: the "preview" label can be set only by collaborators.
That's good & is a decent starting point. A decent second step might be to have the Github Actions workflow also check the approval status of the PR before deploying (requiring all collaborators to be constantly aware that the risk of applying a label is similar to that of an approval seems less viable)
The workflow is fundamentally unable to deploy a PR from a fork, it only works for internal branches, as it relies on the container image being pushed somewhere which needs secrets available in the CI workflow.
какие же они все таки гандоны
And yet if you start typing 192 in chrome, first suggested url is 192.168.l00.1
sad
If there are any googlers here, I'd like to report an even more dangerous website. As much as 30-50% of the traffic to it relates to malware or scams, and it has gone unpunished for a very long time.
The address appears to be adsense.google.com.
Also YouTube.com serves a lot of scam advertisements. They should block that too.
I think google is crumbling under the weight of their size. They are no longer able to process the requested commercials with due diligence.
Nah, they just don't give a fuck. Never have
I see the same scam/deepfake ad(s) pretty much persistently. Maybe they actually differ slightly (they are AI gen mostly), but it's pretty obvious what they are, and I'm sure they get flagged a lot.
They just need to introduce a basic deposit to post ads, and you lose it if you put up a scam ad. Would soon pay for the staff needed to police it, and prevent scammers from bypassing admin by trivially creating new accounts.
That's probably a good idea. They can also earn interest on the deposit. (Not that they need the money).
I used to flag obvious scam adverts. A bunch of times I'd even get an email response a few weeks later saying it was taken down. But then I'd see it again (maybe slightly different or by a "different" advertiser, who knows). Its whack-a-mole.
The reality is that google profits from scam adverts, so they don't proactively do anything about it and hide behind the "at our scale, we can't effectively do anything about it" argument. Which is complete horseshit because if you can't prevent obvious scams on your platform, you don't deserve to have a platform. Google doesn't have to be running at their scale. "We would make less money" is not a valid excuse. We'd all make more money if we could ignore laws and let people be scammed or taken advantage of.
There's plenty of ways they could solve it, but they choose not to. IMHO this should be a criminal offence and google executives should be harshly punished. Its also why I have a rather negative view of googlers, since they wilfully perpetuate this stuff by working on adtech while nothing is being done about the normal everyday people getting scammed each day. Its only getting worse with AI, but I've been seeing it for years.
[dead]
Did they ever? They used to only allow text ads, which reduced malware compared to serving random JavaScript. But did they ever vet the ad's content?
> They are no longer able to process the requested commercials with due diligence
no longer able? or no longer willing to, because it impacts their bottom line?
They can afford to hire thousands of people to swiftly identify scams and take punitive action. And pay them well.
They can, but as long as regulators let them get away with it, they will just pocket the money instead. Google are, imho, an evil company.
Use their Recaptcha to let users identify scam ads instead of cars and traffic lights.
What i really don't understand at least here in Europe the advertising partner (adsense) must investigate at least minimally whether the advertising is illegal or fraudulent, i understand that sites.google etc are under "safe harbor" but that's not the point with adsense since people from google "click" the publish button and also get money to publish that ad.
I have reported over a dozen ads to AdSense (Europe) because of them being outright scams (e.g. on weather apps, an AdSense banner claiming "There is a new upgrade to this program, click here to download it") . Google has invariably closed my reports claiming that they do not find any violation of the adsense policies.
Same thing with Instagram, they accept all scam ads.
Google and Meta are trillion dollar criminal enterprises. The lion's share of their income comes from fraud and scams, with real victims having their lives destroyed. That is the sad truth, no matter how good and important some of their services are. They will never stop their principal source of income.
They’re far too embedded politically to ever face consequence too. I hope someday we can get a serious anti-corruption candidate.
Do you report those only to Google, or also to your local watchdog/police/commerce regulator?
The law is only for plebs like you and me. Companies get a pass.
I'm still amazed how deploying spyware would've rightfully landed you in jail a couple decades back, but do the same thing on the web under the justification of advertising/marketing and suddenly it's ok.
>Companies get a pass.
I'm pretty sure that if Springer were to make a fraudulent ad, they would instantly be slapped with a lawsuit and face public outcry.
Springer itself is nothing but scam.
True, but at least the ad's are not ;)
Which one of the two Springer-s? ;-)
sites.google.com
The same outfit is runimg a domain called blogger.
Reminds me of MS blocking a website of mine for dangerous script. The offending thing i did was use document.write to put copyright 2025 (with the current year) at the end of static pages.
My work's email filter regularly flags links to JIRA and github as dangerous. It stopped being even ironically amusing after a while.
Microsoft's own Outlook.com flags Windows Insider emails coming from a .microsoft.com domain as junk even after marking the domain as "no junk". They know themselves well.
Frequent frustration past week for me:
The integrated button to join a Microsoft Teams meeting directly from my Microsoft Outlook Calendar doesn't work because Microsoft needs to scan the link from Microsoft to Microsoft for malware before proceeding, and the malware scanning service has temporary downtime and serves me static page saying "The content you are accessing cannot currently be verified".
I feel like the GitHub one might be okay since a lot of malware binaries are hosted there still.
To be fair, that's a legally invalid copyright notice.
sites.google.com is widely abused but so practically any site which allows users to host content of their choice and make it publicly available. Where google can be different is that they famously refuse yo do work which they cannot automate and probably they cannot (or don’t want) to automate detection/blocking of spam/phishing hosted on sites.google.com and processing of abuse reports.
The nerve of letting everyone run a phishing campaign on sites.google.com but marking a perfectly safe website as malicious.
Enshitification ensues.
Yeah - that website keeps on spamming me down with useless stuff.
I was able to block most of this via ublock origin but Google disabled this - can not download it from here anymore:
https://chromewebstore.google.com/detail/ublock-origin/cjpal...
Funniest nonsense "explanation":
"This extension is no longer available because it doesn't follow best practices for Chrome extensions."
In reality Google killed it because it threatens their greed income. Ads, ads and more ads.
Use Firefox.
Use one of the forks. librewolf, waterfox, zen. Firefox itself lost trust when Mozilla tried to push the new Terms of Use earlier this year. That was so aggressively user-hostile that nobody should trust Mozilla ever again. Using a fork puts an insulation layer between you and Mozilla.
Librewolf is just a directly de-mozillaed and privacy-enhanced Firefox, similar to Ungoogled Chromium. I've been trying to get in the habit of using Zen Browser, which has a bunch of UI changes.
> Firefox itself lost trust when Mozilla tried to push the new Terms of Use earlier this year.
Those terms of use aren't in place any longer. I'm surprised that listening to the users is viewed as something bad.
This. Their devs and reactivity to their user base kept my trust.
Their marketing and legal departments lost it long before the terms of service debacle.
Rolling back a change that causes loss of user trust does not automatically restore that trust. It takes time and ongoing public commitment to regain that trust.
Allowing that ToS change is what put them on the spyware list, not rolling it back.
The problem is that all those forks are beholden to Mozilla's corporate interests the same way the chromium derivatives are beholden to Google's corporate interests. What we need is one of the newer independent engines to mature - libweb, servo or blitz.
How are they beholden? In the sense that it's hard to provide engine updates without the funding of goog?
edit: also, by "libweb", did you mean "ladybird"?
You can read this as, "I want Mozilla to spend millions developing a competitive Chrome alternative, but I want it for free and aligned with all my personal nitpicks".
Typical freeloader behaviour, moans about free software politics but won't contribute anything themselves.
No they're not. They can pull what they like and not pull what they don't.
Librewolf is trying to be de-Mozillaed, privacy-enhanced Firefox, so it'll probably take whatever not-overtly-spyware patches Mozilla adds. Some others, like Waterfox and Pale Moon, are more selective.
Apparently the "best practise" is using Manifest V3 versus V2.
Reading a bit online (not having any personal/deep knowledge) it seems the original extension also downloaded updates from a private (the developers) server, while that is no longer allowed - they now need to update via the chrome extension, which also means waiting for code review/approval from google.
I can see the security angle there, it is just awkward how much of an vested interest google has in the whole topic. ad-blocking is already a grey area (legally), and there is a cat-and-mouse between blockers and advertisers; it's hard to believe there is only security best-practise going on here.
You know what? I don't even mind them killing it, because of course there are a whole pile of items under the anti-trust label that google is doing so why not one more. But what I do take issue with is the gaslighting, their attempt to make the users believe that this is in the users interests, rather than in google's interests.
If we had functional anti-trust laws then this company would have been broken up long ago, Alphabet or not. But they keep doing these things because we - collectively - let them.
Why would a monopoly care about users interests?
I know they won't. But we have all the tools to force them to care. We just don't use the tools effectively, and between that and lobbying they get a free pass to pretty much do as they please.
DNS level blockers like NextDNS are much easier to use and works for the entire device.
Yes, the irony of Google warning for other sites as malware, is not lost on me.
[dead]
[flagged]
[flagged]
I heard the CEO has a Hitler bedspread and Mussolini tattoo on the far-right of his right buttock.
[flagged]
As someone who doesn't like Google and absolutely thinks they need to be broken up, no probably not. Google's algorithms around security are so incompetent and useless that stupidity is far more likely than malice here.
Callous disregard for the wellbeing of others is not stupidity, especially when demonstrated by a company ostensibly full of very intelligent people. This behavior - in particular, implementing an overly eager mechanism for damaging the reputation of other people - is simply malicious.
Incompetently or "coincidentally" abusing your monopoly in a way that "happens" to suppress competitors (while whitelisting your own sites) probably won't fly in court. Unless you buy the judge of course.
Intent does not always matter to the law ... and if a C&D is sent, doesn't that imply that intent is subsequently present?
Defamation laws could also apply independently of monopoly laws.
I don't see how this is an issue. To me, this does seem at least confusing, but possibly dangerous.
If you have internal auth testing domains at the same place as user generated content, what's to stop somebody thinking a user-generated page isn't a legit page when it asked you to login or something?
To me this seems like a reasonable flag.
There is no user generated content involved here.