Show HN: Privaxy – Adblocking / tracker blocking by MITMing HTTPS traffic

86 points by pierrebarre 3 years ago

teddyh 3 years ago

I fear that MITMing ads is a dead end:

1. IIUC, when SNI is encrypted (in TLS 1.3?) almost everything is out the window.

2. Local devices can do DNS over HTTPS (DoH) and DNS over QUIC (DoQ) to look up their stuff, so DNS-based blocking will soon be obsolete.

3. The browser itself is controlled by the biggest ad-vendor around (Google), so you’ll probably get no help there.

The only solutions are:

A. Use browsers not controlled by Google (i.e. not any Chrome fork either).

B. Use only apps and devices locally which do not display ads. (This is, in a way, a generalization of A.)

C. Legislate away the business models of ads and the media and “smart” devices which use ads.

(A very similar argument can be made for user tracking and telemetry.)

mhils 3 years ago

I wouldn't write it off - one possible trick here is to also MITM the DoH/DoQ server and disable ECH by removing the relevant records from the DNS response. We've just added DNS support to mitmproxy and this is a natural follow-up. :)
- pdimitar 3 years ago
  
  Oh? Do you guys have a blog writeup? I would LOVE to read more about this! I want to eliminate the small amount of ads that make it through my PiHole.
  
  mhils 3 years ago
  
  Not yet, but you might be lucky soon. We have an RSS feed on mitmproxy.org and a Twitter account. :-)
XorNot 3 years ago

Anyone who cares about ad blocking should not be using any Chromium based browser at this point, but isn't this the sort of tool you'd use at a network or virtual network level?
There's no reason to let applications on your device bypass your own network settings - and this is something we probably need to start accommodating in Linux distros to start with (specifically: disabling all the weaponized E2E encryption that vendors are using, and forcibly MITM'ing it with keys under the users control).
Network-namespaces should make this eminently possible - launch the user's entire environment into a network namespace which can only speak to "user rights" networking stack.
randomhodler84 3 years ago

Run your own DoH filtering DNS server, I set this up a few months ago. DNS blocking is not obsoleted by transport encryption.
- aesh2Xa1 3 years ago
  
  OP is stating that "apps and devices" may circumvent DNS blocking by resorting to DoH. You can run your own DoH server, and you can even advertise it via your DHCP server, but clients ("apps and devices") do not need to accept the supplied servers for their own configuration.
  
  randomhodler84 3 years ago
  
  A lot of things are possible, but are they done?
  I am yet to hear of any examples of hardcoded DNS servers. I believe this to be too fragile to implement.
  
  autoexec 3 years ago
  
  They don't even have to be hardcoded, they just have to ignore anything you specify or not give you any option to specify your own. As long as a device manufacturer can push updates to your device (even by IP address) they can regularly update their chosen DNS servers when needed. Honestly though, for many devices I doubt they'd even bother. Companies seem to have little trouble taking the position that if your device is more than a few years old you're insane for expecting them to still support it and you should have already thrown it away and bought another one.
  
  random1temp2 3 years ago
  
  League of Legends hardcodes 8.8.8.8.
  
  randomhodler84 3 years ago
  
  Thank you for the example, probably 53/udp, which one can set up a NAT rule to direct all outgoing 53 to the local filtering DNS resolver.
- teddyh 3 years ago
  
  How do your force applications to use this server? I mean, even if you MITM the connection to the application’s preferred DoH server, the application probably checks the certificate of the DoH server and refuses to work at all if it can’t get a verified connection.
  
  randomhodler84 3 years ago
  
  You don’t mitm the DoH, you substitute it with your own server.
  I have yet to see DNS/DoH “pinning”, and apps (browsers) will let your override it. Embedding DNS entries in apps is a bad idea (as opposed to cert pinning, which is about fixed trust, and a good idea). Given that sometimes this is going to be blocked, even if they did it would fall to the host resolver.
- atommclain 3 years ago
  
  Very curious about how you went about this as I would like to do the same.
  
  randomhodler84 3 years ago
  
  Many options, take a look at https://wiki.archlinux.org/title/DNS_over_HTTPS_servers
  Update the DHCP on your router, all done.
ThePhysicist 3 years ago

In my understanding ECH/ESNI shouldn't be an issue in this setup as long as the browser issues a domain-specific CONNECT request (i.e. "CONNECT google.com" instead of "CONNECT 24.154.13.11"). I think even with ECH enabled you should be able to impersonate the web server if you have a valid root CA certificate in the browsers' trust store. Remember, you're not performing "hostile" MITM-ing, but explicitly configure a proxy and root certificate in your browser. DNS shouldn't be an issue either as the browser leaves domain resolution to the proxy.
- teddyh 3 years ago
  
  This is, of course, assuming that you can trust the browser to obey its proxy settings. (And proxy setting do not apply at all to local “smart” devices.)
baxuz 3 years ago

Does AdGuard also use this approach?
https://kb.adguard.com/en/general/how-malware-protection-wor...
fomine3 3 years ago

4. Deliver ads from same host as content (like Twitter, YouTube)
trasz 3 years ago

D. Create a whole bunch of VMs with browsers and “fake users” to DoS the whole ad-based business model.
- trasz 3 years ago
  
  (Note: DoS not literally, of course, but by feeding it with so much fake metrics that the signal - metrics for actual, live users - gets lost in the noise.)
  Or perhaps forget about VMs, and have some kind of browser plugin that performs "fake browsing" - to throw off analytics - with your real cookies, but hidden from view, so it's not annoying to the live user browsing the web in the usual manner.
- teddyh 3 years ago
  
  Some people might do this, but it will never be enough people to even register on the scales of the ad-funded businesses.

adamzochowski 3 years ago

There was a proxy, proxomitron in early 2000s, that allowed you to change the html/js as it went through the proxy. people used it for adblocking and removing page annoyances, like removing sounds / animated gifs / etc. Here is a list of random old filters people had built at one time: https://proxomitron.info/45/help/Default-Web-Filters.html

dredmorbius 3 years ago

There were numerous of these.
Privoxy, dansguardian, Squid (AFAIR), and others.
The notion that SSL/TLS means that ONLY the webserver origin and web browser client are permitted to see or mitigate content ... is itself harmful. Trusted proxies under your control do have a place, though yes, that introduces new points of contention as well.
- captn3m0 3 years ago
  
  I used to swear by Privoxy till the internet realised HTTPS was actually important and it stopped working everywhere.
  
  dredmorbius 3 years ago
  
  Largely the same. Privaxy actually looks pretty sweet in that regard.

mhils 3 years ago

This approach is a natural escalation step as DNS-based blocking is getting increasingly difficult. But it's not without its drawbacks. For example, browsers tend to have by far the best TLS implementations. By MITMing yourself, you essentially trust the proxy's TLS implementation instead, which will receive much less scrutiny. There's a lot of precedent for TLS vulnerabilities introduced by middleboxes. If browser extensions are possible they should be preferred. But the author does have a point that this can't be taken for granted anymore!

randomhodler84 3 years ago

Why is DNS based blocking getting difficult? You run a bind server and tell it what it can and cannot resolve. It can even listen on DoH so you get transport security between peer and local dns server.
- rsync 3 years ago
  
  Your browser (or your tv) can just skip your entire dns infra and make its own lookups over https- which you won’t see.
  That’s the evil genius of doh- you can’t block 443 and their “dns server” could be the same hostname as the site you visit … and now we’re discussing mitm’ing ourselves…
  Sigh.
  
  randomhodler84 3 years ago
  
  Could, but do? I have never seen DNS or DOH pinning. Seems fragile. Would likely fall back to host resolver anyway.
- mhils 3 years ago
  
  AdTech increasingly uses CNAME cloaking-style tricks to evade DNS blocking. Some of those tricks are detectable, but DNS blocking will inevitably fail once ads are served from the first party domain. It's still rare, but simple CNAME cloaks specifically have seen an uptick in the last few years.
trasz 3 years ago

A TLS proxy is something that’s trivially easy to sandbox; a browser is the exact opposite.

Saint_Genet 3 years ago

Used to run privoxy back in the day, but stopped when adblock extensions came along. It was simply more convenient to manage adblocking from the browser rather than figuring out regexps to put in its config. Also, it didn't do https.

geoffeg 3 years ago

I've really wanted a server-side uBlock Origin like this for a while now for devices that can't run uBlock (mobile, etc) or where uBlock is limited in functionality (Chrome). This looks like a great start.

itintheory 3 years ago

In case you weren't aware, firefox on android can run uBlock Origin without root or any other modifications. This proxy would be nice to have system level ad blocking though!

cal85 3 years ago

What are the potential benefits of a ‘MITM’ approach, compared to other approaches like acting as DNS (like pihole)?

Edit: I should have read the About section more carefully:

> Privaxy is also way more capable than DNS-based blockers as it is able to operate directly on URLs and to inject resources into web pages.

Makes sense. So it potentially has the fine-grained control of a browser-based blocker but also has good performance like a pihole. Sounds compelling. Now I’m interested to know why it’s not been done this way before? Is it just a hard problem to solve, and no one has attempted it yet?

randomhodler84 3 years ago

It’s been done for years and years but it’s considered a very bad idea these days. MITM https sessions is a trivial problem today. It’s just a bad idea as it breaks the entire trust model of the internet.
Most commercial firewalls for the last decade plus have such features.

2Gkashmiri 3 years ago

Why build something fresh and not join forces with pihole? Reinventing the wheel for a niche function doesn't get traction much.

I don't know the reason why the devs of this project think they need to start afresh, there are already tools like Firefox+unlock origin+ pihole which should solve most if not all of the problems. Why not incorporate the defining feature into pihole so that people don't have to add more complexity?

Do I switch off my pihole and set this up?

autoexec 3 years ago

> Why build something fresh and not join forces with pihole? Reinventing the wheel for a niche function doesn't get traction much.
What harm does it do? Sure, some combination of 3 or more other things might give you most of the same functionally but why shouldn't people have the option to chose whichever works best for them? Even if the capabilities were 100% identical it's still worth it because it gives you an option if the thing you're using goes evil or stops updating or turns out to contain a vulnerability that takes months to fix etc.
Even better, it could lead to innovation. Maybe Privaxy does something better than pihole does, or has some nice feature they don't and pihole sees it, loves the idea, and makes that improvement or adds that feature too and suddenly everybody is better off. Maybe just having competition helps improve things.
I'm really struggling to understand how anyone loses here, or why it's preferable to have our options limited.
dredmorbius 3 years ago

Does PiHole do anything other than DNS-based blocking?
- mindslight 3 years ago
  
  It would actually be pretty sweet if something like PiHole bundled and incorporated something like this as a configuration option, to deal with sites where DNS-only blocking didn't work.
- 2Gkashmiri 3 years ago
  
  i dont know. my point is the "fragmentation" thing
  
  dredmorbius 3 years ago
  
  As Privaxy includes blocklists, I'd argue that it is a superset of PiHole functionality. DNS blocklisting is actually pretty straightforward, and there are many tools which do it. PiHole is only one.
  That said, which would be better suited to incorporate the other is an interesting question.

randomhodler84 3 years ago

I said it before and I will say it again, MITM for ad blocking is not a way forward.

Cert pinning defeats this on 99% of consumer devices and introduces a security hole in the browser by subverting the trust model. Unless the proxy is doing 100% of the same thing the browser is doing, and it isn’t, you are weakening browser security too.

Instrument the endpoint (browser plug-in) or control name resolution (filtering DNS server that uses DoH to prevent upstream filtering).

pkulak 3 years ago

It's not about this being some end-all solution, it's about it being an option. Personally, I love it. I used to use Privoxy, back when nothing was encrypted, and it was wonderful. A central place to store all my ad-blocking config that could be connected to at will by most devices on my network. I mostly have that now with DNS blocking, but once ad networks stop putting ads on separate domains, that's done.
Keep in mind that ad-blocking browser plugins aren't exactly secure either. They have access, not only to every network request, but every keystroke, mouse wiggle, etc. And all it takes to all fall down is for whoever is maintaining it to cash out and sell to a bad actor: you'll helpfully be automatically updated to the new, state-owned version.
gumby 3 years ago

The problem with browser plug ins is that they only work in browsers. I read most html or other "web pages" in programs other than browsers (mail client, RSS readers, Electron apps, etc)

sidpatil 3 years ago

Not to be confused with Privoxy: https://www.privoxy.org/

dredmorbius 3 years ago

My understanding is that Privoxy either cannot deal with SSL/TLS traffic, or deals very poorly with it.
The FAQ doesn't seem to discuss the issue at all, which is not a good sign:
https://www.privoxy.org/faq/index.html

hereme888 3 years ago

Some people seem to be saying that apps and devices bypass your DNS settings.

If I set NextDNS with DoT in my Android under the "private DNS" setting, and turn on the NextDNS setting with DNS rebinding protection, would the phone and some apps still find a way around it?

I also use NetGuard, but it's more cumbersome and doesn't allow DoT.

mhio 3 years ago

It's possible. Applications don't have to rely on the OS provided mechanisms to lookup names, or even rely on DNS to get an IP for something.
Chromium contains its own DNS resolver so connects directly to a DNS server rather than use the OS, but it would normally default to your OS settings (and only use DoH when they find a matching entry in their list of DoH providers).
Desktop Firefox is an example of an app that defaults to DoH from 1.1.1.1 (in some places).

bilekas 3 years ago

> Privaxy is also way more capable than DNS-based blockers as it is able to operate directly on URLs and to inject resources into web pages.

I'm not sure I understand why it would be more capable than a DNS blocker ?

If it's just because you can inject into the traffic that's comparing apples and oranges ? Or am I missing something ?

captn3m0 3 years ago

Let’s say a text based ad shows up in a div with the id “advert”.
A DNS based blocker will not be able to block it, but an extension or a proxy based blocker that looks at the HTML content will be able to block it.
So yeah, inject as well as as modify the HTML directly.
It could do things like shimming advertising libraries as well defanging them potentially.
- sumtechguy 3 years ago
  
  To add to that. DNS block is basically 'built in' for this type of filtering as you can just make your filter strings your list of DNS sites. It does have the downside that not everything is http. That is where a real DNS filter comes into play with known malicious endpoints. So a combination is very nice to have.
- bilekas 3 years ago
  
  Okay, that makes a bit more sense now actually!
Septem9er 3 years ago

Simply because it isn't always enough to look at the domain to decide if it should be filtered (for serving ads or whatever). That's one reasons why DNS blockers can filter less effectively than e.g. browser addons.
So yes, the reason is exactly as stated in the quote. It is more capable because it can operate on URLs and on the resources of the website directly.
ThePhysicist 3 years ago

Because you can modify HTML and other resources on the fly, i.e. you can remove tracker scripts before they would even be able to send stuff to a third party.

ThePhysicist 3 years ago

I really like this, built something similar in Golang a while ago (not open-source for various reasons). In general it's a good approach I think, you can also inject JS that can do additional stuff in the browser to suppress tracking/ads.

pkulak 3 years ago

What does it mean when:

"The service may not tolerate TLS interception."

I figured the proxy would be making the request entirely independently. How would an external entity even know the data was later being passed on?

mhio 3 years ago

TLS connections are tunnelled through proxies directly to the endpoint (HTTP CONNECT method) rather than the "client request to proxy" followed by "proxy request to endpoint" method of proxying.
This remote interception then involves turning a CONNECT back into the classic proxy connection. First a TLS session from your client to the proxy, then a TLS session from the proxy to the real endpoint.
The proxy needs to present itself to the client as valid for the real endpoint of the TLS connection. This is usually done by adding your own CA into the clients trust so you can sign any certificates required for the client -> proxy half. As you note, the connection from proxy -> endpoint is normally the easy part of that as it works like a normal client.
Two examples of not "tolerating" that interception are certificate pinning and client certificates.
Certificate pinning - The client validates extra information about the presented certificate beyond CA trust. Usually the x509 SHA-256 digest presented to the client. In this case the external entity doesn't enforce anything, you could modify the client to work.
Client certificates - Client cert authentication includes verification of the server certificate, so the forged proxy certificate will not be valid for the client cert. They are a pair. This would require a forged client cert for client -> proxy. Then the real client cert for proxy -> endpoint half.
So it's more convincing the client to tolerate the interception rather than the external endpoint.
- pkulak 3 years ago
  
  Interesting. Thanks!

idrock 3 years ago

I used to deploy privoxy everywhere and loved the ability to intercept and script just about everything... will def check it out.