Why regex? Why AdGuard Home instead of Pi Hole?¶
What's going on?¶
The plan in 2021 is to evolve the project so that it relies more heavily on regex blocklists as they are much "cleaner" and easier to maintain as well as being more effective compared to domain only blocklists.
This is for a number of reasons. The major one is you don't need a new rule for every single new subdomain you want to block. Another is you can block tracking subdomains across the board without having a separate rule for every apex domain.
So you can block "ads.example.com" no matter what "example.com" actually is. If you go on a new site none of us have seen before that serves ads from an "ads." subdomain - it's automatically blocked. If a newly registered domain we've yet to encounter has a hidden third party tracker on the subdomain "pixel." it will be blocked. And so on.
It also makes things easier for us to manage as we won't have lists with tens of thousands of domains, but instead we can achieve the same goal, but better, with just a few lines. For example all of Facebook can be blocked in about 10 lines of regex. Or almost 2,000 lines of individual subdomains. Us and the community can see what's already blocked and submit updates far more easily like this which means removal of inactive domains, fixing rules that break legitimate sites, and adding new ad and tracker domains are all much easier.
And finally, it's easier on your hardware. If you're running this on a low power single board computer such as a Raspberry Pi you will see quicker domain resolution when that device doesn't have to go through a combined list of tens of thousands of domains.
The old way still has some benefits¶
With all that having been said, we will not phase out pure domain lists entirely.
The primary advantage of a domain blocklist is that it can be more fine grained than a regex blocklist. A regex blocklist is effectively a script that says "block anything matching this pattern." When many ads and trackers online follow identical domain patterns this is great for us.
However what if, as is the case with many smart TVs for instance, you have to be careful to block only very specific URLs otherwise you prevent legitimate features from working?
And we can't get every tracker in a regex because some sites will simply use unique subdomains to serve ads and trackers. Netflix for example.
Finally, it is possible for regex to cause false positives. It is rare so long as the regex isn't too broad about what it's trying to block, but it is still possible. That said, false positives also happen with domain lists because human error is always going to get in the way.
But the other side of the coin here is that false positives are actually much easier to manage with regex. It is humanly impossible to read every single domain on your collection of blocklists and find all the false positives. You can create your own local whitelist as you notice the false positives, and if you're nice enough to make a PR on GitHub explaining the false positive it'll get taken off that one list, but it could pop up on another, and many lists used by others often share the same false positives.
With regex we can create lists of "innocent" domains, also using regex, and this whitelist can be added and auto-updated just like a blocklist. Such lists overrule all blocklists. So by maintaining a simple list of false positives and distributing that first before the regex blocklists, not only do we minimise false positives, but every false positive we add to the whitelist also protects everyone who uses it from false positives across all blocklists.
So can't I just do this on my Pi Hole?¶
Sadly, at the time of writing, no. You can use a Python script to add regex to Pi Hole but this is an unsupported hacky method that could break at any time in a future update and is clearly not user friendly for less technical individuals.
We no longer recommend Pi Hole and instead recommend AdBlock Home. Much like Pi Hole it is an open source DNS sinkhole that can run on a Raspberry Pi. It has a more polished interface and, more importantly, it supports regex blocklists and whitelists.
It also has built in support for encrypted upstream DNS which Pi Hole does not - again you must chain it to third party software in a hacky way to make up for this lack of functionaliy, whereas AdBlock Home has integrated support in the GUI for every method of encrypted DNS in existence.
Installing is easy. If you want the simplest method, boot up your Pi with Ubuntu Server 20.04.2 LTS and type the following into the terminal:
sudo snap install adguard-home
It'll install itself in a couple of minutes. After that you get the IP of your Pi, as with Pi Hole, and type it into your browser along with the port 3000. So for instance if your Pi is at 192.168.1.32 you would enter "192.168.1.32:3000" into your browser. Follow the simple instructions, I recommend keeping all ports at the default settings, set a username and password, then you are in. After initial setup you do not need to add ":3000" anymore. Simply visit that IP on any device inside your WiFi and you will be in the control panel.
The main downside in terms of ease of setup I have noticed is that, unlike Pi Hole, it does not make the Pi get a static IP from the router. So you will need to ensure the IP of your Pi is static yourself. But this is normally very easy once you're in the router settings and you are already changing those to set the DNS server to the internal IP of your Pi (in our above example it would be 192.168.1.32).
There is a more detailed tutorial coming soon.
P.S. If you are more technically inclined you can also install the
adguard-home package as a Docker image or build from source.