Intelligent Tool for Domain Blacklisting


Digital defense on the Internet computer. Blacklisting domains by loopback entries in the 'hosts' file is an effective weapon against unwanted content and threats. This utility can assist with the good practice.



Introduction

The hosts configuration file is the original way for an Internet capable device to resolve a host name (server, "website") to a technical IP address. However, when the "web" had evolved to become that giant omniscient garbage dump as we know it, IP search by way of a local telephone book became a little cumbersome... and was therefore supplemented by the so-called DNS service.

Supplemented, but not replaced
. On a reasonably configured machine, local address resolution has priority over DNS. Before a query is sent out to external DNS server(s), there is always a search through the local hosts file. If the domain in question was actually found in hosts file, the system will use the IP that is stated there, and no further request is made to an external name server. This virtuous standard behaviour enables some interesting applications:
Here, we will focus mainly on the last point. By way of a so-called loopback entry in the hosts file, connections to malicious servers of a known name can be blocked effectively (aka 'domain blacklisting'). Even better, the lock applies to all programmes on the local machine, not only the browser, and cannot be circumvented on a non-corrupted system without elevated rights. A win-win-win situation for competent users:
Although it would also be possible to set up a blacklist in many routers, this turns out quite impractical if we like to set up or remove locks more flexibly. Router firewalls provide good protection against incoming threads by closing ports and application of filter rules. The hosts file, on the other hand, can block unwanted connections to servers in the wild, and it is a simple text file located at the endpoint device, which can be maintained using standard tools.

If you prefer a mix of blocking lists and self-created domain blocks, the usual text editor is not very helpful in the long run. It can not check lines and does not help with a compact formatting. A hosts file full of duplicates, outdated entries and lots of whitespaces not only becomes more and more confusing, its access times will also degrade.

While various tools pretend to help with managing the hosts file, it seems that most of them are packed with fussy features, no transparent code base, insane dependencies, and sometimes quite suspicious background activities. Clearly more risk than fun...!

That's why several years ago, I've written a little programme to address the most annoying maintenance tasks regarding hosts file blacklisting.
The current version can do:

Top | Index


Details

The command line utility hostsupdate depends on a minimum of interaction, since it uses the system clipboard to accept domain names or IPs for blacklisting. The programme prints out brief status messages to the console screen.

Functional tenets:

System requirements

Storage: The program hostsupdate is a compact non-gui executable for commandline/console application.
It is active only on demand and does not plant in any "registry". When running, the programme indeed occupies a few megabytes in RAM, since whole of the hosts file contents is buffered to memory for fast search-and-modify functionalities.
When not in use, the programme will not consume anything other than disk space. In particular, there is no background process to monitor the hosts file.

Windows PC: The console executable only depends on a few standard APIs mainly to access the Windows Clipboard. Of course, it does not depend on ".NET" or other bloated frameworks. I usually release Windows executables for 32 bits, which can run in compatibility mode under current 64 bit Windows. Special support for newer versions of Windows is not planned.

Linux PC: The executable is compiled as a standalone (static) console program for 32 bit Linux for maximum of compatibility. Of course, this comparably simple programme could be built from the sources for 64 bit either, but this does not involve any significant benefits regarding this application.
Note: The Linux variant uses the external utility xsel to access the clipboard. If it does not already exist on the system, it can be installed from standard repositories at any time (sudo apt-get install xsel).
For the DNS Check feature, hostsupdate needs to access the nslookup network tool. This shall be available on virtually any Linux or Windows machine. No other dependencies!

Compiling: Current hostsupdate is written in Standard C (C99) using only free software tools (Geany with gcc 4.x and MinGW-32 on Windows). The code has become even more robust, faster and, above all, more platform-independent. I always have an open ear for bug reports and suggestions.

Installation

Actually there is nothing to 'install' with hostsupdate.No installer app, no registry crap. Just extract the executable to a folder in user space, grant execution rights and write permission (for the backup files) within that folder.


Top | Index


Application

The hostsupdate tool is supposed to be widely self-explanatory. Only some basic notes:

Top | Index


Experience (and experiments...)

So, what should be blacklisted?

Everything that bothers you, I would say. That's a personal decision. Me would not dare to prescribe anyone how to live their digital lifes. The strategies and methods described below have proven quite effective for me on personal computers, for many years now, along with general measures of online security.
For some of you, these instructions may occur too trivial, others may find it quite "hacking". Well, I just share some knowledge in the hope that it could be useful.


Example 1: Use the browser address bar

A formerly "hot" commercial, download or publishing page has transformed to a pain in the ass by repeated messages such as "turn off adblocker", "would you idiot please activate javascript", and the usual whinging on limited functionality, recognition of terms of use, starvation of managers, etc.

Generally, one should not give in to such blackmail attempts and leave the site aside. Enabling all the script stuff will make the page work for a while, but seconds later you might be bothered by unsolicited info about online casinos, sports cars or pharmaceutical stuff.
Hey, I do not have nothing against target group oriented marketing! Always laugh at my sleeve, every time when that advertising appears to completely misjudge my needs. It demonstrates how little "they" actually know. It is a confirmation that all those regular measures against cookies, tracking and fingerprinting, as well as the random spread of well-dosed disinformation, all not completely ineffective!

But the fun is over exactly by the moment, when aggressive scripts try to steal bandwidth and attention, kick processor load to the roof, waste my lifetime, impose potential security risks. This has nothing to do anymore with a "fair advertising deal" and cleanly programmed page functionality. This is a pest that must be repelled!
So, the next time when unsolicited windows or tabs open, let's have a look at the corresponding address line. Usually, the advertising junk is delivered by third-party servers which can be easily distinguished from the calling site. We may block these domains without losing functionality on the site of interest. Just drag the questionable server name to the blacklist! With hostsupdate, this is just a few mouse clicks.




Example 2: HTTP protocols

A hot Add-on for the Firefox browser has once been "Live HTTP headers". It provided detailed real-time line-up of all the individual server requests that are made in the course of "regular browsing". Including quite useful information. For example, if the site operator has forgotten to provide a download button for some interesting media content on the site...
And the protocols show clearly that most commercial sites, pretending to be "user-friendly", immediately generate dozens of redirects to advertising-, tracking-, pixel-, click- and content-servers. Even if we haven't navigated much far on the site and certainly did not authorise all that stuff! Except from the content servers (which are often cloud services that provide multimedia content, identifyable by the "cdn" in domain name) and CAs (certificate servers for https connect), the vast majority of third-parties involved, is to our clear disadvantage!

Unfortunately, the 'Mozilla Firefox' developers have done a lot of wrong choices, and moral depravity seems to have transformed this once-promising browser into a resource-hungry and privacy-mocking monster.
Many Add-ons, which security-conscious surfers have relied on for years, did no longer work from FF version 55 up, new versions of those Add-ons have become next to unusable, due to borderline "improvements" at the user interface (sad example: "NoScript").

For the time being, we may resort to the built-in 'browser console' (Ctrl-Shift-J). It's slowing down the whole shit, but also delivers comprehensive info on conspicious HTTP requests. Some of them should be blacklisted. Anyway, take a closer look.




Example 3: Keep "Google" under control (sort of)

Sometimes, this Google occurs to me as the ideal combination of Scientology and some shady police organisation... On the one hand, we always hear the almost religious sermon of salvation by modern IT, AI, Big Data, blah-blah... On the other hand, there is ruthless infiltration of almost any areas of economy and society by their agents. The fascination of technology and totalitarism holds on. Mr Google has the fattest data centers in the civil sector, staging as the jack-of-all-trades. Best search engine with super objective results; soft-washed video portal with most advanced censorship mechanisms; email with direct NSA support; drones; "autonomous" vehicles; to-date surveillance; military-industrial complex...
Not to forget all these attractive offers for the mercenary wannabee web designers and webmasters! Push your patched-crap site with some glamour, comfort and commercial clicks. Fuck your visitors right in the ass without asking by integration of Google-Adsense, -Analytics, -Maps, -Fonts, -DoubleClick, etc. The market leader knows best how to boost-up lame content, harass, track and manipulate visitors every now and then. Moreover, lots of pointless but remunerative traffic is generated. That's the web of today. Nobody seems to care anymore about data economy, energy consumption, protection of climate and resources... or at least (giggle) What about interesting content and attractive offers...?

Well, you might be one of the few who does not have a hollow head and no heart of stone. Who still has "something to hide" (such as privacy, secrets, dignity...). The good news is, that most of this impertinent spying on web users can be sabotaged and restricted by technical countermeasures, at least to some extent!

For example, the Google challenge. The following list contains some servers operated by the Google syndicate (as of 11/2020), which are usually called without explicit user consent from so-called "partner pages". Even strict browser configuration is next to pointless, as soon as JavaScript hits in...
However, the hosts file blacklisting won't fail. Feel free to copy the lines below to clipboard and adopt them by way of hostsupdate. But beware: Some things won't work anymore, most likely. That's the point where we will, or will not, have to compromise. It's your decision!

127.0.0.1 adwords.google.com
127.0.0.1 googlesyndication.com
127.0.0.1 pagead.googlesyndication.com
127.0.0.1 pagead2.googlesyndication.com
127.0.0.1 www.google-analytics.com
127.0.0.1 www.googleadservices.com
127.0.0.1 googleadservices.com
127.0.0.1 services.google.com
127.0.0.1 adservices.google.com
127.0.0.1 adservice.google.com
127.0.0.1 adservice.google.de
127.0.0.1 partner.googleadservices.com
127.0.0.1 google.begin2search.com
127.0.0.1 safebrowsing.clients.google.com
127.0.0.1 googleads.g.doubleclick.net
127.0.0.1 safebrowsing-cache.google.com
127.0.0.1 themes.googleusercontent.com
127.0.0.1 imageads.googleadservices.com
127.0.0.1 imageads1.googleadservices.com
127.0.0.1 imageads2.googleadservices.com
127.0.0.1 imageads3.googleadservices.com
127.0.0.1 imageads4.googleadservices.com
127.0.0.1 imageads5.googleadservices.com

127.0.0.1 imageads6.googleadservices.com
127.0.0.1 imageads7.googleadservices.com
127.0.0.1 imageads8.googleadservices.com
127.0.0.1 imageads9.googleadservices.com
127.0.0.1 pubads.g.doubleclick.net
127.0.0.1 securepubads.g.doubleclick.net
127.0.0.1 www.google-analytics.com
127.0.0.1 google-analytics.com
127.0.0.1 clientmetrics-pa.googleapis.com
127.0.0.1 ssl.google-analytics.com
tame
127.0.0.1 pagead.googlesyndication.com
127.0.0.1 pagead2.googlesyndication.com
127.0.0.1 maps.googleapis.com
127.0.0.1 video-stats.video.google.com
127.0.0.1 afs.googleadservices.com
127.0.0.1 imageads.googleadservices.com
127.0.0.1 partner.googleadservices.com
127.0.0.1 www.appliedsemantics.com
127.0.0.1 sb-ssl.google.com
127.0.0.1 google.tucows.com
127.0.0.1 apis.google.com
127.0.0.1 ajax.googleapis.com
127.0.0.1 fonts.gstatic.com
127.0.0.1 fonts.googleapis.com
127.0.0.1 www.googletagmanager.com
127.0.0.1 tpc.googlesyndication.com
127.0.0.1 googletagmanager.com
127.0.0.1 www.googletagservices.com
127.0.0.1 googletagservices.com
127.0.0.1 clients1.google.com
127.0.0.1 plus.google.com
127.0.0.1 cse.google.com

127.0.0.1 consent.google.com
127.0.0.1 analytics.google.com
127.0.0.1 smartlock.google.com
127.0.0.1 ade.googlesyndication.com

Example 4: Keep bitchy software under control

If we suspect that a certain program is repeatedly "calling home" without our consent, we should find out which servers it tries to connect. Even on Windows, this can be done with the help of a standard network monitoring tool like netstat.
Same principle applies as with the threats mentioned above. Once we know the IP or Domain name of an unwanted contact, we can block it by a dedicated loopback entry in the hosts file. It's always worth a try. Working with such a tamed software product might become a little bumpy, yet much more private.


Top | Index


License

The hostsupdate utility programme is published under MIT License.

My motivation is to publish projects like this as a contribution and proposal to a more human and transparent technology.

Feedback and/or donations are strictly welcome...!

Top | Index


Download



Top | Back to Index page


11/2020