Anonip is a tool to anonymize IP-addresses in log-files.
It masks the last bits of IPv4- and IPv6-addresses. That way most of the relevant information is preserved, while the IP-address does not match a particular individuum anymore.
The log-entries get directly piped from Apache to Anonip. The unmasked IP-addresses are never written to any file.
With the help of cat, it's also possible, to rewrite existing log-files.
For bug reports und feedback: anonip@privacyfoundation.ch (PGP key)
Current version: anonip 0.5 (11/9/2014)
New version from Digitale Gesellschaft: anonip_1.0 (20.4.2019)
-h|--help help
-d|--debug enable debug-output
--ipv4mask n mask the last n bits from IPv4-adresses
(Default: 12)
--ipv6mask n mask the last n bits from IPv6-adresses
(Default: 84)
--increment n increment the IP-adresses by n (Default: 0)
--output file the file, the logs should be written in (Default: STDOUT)
--column n[,n,...] The column containing the IP-address
(Default: 1)
--replace str Replace-string if the IP-address-recognition fails
(Default: empty, the column will not be mutated;
Example: 0.0.0.0)
--user user change the user-ID
--group group change the group-ID
--umask n set the umask
In the Apache configuration (or the one of the vhost) the log-output needs to
get piped to anonip:
CustomLog "|/path/to/anonip.py [OPTIONS] --output /path/to/log" combined
That's it! All the IP-addresses will be masked in the log now.
Alternative:
cat /path/to/orig_log | /path/to/anonip.py [OPTIONS] --output /path/to/log
In a time, where the mass-data-collection of certain companies and organisations gets more and more obvious, it's crutial to realize, that also we maintain unnecessary huge data-collections.
For example admins of webservers. In the log-files you can find all the IP-addresses of all visitors in cleartext and all of a sudden we possess a huge collection of personalized data.
Anonip tries to avoid exactly that, but without losing the undisputed benefit of those log-files.
With the masking of the last bits of IP-addresses, we're still able to distinguish them up to a certain degree. Compared to the entire removal of the IP-adresses, we're still able to make a rough geolocating as well as a reverse DNS lookup. But the otherwise distinct IP-addresses do not match a particular individuum anymore.