A more privacy-friendly blog
Vincent Bernat
When I started this blog, I embraced some free services, like Disqus or Google Analytics. These services are quite invasive to users’ privacy. Over the years, I have tried to correct this to reach a point where I do not rely on any “privacy-hostile” services.
Analytics#
- Before: Google Analytics
- After: nothing
Google Analytics is a ubiquitous solution to get a powerful analytics solution for free. It’s also a great way to provide data about your visitors to Google—also for free. There are self-hosted solutions like Matomo, GoatCounter, or Plausible.
I opted for a simpler solution: no analytics. It also enables me to think that my blog attracts thousands of visitors every day.
Update (2019-02)
As for server-side logs, IP addresses are anonymized using ipscrub, a module for nginx. However, non-HTML assets are served through Amazon CloudFront.1
Fonts#
- Before: Google Fonts
- After: self-hosted
Google Fonts is a very popular font library and hosting service,
which relies on the generic Google Privacy Policy. The
google-webfonts-helper service makes it easy to self-host any font
from Google Fonts. Moreover, with help from
pyftsubset
, I only include the characters used in this
blog. The font files are lighter and more complete: no problem
spelling “Antonín Dvořák.”
Videos#
- Before: YouTube
- After: self-hosted
Some articles are supported by a video—like “OPL2LPT: an AdLib sound
card for the parallel port.” In the past, I was using YouTube,
mostly because it was the only free platform with an option to disable
ads. Streaming on-demand videos is usually deemed quite difficult. For
example, if you just use the <video>
tag, you may push a too big
video for people with a slow connection. However, it is not that hard:
hls.js enables us to deliver video sliced in segments available at
different bitrates. Users with JavaScript disabled are still
delivered with a progressive version of medium quality.
In “Self-hosted videos with HLS,” I explain this approach in more detail.
Comments#
Disqus is a popular comment solution for static websites. They were recently acquired by Zeta Global, a marketing company, and their business model is supported only by ads. On the technical side, Disqus also loads several hundred kilobytes of resources. Therefore, many websites load Disqus on demand. That’s what I did. This doesn’t solve the privacy problem and I had the sentiment people were less eager to leave a comment if they had to execute an additional action.
Update (2019-01)
A year later, I can confirm the number of comments has significantly increased after removing this additional step. Between 2011 and 2015, the site harvested about 140 comments. In 2016, Disqus was no longer loaded automatically and the number of comments was halved. In 2018, after switching to Isso and automatic loading, there were 158 comments.
For some time, I thought about implementing my own comment system around Atom feeds. Each page would get its feed of comments. A piece of JavaScript would turn these feeds into HTML and comments could still be read without JavaScript, thanks to the default rendering provided by browsers. People could also subscribe to these feeds: no need for mail notifications! The feeds would be served as static files and updated on new comments by a small piece of server-side code. Again, this could work without JavaScript.
I still think this is a great idea. But I didn’t feel like developing and maintaining a new comment system. There are several self-hosted alternatives, notably Isso and Commento. Isso is a bit more featureful, with notably an imperfect import from Disqus. Both are struggling with maintenance and are trying to become sustainable with a paid hosted version.2 Commento is more privacy-friendly as it doesn’t use cookies at all. However, cookies from Isso are not essential and can be filtered with nginx:
proxy_hide_header Set-Cookie; proxy_hide_header X-Set-Cookie; proxy_ignore_headers Set-Cookie;
In Isso, there is currently no mail notifications, but I
have added an Atom feed for each comment thread.
Update (2019-01)
Mail notifications were recently added and I have just enabled them here. As absolutely nobody ever used the Atom feeds, I have removed them.
Another option would have been to not provide comments anymore. However, I have some great contributions as comments and I also think they can work as some kind of peer review for blog articles: they are a weak guarantee that the content is not wrong.
Search engine#
- Before: Google Search
- After: DuckDuckGo
A way to provide a search engine for a personal blog is to provide a form for a public search engine, like Google. That’s what I did. I also slapped some JavaScript on top of that to make it look like not Google.
The solution here is easy: switch to DuckDuckGo, which lets you customize a bit the search experience:
<form id="lf-search" action="https://duckduckgo.com/"> <input type="hidden" name="kf" value="-1"> <input type="hidden" name="kaf" value="1"> <input type="hidden" name="k1" value="-1"> <input type="hidden" name="sites" value="vincent.bernat.ch/en"> <input type="submit" value=""> <input type="text" name="q" value="" autocomplete="off" aria-label="Search"> </form>
The JavaScript part is also removed as DuckDuckGo doesn’t provide an API. As it is unlikely that more than three people will use the search engine in a year, this seems a good idea to not spend too much time on this non-essential feature.
Update (2023-07)
As an alternative, Pagefind is a search engine tailored for static websites and relying on JavaScript. In my case, I don’t think this is worth the time and I will stick with DuckDuckGo.
Newsletter#
- Before: RSS feed
- After: RSS feed
but also a MailChimp newsletter
Nowadays, RSS feeds are far less popular they were before. I am still baffled as to why a technical audience wouldn’t use RSS, but some readers prefer to receive updates by mail.
MailChimp is a common solution to send newsletters. It provides a simple integration with RSS feeds to trigger a mail each time new items are added to the feed. From a privacy point of view, MailChimp seems a good citizen: data collection is mainly limited to the amount needed to operate the service. Privacy-conscious users can still avoid this service and use the RSS feed.
Update (2019-12)
I have removed the newsletter. There were not many subscribers (around 40) and I felt bad about advertising such a service. Instead, I have added links to RSS-to-email services.
Less JavaScript#
- Before: third-party JavaScript code
- After: self-hosted JavaScript code
Many privacy-conscious people are disabling JavaScript or using extensions like uMatrix or NoScript. Except for comments, I was using JavaScript only for non-essential stuff:
- rendering mathematical content—like in “TLS computational DoS mitigation;”
- moving footnotes as sidenotes when the screen is large enough;3
- enhancing videos to use HLS—see “Self-hosted videos with HLS;” and
- enhancing photo galleries with a lightbox—see “Debian on ThinkPad Edge 11.”
For mathematical formulae, I have switched from MathJax to KaTeX. The latter is faster and enables server-side rendering: it produces the same output regardless of the browser. Therefore, the client-side JavaScript is not needed anymore.
For sidenotes, I have turned the JavaScript code doing the transformation into Python code, with pyquery. No more client-side JavaScript for this aspect either.
The remaining code is still here but is self-hosted.
Memento: CSP#
The HTTP Content-Security-Policy
header controls the resources that
a user agent is allowed to load for a given page. It is a safeguard
and a memento for the external resources a site will use. Mine is
moderately complex and shows what to expect from a privacy point of
view:
Content-Security-Policy: default-src 'self' blob:; script-src 'self' blob: d2pzklc15kok91.cloudfront.net 'sha256-Yv7kZY+BkmpZYTujeN0YNmI0uRKpS5CY7E4enn1TRL0='; style-src 'self' 'unsafe-inline' data: d2pzklc15kok91.cloudfront.net; font-src 'self' data: d2pzklc15kok91.cloudfront.net; object-src 'self' d2pzklc15kok91.cloudfront.net media.bernat.ch; img-src 'self' data: d2pzklc15kok91.cloudfront.net; frame-src d2pzklc15kok91.cloudfront.net media.bernat.ch; worker-src blob:; media-src 'self' blob: about: media.bernat.ch d2pzklc15kok91.cloudfront.net; connect-src 'self' media.bernat.ch comments.luffy.cx; base-uri 'none'; frame-ancestors 'none'; form-action duckduckgo.com;
I am quite happy having been able to reach this result. 😊
-
I don’t have an issue with using a CDN like CloudFront: it is a paid service and Amazon is not in the business of tracking users. ↩︎
-
For Isso, look at comment.sh.For Commento, look at commento.io. ↩︎ -
You may have noticed I am a footnote sicko and use them all the time for pointless stuff. ↩︎