Codementor Events

The Beginner’s Guide To Online Privacy

Published Feb 05, 2018Last updated Aug 04, 2018
The Beginner’s Guide To Online Privacy

We are living in remarkable times. We can make pictures of places and people we like by pressing a button on our phones; shop from our homes for literally anything from needles to cars; reach hundreds of thousands of people through social and blogging platforms, and consume information on any topic in any volume at any time of the day.

For a person from 30 years ago it might sound very futuristic. For us it’s just how things are. Common sense.

But all that comes at a price. And that price is our privacy.

Why stay private?

I am a law-abiding citizen, I have nothing to hide.

Or this one:

Why to hide in a globally connected world?

Many people think this way, which is understandable and is absolutely normal. We expect that some companies know a lot of info about us (mainly because we provide it to them ourselves), but it might be a huge surprise that other companies collect far more sensitive information about us that we might not want to share with anyone.

To make matters worse, advances in Artificial Intelligence in recent years enable companies to find very interesting patterns and create fine-grained physiological and psychological profiles of people based on their online behavior. There was a case in 2012, when a company knew a girl was pregnant even before her family knew that. Now imagine what can be done with AI and lots of data about people today.

Hopefully, by the end of this article, you’ll review your thoughts about online privacy. Before moving on, take a moment and consider how would you feel if you had to share the following information about you with a group of 200 strangers:

  • where are you located (geographically)
  • when do you surf the Internet and for how long
  • what is the list of all sites you are visiting on each day
  • what illnesses (if any) you have that you searched online
  • what types of products you buy online
  • what devices you use to connect to the Internet
  • what type of content you prefer to read
  • what type of food you prefer to eat
  • what your political views are

The list can continue, but let’s stop here. Probably you wouldn’t share all this information with your friends, not to mention strangers.

However, the reality is that today many people unwillingly and unconsciously are already sharing such data about themselves to “strangers” in companies who collect this data to benefit from it.

Your “personal anonymous profile”

Even if the majority of those companies who collect all that data about you do not know your real name, it’s not that important for them. It’s not your name that interests them, but rather your behavior and preferences. If they don’t have your name, they’ll just label you with an ID in their system.

However, some companies do know your name and even your social security number, even though you didn’t explicitly share it with them.

The paradox is that we “share” most of that data, in our ignorance about what type of information is easily obtainable about us when we navigate the Internet.

There is so much to privacy that I’m afraid it’s impossible to fully protect ourselves on the Internet from the eyes of amoral corporations, but we can minimize this risk. I invite you to find out how this can be done.

The Pyramid of Privacy

I would like to visually demonstrate what can protect your privacy and how effectively it can do that.

In order of significance, from bottom to the top:

1. Operating System


Source: pixabay.com

Without a solid foundation, you won’t be able to build anything useful. It turns out that even the choice of the operating system that people use can pose a risk to their privacy.

The Risk

If you are a Windows 10 user, then I have some bad news for you, because:

Solutions

One possible solution here is to switch to another operating system like Linux or MacOS. And if in order to use MacOS you have to buy a Mac, you can install a Linux distribution of your choice on any computer.

And in case you have heard scary tales about Linux, just check it out yourself. Here you can find a list of the most popular distributions, see how they look like and download and install them. Or, in case you don’t know where to start, just go with Ubuntu.

Still don’t want to switch from Windows? Then check out W10Privacy — a tool to help you disable some tracking settings in Windows.

2. Networking Layer


Source: pixabay.com

Now, once you at least have a chance to be anonymous and not have a unique ID stuck to your computer that you can’t get rid of, let’s talk about connecting to the Internet.

Have you ever thought about how the Internet works? The navigation process is complex, but at the same time it reflects the power of engineering. However, I won’t dive right now into the internals of how it works, but will focus on privacy-related topics that you must have heard about before: IP and VPN.

The Risk

As in the real world, each device that is connected to the digital World Wide Web has its own address, the IP address that is visible to any site you visit. Therefore, no matter what you do to hide your data and preferences, you will be easily identified by the address through which your computer is connected to the Internet.

That’s exactly why you see ads in your native language from the country you live in, even if you navigate to a foreign website.

That’s also the method by which some sites restrict access to visitors from specific countries. Here you can see where your IP address points on the world map.

Solutions

  1. Virtual Private Networks (VPNs)
  2. WebRTC IP Leak Test

Let’s discuss them one by one.

1. Virtual Private Networks

You can’t just hide your IP address, as you won’t be able to navigate the Internet. However, you can pretend you have a different IP address than your real one. This is where the Virtual Private Networks come into play.

A virtual private network ( VPN ) extends a private network across a public network, and enables users to send and receive data across shared or public networks as if their computing devices were directly connected to the private network.> Source: Wikipedia

There are more than 150 VPN service providers available worldwide and choosing the right one may be tough, as each provider has their own features and limitations.

There are, however, few critical things to take into consideration when choosing one and, surprisingly, it relates to some “eyes.”

Five Eyes, Nine Eyes, Fourteen Eyes

All these are global alliances with the goal of mass surveillance. They cooperatively collect, analyze and share data about citizens from different parts of the world. This started after the World War II, and now countries spy on each other’s citizens and share intelligence on people’s online activity, received/sent emails, Facebook posts and more.

The countries that make up these groups are:

Five Eyes:

  1. Australia
  2. Canada
  3. New Zealand
  4. United Kingdom
  5. United States

Nine Eyes (all of the above plus):
6. Denmark
7. France
8. Netherlands
9. Norway

Fourteen Eyes (all of the above plus):
10. Belgium
11. Germany
12. Italy
13. Spain
14. Sweden

To keep it short, choosing a VPN provider based in one of these countries does not guarantee you privacy, as some entities (like NSA and alike) from the same or even different countries can force VPN (and basically any online service) providers to offer their data.

There is a nice list of over 150 VPN providers with all their features and limitations on thatoneprivacysite.net. Take some time to read and analyze what VPN fits you best. Then I would recommend that you use it for 1 month before buying a long-term subscription to see how it goes.

2. WebRTC IP Leak Test (even with VPN you may be visible)

Hold on! Even behind a VPN and with an encrypted DNS service you may still leak your IP address. And why should things be so complicated?

Technology is always improving, and with every new thing that is being developed, there are either bugs or simply ways to exploit some features to obtain the required results. So it is with WebRTC — a new communication protocol that relies on JavaScript that can leak your actual IP address from behind your VPN. Check it out on privacytools.io and if you see any IP addresses identified, check out this section on the same privacytools.io and go through the steps enumerated there. Don’t forget to check again if WebRTC leaks your IP address!

3. The Browser


Source: pixabay.com

Let’s discuss the surfing boards that we use to navigate in the digital cosmos of the Internet.

What browser is better?

  • Internet Explorer! (said nobody)
  • Edge (*…whispered somebody…*)
  • Opera! (said a couple of people)
  • Safari! (said a bunch of people that have all the Apple products of the newest version the first day they appear)
  • Tor! (shouted an anonymous group from somewhere)
  • Яндекс Браузер! (said a group of Russian speaking people)
  • Chrome!!! (cried a crowd for whom Google probably has their digital version of themselves)
  • Firefox!!! (cried another crowd with posters on privacy)
  • Brave! (said somebody, but it wasn’t clear whether they referred to a browser, or just to be brave in today’s world?)

There are several dozens of them, a list of which you can find on Wikipedia, but this doesn’t answer the above question…

The Risk

Any of the aforementioned browsers are complex pieces of software that provide you access to the Internet. And while surfing the World Wide Web, your browser interacts with other computers, exposing some information about itself to any site it visits. And this is where it gets complicated, as a combination of various browser settings can create your unique Device Fingerprint.


Source: commons.wikimedia.org

Wait, what? A fingerprint?

A device fingerprint is information collected about a remote computing device for the purpose of identification. Fingerprints can be used to fully or partially identify individual users or devices even when cookies are turned off.> Source: Wikipedia

So, the bad news is that while surfing the Internet, you literally leave your digital fingerprints on each site you visit.

The good news? Your device fingerprint can be not unique, if you change your settings to expose as little data as necessary to navigate.

This is possible due to the fact that your device’s fingerprint is not a single piece of information, but is rather a set of different settings (e.g. your screen size, browser type, browser version, installed fonts, installed addons, etc.) that together can uniquely identify your browser.

Remember the lady in red from The Matrix? She stands out because she has a very distinctive appearance in comparison to others around her. So it is with your browser — the more distinctive features it has, the easier it is to spot in the crowd.

But if you dressed her in a black jacket and a white shirt, like the people around her, she wouldn’t stand out much.

There are more than a dozen pieces of information that your browser exposes about its settings, and our job is to make them as “common” as possible.

Want to see what your device fingerprint is? Check out:

  1. panopticlick.eff.org
  2. amiunique.org

If you’ll choose panopticlick, you’ll see something like this:

In the ”Browser Characteristic” column, you can find what type of information is being collected. Based on this information, your browser can be identified. Another interesting column is the “one in x browsers have this value,” which basically is the entropy of that browser characteristic. The smaller the number there the better, as it means that there are many other browsers with this exact setting.

Also, above the table you can see how unique you are. The image above represent the results of the test run from my Chrome browser, which is not configured for keeping me private.

After tweaking some settings and installing some add-ons, here’s what you can achieve (this one’s from my Firefox browser, which I use on a daily basis):

Only 1-in-75,604 browsers from the panopticlick’s dataset have the same fingerprint as mine, which is much better (but not ideal).

Solution

The first thing is to select a browser. From a privacy perspective there are several of them that are widely recommended above others. Namely these are:

1. Tor Browser
Comes with pre-installed privacy add-ons, encryption and an advanced proxy. This one you can pretty much use as it comes out of the box.

2. Firefox  
Tweak the default configuration and install some privacy add-ons and you’re good.

3. Brave  
Automatically blocks ads and trackers, making your navigation faster and safer.

Configure your browser for increased privacy
There are 2 options here:

  1. The easy path would be to follow the instructions here (only valid for Firefox, but you can search for similar settings in Chrome under “about:flags”).
  2. If you’d like to have more flexibility and the possibility to have your privacy settings importable/exportable, check out the ghacks-user.js project on Github (also only for Firefox). It’s more comprehensive and requires some setup, but it’s worth it.

Setup additional add-ons for an increased privacy
Read about this below.

4. Cookies

Now probably you have heard about cookies on the web and that they are something not very good (otherwise why would sites inform you about their usage of cookies when you navigate to one of their pages?)

The reality is that cookies are a nothing but a tool, and only some uses of this tool are questionable from a privacy standpoint.

So, cookies are small strings of text that a site can store in your browser. They cannot install anything (they are just text) and are visible only to the site that stored them (so that no site can see all of your cookies for 20 other sites you’ve visited).

Moreover, cookies are sent with each request and this is what makes them a potential threat to privacy.

Let’s take a simple example: suppose you visit a site that has light and dark themes. The default one is the light theme, but you’ve selected the dark one. Anytime you enter on that site, even if you don’t log-in or register, it displays the dark theme.

In this case, the site could have saved a cookie in your browser theme=dark and whenever you load that site, this cookie is sent to the server, which then serves the corresponding .css file with the dark theme.

The fact that you are constantly logged in on sites when you open them even after rebooting your computer is also possible due to cookies storing the data about your session.

The Risk

Now that was an innocent example, and it’s probably not very clear how one could benefit from these cookies. So let’s see another example that can infringe on our privacy:

Let’s see a specific example.

  1. The User decides to visit siteA.com.
  2. SiteA.com, in order to make some money, shows ads from siteB.com , by placing a specific piece of code within its own pages.
  3. When siteA.com receives the request from User, it sends to him/her the HTML code of the page the User requested, which, in this case, contains an <iframe> HTML tag. This tag loads another page, the ad page, within the current one.
  4. When the User’s browser receives the HTML code from siteA.com, it starts rendering the page and making subsequent requests to get everything that is needed to properly load the page. Thus, the browser will make a request to retrieve the ad from siteB.com, sending the cookies related to siteB.com. But because there are no cookies yet in User’s browser, siteB.com instructs the browser to store the cookie with key __uId and the value abc1.
    At the same time, siteB.com creates a profile in their database with abc1 ID, which will collect all the data about our User. It does so with the help of the Referer header, which contains the URL that initiated the request. In this case, the Referer Header would have the value www.siteA.com.
  5. After some time (or right after siteA.com) the User navigates to siteC.com.
  6. SiteC.com, which is completely unrelated to siteA.com, shows ads from the same advertising company (siteB.com).
  7. When the browser receives the HTML code for siteC.com and makes a request to retrieve the ad from siteB.com, this time it automatically sends the cookie __uId=abc1 to siteB.com, which is User’s unique identifier. This, together with the Referer Header that now contains the value www.siteC.com , tells siteB.com that the User is already in their database. So they update his/her profile with the latest visited website, which is siteC.com.

And thus, bit by bit, advertising companies collect tons of data about people’s online activities.

Solution

Here you have 3 possible options:

  • Completely disable cookies (but this will break some sites and they won’t work). This can be done in browser settings;
  • Limit the cookies to “first party,” which means only the site you are currently navigating to will be able to write cookies into your browser and no other “third-party” cookies from ad companies will be used (this may still break some sites, but very few). This also can be done in browser settings;
  • (Recommended approach) Install an add-on that will handle cookies, with custom rules defined per each site (this approach requires some setup, but is the most flexible).

You can find some add-on recommendations at the end of the article.

5. Scripts


Source: pixabay.com

Sadly, the saying “with great power comes great responsibility” is not very popular among today’s corporations…

JavaScript is the quintessential building block of websites as it offers many possibilities to do various things. You can build games, engaging interactions, animations, and a myriad of other cool stuff on a web page.

JavaScript can also get your screen size, battery charge level (in case of laptops), list of installed add-ons on your browser and other information that can be used to uniquely identify you.

The Risk

So what happens on many sites that you visit? In order to make money, site owners put scripts of ad companies on their sites and once you load the page, the browser loads third-party scripts as well. Those scripts then extract the potentially identifying information about you and send it over to ad companies along with your actions on the page you loaded.

Of the many types of information that JavaScript can get about your browser, its Canvas Fingerprint is by far the most powerful, as it provides the most entropy. It mainly does so because several factors that can greatly vary, like your GPU, graphic drivers, OS and browser, all contribute to its creation.

Thus, bit by bit, companies collect information about your actions, sites you visit, and your clicks, and create your digital profile based on which tailored ads are served to you. This is further adjusted by your continuous actions on the Internet.

Solution

Disabling JavaScript is not a solution, as you won’t be able to use half of the sites on the Internet. However, what you can do is block scripts from specific vendors so that they won’t load with the page and block requests to ad companies with potentially identifiable information. There are several add-ons described in the part 6 below.

6. Miscellaneous and Add-ons

The above steps are fundamental for online privacy, but, unfortunately, they are not sufficient — there are still enough bits of information that sites can collect and use to construct someone’s digital profile.

In this section, you will find a list of basic add-ons for Firefox Quantum (version ≥ 57) that will let you browse safely. In case you’ve chosen another browser, you can search for alternatives in their corresponding add-on lists.

Please note that this is by no means an exhaustive list, so feel free to add your suggestions in the comments. Also, some features might be present in more than one add-on, which for most cases won’t cause any conflicts, but just be aware that sometimes things might not work. That’s why I would recommend installing them one by one and loading several pages to test whether everything works as expected.

So, here we go:

1. Cookies

There are plenty of add-ons in this category. I personally use Cookie AutoDelete, but you might like something else. Just activate it to delete the cookies either on closing the browser, or once a predefined period of time has elapsed.

Thus, sites and ad providers won’t be able to track you easily with cookies, as for them you’ll be like a new visitor each time you visit a site. The other side of the “privacy blade” is that you’ll have to log-in each time you open the browser, because the session cookies will also be deleted.

It might be a bit annoying, but hey, nobody told you it would be daisies.

2. Script Blockers

There are several popular add-ons to block unnecessary tracking scripts (in no particular order): uMatrix, NoScript, uBlock Origin, AdBlock, and others.

The first two provide you with more flexibility, but require some learning and setup. By default they “break” lots of sites, as they simply block all scripts and you need to define some rules regarding what to allow and what to block. Personally, I used both NoScript and uMatrix but prefer uMatrix (currently using it).

The uBlock and AdBlock are best if you don’t want to spend time learning how they work and just want to start navigating more securely. These work out of the box, but sometimes may provide less privacy than uMatrix or NoScript.

3. User Agent

There are also plenty of add-ons on User Agent header switcher (that’s the information about what Operating System and Browser you are using).

The problem is that there are so many OS and browser versions, that this header alone can be a useful source of information for those who want to identify you.


Source: amiunique.org

The purple color on the OS chart are iOS versions. According to this chart, the most common operating system is Windows 7. The situation is not that “common” on the browsers side, as the vendors are literally stamping new versions like hotcakes:


Source: amiunique.org

On the left you can see Firefox browser version distribution among people in the amiunique.org dataset, and on the right there is Chrome browsers distribution.

I don’t have a strong favorite for this feature. Currently I use User Agent Switcher, as it allows you to set your own custom User Agent header value as well as it has a random mode which switches between different user agents in time.

4. Encrypted Browsing

Have you spotted that some URL addresses start with http:// and some start with https://? HTTP stands for HyperText Transfer Protocol, which is the protocol that defines how computers communicate over the Internet.

Add Secure at the end and you’ll know what HTTPS stands for. When you access a site that starts with https://, the contents of the requests you make are encrypted, making it very hard to understand what are you sending even if someone intercepts the request.

Sadly, not all sites implement automatic redirection of HTTP links to HTTPS links, making your online navigation visible for those who may intercept your traffic.

Luckily, HTTPS Everywhere solves that problem and automatically redirects you to the encrypted version of the sites (if these exist).

5. Canvas Fingerprinting

There are two solutions to approach canvas fingerprinting:

  1. Block any attempt this API.
  2. Alter the resulting fingerprint each time it is accessed.

In the long run, the first option is best, as it doesn’t provide additional information. However, because few people are aware of it and have chosen to block canvas API, the lack of a canvas fingerprint in your browser is in itself a source of identifying information about you.

The other option is to alter the canvas fingerprint and occasionally change it, so that each time you will have a different fingerprint, as though different people were navigating.

For this purpose CanvasBlocker works pretty good. It has both described options and it’s up to you to decide which one to choose.

6. Referer Header (without double ‘r’)

This header is sent with each request, indicating from where that request came (which site has referred the page you’re requesting). It can be used to track your online navigation and to see which sites you access from which sites.

But it can be altered to hide your online navigation routes for your eyes only (or whoever is using your computer as well).

If you’ll use the aforementioned uMatrix add-on, it comes with Referer Header spoofing. Otherwise, just search for “referer spoof” among the add-ons of your browser and choose one.

7. Link Cleaner

The Referer Header is an advanced way to get to know where the person came from, but one of the most common approaches to tracking your engagement is Query String parameters. These parameters are parts of URLs that come after the ?character and they hold various types of data.

Take this link: http://meyerweb.com/eric/thoughts/2017/03/07/welcome-to-the-grid/? utm_source=frontendfocus&utm_medium=email&page=2

The key=value pairs on the right of the ? character are query parameters. When you click on such a link, the values of query parameters are sent to the server.

Have you ever wondered what are the utm_medium and other related utm_* parameters in URLs mean? These are related to Google Analytics.

Not all query parameters are infringing on privacy. Some of them are necessary for the site to work properly (e.g. the page parameter).

You can use Link Cleaner which will remove most of the query parameters used for tracking.

The Next Steps


Source: pixabay.com

I tried to cover the main aspects of online privacy that should lay a solid foundation for further investigation and learning about this topic.

But there is so much to privacy that it’s hard to cover everything in one article, and there are things that haven’t been mentioned. Feel free to add them in comments so that those people who want to learn even more will have the chance to do it!

Also, I have mainly focused on safe and private browsing on the Internet, but privacy should be a thing to consider for each of the online services we use as well, including email, file sharing and other services we use on a daily basis.

And remember: It’s one thing to consciously share our information with others and it is completely different to have sensitive information collected without our knowledge and consent.

Stay private!

PrivacyTools — a comprehensive resource on privacy. Contains links and recommendations to service providers as well.

BrowserLeaks — makes analysis of your browser on several dimensions related to privacy, including IP addresses, canvas fingerprinting, Flash and more.

Panopticlick — checks how safe is your browser against tracking and comes with a report on the things that reveal the most information about you.

AmIUnique — an alternative to Panopticlick. Has some general statistics about their dataset as well.

Firefox Hardware Report — a weekly report of the hardware used by a representative sample of the population of the Internet.

Screen Resolution tracking — an interesting thread on how the browser/screen size can let you down.

Firefox getting smarter about Third-Party Cookies.


This post was originally published on Medium.

Discover and read more posts from Iulian Gulea
get started