More and more online news resources have been using a paywall for generating revenue from their news, articles and content since 2010. This is an indisputable right of the owners. What about end users? Are these paywalls embraced by users? Are the paywalls working correctly?
From the end user perspective, it seems there is a contradiction between the necessity of paying regularly for multiple news resources and the need for free access to verified information or quality content. Online news resources such as The New York Times, The Wall Street Journal, Financial Times, The Economist, Bloomberg and Medium are visited frequently for a few news or articles. And all of these resources have their own specific membership options or subscription packages. Nobody has enough time and/or budget to browse many news resources at the same time. As a result, people regularly try to bypass paywalls.
Currently, there are several bypass methods for opening the doors of paywalls, while some of them are applicable and some are not, due to the measures taken by the news resources. To make matters worse, some of these techniques are not safe for the end user’s security and privacy due to routing of all the browser’s traffic.
The Assessment of The Current Techniques
Let’s have a quick review of the current circumvention techniques. These are not valid for all the paywalls.
Using a browser plugin that routes your traffic to another server is a very suspicious move for vigilant nerds. Incognito modes of the browsers are not fully valid for all resources. Deleting cookies does not work on all paywalls. Controlling referer information in the browsers is again not fully applicable to all the paywalls and the settings are painful for most of the users. Some paywalls like Financial Times open the doors when you come from their Facebook page (Facebook redirection). Removing some control codes and manipulating the JavaScript on the paywalls’ pages via the browsers’ F12 developer options require technical know-how which most of the users do not have. Reading, annotating and archiving services like Outline and Archive.is useful but they always run on a separate page and need an everlasting copy-paste operation.
The New Easiest Way
While people find ways to get around the paywalls, news resources try to make them useless and tighten their paywalls with extra controls. However, the paywalls are still unsuccessful due to insufficient measures. May there be an Achilles’ heel which is unhealable? Why not!
After analysing the important paywalls and current bypass techniques above, I found the new easiest way to circumvent the paywalls: the search engines crawlers, that is the bots like Googlebot, MSNbot etc.
Soon after Google’s policy, most of the paywalls started providing limited free clicks to users in order to sustain a good user experience for new potential subscribers. Yet this logic seems to only rely on Google bot’s user-agent information. That means, if you are able to change user-agent type according to Google crawlers, you get full access to the paywalls instantly. And what is even funnier is that the user-agent change in the browser is like shelling peas. So, the process of user-agent change can be automated with various user-agent switcher browser extensions. Everybody can use these browser extensions without hassle. This method works almost on every paywall.
The thing you will see is that the paywalls do not block the search engines crawlers to show their content but they block humans. That means the paywalls cannot bear the consequences of not being indexed by the search engines. If they want to be indexed entirely by Google, they need to open the doors for Google. By doing so, the paywalls actually want to make their content searchable by humans through the search engines. And most of the paywalls show only a few free news for better user experience and than redirect them to their own membership options. But, in the meantime, you are able to see all news by just changing your user-agent with one of the search engine’s information.
Actually, a reverse DNS lookup for the incoming user-agent request can be used as a counter-measure against the spoofed bots’ user-agent. As another option, if you have the IP ranges of the search engines crawlers, you still have a chance to block spoofed bots. According to that, none of the paywalls seems to have taken these simple measures. If they do not apply one of these measures, this way is unbeatable. I can bet that the big news resources already know these measures. So, why can I still reach all paywalled contents easily?
The Closing Words
In this post, I particularly kept away from going deeper into the issues such as the philosophy of free movement of information or its commercialization. If you benefit significantly from the news sources, you should pay for them. But it is economically unaffordable for many people to pay for just a few of the increasingly diversified news resources. At this point, I can suggest an action for the news resources to consolidate their subscription with each other in a joint collaboration and to provide a new type of bundle membership which is valid at multiple paywalls at the same time. Otherwise, they have to think of blocking via solid measures for their contents.