Advertising pays much of the budget for most online publishers, making the growth of ad blockers an existential threat. As such, ad blocking has set off a software-based arms race, with publishers finding software solutions that keep ads appearing, or entreat people using ad blocking software to white-list them. Ad blockers readily respond with modified software that targets these specific responses, triggering the publishers to try again.
Some academics have recently stepped into the middle of this arms race, performing an analysis that allows them to identify the specific methods used by publishers to avoid having ads blocked. And the team has gone on to try a couple of different approaches, both of which modify a web page's contents to keep the anti-ad-blocking software from having an affect.
Outside of the economics of it all, there's an interesting computer science problem here. The code on the web page is attempting to identify software present on a user's browser. How do you recognize when that's happening, and how can you possibly intervene?
The ad-blocking wars
The approach the researchers took involved following code execution as a browser loaded and displayed the page. This was done with a modified version of Google's V8 JavaScript engine, one that allowed the team to extract information about the downloaded code that was being processed and executed as the web page loaded. By doing this with and without an ad blocker installed, they were able to identify differences in the code that was executed when ads were displayed or blocked.
As they note, typical anti-ad-blocking code might wait for the page to load, and then check on the size of an element that's meant to contain an ad. If the ad isn't loaded, this area will never get defined, and its size will end up either being undefined or zero. This allows the code to perform some other action, like putting up an alternative ad or displaying a dialog to ask for the ad-blocking software to be disabled.
By following code traces, the authors could look for conditional tests—things like "is the size of this element 0?"—followed by execution of different code depending on whether an ad blocker is present. By examining the code at that location, they could determine which condition was being tested for.
On its own, this provided an indication of just how prevalent anti-ad-blocking software is. The authors claim to have found an anti-ad-blocking response on over 30 percent of the Alexa Top-10,000 websites, but it's somewhat more complicated than that. In many cases, ad-blocking software was detected, but there was no visible response; the software simply logged the presence of the ad blocker, often through Google analytics.
Setting it loose on web pages that normally don't show ads indicated it didn't produce any false positive identifications. And a test of over 400 sites known to use anti-ad-blocking software showed that it was over 85 percent accurate at identifying them.
The false negatives came about for a variety of reasons. One of these is simply that Javascript has a variety of mechanisms by which programmers can test for specific conditions, and the team didn't trigger their analysis on all of them. The second is just random variability; each page was loaded six times, three each with and without ad blocking. Random differences among these, like slower or faster loading of some page components, could obscure the tests for the presence of anti-ad-blockers. There was at least one approach that the software missed entirely: it loaded a warning message about ad blocking, then tried to load an ad on top of it; if the more complex one was blocked, the warning showed.
Intervention
WIth that success in hand, the authors decided to enter the arms race on the side of the ad blockers. Since they knew what condition was being tested to determine whether an ad blocker was being used, they could intervene in the page's JavaScript in a way that forced it to execute the ad-blocker-free branch of the code. This is relatively simple to do on the code side, by simply rewriting the JavaScript so all the relevant branches do the same thing. Rewriting, however, required the installation of specially-modified proxy software on the same computer, and redirecting all the browser's requests so they went through this software.
This approach had a success rate of over 80 percent on the websites it was tested with. And, despite the potentially significant mangling of the underlying code, only one site showed a visual defect.
An alternative approach they tried was somewhat more precise. Since they could identify the condition that was being tested for, they could modify the variables used by the site so that the condition would always evaluate as if an ad blocker was not present. This only requires a browser extension. And, in the 15 websites it was tested on, it worked every time.
Motivation?
The authors are very up-front about their motivation for this work: "We want to develop a comprehensive understanding of anti-adblockers, with the ultimate aim of enabling adblockers to be resistant against anti-adblockers." They cite user privacy and security as the reason for choosing a side in the arms race, but it's not clear that their approach makes much sense in this regard. Running everything through a modified proxy, or manipulating page-wide variables would both seem to create a whole host of privacy and security risks on their own. In addition, it's not clear how blocking the mere logging of the existence of an ad blocker, which their software would do, helps anyone.
And they admit that, as soon as publishers are aware of the methods they use to test for anti-ad-blocking software, workarounds will be possible. This could be as simple as finding a means of searching for an ad-blocker that won't be picked up by the researchers' approach. Or it could involve intermingling the code for the ad-blocking test with code that's essential for the page to work. Or re-using the variable that's manipulated by the researchers' software. Any of these, and presumably other approaches, would all work.
Finally, the researchers seem to be actively avoiding considering the consequences. Part of their introduction states flatly that "Adblocking results in billions of dollars worth of lost advertising revenue for online publishers." And their own analysis confirms that the majority of the sites running anti-ad blocking software are producing news. If they're aware that the success of their goals will involve crippling a lot of news sources, it's not apparent from this paper.
Network and Distributed Systems Security 2018, 2017. DOI: 10.14722/ndss.2018.23331 (About DOIs).
[contf] [contfnew]
Ars Technica
[contfnewc] [contfnewc]