This project studies advertising and tracking on the web. Specifically, we focus on the arms race between publishers and advertisers (to employ ATS) vs. adblockers (to block ATS). Our work highlights the manual efforts that adblockers employ in this arms race, and aims to automate (and remove) these bottlenecks for the adblocking community.
Our first paper, CV-INSPECTOR (NDSS 2021), studies an entire new ecosystem of circumvention (CV) services that has recently emerged. They aim to bypass adblockers by obfuscating site content, making it difficult for adblocking filter lists to distinguish between ads and functional content. In this paper, we investigate recent anti-circumvention efforts by the adblocking community that leverage custom filter lists. In particular, we analyze the anti-circumvention filter list (ACVL), which supports advanced filter rules with enriched syntax and capabilities designed specifically to counter circumvention. We show that keeping ACVL rules up-to-date requires expert list curators to continuously monitor sites known to employ CV services and to discover new such sites in the wild — both tasks require considerable manual effort. To help automate and scale ACVL curation, we develop CV-INSPECTOR, a machine learning approach for automatically detecting adblock circumvention using differential execution analysis. We show that CV-INSPECTOR achieves 93% accuracy in detecting sites that successfully circumvent adblockers. We deploy CV-INSPECTOR on top-20K sites to discover the sites that employ circumvention in the wild.We further apply CV-INSPECTOR to a list of sites that are known to utilize circumvention and are closely monitored by ACVL authors. We demonstrate that CV-INSPECTOR reduces the human labeling effort by 98%, which removes a major bottleneck for ACVL authors. Our work is the first large-scale study of the state of the adblock circumvention arms race, and makes an important step towards automating anti-CV efforts.
Our second work, AutoFR, aims to automate the process of filter rule generation from scratch. Adblocking relies on filter lists, which are manually curated and maintained by a small community of filter list authors. This manual process is laborious and does not scale well to a large number of sites and over time. We introduce AutoFR, a reinforcement learning framework to fully automate the process of filter rule creation and evaluation. We design an algorithm based on multi-arm bandits to generate filter rules while controlling the trade-off between blocking ads and avoiding breakage. We test our implementation of AutoFR on thousands of sites in terms of efficiency and effectiveness. AutoFR is efficient: it takes only a few minutes to generate filter rules for a site. AutoFR is also effective: it generates filter rules that can block 86% of the ads, as compared to 87% by EasyList while achieving comparable visual breakage. The filter rules generated by AutoFR generalize well to new and unseen sites. We envision AutoFR to assist the adblocking community in automated filter rule generation at scale.
Papers
H. Le, S. Elmalaki, A. Markopoulou, Z. Shafiq, “AutoFR: Automated Filter Rule Generation for Adblocking,” to appear in Proceedings of the 32nd USENIX Conference on Security Symposium (SEC) 2023. August 2023, Anaheim, CA. [ USENIX Paper | Extended Version | AutoFR GitHub | Dataset | Ad-Filtering Dev Summit 2022 | Adblocker Dev Summit 2021 ]
H. Le, A. Markopoulou, Z. Shafiq, “CV-INSPECTOR: Towards Automating Detection of Adblock Circumvention,” Proceedings of Network and Distributed System Security Symposium (NDSS) 2021. February 2021. [ CV-Inspector Github | Datasets | CV-Inspector Amazon Machine Image | NDSS Slides ] [Adblocker Dev Summit 2020]
Team
- Hieu Le (UC Irvine)
- Salma Elmalaki (UC Irvine)
- Athina Markopoulou (UC Irvine)
- Zubair Shafiq (UC Davis)
Contact: hieul@uci.edu