Large-scale Detection of Cookie Paywalls

dc.contributor.authorThunberg, Adam
dc.contributor.authorWallgren, Oskar
dc.contributor.departmentChalmers tekniska högskola / Institutionen för data och informationstekniksv
dc.contributor.departmentChalmers University of Technology / Department of Computer Science and Engineeringen
dc.contributor.examinerGulisano, Vincenzo
dc.contributor.supervisorMorel, Victor
dc.date.accessioned2023-12-22T10:29:03Z
dc.date.available2023-12-22T10:29:03Z
dc.date.issued2023
dc.date.submitted2023
dc.description.abstractTargeted advertising has become standard practice for websites that offer free content, at the expense of tracking visitors and selling their data to advertisers. There are alternative practices, such as funding from paywalls that block content until a fee has been paid. A new type of paywall was recognized in 2022, coined a cookie paywall, which differs from a regular paywall by offering two alternatives for access to content: agree to be tracked by consenting to cookies, or pay a fee. The ePrivacy Directive, by the European Union, states that: “A user cannot be denied access to a website for the purpose of declining cookies”, a point of which cookie paywalls are clearly in violation. Despite this fact, they have been stated as allowed, or as a legal gray area in countries such as Austria and France, respectively. In the only known previous study on cookie paywalls, 2800 websites based in Central Europe were manually browsed and analyzed, where 13 websites were reported to be employing cookie paywalls. Our goal for this thesis was to study the prevalence of cookie paywalls on a large scale. We built a scalable web crawler that performed cookie paywall detection on the top 1 million most popular websites. We found 431 cookie paywalls, most of which were located in Germany, a region where cookie paywalls are prohibited by law. The runners-up were France and Italy, where 42 and 27 cookie paywalls were found, respectively. Only 8 were found outside of Europe. The price across all known cookie paywalls averaged €3.34 per month, ranging from €0.75 to €49 per month. In scalability tests, the crawler cluster achieved a linear increase in performance as more crawlers were added, reaching a throughput of 7.46 pages/second with 32 crawlers. Although the Crawler found hundreds of cookie paywalls, its detection algorithm was likely biased as it was built by analyzing websites only in Central Europe, leaving room for improvement. It is also possible to determine a trend of cookie paywall prevalence on the Web by performing similar analyses in the future.
dc.identifier.coursecodeDATX05
dc.identifier.urihttp://hdl.handle.net/20.500.12380/307478
dc.language.isoeng
dc.setspec.uppsokTechnology
dc.subjectcookies
dc.subjectcrawling
dc.subjectpaywalls
dc.subjectweb privacy
dc.titleLarge-scale Detection of Cookie Paywalls
dc.type.degreeExamensarbete för masterexamensv
dc.type.degreeMaster's Thesisen
dc.type.uppsokH
local.programmeComputer systems and networks (MPCSN), MSc

Ladda ner

Original bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
CSE 23-95 AT OW.pdf
Storlek:
3.57 MB
Format:
Adobe Portable Document Format

License bundle

Visar 1 - 1 av 1
Hämtar...
Bild (thumbnail)
Namn:
license.txt
Storlek:
2.35 KB
Format:
Item-specific license agreed upon to submission
Beskrivning: