HLISA: towards a more reliable measurement tool

Daniel Goßen, H.L. Jonker, Stefan Karsch, Benjamin Krumnow*, David Roefs

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference article in proceedingAcademicpeer-review

Abstract

Automated browsers (web bots) are an invaluable tool for studying the web. However, research has shown that web bots can be distinguished from regular browsers and that they may be served different content as a consequence. This undermines their utility as a measurement tool. So far, three methods have been used to detect web bots: browser fingerprint, order of site traversal, and aspects of page interaction.
While site traversal depends on the study being executed, the other two aspects can be controlled in a generic fashion. Whereas identifiability of web bot fingerprints has been studied in the past, how to alter the fingerprint has received less attention. In this paper, we study which method to alter the fingerprint incurs the least side effects. Secondly, we provide an initial investigation of how the interaction API of Selenium differs from human interaction. We incorporate the latter results into HLISA, an API that simulates interaction like humans. Finally, we discuss the conceptual arms race between simulators and detectors and find that conceptually, detecting HLISA requires modelling human interaction.
Original languageEnglish
Title of host publicationIMC '21
Subtitle of host publicationProceedings of the 21st ACM Internet Measurement Conference
Pages380–389
Number of pages10
DOIs
Publication statusPublished - Nov 2021
EventThe 21st ACM Internet Measurement Conference - Online, ACM, New York, United States
Duration: 2 Nov 20214 Nov 2021
https://conferences.sigcomm.org/imc/2021/

Conference

ConferenceThe 21st ACM Internet Measurement Conference
Abbreviated titleIMC '21
Country/TerritoryUnited States
CityNew York
Period2/11/214/11/21
Internet address

Fingerprint

Dive into the research topics of 'HLISA: towards a more reliable measurement tool'. Together they form a unique fingerprint.

Cite this