Utilizing Source Information to Detect and Prevent Online- Based Fraud
- Author(s): Zhang, Qing
- et al.
Everyday, organizations release new services and generate data that we access on the Web in the form of digital goods. Digital goods are targeted by fraudsters who seek monetary gain as a result of malicious activities. These activities include stealing sensitive data, falsifying Web traffic, and spamming the Web, which can cost upwards of billions of dollars a year in lost revenue. This dissertation looks at three systems that validate or safeguard digital goods. First, I present Sifter, a system that assesses the quality of digital goods in the form of Web traffic. Sifter works via instrumentation of duplicate servers that receive traffic from separate sources. This instrumentation examines various metrics including mouse activity, click through rate, user-agent strings, timing metrics, and blacklists. Sifter compares these metrics across different traffic providers that vary in reputation. As a result, Sifter spots significant differences in the quality of traffic sources and sheds light on the savings offered by low-cost, bulk traffic digital goods. I then move from detecting fraudulent digital goods to Neon, which focuses on providing a framework that facilitates better data policy management on derived digital goods. Neon tracks every byte on a server by tinting them at the virtualization layer. As result, the flow of every byte in the system can be tracked without modification to guest operating systems or user applications. Neon then applies different management policies to tinted bytes, which can be used to control the flow of digital goods within an organization. Last, I consider the case where an organization must detect fraudulent digital goods outside of their network with DSpin. Specifically, DSpin detects spinning on the Web, where a spun article has words replaced and content restructured to possess similar meaning but different appearance. DSpin operates by identifying immutable words in the article that remain unchanged after spinning and comparing them between articles. I show that DSpin is an effective means of detecting spun digital goods. Each of these three systems detect and prevent fraudster attacks by examining source information. Indeed, I show that source information can be pivotal in safeguarding and validating digital goods