Advertising attribution methods

To discuss ad cheating, you need to understand the logic of ad attribution first. Cheating methods are basically based around the logic of ad attribution. There are various ad attribution schemes, we will mainly discuss here application-based advertising, overseas mobile ecology, and third-party attribution schemes. Overseas mobile advertising ecology has more mature and credible third-party attribution platforms, such as Appflyer, Adjust and Kochava, etc. The core logic of attribution is that the final attribution logic is the same as the final attribution logic. The core logic of attribution is the last attribution model, i.e. “Last Click”.

image

After the media’s ad is exposed, if the user clicks on the ad, the media will give the media information, user device information (core IDFA/IMEI), timestamp, network status and other information of the ad click to the third-party attribution platform through 302 hops (i.e. after the ad is clicked, it will jump to the backend of the third-party attribution platform through 302 redirection, and then jump to Google Play or App Store). At this point, the third-party attribution platform actually has no information related to the exposure of ads.

After the application is activated, the application-related information can be sent back to the third-party attribution platform by accessing the attribution SDK or through a server-side docking method (S2S), and the attribution platform will find out the matching media click information from the database and attribute the application activation to the corresponding media and advertisement according to the logic of the last click through the matching application package name, user ID information and advertisement click information to complete the attribution process.

Classification of Cheating

Across all ad cheat types, cheaters are able to fake either or both types of “signals” used in attribution. These two types of signals are ad interactions (e.g., view or click, corresponding to position 2 in attribution) and application activities (e.g., install, session, or event, corresponding to position 3 in attribution). On this basis, we classify cheating into faking ad interactions and faking user in-app activities. The former is called Spoofed Attribution and the latter is called Spoofed Users.

image

  • All traffic in type 1 is real traffic, i.e., real interactions with the app that are driven by the user by the ad.
  • Type 2 refers to faked attribution, where the cheater fakes a real user’s ad interaction. The purpose is to steal the natural interaction between the user and the app or the effect generated by the real ad. This type of faking is also known as “poaching” or “traffic theft” (poaching).
  • Types 3 and 4 refer to faking users. This type of cheating focuses on simulating the user’s in-app activity behavior. By faking app installs and events generated by non-existent users, cheaters can steal advertising budgets that are rewarded with app conversions. “Plug-ins”, “virtual bots”, and any means related to “fake users” can be classified as this type of cheat.

Fake Attribution

Fake attribution, also known as Attribution Fraud, Spoofed Attribution, Attribution Cheating, and Attribution Robbery, is a means of cheating by exploiting some loopholes in attribution logic to hijack conversions generated by real users by posting fake exposures/clicks.

Click Fraud (Click Spamming)

Click Fraud, also called Click Stuffing or Click Flood, or Click Flooding, Click Stuffing, Big Click, Pre-Click, and Crash Bank in Chinese, is the process of faking the exposure or clicks of a massive amount of ads, waiting until after the user really installs them, and then under the Last Click attribution principle, such as those installed within N days after the click are counted as channels that bring clicks, grabbing other channels or natural volume attribution into their own channels.

The fraudulent application may perform clicks while the user is using it, or perform clicks while there is activity in the background (e.g., startup, power saving, etc.). The application may even report display counts as clicks to present fake ad interactions, all without the user’s intention or knowledge.

image

Forms of Click Fraud.

  • Ad Stacking Clicks: Multiple ads are placed in a single ad display position in a cascading fashion, with only the top ad visible. All ads in the stack are billed per display or click for the space. Fraudsters place multiple ads into programmatic ad campaigns and generate revenue for unviewed ads. The app silently loads and clicks ads in the background.

image

  • Views as Clicks or “pre-caching”: Send views as clicks, clicking on them before the ad is displayed. The display is sent as a channel for clicks.
  • Server2Server Clicks: Get traffic from Adx to send click events directly to three parties. These forms share a common characteristic: the user does not actually intend to interact with the ad and is not interested in downloading the displayed application. Servers that send manual clicks to directories.

Dependent conditions

  • Rich ad resources, since click fraud is mainly about stealing natural traffic, it requires some ad resources for applications with a relatively large number of natural downloads.
  • Massive amount of devices and traffic to find active devices.

Identification method

  • long CTIT(Click-to-install-time) distribution rates
  • low click-to-install conversion rates
  • high multi-touch contributor rates (or)

image

Click Injection

Click Hijacking, also known as Install Hijacking, Click Injection, and Small Click, is when a cheater “listens” to the installation broadcast messages of other applications through an application installed on the user’s device. When a new application is installed on the user’s device, the cheater is notified and then sends fake clicks to exploit the vulnerability of the attribution model to hijack the corresponding installation before it is completed. This is characterized by a short click-to-install time, with the app store recording a download time earlier than the click on the ad.

image

If we know when an application is downloaded or installed, and send the “click” information to the third-party attribution platform at this time, the probability of attribution according to the Last Click principle is very high because this time is closer to the activation of the application. And Android system just provides the broadcast mechanism to get the application installation. When the application is installed, Android system will broadcast the application installation message (android.intent.action.PACKAGE_ADDED) to the application that has registered the installation broadcast listening capability in the system registration file (Manifest.xml) by means of system broadcast (Broadcast). After getting the installation information of the application (the core information is the package name of the application), the AdSense SDK will get the corresponding advertising information from the advertising backend according to the package name, and pass the related user device information and media information to the third-party attribution platform through the “virtual click” method.

Dependent conditions

  • Rich ad resources, as ad messages are pulled from the ad backend based on packet name requests in real time after receiving the system application installation broadcast, and then the “simulated click” message is sent. Otherwise, you don’t even know what kind of ad clicks to send to the third-party attribution platform.
  • Sign up for the system’s ability to broadcast app install ads (or know about Google play’s download events). This is how you know when the app is installed. Also the traffic coverage of the affiliate SDK should be wide so that more ads can be grabbed. This phenomenon is white-hot when some small advertising affiliates even only need traffic media to access their SDK without displaying ads to get revenue.

Identification method

  • short CTIT(Click-to-install-time) distribution rates
  • high click-to-install conversion rates

image

Our filtering method varies slightly depending on the source of the installation.

  • Google paly and Huawei: Google and Huawei referrer APIs create timestamps that can be used to screen whether clickjacking has occurred. The SDK also collects the install_finish_time timestamp for a second layer of filtering.
  • Installs from other channels: Installs that occur outside the Google Play app store and Huawei AppGallery have no referrer API and cannot send install_begin timestamps. Therefore, to filter such installs, we rely on the install_finish_time timestamp. Clicks received after the install_finish_time timestamp will be considered fraudulent and rejected.

Fake Users

Fake users occur as fake in-app activity, and we were able to detect emulator, Device Farms, and SDK forgeries. In the initial cases of fake users identified, we detected fraudsters using emulators to mimic the use of Android apps by real users on cloud computing services. Also, we identified iOS Device Farms in Southeast Asian countries that faked fake app activity with real devices and people.

Simulators (Bots)

Simulators are cheaters who use automated scripts or computer programs to simulate real users’ clicks, downloads, installs and even in-app behavior, disguised as real users, in order to cheat advertisers’ CPI/CPA budgets.

image

Features are dense IP dispersion, high rate of new devices, abnormal user behavior, abnormal distribution of models/systems/time, etc.

Device Farms

Device Farms refers to cheaters who buy a large number of real devices for ad clicks, downloads, installs and in-app behavior, and hide device information by modifying device ad trackers, etc.

image

Device farmers use a variety of tactics to hide their activity, including hiding behind new IP addresses, using a variety of devices while enabling restricted ad tracking or hiding behind DeviceID reset fraud (resetting their DeviceID with each installation). When implemented on a large scale, this fraud is also known as DeviceID Reset Marathons.

image

Features are dense IP dispersion, high rate of new devices, abnormal user behavior, abnormal model/system distribution, etc.

SDK Spoofing

SDK Spoofing is a cheat that uses data from real devices to send fake clicks and installs to consume advertisers’ budgets by performing “man-in-the-middle attacks” to break the communication protocol of third-party SDKs without any actual installs. The cheater destroys the encryption and hash signature, which in turn leads to a showdown between the cheater and the researcher.

Featured is a mismatch between advertiser backend data and third-party data.

Anti-cheat methods

Anonymous IPs

Anonymous IP filters protect the authenticity of application tracking data from fraudulent installation activity from VPNs, Tor exit nodes, or data centers. Some fraudsters use emulation software to fake installations and place fraudulent conversions into high-value markets for profit, and it is these fraudsters that anonymous IP filters target.

Click to install time

Click to install time (CTIT) measures the gamma distribution between the timestamps in a user’s journey - their initial ad interaction and their first app launch.

image

CTIT can be used to identify different cases of click-based fraud:

  • Short CTIT (less than 10 seconds): possible installation hijacking fraud (install hijacking)
  • Long CTIT (24 hours and beyond): possible big click fraud (click flooding)

New device rate

The New device rate (NDR) will highlight the percentage of new devices that downloaded the advertiser’s app.

It is of course normal to have new devices, as there will be new users installing the application or existing users changing devices. However, it is important to keep a close eye on the acceptable new device rate for their campaigns, as this rate is determined by the measured new device ID. As such, it can be manipulated by device ID reset fraud tactics, which are common in device farms.

image

Sensors

Device sensors (Device sensors) can collect hundreds of indicators ranging from device battery power to tilt angle, and can be used for biometric behavior analysis.

image

These metrics help create profiles for each installation - analyzing the device and user behavior for each installation and its compatibility with the normal trends measured by real users.

Restrict ad tracking

Limit ad tracking (LAT) is a privacy feature that allows users to limit the data advertisers receive about their device-generated activity. When a user enables LAT, advertisers and their measurement solutions receive a blank device ID, rather than a device-specific ID.

This metric is only relevant in Google and iOS ad identifiers; Amazon, Xiaomi, etc. use other identifiers.

Conversion rates

Conversion rates (Conversion rates) describe the conversion of one action to another, which could mean an ad display converting to a click, a click converting to an install, or an install to an active user. Advertisers know their expected conversion rates at any point in the user’s journey to help prevent fraudulent infiltration.

A conversion rate that is too high may not be true and will be suspected of cheating.

Artificial Intelligence

Artificial intelligence has become a common fraud indicator because it allows for the large-scale application of fraud-identifying logic. Artificial intelligence helps to indicate instances of any scale that humans cannot track.

Machine learning algorithms (i.e. Bayesian networks) combined with large mobile attribution databases will ensure an efficient and accurate fraud detection solution.