Web Fraud - Account Harvesting

Original Source: [splunk source]
Name:Web Fraud - Account Harvesting
id:bf1d7b5c-df2f-4249-a401-c09fdc221ddf
version:3
date:2024-10-17
author:Jim Apger, Splunk
status:deprecated
type:TTP
Description:This search is used to identify the creation of multiple user accounts using the same email domain name.
Data_source:
search:`stream_http` http_content_type=text* uri="/magento2/customer/account/loginPost/"
| rex field=cookie "form_key=(?<SessionID>\w+)"
| rex field=form_data "login\[username\]=(?<Username>[^&|^$]+)"
| search Username=*
| rex field=Username "@(?<email_domain>.*)"
| stats dc(Username) as UniqueUsernames list(Username) as src_user by email_domain
| where UniqueUsernames> 25
| `web_fraud___account_harvesting_filter`


how_to_implement:We start with a dataset that provides visibility into the email address used for the account creation. In this example, we are narrowing our search down to the single web page that hosts the Magento2 e-commerce platform (via URI) used for account creation, the single http content-type to grab only the user's clicks, and the http field that provides the username (form_data), for performance reasons. After we have the username and email domain, we look for numerous account creations per email domain. Common data sources used for this detection are customized Apache logs or Splunk Stream.
known_false_positives:As is common with many fraud-related searches, we are usually looking to attribute risk or synthesize relevant context with loosely written detections that simply detect anamolous behavior. This search will need to be customized to fit your environment&#151;improving its fidelity by counting based on something much more specific, such as a device ID that may be present in your dataset. Consideration for whether the large number of registrations are occuring from a first-time seen domain may also be important. Extending the search window to look further back in time, or even calculating the average per hour/day for each email domain to look for an anomalous spikes, will improve this search. You can also use Shannon entropy or Levenshtein Distance (both courtesy of URL Toolbox) to consider the randomness or similarity of the email name or email domain, as the names are often machine-generated.
References:
  -https://splunkbase.splunk.com/app/2734/
  -https://splunkbase.splunk.com/app/1809/
drilldown_searches:
  :
tags:
  analytic_story:
    - 'Web Fraud Detection'
  asset_type:Account
  confidence:50
  impact:50
  message:tbd
  mitre_attack_id:
    - 'T1136'
  observable:
    name:'src_user'
    type:'User'
    - role:
      - 'Victim'
  product:
    - 'Splunk Enterprise'
    - 'Splunk Enterprise Security'
    - 'Splunk Cloud'
  required_fields:
    - '_time'
    - 'http_content_type'
    - 'uri'
    - 'cookie'
  risk_score:25
  security_domain:threat

tests:
  :
manual_test:None

Related Analytic Stories


Web Fraud Detection