M365 Copilot Information Extraction Jailbreak Attack

Original Source: [splunk source]
Name:M365 Copilot Information Extraction Jailbreak Attack
id:c0ee37bb-ed43-4632-8e38-060fba80b0b2
version:1
date:2025-09-25
author:Rod Soto
status:experimental
type:TTP
Description:Detects M365 Copilot information extraction jailbreak attacks that attempt to obtain sensitive, classified, or comprehensive data through various social engineering techniques including fictional entity impersonation, bulk data requests, and privacy bypass attempts. The detection searches exported eDiscovery prompt logs for extraction keywords like "transcendent," "tell me everything," "confidential," "dump," "extract," "reveal," and "bypass" in the Subject_Title field, categorizing each attempt by extraction type and assigning severity levels (CRITICAL for classified/proprietary data, HIGH for bulk extraction or privacy bypass). Prompts are further analyzed for compound risk patterns such as "Confidential+Extraction" or "Bulk_Request+Bypass," filtering out low-severity cases to surface the most dangerous attempts to exfiltrate sensitive organizational information through AI manipulation.
Data_source:
  • -M365 Exported eDiscovery Prompts
search:`m365_exported_ediscovery_prompt_logs`
| search Subject_Title="*transcendent*" OR Subject_Title="*incorporeal*" OR Subject_Title="*being who*" OR Subject_Title="*tell me everything*" OR Subject_Title="*give me all*" OR Subject_Title="*comprehensive*" OR Subject_Title="*step by step*" OR Subject_Title="*hidden*" OR Subject_Title="*secret*" OR Subject_Title="*confidential*" OR Subject_Title="*classified*" OR Subject_Title="*internal*" OR Subject_Title="*sensitive*" OR Subject_Title="*proprietary*" OR Subject_Title="*dump*" OR Subject_Title="*extract*" OR Subject_Title="*reveal*" OR Subject_Title="*disclose*" OR Subject_Title="*leak*"
| eval user = Sender
| eval extraction_type=case(match(Subject_Title, "(?i)(transcendent|incorporeal).*being"), "Knowledge_Entity", match(Subject_Title, "(?i)tell.*me.*(everything|all)"), "Everything_Request", match(Subject_Title, "(?i)(give|show|provide).*me.*(all|every)"), "Complete_Data_Request", match(Subject_Title, "(?i)(hidden|secret|confidential|classified)"), "Restricted_Info", match(Subject_Title, "(?i)(comprehensive|complete|full|entire)"), "Complete_Info", match(Subject_Title, "(?i)(dump|extract|scrape).*(data|info|content)"), "Data_Extraction", match(Subject_Title, "(?i)(reveal|disclose|expose|leak)"), "Information_Disclosure", match(Subject_Title, "(?i)(internal|proprietary|sensitive).*information"), "Sensitive_Data_Request", match(Subject_Title, "(?i)step.*by.*step.*(process|procedure|method)"), "Process_Extraction", match(Subject_Title, "(?i)(bypass|ignore).*privacy"), "Privacy_Bypass", match(Subject_Title, "(?i)(access|view|see).*(private|restricted)"), "Unauthorized_Access", 1=1, "Generic_Request")
| eval severity=case(match(Subject_Title, "(?i)(transcendent|incorporeal)"), "HIGH", match(Subject_Title, "(?i)tell.*everything"), "HIGH", match(Subject_Title, "(?i)(dump|extract|scrape)"), "HIGH", match(Subject_Title, "(?i)(classified|proprietary|confidential)"), "CRITICAL", match(Subject_Title, "(?i)(hidden|secret|internal|sensitive)"), "MEDIUM", match(Subject_Title, "(?i)(reveal|disclose|leak)"), "MEDIUM", match(Subject_Title, "(?i)(bypass|ignore).*privacy"), "HIGH", 1=1, "LOW")
| where severity!="LOW"
| eval data_risk_flags=case(match(Subject_Title, "(?i)(classified|confidential|proprietary)") AND match(Subject_Title, "(?i)(dump|extract|scrape)"), "Confidential+Extraction", match(Subject_Title, "(?i)(everything|all|complete)") AND match(Subject_Title, "(?i)(bypass|ignore)"), "Bulk_Request+Bypass", match(Subject_Title, "(?i)(classified|confidential|proprietary)"), "Confidential", match(Subject_Title, "(?i)(dump|extract|scrape)"), "Extraction", match(Subject_Title, "(?i)(everything|all|complete|comprehensive)"), "Bulk_Request", match(Subject_Title, "(?i)(bypass|ignore)"), "Bypass_Attempt", 1=1, "Standard_Request")
| table _time, user, Subject_Title, extraction_type, severity, data_risk_flags, Size
| sort -severity, -_time
| `m365_copilot_information_extraction_jailbreak_attack_filter`


how_to_implement:To export M365 Copilot prompt logs, navigate to the Microsoft Purview compliance portal (compliance.microsoft.com) and access eDiscovery. Create a new eDiscovery case, add target user accounts or date ranges as data sources, then create a search query targeting M365 Copilot interactions across relevant workloads. Once the search completes, export the results to generate a package containing prompt logs with fields like Subject_Title (prompt text), Sender, timestamps, and workload metadata. Download the exported files using the eDiscovery Export Tool and ingest them into Splunk for security analysis and detection of jailbreak attempts, data exfiltration requests, and policy violations.
known_false_positives:Legitimate researchers studying data classification systems, cybersecurity professionals testing information handling policies, compliance officers reviewing data access procedures, journalists researching transparency issues, or employees asking for comprehensive project documentation may trigger false positives.
References:
  -https://www.splunk.com/en_us/blog/artificial-intelligence/m365-copilot-log-analysis-splunk.html
drilldown_searches:
name:'View the detection results for - "$user$"'
search:'%original_detection_search% | search "$user = "$user$"'
earliest_offset:'$info_min_time$'
latest_offset:'$info_max_time$'
name:'View risk events for the last 7 days for - "$user$"'
search:'| from datamodel Risk.All_Risk | search normalized_risk_object IN ("$user$", starthoursago=168 | stats count min(_time) as firstTime max(_time) as lastTime values(search_name) as "Search Name" values(risk_message) as "Risk Message" values(analyticstories) as "Analytic Stories" values(annotations._all) as "Annotations" values(annotations.mitre_attack.mitre_tactic) as "ATT&CK Tactics" by normalized_risk_object | `security_content_ctime(firstTime)` | `security_content_ctime(lastTime)`'
earliest_offset:'$info_min_time$'
latest_offset:'$info_max_time$'
tags:
  analytic_story:
    - 'Suspicious Microsoft 365 Copilot Activities'
  asset_type:Web Application
  mitre_attack_id:
    - 'T1562'
  product:
    - 'Splunk Enterprise'
    - 'Splunk Enterprise Security'
    - 'Splunk Cloud'
  security_domain:endpoint

tests:
name:'True Positive Test'
 attack_data:
  data: https://raw.githubusercontent.com/splunk/attack_data/master/datasets/m365_copilot/copilot_prompt_logs.csv
  sourcetype: csv
  source: csv
manual_test:None