Name:Kubernetes Process with Resource Ratio Anomalies id:0d42b295-0f1f-4183-b75e-377975f47c65 version:4 date:2024-10-17 author:Matthew Moore, Splunk status:experimental type:Anomaly Description:The following analytic detects anomalous changes in resource utilization ratios for processes running on a Kubernetes node. It leverages process metrics collected via an OTEL collector and hostmetrics receiver, analyzed through Splunk Observability Cloud. The detection uses a lookup table containing average and standard deviation values for various resource ratios (e.g., CPU:memory, CPU:disk operations). Significant deviations from these baselines may indicate compromised processes, malicious activity, or misconfigurations. If confirmed malicious, this could signify a security breach, allowing attackers to manipulate workloads, potentially leading to data exfiltration or service disruption. Data_source:
how_to_implement:To implement this detection, follow these steps:
* Deploy the OpenTelemetry Collector (OTEL) to your Kubernetes cluster.
* Enable the hostmetrics/process receiver in the OTEL configuration.
* Ensure that the process metrics, specifically Process.cpu.utilization and process.memory.utilization, are enabled.
* Install the Splunk Infrastructure Monitoring (SIM) add-on. (ref: https://splunkbase.splunk.com/app/5247)
* Configure the SIM add-on with your Observability Cloud Organization ID and Access Token.
* Set up the SIM modular input to ingest Process Metrics. Name this input "sim_process_metrics_to_metrics_index".
* In the SIM configuration, set the Organization ID to your Observability Cloud Organization ID.
* Set the Signal Flow Program to the following: data('process.threads').publish(label='A'); data('process.cpu.utilization').publish(label='B'); data('process.cpu.time').publish(label='C'); data('process.disk.io').publish(label='D'); data('process.memory.usage').publish(label='E'); data('process.memory.virtual').publish(label='F'); data('process.memory.utilization').publish(label='G'); data('process.cpu.utilization').publish(label='H'); data('process.disk.operations').publish(label='I'); data('process.handles').publish(label='J'); data('process.threads').publish(label='K')
* Set the Metric Resolution to 10000.
* Leave all other settings at their default values.
* Run the Search Baseline Of Kubernetes Container Network IO Ratio known_false_positives:unknown References: -https://github.com/signalfx/splunk-otel-collector-chart drilldown_searches:
: tags: analytic_story: - 'Abnormal Kubernetes Behavior using Splunk Infrastructure Monitoring' asset_type:Kubernetes confidence:50 impact:50 message:Kubernetes Process with Resource Ratio Anomalies on host $host$ mitre_attack_id: - 'T1204' observable: name:'host' type:'Hostname' - role: - 'Victim' product: - 'Splunk Enterprise' - 'Splunk Enterprise Security' - 'Splunk Cloud' required_fields: - 'process.*' - 'host.name' - 'k8s.cluster.name' - 'k8s.node.name' - 'process.executable.name' risk_score:25 security_domain:network