Backend Analysis of ML in SIEM Tools #5

Satish-970 · 2025-09-27T18:15:29Z

Satish-970
Sep 27, 2025
Maintainer

Team, as part of our project, we need a clear understanding of how existing SIEM platforms implement their ML pipelines. Please focus on:

1.Analyzing ML Models in Other SIEMs

Identify the types of ML models used (anomaly detection, classification, time-series/sequence analysis).

Understand the algorithms behind these models (e.g., statistical methods, clustering, regression, neural networks).

2.Understanding Data Handling and Processing

Examine how these SIEM tools handle large-scale log and metric data.

Investigate techniques like batch processing, streaming pipelines, sliding windows, and distributed computation.

3.Documenting Insights

Prepare a concise report describing each ML model’s backend workflow.

Highlight strengths, weaknesses, and scalability approaches.

Objective: This analysis will inform the design of our own from-scratch hybrid ML pipeline, ensuring we adopt best practices and optimize for performance.

GOOD LUCK

allenjose24 · 2025-09-27T18:21:10Z

allenjose24
Sep 27, 2025
Maintainer

Sounds good. I’ll dig into how different SIEMs are actually applying ML — what types of models they use (like anomaly detection, classification, or time-series analysis) and the algorithms behind them. I’ll also look at how they handle large volumes of log data, whether through streaming, batch jobs, or distributed setups.

Once I’ve gone through this, I’ll put together a short report that covers the workflows, the strengths/weaknesses I notice, and what we can borrow for our own pipeline. This should give us a clearer picture before we start shaping our hybrid ML design.

0 replies

nikhilreddy1832 · 2025-10-07T16:26:12Z

nikhilreddy1832
Oct 7, 2025
Collaborator

Insights From the Prototype1:

1)The model found 4 main patterns (topics) in the data:
Login activity on AD Server

Firewall traffic (allow/deny and port scans)

File access on endpoint systems

Web requests on the server

2)Some log entries looked suspicious, like:

Many failed logins before success → possible brute-force attempt

Privilege escalation by admin → unusual system behavior

File delete/write actions on confidential files → insider risk

Access to /../etc/passwd → web attack attempt

3)Perplexity score: 22.13 (means the model fits the data fairly well).

4)Most logs were normal, but a few had high anomaly scores (>0.8), showing possible threats.

Visualizations:

How logs are grouped into different topics

Which entries have higher anomaly scores

This project helped me understand how machine learning can detect security threats automatically.
It also shows how analyzing system logs can reveal attacks or unusual activity early.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cyber Metrics

Backend Analysis of ML in SIEM Tools #5

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Cyber Metrics

Backend Analysis of ML in SIEM Tools #5

Uh oh!

Uh oh!

Satish-970 Sep 27, 2025 Maintainer

Replies: 2 comments

Uh oh!

allenjose24 Sep 27, 2025 Maintainer

Uh oh!

nikhilreddy1832 Oct 7, 2025 Collaborator

Satish-970
Sep 27, 2025
Maintainer

allenjose24
Sep 27, 2025
Maintainer

nikhilreddy1832
Oct 7, 2025
Collaborator