Detect, Alert, Search: The SIEM You Already Have on AWS

The question came up in a security assessment. "Do you have a SIEM?"

The honest answer was no. There was no budget for one either. A commercial SIEM meant a license, an ingestion bill that grew with every log source, and someone to run it. None of that was happening this quarter.

But the question behind the question was fair. Can you see what is happening in your AWS accounts? Would you know if a credential was being used from somewhere it should not be? Could you go back and check?

That is what a SIEM is for. And the account already had most of the parts to answer those questions. They just were not connected.

You do not buy a SIEM to get detection and searchable history on AWS. You connect four services you may already be paying for.

If you read the last article, this is the other half of it. The guardrails there kept an attacker from deleting CloudTrail or switching off GuardDuty. Here we use the logs and findings those guardrails protect.

What a SIEM Actually Does

Strip away the marketing and a SIEM does three things that matter here.

It collects signals from many sources and puts them in one format. It alerts you in real time when something important happens. And it keeps history you can search later, when you need to investigate.

Collect, alert, search. That is the job.

What we are not rebuilding is the rest of the brochure: a correlation engine that stitches twenty events into one story, user behavior analytics, prebuilt compliance dashboards, a case management workflow. Those are real features. Most teams asking "do we have a SIEM?" do not need them yet. They need to know when something fires, and be able to look back.

The Architecture

Four services, each doing one job.

GuardDuty is the sensor. It watches CloudTrail, VPC flow logs, and DNS queries, and raises findings when it sees credential abuse, cryptomining, or traffic to known-bad infrastructure. You do not write rules for it. You turn it on.

Security Hub is the part that makes this a SIEM instead of a pile of consoles. It ingests findings from GuardDuty, Inspector, IAM Access Analyzer, Macie, and its own security checks, and normalizes all of them into one schema, the AWS Security Finding Format (ASFF). That normalization is work a SIEM would otherwise charge you for.

EventBridge is the router. Security Hub emits an event every time it imports a finding. A rule decides which of those events are worth waking someone up for.

Lambda is the processor. It reads the finding, pulls out what a human needs, sends it to Slack, and writes the raw record to S3.

GuardDuty ─┐
Inspector ─┤
Config    ─┼─► Security Hub ──► EventBridge ──► Lambda ─┬─► Slack  (real-time alert)
Access    ─┤   normalize        filter by      format   │
Analyzer  ─┘   (ASFF)           severity                └─► S3 ──► Athena  (searchable history)

S3 plus Athena is the half people skip, and it is the half that earns the word SIEM. Security Hub keeps findings for about 90 days. The S3 archive is your real history, and Athena lets you query it with SQL when you need to answer "has this happened before?"

Setting It Up

The order matters. Sensor first, then aggregator, then the pipe.

If GuardDuty or Security Hub is already on in your account, skip ahead. Most of these steps are idempotent. The point is not turning things on, it is connecting what is already running.

Step 1. Turn on GuardDuty. This is the detection sensor. Without it, Security Hub has thin threat signal.

aws guardduty create-detector --enable --region us-east-1

The command returns a detector ID. Save it. You will use it to send test findings later.

Step 2. Enable Security Hub with its default standards. This turns on the aggregator and a baseline set of checks, and it wires GuardDuty findings in automatically.

aws securityhub enable-security-hub --enable-default-standards --region us-east-1

Give it a few hours. Findings do not appear instantly. Security Hub runs its checks and pulls in GuardDuty on its own schedule.

Step 3. Create the archive bucket, with public access blocked and versioning on. This is where the searchable history lives, so it should outlive any single finding.

BUCKET_NAME="your-account-id-siem-findings"

aws s3api create-bucket --bucket "$BUCKET_NAME" --region us-east-1

aws s3api put-public-access-block --bucket "$BUCKET_NAME" \
  --public-access-block-configuration \
  BlockPublicAcls=true,IgnorePublicAcls=true,BlockPublicPolicy=true,RestrictPublicBuckets=true

aws s3api put-bucket-versioning --bucket "$BUCKET_NAME" \
  --versioning-configuration Status=Enabled

Step 4. Wire the EventBridge rule. The piece that decides what is worth an alert is the event pattern. This is the one to get right.

{
  "source": ["aws.securityhub"],
  "detail-type": ["Security Hub Findings - Imported"],
  "detail": {
    "findings": {
      "Severity": { "Label": ["HIGH", "CRITICAL"] },
      "Workflow": { "Status": ["NEW"] },
      "RecordState": ["ACTIVE"]
    }
  }
}

Read it from the inside out. RecordState ACTIVE ignores findings Security Hub has already archived. Workflow.Status NEW skips anything someone has already triaged or suppressed. Severity.Label HIGH and CRITICAL is the filter that keeps this from becoming noise on day one. Inside an attribute the values are OR, across attributes they are AND. So this matches a new, active, high-or-critical finding, and nothing else.

That last filter is a decision, not a default. Start narrow.

The Lambda

The function does the unglamorous work. Turn a finding into a message a person can act on, and keep a copy.

import json, os, urllib.request
import boto3

s3      = boto3.client("s3")
secrets = boto3.client("secretsmanager")

BUCKET  = os.environ["ARCHIVE_BUCKET"]
WEBHOOK = secrets.get_secret_value(
    SecretId=os.environ["WEBHOOK_SECRET_ARN"])["SecretString"]


def handler(event, _context):
    for finding in event["detail"]["findings"]:
        archive(finding)
        notify(finding)
    return {"statusCode": 200, "body": f"Processed {len(event['detail']['findings'])} findings"}


def archive(finding):
    fid = finding["Id"].rsplit("/", 1)[-1]
    key = f"findings/{finding['CreatedAt'][:10]}/{finding['AwsAccountId']}/{fid}.json"
    s3.put_object(Bucket=BUCKET, Key=key, Body=json.dumps(finding).encode())


def notify(finding):
    text = (f":rotating_light: *{finding['Severity']['Label']}*  "
            f"{finding['Title']}\n"
            f"account `{finding['AwsAccountId']}`  region `{finding['Region']}`")
    req = urllib.request.Request(
        WEBHOOK,
        data=json.dumps({"text": text}).encode(),
        headers={"Content-Type": "application/json"})
    urllib.request.urlopen(req, timeout=5)

A few decisions worth calling out. One EventBridge event can carry several findings, so the handler loops, it does not assume one. The function returns a clean response so failures are explicit. If the Lambda throws, EventBridge retries twice by default (configurable up to 185 retries with exponential backoff on the rule's retry policy). That means a transient Slack outage will not lose findings permanently. The S3 key is built from the finding date, account, and ID, so a redelivery of the same finding overwrites instead of duplicating. The webhook URL is not in the code. It comes from Secrets Manager, because a webhook in source control is a credential leak waiting to happen. And the function uses urllib from the standard library, not requests, so there is nothing to package. It deploys as a single file.

The full version, with the IAM role scoped to one bucket prefix and the EventBridge wiring, is in the repo. Link is at the end.

Querying the Archive with Athena

Once findings land in S3, Athena turns the bucket into a searchable database. Create the table once:

CREATE EXTERNAL TABLE siem_findings (
  Id             STRING,
  AwsAccountId   STRING,
  Region         STRING,
  Title          STRING,
  Description    STRING,
  CreatedAt      STRING,
  Severity       STRUCT<Label:STRING, Normalized:INT>,
  Workflow       STRUCT<Status:STRING>,
  ProductName    STRING,
  Resources      ARRAY<STRUCT<Type:STRING, Id:STRING, Region:STRING>>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://your-account-id-siem-findings/findings/'
;

Then query as needed. Example — all critical findings in the last 30 days:

SELECT CreatedAt, Title, AwsAccountId, Region
FROM siem_findings
WHERE Severity.Label = 'CRITICAL'
  AND CreatedAt > date_format(current_date - interval '30' day, '%Y-%m-%d')
ORDER BY CreatedAt DESC;

This is the piece that earns the word SIEM. Not a live dashboard, but on-demand investigation when an incident requires context.

The Findings I Route First

Turning on every finding at once is how a SIEM becomes a channel everyone mutes. Start with the ones that usually mean someone is already inside.

From GuardDuty: UnauthorizedAccess on IAM credentials, CryptoCurrency activity, Backdoor and Trojan types, and any use of the root account. From Security Hub's own checks: root account in use, S3 buckets open to the public, console users without MFA, security groups open to 0.0.0.0/0 on admin ports.

That is a short list on purpose. You widen the EventBridge pattern later, by adding MEDIUM to the severity filter, once you trust the signal and have somewhere to put it. Widening is a one-line change. Earning back a team's attention after you have flooded them is not.

What This Is Not

This is honest detection and searchable history. It is not an enterprise SIEM, and pretending otherwise will get someone hurt.

It does not correlate. It routes one finding at a time. It will not connect a failed login, a new access key, and a data transfer into a single story. AWS has a reference pattern for that with EventBridge, Lambda, and DynamoDB, but that is a build, not a checkbox.

Athena is query on demand, not a live dashboard. You write SQL when you investigate. If you want charts, that is QuickSight, and QuickSight costs.

It is single account as written. Multi-account means setting up Security Hub and GuardDuty with a delegated administrator and aggregating findings into one account. Same pattern, more setup.

And "without buying anything" means no third-party license and no new product. It does not mean free. Security Hub bills per finding ingested and per check, GuardDuty bills on the events and volume it analyzes, Athena bills per terabyte scanned. For a small to mid AWS footprint this lands in the tens of dollars a month. A commercial SIEM for the same footprint starts in the thousands. That gap is the whole point.

To put numbers on it: a small account generating 100 findings per day and running 200 security checks hits roughly \(4/month for Security Hub (\)0.00003/finding + \(0.0010/check), \)10–15/month for GuardDuty (CloudTrail and VPC flow analysis on modest traffic), and pennies for Athena queries against a few gigabytes in S3. Call it \(15–25/month total. A commercial SIEM ingesting the same data starts at \)1,000/month before you count the engineer running it.

When This Is Enough, and When It Is Not

It is enough when you have a small to mid AWS footprint, you need to know when something fires, you need to be able to look back, and no auditor is requiring a specific named product.

It is not enough when a compliance framework names a SIEM you have to use, when you need to correlate AWS with on-prem or another cloud, when you need monitored response around the clock, or when your finding volume is high enough to need real tuning and a team to do it.

"We don't have a SIEM" and "we have no visibility" are two different problems. Only one of them needs a budget.

Detection is the easy half. The next article is the hard half: what happens after the alert fires, when the incident response playbook assumes a team, centralized logs, and time you do not have.

Open Security Hub in your account right now. If there are findings sitting there with no one watching, you already have the signal. The only missing piece is the wire.

The deployable template, the Lambda, and a test script that fires a sample finding through the whole pipeline are in a companion repo: https://github.com/Kiruma/aws-lightweight-siem

Tags: AWS, Cloud Security, Security Hub, GuardDuty, SIEM

Detect, Alert, Search: The SIEM You Already Have on AWS

What a SIEM Actually Does

The Architecture

Setting It Up

The Lambda

Querying the Archive with Athena

The Findings I Route First

What This Is Not

When This Is Enough, and When It Is Not

Comments

More from this blog

A Starting Point for AWS Service Control Policies

IAM Misconfiguration: Why It Keeps Happening and How to Start Fixing It

Command Palette

What a SIEM Actually Does

The Architecture

Setting It Up

The Lambda

Querying the Archive with Athena

The Findings I Route First

What This Is Not

When This Is Enough, and When It Is Not

Comments

More from this blog