HubbleStack Hubble — Cloud Security Compliance — Portfolio

What Is HubbleStack Hubble

HubbleStack Hubble is an open-source, modular security compliance framework built in Python. It installs as a lightweight agent on cloud Linux machines and provides on-demand profile-based auditing, real-time security event notifications, alerting, and reporting.

Hubble’s core modules:

Nova (Audit) — Profile-based security auditing. Runs configurable audit profiles against the host and reports compliance status. Checks things like file permissions, package versions, service states, kernel parameters, and CIS benchmark controls.
Nebula (Osquery) — Leverages osquery to gather system-level statistics and telemetry — running processes, open ports, installed packages, user accounts, network connections, scheduled jobs, and more.
Pulsar (FIM) — File integrity monitoring. Watches critical system files and directories for unauthorized changes and raises real-time alerts.
Quasar (Reporting) — Collects results from Nova, Nebula, and Pulsar and ships them to centralized logging destinations (Splunk, Logstash, etc.) via configurable returners.

The framework was originally built on top of SaltStack, using Salt’s module system, grains, and file client for configuration management and remote execution. Hubble agents run as a daemon on each machine, executing audits and scans on a configurable schedule, and pushing results to a centralized collection point.

My Role

I am a contributor to this open-source project — not the original creator. I worked on Hubble as part of a cloud security team, where it was deployed across a large fleet of Linux machines in AWS. My contributions focused on re-architecting a core module, removing technical debt, improving engineering quality, and building the data pipeline that made Hubble’s output actionable at scale.

What I Did

Re-Architected the Security Rules Module:

Hubble’s audit module (Nova) allowed teams to write security rules — compliance checks that run against hosts and report pass/fail status. The original implementation tightly coupled rule definitions with their execution logic, making it difficult to add new rule types without modifying core code
Re-architected the module to cleanly separate rule definitions (declarative profiles in YAML) from the execution engine. New rules could be added by writing a profile without touching the audit engine code
Introduced a pluggable comparator system — rule authors define what to check (a file permission, a config value, a running process) and how to compare (equals, greater than, regex match, contains). The engine handles execution, error handling, and result formatting
This made it significantly easier for security teams to write and maintain rules without needing deep knowledge of Hubble’s internals

Removed SaltStack Dependency:

Hubble was originally built on SaltStack, inheriting Salt’s module loader, grains system, file client, and configuration management. This created a heavy dependency — Salt is a large framework, and Hubble only used a fraction of it
Worked on removing the Salt dependency from key modules, replacing Salt’s loader with Hubble’s own lightweight module loading system
This reduced the installation footprint, eliminated Salt-related version conflicts and security vulnerabilities, and simplified deployment. Hubble could now run as a standalone agent without requiring a Salt master or minion infrastructure

Engineering Best Practices:

Introduced unit testing for the audit modules — the project had minimal test coverage when I started contributing. Wrote tests for the rule evaluation engine, comparators, and profile parsing logic
Added proper error handling and logging throughout the modules I worked on — previously, failures in rule execution could silently skip checks without reporting them as errors
Improved code structure — broke monolithic modules into smaller, testable components with clear interfaces
Added pre-commit hooks and linting configuration to catch issues before they reached code review

Bug Fixes:

Fixed issues in the audit execution path where certain rule types would fail silently on specific OS configurations
Fixed edge cases in the comparator logic — regex comparisons on multi-line output, numeric comparisons with mixed-type values, and handling of missing/null data from system queries
Fixed issues with the daemon’s scheduling system where audit runs could overlap if a previous run took longer than expected

Centralized Log Collection & Data Pipeline

Hubble agents across the fleet produce a massive volume of audit results, compliance reports, osquery telemetry, and file integrity alerts. Making this data useful required a robust collection and analysis pipeline.

Log Collection Architecture:

Each Hubble agent ships results to a centralized log collector using Hubble’s returner system
Results are structured JSON — every audit check includes the host identity, timestamp, profile name, check name, pass/fail status, actual vs expected values, and severity
The log collector aggregates results from all agents across the fleet

Splunk Integration:

Hubble audit results and security events are ingested into Splunk for real-time monitoring, alerting, and dashboarding
Built Splunk dashboards for: fleet-wide compliance posture (percentage of hosts passing each CIS benchmark control), trending compliance over time, hosts with the most failures, and new failures (regressions)
Configured Splunk alerts for critical security events — a host failing a high-severity check, file integrity violations on sensitive system files, unexpected processes or open ports detected by osquery
Security analysts could drill down from a fleet-wide view to a specific host’s full audit history

Databricks Pipeline:

For deeper analysis and long-term trend tracking, Hubble data flowed into Databricks via a data pipeline
Raw audit results stored in a data lake for historical analysis — compliance trends over months, correlation between configuration changes and security incidents, identification of systemic issues across the fleet
Databricks notebooks for security team analysis — which security controls fail most often, which teams/regions have the lowest compliance, and what is the average time-to-remediation after a failure is detected

Technical Challenges

Salt dependency removal complexity — Salt’s module loader was deeply intertwined with Hubble’s code. Modules used Salt grains for host identification, Salt’s file client for profile distribution, and Salt’s configuration system for settings. Removing Salt required building lightweight replacements for each of these while maintaining backward compatibility with existing audit profiles. This was an incremental process — replacing one Salt dependency at a time, testing thoroughly at each step.
Rule engine extensibility — The re-architected rule engine needed to support existing profiles without breaking changes while also being extensible for new rule types. Designed the comparator system to be backward-compatible — old-style rules still worked, but new rules could use the cleaner declarative syntax. Migration was gradual, not a flag-day switch.
Testing a system-level tool — Hubble checks system state (file permissions, running services, kernel parameters). Unit testing these checks requires mocking system calls extensively. Built a test harness that could simulate various OS states so rule evaluation could be tested without running on actual hosts. Integration tests ran in Docker containers representing different OS configurations.
Log volume at scale — Hundreds of machines each running dozens of audit checks on a schedule generate enormous log volume. The Splunk ingestion pipeline needed careful tuning — index sizing, source type configuration, and retention policies to keep costs manageable while retaining enough history for trend analysis. Databricks handled the long-term archival and heavy analytical queries that would be too expensive to run in Splunk.

Architecture

Hubble Agent — Python daemon installed on each Linux machine. Runs audit profiles (Nova), osquery queries (Nebula), and file integrity monitoring (Pulsar) on configurable schedules. Ships results via returners.
Audit Engine (Nova) — Re-architected module with pluggable comparators. Reads YAML audit profiles, executes checks against the host, and produces structured JSON results with pass/fail status and evidence.
Osquery Integration (Nebula) — Executes osquery packs to gather system telemetry — processes, ports, packages, users, network connections.
File Integrity (Pulsar) — Monitors critical files and directories using inotify. Raises real-time alerts on unauthorized changes.
Log Collector — Aggregates structured JSON results from all agents across the fleet.
Splunk — Real-time ingestion of audit results and security events. Dashboards for compliance posture, trending, and per-host drill-down. Alerts for critical violations.
Databricks — Data pipeline for long-term storage and deep analysis. Historical compliance trends, fleet-wide pattern detection, and analytical notebooks for the security team.
Infrastructure — Agents deployed across AWS Linux fleet. Centralized collection and pipeline infrastructure on AWS.

Results & Impact

Re-architected rule engine — security teams could write new audit rules in YAML without modifying Hubble’s core code, dramatically increasing the speed of new compliance check development
Salt dependency removal — lighter installation footprint, fewer security vulnerabilities from transitive dependencies, and simpler deployment without Salt infrastructure
Unit test coverage — introduced testing discipline to a project that had minimal coverage, catching regressions and enabling confident refactoring
Fleet-wide compliance visibility — Splunk dashboards gave the security team real-time, fleet-wide compliance posture for the first time, replacing manual spot-checks with continuous monitoring
Long-term trend analysis — Databricks pipeline enabled historical compliance analysis and identification of systemic security patterns across hundreds of machines
Open-source contributor — all contributions merged upstream into the HubbleStack project, benefiting the broader community

Stack Deep Dive

Python for the Hubble agent and all modules — audit engine, comparators, returners, and daemon
SaltStack (partially removed) — original framework dependency, incrementally replaced with lightweight alternatives
Osquery for system-level telemetry collection — processes, ports, packages, network state
Splunk for real-time log ingestion, dashboarding, alerting, and compliance monitoring
Databricks for long-term data pipeline, historical analysis, and security analytics notebooks
AWS for fleet infrastructure — Linux machines running Hubble agents across multiple accounts and regions
YAML for declarative audit profile definitions — the interface between security teams and the audit engine