Anvilogic on Snowflake Architecture
Anvilogic implementation on Snowflake (AWS, GCP, Azure).
Last updated
Anvilogic implementation on Snowflake (AWS, GCP, Azure).
Last updated
Below is the generic architecture digram for how Anvilogic works on top of Snowflake.
This supports Snowflake on Azure, AWS, and GCP.
Overall Diagram:
ETL Parsing & Normalization Process
PDF Download:
Snowflake will be configured in the IaaS environment that you have and is available across AWS, GCP, and Azure.
You can also have a separate Snowflake account per environment if you are a multi-IaaS organization.
Data that already originates in IaaS that can be sent to cloud storage does not require a streaming tool and can be onboarded to Snowflake directly.
Datasets that come from assets hosted within a data center or not in a public IaaS environment will require a solution to route that data to Snowflake.
Data streaming tools (ex. Cribl, Fluentbit, Apache NiFi) can be used to send on-prem. logs directly to Snowflake.
It is a requirement that you have a data transport/streaming tool to send data to IaaS storage or Anvilogic pipelines for ingestion.
Forwarding agents that are installed on endpoints also need to be re-configured to send to the streaming tools for ingestion into Snowflake.
Snowflake and/or Anvilogic does not provide any data streaming or endpoint agent technology.
Configure your security tools & appliances to log to cloud storage services like S3, Blob storage, or GCP storage - Snowpipe then picks it up and ingests into Snowflake.
All communication (search and detection use cases deployments) are done over REST API using HTTPS/443 with TLS v1.2+.
Yes, if you have a streaming tool (ex. Cribl, Fluentbit, Apache NiFi) you can send custom data sources directly to Anvilogic’s ingestion pipeline. Anvilogic also has some out of the box support for raw data ingestion sources in our integrations armory.
This will send data to our S3 storage service, which temporarily stores data to process it into Snowflake.
Yes, Anvilogic helps with all of the parsing and normalization of security relevant data into the Anvilogic schema. We have onboarding templates and configs that will help ensure the data you are brining into Snowflake is properly formatted to execute detections and perform triage, hunting, and response.
Yes, if you have third party Intel or CMDB tools that are required to be used within detection enrichment, those can be called via REST API and transported into a Snowflake table.
Anvilogic detections can then leverage those enrichment tables to enrich detections before those detections are stored in the Alert lake (upstream of SOAR).
Yes, Anvilogic can provide out of the box integrations for common vendor alerts and data collection for specific SaaS Security tools (ex. Crowdstrike FDR).
Tools not listed in our integration marketplace can be sent through the Custom Data Integration pipeline as a self service option.
Raw data sources are events/telemetry that is generated from endpoints/tools/appliances (ex. Windows Event logs, EDR logs).
Alerts data is curated signals from security tools (ex. Proofpoint alerts, Anti-virus alerts, etc.) that has already been identified to be suspicious or malicious by the vendor.
Yes, Anvilogic requires 2 warehouses to run.
Ad-hoc Warehouse - Compute for queries to assist search, hunt, and IR
Detect Warehouse - Run 24/7 executing scheduled tasks (detections) on a cron
Yes, Anvilogic can integrate with most SOARs via REST API through either a push or a pull method.
Yes, Anvilogic has a search user interface (UI) to make it easy to query data that is inside of a Snowflake database.
In addition, Anvilogic makes it easy to build repeatable detections that can execute on top of Snowflake using a low-code UI builder.
Yes, Anvilogic has a data model and offers parsing and normalization code for any security data set that you want to use within the platform.
Yes, we can also work with OCSF data, and each data feed can be modified/controlled to customize to your needs.
Yes, Anvilogic can onboard IOCs from your third party threat intel tools (ex. Threat Connect) and use that data to create new detections, conduct ongoing exposure checks across your data feeds, or use it to enrich your alert output for triage analysts.