Only this pageAll pages
Powered by GitBook
1 of 42

Public Docs

Loading...

Get Started

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Loading...

Security Controls

Loading...

Loading...

Welcome to Anvilogic

What is Anvilogic?

Anvilogic is an AI SOC solution and multi-data platform that enables detection engineers and threat hunters to detect, hunt, and investigate seamlessly across disparate data lakes and SIEMs without the need to centralize data, learn new languages or deploy new sensors.

Anvilogic empowers enterprise SOCs to rapidly mature their detection programs with a dual approach: instantly deployable, curated detections and a powerful low-code builder for crafting correlated custom alerts. With thousands of expert-built detections ready to deploy in a single click, teams can accelerate threat coverage from day one. Anvilogic’s platform also features automated workflows and AI-driven insights for tuning, triage, maintenance, and critical alert escalation—helping SOCs hunt threats with greater speed and precision. Real-time SOC maturity scoring gives teams continuous visibility into their detection posture, mapped against their most critical threats.

Onboarding guide

Congratulations and welcome to Anvilogic!

This guide will help you log in, complete the guided onboarding to set threat priorities and integrate a data repository, get data in, and deploy detections.

Onboarding workflow

The following flowchart summarizes the tasks you will complete to get started.

Anvilogic onboarding tasks

Get started

The Anvilogic platform is supported on the highest versions of Google Chrome and Mozilla Firefox.

If you're ready, to get started with your Anvilogic onboarding.

Need help?

If you run into any issues, see for information about how you can contact us.

Log in and set your password
Get help

Download and install the Anvilogic App for Splunk

Integrate Splunk with the Anvilogic platform using the Anvilogic App for Splunk.

The Anvilogic App for Splunk provides triage, allow list and suppressions management, and analytics used by the data feed and productivity scores on the maturity score pages.

You can also enable automated threat detection in the Anvilogic App for Splunk, which is required to generate tuning insights and some hunting insights.

Snowflake-only customers can get tuning insights without the Anvilogic App for Splunk.

I am a Splunk user

If you are already using Splunk Enterprise or Splunk Cloud Platform, follow the instructions in the documentation to download and install the Anvilogic App for Splunk.

Next step

Select one of the following to continue:

I don't have Splunk

If you don't have Splunk, and you want the capabilities provided by the Anvilogic App for Splunk, Anvilogic will provision a Splunk instance for you and manage the installation and upgrade of the Anvilogic App for Splunk.

Next step

After the Anvilogic platform is connected to a hosted Splunk instance, .

Splunk Enterprise
Splunk Cloud Platform
Review data feeds

Install the Anvilogic App for Splunk

Install the Anvilogic App for Splunk in your Splunk Enterprise environment.

Follow the instructions in the Splunk documentation to install the Anvilogic App for Splunk in your environment:

  • If you have a distributed Splunk Enterprise deployment, use the deployer to install the app on your search heads. See Install an add-on in a distributed Splunk Enterprise deployment in the Splunk Supported Add-ons manual.

  • If you have a single-instance Splunk Enterprise deployment, install the app on the search head. See Install an add-on in a single-instance Splunk Enterprise deployment in the Splunk Supported Add-ons manual.

You must restart Splunk to complete the installation.

Splunk Cloud Platform

High-level steps for downloading and install the Anvilogic App for Splunk on Splunk Cloud Platform.

Perform the following tasks to download and install the Anvilogic App for Splunk on Splunk Cloud Platform:

  1. Verify requirements

  2. Install the Anvilogic App for Splunk

Next step

Define your company's threat profile

After you log in, use the guided onboarding experience to define your company's threat profile.

Use the guided onboarding to define your company's threat profile and make the Anvilogic platform work according to your needs and priorities.

Benefits of setting threat priorities

Anvilogic provides prioritized content recommendations based on the following factors:

  • Your threat priorities

Create the Anvilogic indexes

Create the required custom indexes on the Splunk platform.

The Anvilogic App for Splunk requires custom Splunk indexes used by the HTTP Event Collector (HEC) collector command for auditing, metrics and reporting:

  1. Create an index named <your-org-name>_anvilogic for storing Anvilogic rule output and auditing the app. See in the Splunk Enterprise Managing Indexers and Clusters of Indexers manual.

  2. Create a metrics index named <your-org-name>_anvilogic_metrics for storing the output of baselining rules. See in the Splunk Enterprise Managing Indexers and Clusters of Indexers manual.

Market and industry trends

  • Your trusted group activity

  • Popular search terms

  • Activity from organizations similar to you

  • Gather your organization’s specific threat priorities to help Anvilogic recommend use cases specific to your organization rather than generic recommendations based on external factors.

    What's in a threat profile?

    To build your company profile, provide the information listed in the table. This information helps to filter the MITRE techniques most applicable to you, so that the most relevant recommended content is generated.

    Category
    Description

    Region

    Select the geographical region in which your company operates. If you operate in multiple regions, select Global.

    Industry

    Select the industry vertical that best represents your company. You can select more than one industry.

    Infrastructure

    Select the infrastructure used within your organization. Select as many as apply to your organization.

    Revisit your threat priorities

    As your organization matures over time, you can revisit and update your threat profile to accommodate changes to your infrastructure, including platforms, threat groups, techniques, and data categories.

    Next step

    After you define your threat profile, Select your data repository and get data in.

    Verify requirements
    Next step

    Assign the avl_admin role.

    Create events indexes
    Create metrics indexes

    Additional tasks

    As an admin user, grant additional users access to the Anvilogic platform, or set up more secure authentication settings.

    Create users and assign privileges

    See Add a new user for instructions on how to add users who can access the Anvilogic platform. When you add a new user, you assign them roles which grant certain privileges to the user when they access the platform. See User roles and privileges (RBAC) for a full list of platform roles.

    Configure additional authentication settings

    You can configure additional authentication settings for access to the Anvilogic platform, such as multi-factor authentication (MFA) or single sign-on (SSO). See and for more information.

    Review and deploy recommended content

    Review and deploy a variety of detections on the Anvilogic platform.

    The Anvilogic platform generates recommended content for you to deploy based on your threat priorities and good quality data feeds.

    Where can I view recommended content?

    You can view recommended content on the Home page and in the Armory, which shows you all available detections not yet deployed in your system.

    Connect to the Anvilogic platform

    After you install the Anvilogic App for Splunk, you must configure the app to connect to the Anvilogic platform.

    Connect the app to the Anvilogic platform

    Perform the following steps to complete your initial configurationand connect the Anvilogic App for Splunk to the Anvilogic platform:

    You must have the avl_admin role to edit the app configuration page.

    Integrate Splunk as your data repository

    Integrate the Anvilogic platform with your Splunk Enterprise or Splunk Cloud Platform instance.

    After defining your company profile in the guided onboarding, select Splunk as the data logging platform.

    You must have admin privileges in Splunk in order to complete the integration.

    Fluentd

    The following page will help you understand how you can use Fluentd to send data to Anvilogic to ingest into Snowflake.

    What is Fluentd?

    Fluentd is an open source streaming tool that can be used to send data to Snowflake to leverage with Anvilogic.

    You can leverage the below configs as templates for how to stream common security data sets to Anvilogic's data onboarding pipeline.

    Remember: Anvilogic helps to parse and normalize this data to our schemas automatically once the data has been sent to our pipeline.

    FluentBit

    The following page will help you understand how you can use FluentBit to send data to Anvilogic to ingest into Snowflake.

    What is FluentBit?

    Fluentd is an open source streaming tool that can be used to send data to Snowflake to leverage with Anvilogic.

    You can leverage the below configs as templates for how to stream common security data sets to Anvilogic's data onboarding pipeline.

    Remember: Anvilogic helps to parse and normalize this data to our schemas automatically once the data has been sent to our pipeline.

    Data Type Examples:

    Splunk Enterprise

    High-level steps for downloading and install the Anvilogic App for Splunk on Splunk Cloud Platform.

    Perform the following tasks to download and install the Anvilogic App for Splunk on Splunk Enterprise

    (Optional) Upload your existing detections

    Upload your existing detections using a CSV file.

    This is an optional step and if you choose not to do it now, you can come back and do this later at any point in time.

    If you have existing detections, you can export them to a CSV file, then import the CSV into Anvilogic. Doing this helps you get an idea of what your MITRE coverage looks like, so you can address and strengthen the areas where you need additional coverage.

    The CSV file must have the title, description, and search of the existing detection. See for instructions to import the CSV file into Anvilogic. This document also describes how to properly format the CSV file when you create it.

    Configure MFA with Duo
    Configure SSO
    Verify requirements
    Download the Anvilogic App for Splunk
    Install the Anvilogic App for Splunk
    Next step

    Review and deploy recommended content

    Import existing rules
    Next step

    Download and install the Anvilogic App for Splunk

  • In Splunk Web, select Apps > Anvilogic to access the Anvilogic App for Splunk.

  • If this is your first time installing the Anvilogic App for Splunk, you are prompted to set up the app. Click Continue to app setup page. To access the app configuration settings after the initial configuration, go to Settings > App Configuration.

  • Complete the general settings.

    1. On the Anvilogic platform, select Settings > Generate API Key. Copy the generated API key.

    2. Navigate to the Anvilogic App for Splunk.

    3. Select Setting > App Configuration.

    4. Click and expand the General Settings section.

    5. Click and expand the API Settings section.

    6. Paste the API key you copied earlier into the API Key field.

  • If your network requires a proxy to connect to Anvilogic, configure the proxy settings in the Anvilogic App for Splunk configuration page.

  • Click Save.

  • Verify the connection

    In your Splunk instance, run the following Splunk search to verify your app's connection with the Anvilogic platform:

    You can view your connection status along with other system health information in the Health Monitoring dashboard in the Anvilogic App for Splunk.

    Next step

    Review data feeds.

    | avlmanage command=check_app_health
    Deploy recommended content

    The table defines additional types of recommended content on the Anvilogic platform and how you can deploy them.

    Content
    Description

    Threat identifiers

    Recommended threat identifiers can be viewed on the Home page and in the Armory. See for an example of how to deploy a recommended threat identifier from the Home page.

    Trending topics

    Trending topics are in-product versions of the Forge Threat Detection Report emails sent to existing customers. Trending topics can be found on the Home page and the Armory. See for an example of how to deploy all the content in a trending topic.

    Detection packs

    Detection packs are collections of threat identifiers, threat scenarios, and macros that address a specific security issue. Detection packs can be viewed in the Armory. See for an example of how to deploy all the content in a detection pack.

    Next steps

    Perform Additional tasks to set up user access and authentication.

    Linux data
    Syslog data
    Windows data

    Verify requirements

    Verify the requirements on this page before you download and install the Anvilogic App for Splunk.

    Supported versions

    You can integrate the Anvilogic platform with Splunk Enterprise versions 9.0.x and 8.0 - 8.3.x.

    Splunk Enterprise Security (ES) versions 5.0 - 7.0.x are supported.

    Where to install the app

    Install the Anvilgic App for Splunk on your Splunk search head. The server where you install the Anvilogic App for Splunk must meet the following requirements:

    • The server must be able to connect to over port 443. This is required to download Splunk code and rules metadata.

    • The server must be able to connect to over port 443.

    • The server must be able to connect to over port 443 to send events for third party vendor alert integrations.

    If you have multiple Splunk Enterprise instances, install the Anvilogic App for Splunk in only one of those environments.

    Splunk Enterprise deployment considerations

    For performance considerations, review the following factors in your Splunk Enterprise deployment:

    • The number of concurrent users.

    • The number of concurrent searches.

    • The types of searches used.

    See in the Splunk Enterprise Capacity Planning Manual.

    When you deploy threat identifiers on the Anvilogic platform, saved searches are created in your Splunk deployment. You can use cron scheduler recommendations on the Anvilogic platform to manage the load on your Splunk deployment.

    Splunk Enterprise resource and hardware considerations

    Resource and hardware considerations for the Anvilogic App for Splunk match the recommendations for your Splunk Enterprise deployment. See in the Splunk Enterprise Capacity Planning Manual.

    Last updated 1 month ago

    Integrate Snowflake as your data repository

    Integrate the Anvilogic platform with Snowflake.

    Choose Snowflake

    After defining your company profile in the guided onboarding, select Snowflake as the data logging platform.

    You must have admin privileges in Snowflake in order to complete the integration.

    Connect Snowflake to the Anvilogic platform

    Perform the following steps to complete the integration with Snowflake:

    1. Input your Snowflake account identifier to establish a connection between your Snowflake instance and the Anvilogic platform.

    2. Click Copy Code, then click Go to Snowflake to go to your Snowflake instance and run the copied SQL commands. This set of SQL commands creates the necessary Snowflake components, the anvilogic_service Snowflake user used by the Anvilogic platform, and assigns the necessary permissions to the anvilogic_admin role for the anvilogic_service user.

    3. Perform the following tasks in your Snowflake instance:

    Next step

    After you have defined your company's threat profile and connected Snowflake as a data repository, it's time to .

    Log Analytics Cross-Tenant Search

    Learn how to configure Azure Lighthouse to enable cross-tenant searches in Microsoft Log Analytics.

    In order to execute cross-tenant queries against a Microsoft Azure Log Analytics Workspace, the proper permissions first need to be configured. This can be done using Azure Lighthouse, a free service that assists customers in managing multiple Azure tenants. In this case, it is used to assign role-based access control (RBAC) permissions to grant service principals permissions across tenants.

    What follows are the instructions to set up Azure Lighthouse to enable the Anvilogic Azure integration to query across Log Analytics Workspaces in different Azure tenants.

    Terminology

    • Provider - The tenant that is providing the service (in which the Anvilogic ADX cluster was deployed).

    • Customer - The tenant that the provider needs access to. This contains the Log Analytics Workspaces that will be searched.

    There is only one provider, but there can be many customers.

    Other Considerations

    At the moment, Microsoft does not support resource-level permissions. Their guidance is to place active DENY permissions for the Anvilogic service principal on any resources in the Customer Resource Group that you don't want it to be able to access.

    Alternatively, you can move the Log Analytics Workspace to it's own resource group using the . This is a non-destructive change and would not impact the workspace (i.e. it can be done while in production).

    Microsoft also recommends that customers have only one Log Analytics Workspace per region. If customers are using multiple, that is an anti-pattern from Microsoft's perspective. For more information, see .

    Select your data repository and get data in

    Select the data repository where you store your logs.

    After defining your company profile in the guided onboarding, select a data repository:

    Select a data logging platform

    Next step

    Follow the instructions for your data repository.

    • Integrate Splunk as your data repository

    Install the Anvilogic App for Splunk

    The process to get the Anvilgic App for Splunk differs depending on whether you are using Splunk Cloud Platform Classic Experience or Splunk Cloud Platform Victoria Experience.

    Splunk Cloud Platform Classic Experience

    File a service ticket to have Splunk install the Anvilogic App for Splunk for you.

    Splunk Cloud Platform Victoria Experience

    Follow the instructions in in the Splunk Cloud Pkatform Admin Manual to install the Anvilogic App for Splunk.

    Next step

    .

    Download the Anvilogic App for Splunk

    This page provides instructions for downloading the Anvilogic App for Splunk.

    Download instructions

    Perform the following steps to download the Anvilogic App for Splunk:

    1. Access .

    Configure the HEC collector commands

    Create a HEC token that can write to the custom indexes you just created.

    The Anvilogic App for Splunk contains a custom Splunk command that uses the HTTP Event Collector (HEC) to send results from threat identifiers into the events of interest index. This command is critical to the frameworks ability to store events for advanced correlation, and manages auditing on all objects.

    More information on the HEC and how to set it up can be found in in the Splunk Enterprise Getting Data In manual.

    Perform the following steps to create inputs on a single search head. Some steps may vary if you are managing a search head cluster.

    1. In Splunk Web, select Settings > Data inputs.

    2. Select

    Verify requirements

    Verify the requirements on this page before you download and install the Anvilogic App for Splunk.

    Supported versions

    You can integrate the Anvilogic platform with Splunk Cloud Platform versions 8.0.x and higher. Splunk Enterprise Security (ES) versions: 5.0 - 7.0.x are supported.

    If you are using the Splunk Cloud Platform Classic experience, you won't be able to accept tuning insights.

    See for more information about the differences between Splunk Cloud Platform Classic Experience and Splunk Cloud Platform Victoria Experience.

    Log in and set your password

    Log in for the first time and set your password on the Anvilogic platform.

    Your welcome packet email from Anvilogic contains a link to log in to the Anvilogic platform for the first time. You must change your password the first time you log in to the Anvilogic platform.

    1. Make sure you are a user with administrator privileges on the Anvilogic platform.

    2. Click Set Password in the welcome package email. You are directed to set password page.

    3. Enter the email address with which you have registered. This email address must match the email address that the welcome email was sent to.

    https://secure.anvilogic.com
    https://eoi-files.anvilogic.com
    https://databus.anvilogic.com
    How concurrent users and and searches impact performance
    Reference hardware
    Azure Resource Mover
    https://learn.microsoft.com/en-us/azure/azure-monitor/logs/workspace-design
    Install a public app from Splunkbase
    Create the Anvilogic indexes
    Deploy a recommended Snowflake threat identifier
    Deploy a trending topic
    Deploy a detection pack
    HTTP Event Collector > New Token
    .
  • Fill in relevant information:

    • Specify a name of avl_hec_token.

    • Leave the Source Name Override blank.

    • Enter HEC Input for Anvilogic Detection Framework as the description.

    • Leave the Output Group as none.

    • Leave the Enable indexer acknowledgement box unchecked.

  • Click Next to configure the input settings:

    • Source type = Automatic

    • App Context = Anvilogic (anvilogic)

    • index = anvilogic AND index = anvilogic_metrics

    • Default Index = anvilogic

  • Click Review, then click Submit.

  • Copy the token value.

  • Perform the following steps to update the global settings and enable the tokens:

    1. In Splunk Web, select Settings > Data inputs.

    2. Select HTTP Event Collector > Global Settings.

    3. Ensure the following settings are enabled:

      • All Tokens: Enabled

      • Enable SSL - Check

      • HTTP Port Number = Default is 8088

    Next step

    Connect to the Anvilogic platform.

    Configure HTTP Event Collector on Splunk Enterprise
    Allow IPs

    If you are installing the Anvilogic App for Splunk on Splunk Enterprise Security (ES) search heads in Splunk Cloud Platform, and you also have search heads that are not on Splunk ES, you must allow all IPs to send to the Splunk Cloud HTTP event collector (HEC) endpoint on port 443 since Splunk Cloud Platform does not assign static IPs to the Splunk Cloud Platform search heads.

    This setting requires an HEC token for authentication and is often used to send data to Splunk Cloud Platform from multiple devices with dynamic IPs, such as mobile devices. See Configure IP allow lists for Splunk Cloud Platform in the Splunk Cloud Platform Admin Config Service Manual for instructions.

    Remove the app from dual environments

    If your environment includes Splunk ES running on Splunk Cloud Platform Victoria and Splunk Enterprise, the Anvilogic App for Splunk is installed in both environments. You must submit a support ticket with Splunk Support to remove the Anvilogic App for Splunk from one of those environments.

    Next step

    After verifying the requirements, Install the Anvilogic App for Splunk.

    Splunk Cloud Platform Service Details
    Integrate Snowflake as your data repository

    Monte Copilot & AI privacy and controls

    Frequently asked questions around privacy and security controls for Monte Copilot and AI used within the Anvilogic platform.

    Is any customer data used to train the AI models in either AI Insights or Copilot?

    Anvilogic agrees that it shall not use, process, or allow access to sensitive or raw Customer Data for the purpose of training, developing, or refining artificial intelligence (AI) models, or any other automated decision-making technologies.

    Is any of the data in the conversations with Monte Copilot being used for model training?

    No. Any data sent to Monte Copilot is not used to train the models. Customer feedback when using Monte Copilot in the form of a thumbs up/down will be used to tune the responses using prompt engineering.

    Does Monte Copilot use a public or private model?

    Monte Copilot uses OpenAI-hosted models.

    Is PII, PHI and/or PCI data going to be used by Monte Copilot?

    Though Monte Copilot may capture the PII data that was submitted by the users during a conversation, this PII data is removed and not processed via LLM Models.

    Do customers have the option to opt out of Monte Copilot?

    Yes, all generative AI capabilities are part of add-ons. A customer has to purchase these add-ons to use the capabilities.

    Do you inventory all your AI models, and do you regularly reassess them for risk and compliance?

    Weekly and Monthly output reviews are conducted on models. Models are stored in either UDFs or Sagemaker endpoints. Risks are identified during output reviews and remediated.

    Do you prevent and/or identify and mitigate false positives, data loss prevention and unintended consequences of the AI's outputs?

    We use regular expressions to mitigate clearly false positives, if there is a problem with the model, we generate a new model that addresses the perceived shortcomings.

    Do you provide training and support to your employees who are using generative AI?

    Due to the continuous evolution of AI, the team's training is a hands-on approach through weekly meetings to discuss our research and implementation of generative AI-based technologies. In these discussions, we cover the latest information about best practices and knowledge sharing regarding technology and software libraries related to generative AI.

    What customer data, if any, does the solution require for training and/or maintenance?

    None.

    Does the Monte Copilot solution leverage protected attributes such as gender, race, age, disability, spoken language, mental health, or marital status, and proxy features of protected attributes, such as ZIP code?

    No.

    Are there roles to control which users can use Monte Copilot?

    Not yet, but RBAC around what team members can use Monte Copilot will be implemented before our generally available (GA) release.

    Where does the data from questions and answers get stored? For how long are Q&A stored? Can Q&A be deleted?

    Questions and answers are stored within an Anvilogic owned database hosted on Anvilogic's AWS instance.

    By default, Q&A are stored for 30 days and then rolled off, unless feedback (thumbs down or thumbs up) were provided.

    This data can be deleted by an Anvilogic engineer with access if required.

    Open a new worksheet.
  • Change the role from PUBLIC to ACCOUNTADMIN.

  • Paste the copied SQL commands into the new worksheet.

  • Click the All Queries checkbox to run all the commands.

  • Click Run.

  • Look for the Statement executed successfully message.

  • Return to the Anvilogic platform, then click Next.

  • Click Copy Code, then click Go to Snowflake to go to your Snowflake instance and run the copied SQL commands. This set of SQL commands creates the S3 storage integration and allows access to the anvilogic_service user so that a connection to your managed S3 bucket where Snowflake retrieves the data can be made.

  • Perform the following tasks in your Snowflake instance:

    1. Open a new worksheet.

    2. Change the role from PUBLIC to ACCOUNTADMIN.

    3. Paste the copied SQL commands into the new worksheet.

    4. Click the All Queries checkbox to run all the commands.

    5. Click Run.

    6. Look for the Statement executed successfully message.

  • Return to the Anvilogic platform, then click Add.

  • Get data into Snowflake
    Click Login and log in with your Splunk account.
  • Type Anvilogic in the Search for apps field. Click on Anvilogic App for Splunk in the results.

  • Click Download.

  • Troubleshooting download permissions

    If you don't have the permissions to download the app, you will see Download Restricted when you try to download the app.

    If this happens, you must provide Anvilogic with your Splunk.com or Splunkbase username to satisfy Splunk's access control requirements. You must provide this username for each user who requires download access for the Anvilogic App for Splunk.

    Perform the following tasks to find your Splunkbase username:

    1. Make sure you are logged in to Splunkbase.

    2. Click you user profile photo or avatar, then select My Profile.

    3. Find your username at the top of the screen, such as [email protected] in the following example:

    Download the Anvilogic App for Splunk from the Anvilogic platform

    If needed, you can download the Anvilogic App for Splunk from the Anvilogic platform:

    1. In the Anvilogic platform, click Settings ().

    2. In the Anvilogic Splunk App field, click Download.

    The downloaded file is an SPL (Splunk application package) file that can be installed in your Splunk environment.

    Splunkbase
  • Enter a password meeting the password requirements.

  • Re-enter the password for confirmation.

  • Review the Master service agreement and the privacy policy and click on the check box indicating your consent.

  • Click Submit.

  • After you log in, you will see the first screen of the guided onboarding.

    Guided onboarding welcome screen.

    Click Let's Start to begin.

    Windows data

    This page is designed to help customers leverage the Forward Events integration within their Anvilogic account for FluentBit.

    Pre-Reqs

    • Anvilogic account

    • Snowflake data repository connected to your Anvilogic account

    Setting up FluentBit Config

    1. Anvilogic will provide a S3 bucket and the corresponding access keys/ids (note these change for each integration) when you create a forward events integration in your Anvilogic deployment.

    2. Following the steps of the AWS CLI install, once you have done the installation correctly - Please run aws configure and paste in the access key and id provided. Once this is completed, validate that the credentials have been created - usually C:\Users\YourUsername.aws\credentials

    Once you have pasted the above config into your fluentBit.conf file (typically located at C:\Program Files\fluent-bit\conf )

    • NOTE: You can also edit or add any of your own customer parsers for logs by editing the parser.conf file at /etc/fluent-bit/

    • Once you have edited your fluent-bit.conf, please restart the fluentBit application

    1. You can now confirm that data has landed in your snowflake account.

    Please update the input section of this example config to fit your exact needs.

    Get data into Snowflake

    Get your data into Snowflake, where it can be used to generate detections on the Anvilogic platform.

    Assumptions

    This document assumes you have completed the guided onboarding:

    • You have defined your company threat profile.

    • You have integrated Snowflake as your data repository

    Before you continue, make sure you are a user with administrator privileges on the Anvilogic platform.

    Data onboarding summary

    The following flowchart summarizes the process for getting your data into Snowflake.

    Data onboarding steps

    Pick one of the following next steps, depending on your infrastructure:

    Self-managed pipelines

    Before you begin, make sure you read . This document contains important information for optimizing your data onboarding for the best performance.

    After you review the best practices, see for supported data sources and onboarding instructions for each data source.

    Anvilogic-managed pipelines

    See for a list of supported data sources. Click on the name of a data source and follow the instructions to get the data into Snowflake. Anvilogic manages the pipelines for these data sources once you have the data source integrated.

    If you have a data source that is not listed here, use to get your data in. is the recommended way to get your data sources into Snowflake. If you don't use Cribl Stream, you can use your own pipelines to Snowflake.

    Next step

    Hybrid - Anvilogic on Splunk & Azure Architecture

    Anvilogic implementation with Splunk & Snowflake.

    Below is the generic architecture digram for how Anvilogic works on top of a hybrid data environment like Snowflake & Splunk

    • This supports Azure on Data Explorer, Log Analytics, Fabric, and Sentinel.

    • This supports Splunk on Splunk Cloud, Splunk Enterprise on-premise, and Splunk Enterprise Security (ES)

    Diagram:

    PDF Download:

    Hybrid FAQ

    What is the EOI routing pipeline?

    With a multi platform SIEM, you need to select a primary location to store all of your Alerts, this in the Anvilogic platform is called your “Events of Interest (EOI)”.

    You will select which logging platform you want to contain your consolidated EOIs from all detection inputs and the EOI routing pipeline will ensure all alerts (regardless where they original from) get routed to land in the correct destination for correlation opportunities across your data repositories.

    In this example Splunk was selected to be the primary EOI data repo, which means all Snowflake alerts get routed to the Splunk index. If Snowflake was selected, then all Splunk alerts would get routed to the Snowflake alert table.

    Anvilogic will also store a copy of all alerts generated in the platform Alert Lake, which is used for AI-Insights (ex. Tuning, Health, and Hunting escalations).

    Frequently Asked Questions (FAQs)

    Hybrid - Anvilogic on Splunk & Snowflake Architecture

    Anvilogic implementation with Splunk & Snowflake.

    Architecture Diagram

    Below is the generic architecture digram for how Anvilogic works on top of a hybrid data environment like Snowflake & Splunk

    • This supports Snowflake on Azure, AWS, and GCP.

    • This supports Splunk on Splunk Cloud, Splunk Enterprise on-premise, and Splunk Enterprise Security (ES)

    Diagram:

    PDF Download:

    Hybrid FAQ

    What is the EOI routing pipeline?

    With a multi platform SIEM, you need to select a primary location to store all of your Alerts, this in the Anvilogic platform is called your “Events of Interest (EOI)”.

    You will select which logging platform you want to contain your consolidated EOIs from all detection inputs and the EOI routing pipeline will ensure all alerts (regardless where they original from) get routed to land in the correct destination for correlation opportunities across your data repositories.

    In this example Splunk was selected to be the primary EOI data repo, which means all Snowflake alerts get routed to the Splunk index. If Snowflake was selected, then all Splunk alerts would get routed to the Snowflake alert table.

    Anvilogic will also store a copy of all alerts generated in the platform Alert Lake, which is used for AI-Insights (ex. Tuning, Health, and Hunting escalations).

    Frequently Asked Questions (FAQs)

    Anvilogic on Splunk Architecture

    Anvilogic implementation with Splunk (Cloud & Splunk on-premise).

    Architecture Diagram

    Below is the generic architecture digram for how Anvilogic works on top of Splunk.

    This supports both Splunk Cloud (Classic & Victoria) and Splunk on-premise.

    Reference Architectures

    The following is Anvilogic's reference architecture to support your environment.

    High Level Architecture

    Anvilogic on Splunk Detailed Architecture

    Best practices for Snowflake
    Snowflake data ingestion
    Snowflake data ingestion
    Snowflake custom data
    Cribl Stream
    Forward events
    Review data feeds
    10MB
    Anvilogic_Azure_Splunk_Hybrid.pdf
    PDF
    Open
    Splunk FAQ
    Azure FAQ
    Anvilogic on Azure (Data Explorer, Log Analytics, Fabric, Sentinel) & Splunk (Cloud or On-Premise) Hybrid
    7MB
    Reference Architecture - Hybrid Splunk & Snowflake.pdf
    PDF
    Open
    Splunk FAQ
    Snowflake FAQ
    Anvilogic on Snowflake (AWS, GCP, or Azure) and Splunk (Cloud or On-Premise)
    Anvilogic on Snowflake Detailed Architecture

    Anvilogic on Azure Detailed Architecture

    Hybrid - Anvilogic on Splunk & Snowflake Detailed Architecture

    Anvilogic on Splunk Architecture
    General Platform Architecture - High Level Overview
    Anvilogic on Snowflake Architecture
    Anvilogic on Azure
    Hybrid - Anvilogic on Splunk & Snowflake Architecture
    .
  • Once that has been validated, we need to create a system variable in order for fluentBit to read/use these credentials. To do so;

    1. Open the Start Menu and search for “Environment Variables.”

    2. Select Edit the system environment variables.

    3. In the System Properties window, click the Environment Variables button.

    4. Under System variables, click New.

    5. Enter the following:

      1. Variable name: AWS_SHARED_CREDENTIALS_FILE

      2. Variable value: C:\Users\YourUsername\.aws\credentials

    6. Next we need to configure fluentbit to read our logs and send them to S3. In this example, we will be ingesting the windows event logs. You can change what channels by simply adding or removing them.

      1. Please note, the bucket will be the bucket name/path.

        1. This could mean that it is sdi_customer_data-1 or -2 or -3.

  • AWS CLI Installed
    FluentBit installed
    [INPUT]
        Name         winlog
        Channels     Security, Application, System
        Interval_Sec 1
    
    [OUTPUT]
        Name              s3
        Match             *
        bucket            avl-raw-prod-s3-221-24243202/sdi_custom_data-1
        region            us-east-1
        use_put_object    On
        Store_dir         C:\Windows\Temp\fluent-bit\s3
        s3_key_format     /$TAG/%Y/%m/%d/%H-%M-%S
    Diagram:
    Anvilogic on Splunk (Cloud and On-Premise)

    PDF Download:

    Frequently Asked Questions (FAQs)

    How does Anvilogic get installed for Splunk?

    The Anvilogic App for Splunk gets installed on your Search head (single or clustered). It is approved for Splunk on-premises and Splunk Cloud (both Victoria and Classic).

    Detections run as saved searches in the AVL app on cron & results go into the Anvilogic index.

    How does the Anvilogic SaaS platform communicate with Splunk?

    All communication (detection use cases deployments) are done over REST API using HTTPS/443 with TLS v1.2+.

    Can you help bring raw log data into Splunk for us?

    No, Anvilogic does not provide a connector service or forwarding agent to help bring security logs into Splunk. Anvilogic only supports raw data ingestion for Snowflake.

    Can you help bring alert data into Splunk for us?

    Yes, Avilogic can help retrieve alerts/signals from SaaS security tools (ex. Proofpoint, Wiz, Crowdstrike, etc.) and can ingest those into the Anvilogic index for correlation.

    Does Anvilogic require Splunk Enterprise Security (ES)?

    No, Anvilogic does not require Splunk ES to operate. However, Anvilogic can integrate with the existing ES framework if required.

    Anvilogic does have a native triage capability that can replace certain ES components if required.

    What data will Anvilogic have access to?

    The Anvilogic Splunk app should be installed on a search head that has access to security data. This will allow the detection team to build and deploy detections to the search heads that have access to the indexed data.

    All RBAC controls are still maintained by your existing Splunk admins.

    What is the Anvilogic Index?

    Anvilogic Index will store the output from all detections that are running within the Anvilogic Splunk app.

    This is a fully normalized set of signals that we call “events of interest” that can be used to escalate activity to your SOAR or can be used as a hunting index to create Threat Scenario correlations.

    Do you collect the alerts stored in the Anvilogic index?

    Not by default. Alerts are stored inside of your Splunk index you specify during the Splunk app setup.

    The Anvilogic AI-Insights (ex. Hunting, Tuning, Health) package requires a copy of these events to be collected and stored by Anvilogic. If enabled, a copy of those events will be collected into Anvilogic.

    Do you provide parsers for un-normalized data?

    Yes, Anvilogic does not require any Splunk add on to function. We provide hundreds of out-of-the-box parsers that can be used to normalize your security data inside of Splunk.

    Do you integrate with SOAR?

    Yes, Anvilogic can integrate with most SOARs via REST API through either a push or a pull method.

    Does Anvilogic have a triage capability in Splunk?

    Yes, our Anvilogic app for Splunk has built in triage and allowlisting capabilities to make it easy to investigate alerts that are being generated.

    We can also easily integrate with any downstream SOAR platform you are using.

    5MB
    Reference Architecture - Splunk - June 24 2024.pdf
    PDF
    Open

    Anvilogic on Azure

    Anvilogic implementation with Azure (Data Explorer, Log Analytics, and Fabric).

    Architecture Diagram

    Below is the generic architecture digram for how Anvilogic works on top of Azure.

    This supports both Azure Log Analytics, Azure Data Explorer (ADX), and Fabric workspaces.

    We support querying a Log Analytics Workspace in a different tenant than the Anvilogic Azure Data Explorer Cluster.

    • In order to execute cross-tenant queries against a Microsoft Azure Log Analytics Workspace, the proper permissions first need to be configured.

    • This can be done using , a free service that assists customers in managing multiple Azure tenants.

    Questions around cost? Review .

    Diagram:

    PDF Download:

    Frequently Asked Questions (FAQs)

    What gets installed in my Azure environment?

    The following infrastructure will be created in the resource group you create for Anvilogic. We use an to deploy the infrastructure.

    • User managed identity (gives permissions to access key vault, and ADX tables)

    • Azure Key Vault

    Does Anvilogic's integration incur any Azure costs?

    Yes, there are costs associated with running the Anvilogic Resource Group in Azure, specifically on the Azure Data Explorer hosting cluster. These costs depend on the compute required to execute detections within your environment.

    The average costs of running Anvilogic's required infrastructure in Azure range from $6,000-$25,000 per year annually depending on how many detections you will be running.

    What costs money?

    During the set up process, a VM is created that will manage the Data Explorer Cluster. The default size upon our automated installation of that VM is a Standard_E8ads_v5 (Medium 8vCPUs).

    Calculate Costs:

    What permissions do I need to create/use to install Anvilogic’s Azure integration?

    You will be creating the following:


    1. Create new App registration for Anvilogic

    2. Create a new secret in the new App that was created in Step 1

    Can you query Data Explorer, Log Analytics, and Fabric?

    Yes, Anvilogic supports searching and running detection against any data source inside of an Azure LA, ADX cluster, or Fabric workspace.

    You will need to give the Anvilogic app service principal permissions to query any of the ADX, LA clusters, or Fabric workspaces you want Anvilogic searches and detections to use.

    For Microsoft Fabric you need to create a workspace, leverage an event stream under real time intelligence, and the destination from the event stream MUST be a KQL Database.

    The will then be able to query the KQL database.

    Currently, we do not support querying a Log Analytics Workspace in a different tenant than the Anvilogic Azure Data Explorer Cluster.

    How does the Anvilogic platform query our LA, ADX, or Fabric Clusters?

    We connect into your ADX cluster and then use the Microsoft to initiate a query to any other LA, ADX cluster, or Fabric workspace that our app service principal have access to.

    For Microsoft Fabric you need to create a workspace, leverage an event stream under real time intelligence, and the destination from the event stream MUST be a KQL Database.

    The will then be able to query the KQL database.

    Currently, we do not support querying a Log Analytics Workspace in a different tenant than the Anvilogic Azure Data Explorer Cluster.

    What if Azure isn't my primary SIEM and I have a hybrid set up?

    Since Anvilogic supports multiple SIEM/Data Lakes, you can configure all of the events if interest (EOIs) generated from detection queries to also write a copy back to your primary Alert Lake or EOI data store. That can be located in any of the other support platforms (ex. Splunk, Snowflake).

    For example - if Splunk is your primary SIEM, then you can configure all of your Azure detection results to also send a copy of the event of interest (EOI) back to the Anvilogoic index in Splunk. The Anvilogic platform handles all of this EOI routing for you.

    Can you help bring alert data into Azure for us?

    Yes, Anvilogic can help retrieve alerts/signals from SaaS security tools (ex. Proofpoint, Wiz, Crowdstrike, etc.) and can ingest those into the Anvilogic table in ADX for correlation.

    Can you help bring raw data into Azure for us?

    No, Anvilogic does not support raw data ingestion into ADX, LA, or Fabric. Data must already be present in those environments.

    Anvilogic only supports raw data ingestion for Azure Snowflake.

    Do you provide parsers for un-normalized data?

    Yes, we provide hundreds of out-of-the-box parsers that can be used to normalize your security data inside of ADX,LA, or Fabric.

    What is the Anvilogic Alert Table in ADX?

    Anvilogic Alert table in ADX will store the output from all detections that are running within the App container environment.

    This is a fully normalized set of signals that we call “events of interest” that can be used to escalate activity to your SOAR or can be used as a hunting index to create Threat Scenario correlations.

    Do you collect the alerts stored in the Anvilogic Alert Table in ADX?

    Alerts are stored inside of your Azure table in ADX you specify during the setup.

    The Anvilogic AI-Insights (ex. Hunting, Tuning, Health) package requires a copy of these events to be collected and stored by Anvilogic. If enabled, a copy of those events will be collected into Anvilogic.

    Do you integrate with SOAR?

    Yes, Anvilogic can integrate with most SOARs via REST API through either a push or a pull method.

    Anvilogic on Snowflake Architecture

    Anvilogic implementation on Snowflake (AWS, GCP, Azure).

    Architecture Diagram

    Below is the generic architecture digram for how Anvilogic works on top of Snowflake.

    This supports Snowflake on Azure, AWS, and GCP.

    Overall Diagram:

    ETL Parsing & Normalization Process

    PDF Download:

    Frequently Asked Questions (FAQs)

    Does it matter which public cloud I own?

    Snowflake will be configured in the IaaS environment that you have and is available across AWS, GCP, and Azure.

    You can also have a separate Snowflake account per environment if you are a multi-IaaS organization.

    Data that already originates in IaaS that can be sent to cloud storage does not require a streaming tool and can be onboarded to Snowflake directly.

    How do I onboard logs from data center assets?

    Datasets that come from assets hosted within a data center or not in a public IaaS environment will require a solution to route that data to Snowflake.

    Data streaming tools (ex. Cribl, Fluentbit, Apache NiFi) can be used to send on-prem. logs directly to Snowflake.

    It is a requirement that you have a data transport/streaming tool to send data to IaaS storage or Anvilogic pipelines for ingestion.

    Forwarding agents that are installed on endpoints also need to be re-configured to send to the streaming tools for ingestion into Snowflake.

    Snowflake and/or Anvilogic does not provide any data streaming or endpoint agent technology.

    How do you get data that originates in public cloud into Snowflake?

    Configure your security tools & appliances to log to cloud storage services like S3, Blob storage, or GCP storage - Snowpipe then picks it up and ingests into Snowflake.

    How does the Anvilogic SaaS platform communicate with the Snowflake database that gets installed in your Snowflake account?

    All communication (search and detection use cases deployments) are done over REST API using HTTPS/443 with TLS v1.2+.

    Can Anvilogic help with getting raw data into Snowflake?

    Yes, if you have a streaming tool (ex. Cribl, , Apache NiFi) you can send custom data sources directly to Anvilogic’s ingestion pipeline. Anvilogic also has some out of the box support for raw data ingestion sources in our integrations armory.

    This will send data to our S3 storage service, which temporarily stores data to process it into Snowflake.

    Does Anvilogic help with parsing and normalization of raw data into Snowflake?

    Yes, Anvilogic helps with all of the parsing and normalization of security relevant data into the Anvilogic schema. We have onboarding templates and configs that will help ensure the data you are brining into Snowflake is properly formatted to execute detections and perform triage, hunting, and response.

    Can Anvilogic help getting enrichment data into Snowflake?

    Yes, if you have third party Intel or CMDB tools that are required to be used within detection enrichment, those can be called via REST API and transported into a Snowflake table.

    Anvilogic detections can then leverage those enrichment tables to enrich detections before those detections are stored in the Alert lake (upstream of SOAR).

    Does Anvilogic have out-of-the-box integrations for specific vendors alert sources?

    Yes, Anvilogic can provide out of the box integrations for common vendor alerts and data collection for specific SaaS Security tools (ex. Crowdstrike FDR).

    Tools not listed in our integration marketplace can be sent through the Custom Data Integration pipeline as a self service option.

    What is the difference between raw data vs Alert data?

    Raw data sources are events/telemetry that is generated from endpoints/tools/appliances (ex. Windows Event logs, EDR logs).

    Alerts data is curated signals from security tools (ex. Proofpoint alerts, Anti-virus alerts, etc.) that has already been identified to be suspicious or malicious by the vendor.

    Do you use Snowflake Warehouses?

    Yes, Anvilogic requires 2 warehouses to run.

    • Ad-hoc Warehouse - Compute for queries to assist search, hunt, and IR

    • Detect Warehouse - Run 24/7 executing scheduled tasks (detections) on a cron

    Do you integrate with SOAR?

    Yes, Anvilogic can integrate with most SOARs via REST API through either a push or a pull method.

    Does Anvilogic have a search user interface (UI) for Snowflake?

    Yes, Anvilogic has a search user interface (UI) to make it easy to query data that is inside of a Snowflake database.

    In addition, Anvilogic makes it easy to build repeatable detections that can execute on top of Snowflake using a low-code UI builder.

    Does Anvilogic have a data model? Does it work with OCSF?

    Yes, Anvilogic has a data model and offers parsing and normalization code for any security data set that you want to use within the platform.

    Yes, we can also work with OCSF data, and each data feed can be modified/controlled to customize to your needs.

    Does Anvilogic support IOC collection & searching?

    Yes, Anvilogic can onboard IOCs from your third party threat intel tools (ex. Threat Connect) and use that data to create new detections, conduct ongoing exposure checks across your data feeds, or use it to enrich your alert output for triage analysts.

    Linux data

    This page is designed to help customers leverage the Forward Events integration within their Anvilogic account for FluentBit.

    Pre-Reqs

    • Anvilogic account

    • Snowflake data repository connected to your Anvilogic account

    Setting up FluentBit Config

    1. Anvilogic will provide a S3 bucket and the corresponding access keys/ids (note these change for each integration) when you create a forward events integration in your Anvilogic deployment.

    2. Create a credential file on the machine that fluentBit can read from. For example, /home/<username>/creds . Inside the file please paste the following config with your specific access key/id

    1. Since our credentials are already updated in the /home/<username>/creds file, we need to configure the service config file for Fluent Bit and set the path to this credential file (see image for reference). To do that, fire up your favorite text editor and edit the fluent-bit.service file located at /usr/lib/systemd/system/fluent-bit.service.

      1. Environment="AWS_SHARED_CREDENTIALS_FILE=/home/<username>/creds"

    1. Then run the following commands in a terminal window

      1. sudo systemctl daemon-reload

      2. sudo systemctl start fluent-bit

    Once you have pasted the above config into your fluentBit.conf file (typically located at /etc/fluent-bit/fluent-bit.conf)

    • NOTE: You can also edit or add any of your own customer parsers for logs by editing the parser.conf file at /etc/fluent-bit/

    • Once you have edited your fluent-bit.conf, please restart the fluentBit service sudo systemctl restart fluent-bit

      • You can validate that your config is working by heading to /tmp/fluent-bit/s3/ and looking inside that folder.

    1. You can now confirm that data has landed in your snowflake account.

    Please update the input section of this example config to fit your exact needs.

    Anvilogic on Databricks Architecture

    Anvilogic implementation on Databricks (AWS, Azure, GCP).

    Architecture Diagram

    Below is the generic architecture digram for how Anvilogic works on top of Databricks.

    This supports Databricks on AWS, Azure, and GCP.

    Review data feeds

    Review the category mappings and quality of your data feeds.

    Your data feeds are automatically categorized and synchronized to the Anvilogic platform every 7 days. When you add a data feed, you can view it on the Data Feeds page within 7 days.

    Review the data feed category

    Verify the category of your data feeds matches what you expect, as this affects your MITRE coverage. Select Maturity Score () > Data Feeds from the navigation bar, the review the categories for each data feed:

    To change or add categories to a data feed:

    In this case, it is used to assign role-based access control (RBAC) permissions to grant service principals permissions across tenants.

    Click here to learn more.

    ADX Cluster, database, and tables Azure Container App/environment/jobs/instance

  • Log analytics workspace for the container app

  • Azure Container app registry & cache

  • Visit Azure Pricing Page -> Type in "azure data explorer" under products

  • In the Instance section, type "E8ads"

  • Refer to Azure Costs Estimates for more details.

    Create an Anvilogic resource group

  • Go through our integration set up on the Anvilogic Platform

  • Azure Lighthouse
    Azure Costs Estimates
    563KB
    Azure Reference Architecture.pdf
    PDF
    Open
    Azure ARM template
    cluster command
    cluster command
    cluster command
    Anvilogic's Integration with Azure ADX, LA, Fabric.
    6MB
    Reference Architecture - Snowflake - July 9 2024.pdf
    PDF
    Open
    Fluentbit
    Anvilogic on Snowflake (AWS, GCP, or Azure)
    ETL Process for moving data from Raw format to Gold.
    Next we need to configure fluentbit to read our logs and send them to S3. In this example, I will be looking at apache2 access logs and sending them to S3.
    FluentBit installed
    [default]
    aws_access_key_id = AKIAIOSFODNN7EXAMPLE
    aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
    [INPUT]
        Name              tail
        Tag               apache
        Path              /var/log/apache2/access.log
        Parser            apache2
        Mem_Buf_Limit     50MB
    
    [OUTPUT]
        Name              s3
        Match             *
        bucket            avl-raw-prod-s3-111-12345678/sdi_custom_data-0
        region            us-east-1
        use_put_object    On
        Store_dir         /tmp/fluent-bit/s3
        s3_key_format     /$TAG/%Y/%m/%d/%H/%M/%S

    Diagram:

    Anvilogic on Databricks (AWS, GCP, or Azure)

    ETL Parsing & Normalization Process

    ETL Process for moving data from Raw format to Gold.

    PDF Download:

    Frequently Asked Questions (FAQs)

    Does it matter which public cloud I own?

    Databricks will be configured in the IaaS environment that you have and is available across AWS, Azure, and GCP.

    Data that already originates in IaaS that can be sent to cloud storage does not require a streaming tool and can be onboarded to Databricks directly.

    What type of Databricks Compute is required?

    Anvilogic requires two types of compute warehouses to run:

    1. SQL Warehouse - Compute for ad-hoc queries to assist search, hunt, and IR

    2. All-Purpose/Job Compute - Used to execute scheduled detection workflow jobs and for our low-code detection builder

    For both, you have the option to use either serverless or classic compute, but serverless is the default and highly recommended for improved performance, scalability, and cost.

    How do detection use cases execute in Databricks?

    Detections execute as jobs within a Workflow. Rules built on the Anvilogic platform are converted from a user friendly SQL builder to PySpark functions that run on a defined schedule.

    How do I onboard logs from data center assets?

    Datasets that come from assets hosted within a data center or not in a public IaaS environment will require a solution to route that data to Databricks.

    Data streaming tools (ex. Cribl, Fluentbit, Apache NiFi) can be used to send on-prem. logs directly to Databricks.

    It is a requirement that you have a data transport/streaming tool to send data to IaaS storage or Anvilogic pipelines for ingestion.

    Forwarding agents that are installed on endpoints also need to be re-configured to send to the streaming tools for ingestion into Databricks.

    Databricks and/or Anvilogic does not provide any data streaming or endpoint agent technology.

    How do you get data that originates in public cloud into Databricks?

    Python Notebooks are used to collect data from storage and transform raw events into the AVL detection schema, preferably using Lakeflow Pipelines (formerly known as Delta Live Tables).

    Can Anvilogic help with getting raw data into Databricks?

    Yes, if you have a streaming tool (ex. Cribl, Fluentbit, Apache NiFi) you can send custom data sources directly to your primary storage servers (ex. S3, Blob, etc.) and Anvilogic can orchestrate the ETL process into the correct schema and tables required for detection purposes.

    Does Anvilogic help with parsing and normalization of raw data into Databricks?

    Yes, Anvilogic helps with all of the parsing and normalization of security relevant data into the Anvilogic schema.

    We have onboarding templates and configs that will help ensure the data you are bringing into Databricks is properly formatted to execute detections and perform triage, hunting, and response.

    All data parsing, normalization, and enrichment is done in the Python Notebook section of the diagram above.

    What is the difference between Bronze, Silver, and Gold tables?

    Anvilogic leverages Databricks Lakeflow Pipelines to assist in the ETL process of parsing, normalization and enrichment.

    • Bronze Tables - Unparsed and unstructured data, usually in 2 columns (time and raw).

    • Silver Tables - Parsed and structured data; this is usually where raw data is separated into multiple columns (normalization and enrichment can also occur here).

    • Gold Tables - Normalized and enriched security data feeds that have been organized into tables based on their security domain (ex. Endpoint, Cloud, Network, etc).

    Each feed can be customized based on your organization's preferences.

    Does Anvilogic have out-of-the-box integrations for specific vendors alert sources?

    Yes, Anvilogic can provide out of the box integrations for common vendor alerts and data collection for specific SaaS Security tools (ex. Crowdstrike FDR).

    Tools not listed in our integration marketplace can be sent through the Custom Data Integration pipeline as a self service option.

    What is the difference between raw data and alert data?

    Raw data sources are events/telemetry that is generated from endpoints/tools/appliances (ex. Windows Event logs, EDR logs).

    Alerts data is curated signals from security tools (ex. Proofpoint alerts, Anti-virus alerts, etc.) that has already been identified to be suspicious or malicious by the vendor.

    Do you integrate with SOAR?

    Yes, Anvilogic can integrate with most SOARs via REST API through either a push or a pull method.

    Does Anvilogic have a search user interface (UI) for Databricks?

    Yes, Anvilogic has a search user interface (UI) to make it easy to query data that is inside of a Databricks catalog.

    In addition, Anvilogic makes it easy to build repeatable detections that can execute on top of Databricks using a low-code UI builder.

    Does Anvilogic have a data model? Does it work with OCSF?

    Yes, Anvilogic has a data model and offers parsing and normalization code for any security data set that you want to use within the platform.

    Yes, we can also work with OCSF data, and each data feed can be modified/controlled to customize to your needs.

    Does Anvilogic support IOC collection & searching?

    Yes, Anvilogic can onboard IOCs from your third party threat intel tools (ex. Threat Connect) and use that data to create new detections, conduct ongoing exposure checks across your data feeds, or use it to enrich your alert output for triage analysts.

    Databricks Technology Partner and Built-On Partner Badges
    9MB
    Databricks-Reference-Architecture.pdf
    PDF
    Open

    Click on the name of the data feed.

  • Click Tags.

  • In the Data Categories, field, enter the data categories you want associated with this data feed.

  • Click Update when you are finished.

  • Review the data feed quality

    Select Maturity Score () > Data Feeds from the navigation bar, the review the quality for each data feed:

    An initial quality feed assessment is made by the Anvilogic platform for any new data feed added to the Anvilogic platform.

    Perform your own evaluation of the timeliness, logging level, field extraction, and monitoring scope for each data feed so you can assign a proper data feed quality. Feed quality is important because only Good quality feeds are used to generate recommendations on the Anvilogic platform.

    To manually change the quality of a data feed:

    1. Click on the name of the data feed.

    2. Select one of the qualities from the Feed Quality dropdown.

    3. Click Update when you are finished.

    Auto-compute feed qualities are available for Windows event logs in Splunk. See Data feed quality auto computation.

    Next step

    Review and deploy recommended content.

    Azure Costs Estimates

    Unified Detect for Azure supports both Azure Log Analytics, Azure Data Explorer (ADX), and Microsoft Fabric.

    Installing Anvilogic's UD for Azure creates a new Azure Data Explorer cluster in your environment that is used to manage objects to run the Unified Detect framework.

    During the set up process, a VM is created that will manage the Data Explorer Cluster. The default size upon our automated installation of that VM is a Standard_E2ads_v5 (Medium 8vCPUs) in a production cluster with SLA. This can be changed at any time if the amount of detections you have running requires more compute resources.

    Review your billing configurations for ADX pricing tiers that control cluster management to ensure proper scaling expectations and configuration for the Anvilogic service to not get terminated.

    See and .

    Estimated cluster sizes

    The table below assumes each deployed job run averages 1 minute and every rule deployed has the specified job run frequency. In reality, you could have a mix of how long the jobs take to run and how often they run. The table below is a guideline to be used for estimating capacity, and is based on the limits, which is the number of cores multiplied by 10.

    3 Concurrency job runs are reserved for adhoc jobs executed from the Azure TI Builder view when creating or editing a threat identifier. The remaining jobs are reserved for deployed rules.

    Other KQL queries being run outside of Azure UD also contribute towards this search concurrency and can cause throttled jobs if the cluster is operating near full utilization.

    Cluster size
    Azure ADX concurrency limit
    Job run frequency (in minutes)
    Deployed rules limit

    Cluster size costs

    The table shows the estimated monthly cost for various cluster sizes.

    The estimated monthly and annual costs do not include additional storage costs. To determine the additional storage costs, use in the Microsoft documentation.

    Cluster size
    Number of cores
    Estimated monthly cost
    Estimated annual cost
    Reference architecture diagram for Anvilogic on Databricks

    480

    Standard_E2ads_v5

    20

    60

    960

    Standard_E4ads_v5

    40

    5

    180

    Standard_E4ads_v5

    40

    15

    540

    Standard_E4ads_v5

    40

    30

    1,080

    Standard_E4ads_v5

    40

    60

    2,160

    Standard_E8ads_v5

    80

    5

    380

    Standard_E8ads_v5

    80

    15

    1,140

    Standard_E8ads_v5

    80

    30

    2,280

    Standard_E8ads_v5

    80

    60

    4,560

    Standard_E16ads_v5

    160

    5

    780

    Standard_E16ads_v5

    160

    15

    2,340

    Standard_E16ads_v5

    160

    30

    4,680

    Standard_E16ads_v5

    160

    60

    9,360

    Standard_D32d_v4

    320

    5

    1,580

    Standard_D32d_v4

    320

    15

    4,740

    Standard_D32d_v4

    320

    30

    9,480

    Standard_D32d_v4

    320

    60

    18,960

    $24,600

    Standard_E16ads_v5

    16

    $4,099

    $49,188

    Standard_D32d_v4

    32

    $7,781

    $93,372

    Standard_E2ads_v5

    20

    5

    80

    Standard_E2ads_v5

    20

    15

    240

    Standard_E2ads_v5

    20

    Standard_E2ads_v5

    2

    $512

    $6,144

    Standard_E4ads_v5

    4

    $1,024

    $12,288

    Standard_E8ads_v5

    8

    Azure Data Explorers default concurrency
    Microsoft Azure pricing calculator
    Estimated cluster sizes
    Cluster size costs

    30

    $2,050

    Assign the avl_admin role

    Assign the avl_admin role to your admin users.

    Use Splunk Web to assign the avl_admin role to app administrators. See Create and manage roles with Splunk Web in the Securing Splunk Enterprise manual for instructions.

    Assign desired roles directly to each user. Don't inherit user roles through another role.

    Customize roles

    The following roles are available on the Anvilogic App for Splunk. See to see a summary of the privileges provided by each role.

    • avl_admin

    • avl_senior_developer

    • avl_developer

    • avl_senior_triage

    You can customize the avl_senior_developer, avl_developer, avl_senior_triage, and avl_triage roles. The avl_admin and avl_readonly roles can't be modified.

    For example, perform the following tasks to customize the capabilities allowed or restricted by the AVL Senior Developer role:

    1. In the Anvilogic App for Splunk, select Settings > App Configuration.

    2. Click User Settings to expand the section.

    3. Click Customize AVL Senior Developer Role to expand the section for that role.

    4. Deselect any capabilities you want to remove for this role, or select a capability to add it to the role.

    Summary of roles and privileges

    The following table lists the roles in the Anvilogic App for Splunk and the privileges granted by each role. You can customize the privileges enabled for each role as desired.

    Privilege
    AVL Senior Developer
    AVL Developer
    AVL Senior Triage
    AVL Triage

    Next step

    .

    Syslog data

    This page is designed to help customers leverage the Forward Events integration within their Anvilogic account for FluentBit.

    Pre-Reqs

    • Anvilogic account

    • Snowflake data repository connected to your Anvilogic account

    Setting up FluentBit Config

    1. Anvilogic will provide a S3 bucket and the corresponding access keys/ids (note these change for each integration) when you create a forward events integration in your Anvilogic deployment.

    2. Create a credential file on the machine that fluentBit can read from. For example, /home/<username>/creds . Inside the file please paste the following config with your specific access key/id

    1. Since our credentials are already updated in the /home/<username>/creds file, we need to configure the service config file for Fluent Bit and set the path to this credential file (see image for reference). To do that, fire up your favorite text editor and edit the fluent-bit.service file located at /usr/lib/systemd/system/fluent-bit.service.

      1. Environment="AWS_SHARED_CREDENTIALS_FILE=/home/<username>/creds"

    1. Then run the following commands in a terminal window

      1. sudo systemctl daemon-reload

      2. sudo systemctl start fluent-bit

    Once you have pasted the above config into your fluentBit.conf file (typically located at /etc/fluent-bit/fluent-bit.conf)

    • NOTE: You can also edit or add any of your own customer parsers for logs by editing the parser.conf file at /etc/fluent-bit/

    • Once you have edited your fluent-bit.conf, please restart the fluentBit service sudo systemctl restart fluent-bit

      • You can validate that your config is working by heading to /tmp/fluent-bit/s3/ and looking inside that folder.

    1. You can now confirm that data has landed in your snowflake account.

    Please update the input section of this example config to fit your exact needs.

    avl_triage
  • avl_readonly

  • Click Save.

    avl_remove_al_rule_entry

    ✓

    ✓

    ✓

    avl_modify_al_rule_entry

    ✓

    ✓

    ✓

    ✓

    avl_add_al_global_entry

    ✓

    ✓

    ✓

    avl_remove_al_global_entry

    ✓

    ✓

    ✓

    avl_modify_al_global_entry

    ✓

    ✓

    ✓

    avl_manage_rule_al

    ✓

    ✓

    ✓

    avl_manage_global_al

    ✓

    ✓

    ✓

    Triage privileges

    avl_change_first_alert_status

    ✓

    avl_change_all_alert_status

    ✓

    ✓

    ✓

    avl_change_alert_status_to_new

    ✓

    ✓

    ✓

    avl_bulk_alert_status

    ✓

    ✓

    ✓

    avl_add_observation

    ✓

    ✓

    ✓

    ✓

    avl_remove_observation

    ✓

    ✓

    ✓

    avl_rate_rule

    ✓

    ✓

    ✓

    ✓

    avl_add_rule_feedback

    ✓

    ✓

    ✓

    ✓

    avl_create_case

    ✓

    ✓

    ✓

    ✓

    avl_suppress_alert

    ✓

    ✓

    ✓

    ✓

    avl_suppress_global_alert

    ✓

    ✓

    ✓

    ✓

    Content deployment privileges

    avl_deploy_content

    ✓

    avl_write_hec

    ✓

    ✓

    ✓

    avl_post_rest_platform

    ✓

    ✓

    avl_post_rest

    ✓

    ✓

    ✓

    avl_get_rest

    ✓

    ✓

    ✓

    ✓

    avl_rest_config_access_get

    ✓

    ✓

    ✓

    ✓

    avl_rest_config_access_post

    Allowlist privileges

    avl_add_al_rule_entry

    ✓

    ✓

    ✓

    ✓

    Summary of roles and privileges
    Configure the HEC collector commands
    Next we need to configure fluentbit to read our logs and send them to S3. In this example, I will be sending logs via Syslog and sending them to S3.
    FluentBit installed
    [default]
    aws_access_key_id = AKIAIOSFODNN7EXAMPLE
    aws_secret_access_key = wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
    [INPUT]
        Name              syslog
        Mode              udp
        Listen            0.0.0.0
        Port              1515
        Parser            syslog-rfc3164
        Mem_Buf_Limit     10MB
    
    [OUTPUT]
        Name              s3
        Match             *
        bucket            avl-raw-prod-s3-221-24243202/sdi_custom_data-0
        region            us-east-1
        use_put_object    On
        Store_dir         /tmp/fluent-bit/s3
        s3_key_format     /$TAG/%Y/%m/%d/%H/%M/%S

    AI security controls

    This page summarizes the AI security controls and measures in place on the Anvilogic platform.

    Controls

    The table summarizes the security controls in place for AI on the Anvilogic platform.

    Control Category

    Controls Applied

    Context is established and understood.

    Intended purposes, potentially beneficial uses, context-specific laws, norms and expectations, and prospective settings in which the AI system will be deployed are understood and documented. Considerations include the specific set or types of users along with their expectations; potential positive and negative impacts of system uses to individuals, communities, organizations, society, and the planet; assumptions and related limitations about AI system purposes, uses, and risks across the development or product AI lifecycle; and related test, evaluation, verification, and validation (TEVV) and system metrics.


    The organization’s mission and relevant goals for AI technology are understood and documented.


    The business value or context of business use has been clearly defined or– in the case of assessing existing AI systems– re-evaluated.


    Organizational risk tolerances are determined and documented.


    System requirements (e.g., “the system shall respect the privacy of its users”) are elicited from and understood by relevant AI actors. Design decisions take socio-technical implications into account to address AI risks.

    Measure

    The Measure function employs quantitative, qualitative, or mixed-method tools, techniques, and methodologies to analyze, assess, benchmark, and monitor AI risk and related impacts. It uses knowledge relevant to AI risks identified in the MAP function.

    How It Worksfluentbit.io

    Categorization of the AI system is performed.

    The specific tasks and methods used to implement the tasks that the AI system will support are defined (e.g., classifiers, generative models, recommenders).


    Scientific integrity and test, evaluation, verification, and validation (TEVV) considerations are identified and documented, including those related to experimental design, data collection and selection (e.g., availability, representativeness, suitability), system trustworthiness, and construct validation.

    AI capabilities, targeted usage, goals, and expected benefits and costs compared with appropriate benchmarks are understood.

    Potential benefits of intended AI system functionality and performance are examined and documented.


    Potential costs, including non-monetary costs, which result from expected or realized AI errors or system functionality and trustworthiness– as connected to organizational risk tolerance– are examined and documented.


    Targeted application scope is specified and documented based on the system’s capability, established context, and AI system categorization.


    Processes for operator and practitioner proficiency with AI system performance and trustworthiness– and relevant technical standards and certifications– are defined, assessed, and documented.


    Processes for human oversight are defined, assessed, and documented in accordance with organizational policies.

    Risks and benefits are mapped for all components of the AI system including third-party software and data.

    Internal risk controls for components of the AI system, including third-party AI technologies, are identified and documented.

    Impacts to individuals, groups, communities, organizations, and society are characterized.

    Likelihood and magnitude of each identified impact (both potentially beneficial and harmful) based on expected use, past uses of AI systems in similar contexts, public incident reports, feedback from those external to the team that developed or deployed the AI system, or other data are identified and documented.


    Practices and personnel for supporting regular engagement with relevant AI actors and integrating feedback about positive, negative, and unanticipated impacts are in place and documented.

    Manage deployment environment governance.

    When developing contracts for AI system products or services Consider deployment environment security requirements.

    Ensure a robust deployment environment architecture.

    Establish security protections for the boundaries between the IT environment and the AI system.


    Identify and protect all proprietary data sources the organization will use in AI model training or fine-tuning. Examine the list of data sources, when available, for models trained by others.

    Harden deployment environment configurations.

    Apply existing security best practices to the deployment environment. This includes sandboxing the environment running ML models within hardened containers or virtual machines (VMs), monitoring the network, configuring firewalls with allow lists, and other best practices for cloud deployments.


    Review hardware vendor guidance and notifications (e.g., for GPUs, CPUs, memory) and apply software patches and updates to minimize the risk of exploitation of vulnerabilities, preferably via the Common Security Advisory Framework (CSAF).


    Secure sensitive AI information (e.g., AI model weights, outputs, and logs) by encrypting the data at rest, and store encryption keys in a hardware security module (HSM) for later on-demand decryption.


    Implement strong authentication mechanisms, access controls, and secure communication protocols, such as by using the latest version of Transport Layer Security (TLS) to encrypt data in transit.


    Ensure the use of phishing-resistant multifactor authentication (MFA) for access to information and services. [2] Monitor for and respond to fraudulent authentication attempts.

    Protect deployment networks from threats.

    Use well-tested, high-performing cybersecurity solutions to identify attempts to gain unauthorized access efficiently and enhance the speed and accuracy of incident assessments.


    Integrate an incident detection system to help prioritize incidents. Also integrate a means to immediately block access by users suspected of being malicious or to disconnect all inbound connections to the AI models and systems in case of a major incident when a quick response is warranted.

    Continuously protect the AI system.

    Models are software, and, like all other software, may have vulnerabilities, other weaknesses, or malicious code or properties. Continuously monitor AI system.

    Validate the AI system before and during use.

    Store all forms of code (e.g., source code, executable code, infrastructure as code) and artifacts (e.g., models, parameters, configurations, data, tests) in a version control system with proper access controls to ensure only validated code is used and any changes are tracked.

    Secure exposed APIs.

    If the AI system exposes application programming interfaces (APIs), secure them by implementing authentication and authorization mechanisms for API access. Use secure protocols, such as HTTPS with encryption and authentication.

    Enforce strict access controls.

    Prevent unauthorized access or tampering with the AI model. Apply role-based access controls (RBAC), or preferably attribute-based access controls (ABAC) where feasible, to limit access to authorized personnel only. Distinguish between users and administrators. Require MFA and privileged access workstations (PAWs) for administrative access.

    Ensure user awareness and training.

    Educate users, administrators, and developers about security best practices, such as strong password management, phishing prevention, and secure data handling. Promote a security-aware culture to minimize the risk of human error. If possible, use a credential management system to limit, manage, and monitor credential use to minimize risks further.

    Conduct audits and penetration testing.

    Engage external security experts to conduct audits and penetration testing on ready to-deploy AI systems.

    Implement robust logging and monitoring.

    Establish alert systems to notify administrators of potential oracle-style adversarial compromise attempts, security breaches, or anomalies. Timely detection and response to cyber incidents are critical in safeguarding AI systems.

    Measure

    Measure Subcategories

    MEASURE 1: Appropriate methods and metrics are identified and applied.

    MEASURE 1.1: Approaches and metrics for measurement of AI risks enumerated during the MAP function are selected for implementation starting with the most significant AI risks. The risks or trustworthiness characteristics that will not– or cannot– be measured are properly documented.


    MEASURE 1.2: Appropriateness of AI metrics and effectiveness of existing controls are regularly assessed and updated, including reports of errors and potential impacts on affected communities.


    MEASURE 1.3: Internal experts who did not serve as front-line developers for the system and/or independent assessors are involved in regular assessments and updates. Domain experts, users, AI actors external to the team that developed or deployed the AI system, and affected communities are consulted in support of assessments as necessary per organizational risk tolerance.

    MEASURE 2: AI systems are evaluated for trustworthy characteristics.

    MEASURE2.1: Test sets, metrics, and details about the tools used during test, evaluation, verification, and validation (TEVV) are documented.


    MEASURE 2.2: Evaluations involving human subjects meet applicable requirements (including human subject protection) and are representative of the relevant population.


    MEASURE 2.3: AI system performance or assurance criteria are measured qualitatively or quantitatively and demonstrated for conditions similar to deployment setting(s). Measures are documented.


    MEASURE 2.4: The functionality and behavior of the AI system and its components– as identified in the MAP function– are monitored when in production.


    MEASURE 2.5: The AI system to be deployed is demonstrated to be valid and reliable. Limitations of the generalizability be yond the conditions under which the technology was developed are documented.


    MEASURE 2.6: The AI system is evaluated regularly for safety risks– as identified in the MAP function. The AI system to be deployed is demonstrated to be safe, its residual negative risk does not exceed the risk tolerance, and it can fail safely, particularly if made to operate beyond its knowledge limits. Safety metrics reflect system reliability and robustness, real-time monitoring, and response times for AI system failures.


    MEASURE 2.7: AI system security and resilience– as identified in the MAP function– are evaluated and documented.


    MEASURE 2.8: Risks associated with transparency and account ability– as identified in the MAP function– are examined and documented.


    MEASURE 2.9: The AI model is explained, validated, and documented, and AI system output is interpreted within its context as identified in the MAP function– to inform responsible use and governance.


    MEASURE 2.10: Privacy risk of the AI system– as identified in the MAP function– is examined and documented.


    MEASURE 2.11: Fairness and bias– as identified in the MAP function– are evaluated and results are documented.


    MEASURE 2.12: Environmental impact and sustainability of AI model training and management activities– as identified in the MAP function– are assessed and documented.


    MEASURE 2.13: Effectiveness of the employed test, evaluation, verification, and validation (TEVV) metrics and processes in the MEASURE function are evaluated and documented.

    MEASURE 3: Mechanisms for tracking identified AI risks over time are in place.

    MEASURE 3.1: Approaches, personnel, and documentation are in place to regularly identify and track existing, unanticipated, and emergent AI risks based on factors such as intended and actual performance in deployed contexts.


    MEASURE 3.2: Risk tracking approaches are considered for settings where AI risks are difficult to assess using currently available measurement techniques or where metrics are not yet available.


    MEASURE 3.3: Feedback processes for end users and impacted communities to report problems and appeal system outcomes are established and integrated into AI system evaluation metrics.

    MEASURE 4: Feedback about efficacy of measurement is gathered and assessed.

    MEASURE4.1: Measurement approaches for identifying AI risks are connected to deployment context(s) and informed through consultation with domain experts and other end users. Approaches are documented.


    MEASURE 4.2: Measurement results regarding AI system trust worthiness in deployment context(s) and across the AI lifecycle are informed by input from domain experts and relevant AI ac tors to validate whether the system is performing consistently as intended. Results are documented.


    MEASURE 4.3: Measurable performance improvements or declines based on consultations with relevant AI actors, including affected communities, and field data about context relevant risks and trustworthiness characteristics are identified and documented.

    What is Fluentd? | Fluentdwww.fluentd.org
    Logo
    Logo
    Introduction | Fluentddocs.fluentd.org
    Logo