Scanning in SCMs

UnifAI integrates with your GitHub repositories to automatically enforce security policies and deliver remediation fixes. Connect your repository, define your policies, and UnifAI will identify violations and open a pull request for your team to review and merge.

UnifAI works through a scan script that runs on your repository. The script is SCM-agnostic — this page covers GitHub, but supported source code managers also include:

  • GitHub Enterprise (self-hosted)

  • GitHub SaaS (GitHub.com)

  • GitLab Enterprise (self-hosted)

  • GitLab SaaS (GitLab.com)

  • Bitbucket Data Center (on-premises)

  • Bitbucket Cloud

Scanning a GitHub Repository

How the Scan Works

When you trigger a scan, UnifAI performs the following steps:

Step

Action

Result

1

Clone repository

UnifAI clones your repository to the scan environment.

2

Generate fixes

UnifAI generates remediation code for every violation found.

3

Create remediation branch

A new branch is created in your repository to hold the fixes.

4

Open pull request

A pull request opens with the full violations report and all code changes.

As you merge remediation pull requests and run subsequent scans, the violation count decreases. You can track this reduction across scan iterations in the violations report.

Choosing a Deployment Path

Select the path that matches how your organization hosts GitHub:

Path

When to Use

What Gets Deployed

Path A: GitHub Enterprise

Your organization runs GitHub on its own infrastructure (self-hosted).

The Lineaje MCP server is deployed on a VM inside your environment. Source code never leaves your network.

Path B: GitHub SaaS

Your organization uses GitHub.com (cloud-hosted).

No local MCP server is required. The scan script connects directly to the Lineaje SaaS MCP endpoint.

Both paths produce the same output: a pull request on your repository containing remediation fixes for all detected violations, along with a violations report showing the policy name, file, line numbers changed, and what was remediated.

What You Need

To scan any repository, you provide four inputs:

  • Repository name

  • Branch name

  • SCM token (to access the repository)

  • Device Code — passed as a parameter and never stored

Language Model

When a remediation fix is applied, the IDE or CLI agent (for example, Cursor) consumes tokens to generate code from the remediation output. This averages approximately 600 tokens per violation. Token consumption is handled by your IDE agent and is not controlled by Lineaje.

To use a different model instead of the default, configure the model in the MCP server settings. Only models that Lineaje has tested and certified are supported. Supported models include Claude Sonnet 4.5, 4.6, and Claude Opus 4.6 (SaaS supported). On-premises customers can use these models, but model configuration must be done through AWS Bedrock.

Path A: GitHub Enterprise (On-Premises)

Use this path if your organization hosts GitHub on its own infrastructure. The MCP server deploys inside your environment on a virtual machine (VM). Your source code is scanned locally and never sent to Lineaje.

Before You Start

  • A GPU is required for the vLLM setup. Lineaje hosts an open-source LLM/SLM.

  • Outbound network access from the VM on port 443 to Lineaje SaaS is required.

  • The Lineaje MCP server package and deployment script must be available on the VM.

MCP Server Specifications (EC2)

The MCP server can be hosted on EC2 with the following specification:

Component

ECS MCP Server

Compute

4 vCPU

Memory

8 GB RAM

OS

Linux

Architecture

x86_64

Storage Type

Amazon EFS

Cache

Redis (SSL Enabled)

On-Premises VM Specifications (vLLM)

For the on-premises solution, customers deploy an open-source model on the VM. vLLM must be installed on the VM with the following minimum requirements:

Property

Value

Instance type

g6.12xlarge

Family

g6

vCPUs

48

OS

Ubuntu 24.04 (Noble)

Storage

1 TB

GPU

4× NVIDIA L4 (24 GB each = 96 GB total)

GPU VRAM

96 GB total

vLLM OS Prerequisites

Supported OS: Ubuntu 22.04 or 24.04 (recommended).

  • NVIDIA Driver: 525 or later

  • CUDA: 11.8 or later

  • Docker: 24 or later

  • Python: 3.10 or later (for Ansible)

  • Storage: Minimum 200 GB for models and Docker images

Scan Features

Path A: GitHub Enterprise (On-Premises) supports two scan types:

  • Full repository scan — Scans the entire repository using UnifAI and opens a pull request with the Lineaje violations report in the description along with all file changes.

  • Pull request scan — Scans a specific pull request using UnifAI and opens a remediation pull request with the Lineaje violations report in the description along with all file changes.

Setup

1

Step 1: Create a Virtual Environment

Before running the project on a VM or local machine, set up a Python virtual environment.

2

Step 2: Install Dependencies

3

Step 3: Configure Environment Variables

Create a .env file in your project root and set the following variables:

Variable

Description

LOCAL_PATH

Path to the directory to scan

MCP_SERVER_URL

http://127.0.0.1:8000/mcp (local) or ${PUBLIC_VM_URL}/mcp (VM)

DEVICE_CODE

Device code obtained from the SBOM360 Integrations page (see, Obtain a Device Code)

SOURCE_CODE_REPO

URL of the repository to scan

REPO_BRANCH

Branch to scan

LLM_API_KEY

API key for the LLM provider used during scanning

LLM_MODEL

Model identifier (e.g., anthropic/claude-sonnet-4.6)

LLM_API_URL

LLM API endpoint URL

Example .env file:

4

Step 4: Deploy the Lineaje MCP Server

  • Deploy the Lineaje MCP server on your VM or local machine.

  • Clone the Lineaje MCP Server repository to the VM or local machine.

    • If running locally, MCP_SERVER_URL is http://127.0.0.1:8000/mcp. If running on a different port, update the port number accordingly.

    • If the MCP server is deployed on a VM, set MCP_SERVER_URL to ${PUBLIC_VM_URL}/mcp.

5

Step 5: Authenticate

  • Start the MCP server by running: python3 mcp_server.py

  • In the IDE, click the authentication option that appears.

  • The identity provider (IDP) page opens. Authenticate with your credentials.

  • After successful authentication, an option to open the IDE (for example, Cursor) is displayed.

6

Step 6: Obtain a GitHub Personal Access Token (PAT)

Follow these steps to generate a PAT for repository access:

  1. In GitHub, go to Settings → Developer Settings → Personal Access Tokens → Tokens (Classic).

  2. Click Generate new token (classic) and authenticate with your credentials.

  3. Enter a name in the Note field for your own reference.

  4. Set an expiration date according to your organization’s policies.

  5. Under Select scopes, grant access to repo and project only.

  6. Click Generate token. Copy the token immediately and store it in your .env file — it will not be shown again.

7

Step 7: Obtain a Device Code

Obtain the device code from the SBOM360 Integrations page:

  1. Log in at app.veedna.com and navigate to SBOM360 → Integrations → Deploy Private Cloud → Download CLI.

  2. Click the link in the box.

  3. Click Verify Device. A Lineaje web page opens asking you to authenticate.

  4. Authenticate and click Yes. You are returned to the Integrations page.

  5. Copy the device code displayed on the Integrations page.

Running the Scan

Full Repository Scan

The scan script (repo_scan.py) is located in the Lineaje MCP server codebase. Run the following command, replacing the placeholder values with your own:

Pull Request Scan

The PR scan script (pr_scan.py) is located in the Lineaje MCP server codebase. Run the following command, replacing the placeholder values with your own:

After the scan completes, review the pull request that UnifAI opens on your repository. See Reviewing the Pull Request for details. The remediation branch is named:

Path B: GitHub SaaS

Use this path if your organization uses GitHub.com. You do not deploy a local MCP server — the scan script connects directly to the Lineaje SaaS MCP endpoint. The scanning still runs on a runner in your environment; only the MCP server location changes.

Before You Start

  • A GitHub.com repository

  • A GitHub Personal Access Token (PAT) with access to the repository

Scan Features

Path B supports two scan types:

  • Full repository scan — Scans the entire repository using UnifAI and opens a pull request with the Lineaje violations report in the description along with all file changes.

  • Pull request scan — Scans a specific pull request using UnifAI and opens a remediation pull request with the Lineaje violations report in the description along with all file changes.

Setup

1

Step 1: Configure Environment Variables

Create a .env file in your project root and set the following variables:

Variable

Description

LOCAL_PATH

Path to the directory to scan

MCP_SERVER_URL

https://mcp.v2.prod.veedna.com/mcp

DEVICE_CODE

Device code obtained from the SBOM360 Integrations page (see Step 3)

SOURCE_CODE_REPO

URL of the repository to scan

REPO_BRANCH

Branch to scan

LLM_API_KEY

API key for the LLM provider used during scanning

LLM_MODEL

Model identifier (e.g., anthropic/claude-sonnet-4.6)

LLM_API_URL

LLM API endpoint URL

Example .env file:

2

Step 2: Obtain a GitHub Personal Access Token (PAT)

Follow these steps to generate a PAT for repository access:

  • In GitHub, go to Settings → Developer Settings → Personal Access Tokens → Tokens (Classic).

  • Click Generate new token (classic) and authenticate with your credentials.

  • Enter a name in the Note field for your own reference.

  • Set an expiration date according to your organization’s policies.

  • Under Select scopes, grant access to repo and project only.

  • Click Generate token. Copy the token immediately and store it in your .env file — it will not be shown again.

3

Step 3: Obtain a Device Code

The device code is obtained from the SBOM360 Integrations page:

  • Log in at app.veedna.com and navigate to SBOM360 → Integrations → Deploy Private Cloud → Download CLI.

  • Click the link in the box.

  • Click Verify Device. A Lineaje web page opens asking you to authenticate.

  • Authenticate and click Yes. You are returned to the Integrations page.

  • Copy the device code displayed on the Integrations page.

Running the Scan

Full Repository Scan (Public Repositories)

The scan script (repo_scan.py) is located in the Lineaje MCP server codebase. Run the following command, replacing the placeholder values with your own:

Pull Request Scan

The PR scan script (pr_scan.py) is located in the Lineaje MCP server codebase. Run the following command, replacing the placeholder values with your own:

After the scan completes, review the pull request that UnifAI opens on your repository. See Reviewing the Pull Request for details. The remediation branch is named:

Reviewing the Pull Request

When the scan detects violations, UnifAI creates a new remediation branch and opens a pull request against the branch you scanned. The pull request contains a violations report and all code changes needed to resolve the identified issues.

The violations report in the pull request description lists the following for every fix:

Column

Description

Policy Name

The UnifAI policy that the code violated.

File

The file where the violation was found.

Lines Changed

The specific line numbers added, modified, or removed to fix the violation.

Remediation

A description of what was changed and why.

To resolve the violations, review the changes in the pull request and merge it into your branch.

As you merge remediation pull requests and run subsequent scans, the violation count decreases. You can track this reduction across scan iterations in the violations report.

circle-info

The pull request is opened by the account whose PAT was used to run the scan. Ensure that token has permission to create branches and open pull requests on the repository.

Performance and Scope

The time to complete a scan depends on the size of the repository. Use the guidance below to estimate scan duration.

Repository Size

Guidance

Small / Medium

Supported today. Recommended starting point.

Large (10,000+ files)

Coming in the next release.

Up to 5 MCP server instances are supported concurrently.

Last updated