Installation Guide

GLOSI Setup

Introduction

This document provides a comprehensive guide to deploying and operating GLOSI (Global Software Inventory), a containerized solution developed by Lineaje for analyzing software packages and vulnerabilities.

Published weekly as a pre-built Docker image to Lineaje's Elastic Container Registry (ECR), GLOSI comes bundled with all required services and data, enabling users to spin up a fully functional environment with minimal setup. The system integrates a robust search backend powered by Elasticsearch, a FastAPI-driven data service, an orchestration layer for business logic, and a modern UI for user interaction.

This guide walks through how to access the image, deploy the services using Docker Compose, understand the architecture, and manage or troubleshoot the system. Whether you're running it on a single virtual machine or distributing services across multiple nodes, this document covers the configuration, health checks, networking, and recommended hardware requirements.

GLOSI Docker Image

The image is published to Lineaje Elastic Container Registry every week, which contains the full data within it including packages and vulnerability in compressed format along with required code dependencies to access and synchronise the data automatically. The image also contains a public elastic server image which can be used to power rest of the services.

Deployment Steps

Pre-requisite

1. You should have the tarball containing the installation guide, API documentation along with Resource folder containing the docker-compose.yaml file which contains GLOSI image hosted within Lineaje ECR. The tarball can also be found here.

2. You should have access to Lineaje ECR hosting the GLOSI docker image, in case you don’t have access, then please reach out to Lineaje Support on [email protected] for requesting ECR access along with your AWS Account Id. Once ECR access is granted on your AWS Account, you are good to move forward with next steps.

3. Docker installed on the host vm

4. Install AWS CLI on vm

5. For automated installation, services being installed using the GLOSI vm should have access to gold.lineaje.com domain. Whitelisting

Deployment

1. Add a config in ~/.aws/config with below info

[profile LineajeGlosiRole]
role_arn = arn:aws:iam::739923383310:role/lineajeGlosiPullRole
credential_source = Ec2InstanceMetadata

2. Login to aws ecr using below command. glosi repository id can be found in docker-compose.yaml file present under Resource folder of tarball.

aws ecr get-login-password --region us-east-1 --profile LineajeGlosiRole | sudo docker login --username AWS --password-stdin 739923383310.dkr.ecr.us-east-1.amazonaws.com

2. Pull the image in your host vm using below command

docker pull 739923383310.dkr.ecr.us-east-1.amazonaws.com/glosi:latest

3. Use below command to deploy all services with docker-compose.yaml and image pulled in previous step. This will deploy all services in one single vm.

sudo docker compsoe -f <docker-compose.yaml> up -d

Optional: Custom Deployment

ElasticSearch

Description

To start elastic search separately elasticsearch container can be spawned in separate vm and a port needs to be exposed to make it accessible through data service. With below configuration the container runs at recommended 9200 port, which can be changed as per requirement.

Configuration

elasticsearch:
    image: 216394054222.dkr.ecr.us-east-1.amazonaws.com/glosi:latest
    container_name: elasticsearch
    command: ["/bin/sh", "-c", "/usr/share/elasticsearch/bin/elasticsearch"]
    user: "1001:1001"  # Run as UID/GID defined in Dockerfile
    environment:      
      - discovery.type=single-node
      - ES_JAVA_OPTS=-Xms512m -Xmx512m
    volumes:
      - ./elasticsearch/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml
      - elasticsearch_data:/usr/share/elasticsearch/data
    ports:
      - "9200:9200"
    restart: no
    healthcheck:
      test: ["CMD-SHELL", "curl --silent --fail http://elasticsearch:9200/_cluster/health || exit 1"]
      interval: 5s
      retries: 10
      timeout: 3s

Data Service

Description

Acts as a data access API using FastAPI and connects to Elasticsearch. Converts client requests into Elasticsearch queries. This container runs on recommended port 8000 which can be customised with below configuration. This service also exposed a swagger docs endpoint at url: http://<data-service:VM_IP>:<port:8000>/docs

Configuration

data-service:
    image: 216394054222.dkr.ecr.us-east-1.amazonaws.com/glosi:latest
    container_name: data-service
    working_dir: /app/data-service
    command: ["gunicorn", "main:app", "-w", "4", "-k", "uvicorn.workers.UvicornWorker", "-b", "0.0.0.0:8000", "--timeout", "2", "--graceful-timeout", "15", "--keep-alive", "10"]    
    environment:
      - PATH=/app/data-service/venv/bin:$PATH
      - ELASTIC_AUTH_METHOD=NO_AUTH
      - OPENSEARCH_HOST=elasticsearch   
      - OPENSEARCH_PORT=9200
      - USER_SSL=False
      - VERIFY_CERT=False
    depends_on:
      elasticsearch:
        condition: service_healthy
    ports:
      - "8000:8000"
    restart: always

GLOSI UI

Description

The frontend using which user interact with this service through their browsers. It calls backend APIs and displays data visually. It runs on recommended 3000 ports which can be customised with below container definition.

glosiui:
    image: 216394054222.dkr.ecr.us-east-1.amazonaws.com/glosi:latest
    container_name: risklensui
    working_dir: /app/risklensui
    command: ["npm", "run", "start"]    
    ports:
      - "3000:3000"
    restart: always

Glosi Orchestration Service

Orchestration service handles workflows and connects UI actions with the data-service. Adds logic, routing, and validation. It runs on recommended 8500 port which can be customised with below container definition. Here link to elastic search needs to be provided along with port as this service interacts with elastic search directly with data synchronisation.

orchestration-service:
    image: 216394054222.dkr.ecr.us-east-1.amazonaws.com/glosi:latest
    container_name: orchestration-service
    working_dir: /app/glosi-orchestration-service
    command: ["gunicorn", "main:app", "-w", "4", "-k", "uvicorn.workers.UvicornWorker", "-b", "0.0.0.0:8500", "--timeout", "2", "--graceful-timeout", "15", "--keep-alive", "10"]    
    environment:
      - PATH=/app/glosi-orchestration-service/venv/bin:$PATH
      - DATA_SERVICE_HOST=data-service
      - DATA_SERVICE_PORT=8000
      - ELASTIC_AUTH_METHOD=NO_AUTH
      - OPENSEARCH_HOST=elasticsearch   
      - OPENSEARCH_PORT=9200
      - USER_SSL=False
      - VERIFY_CERT=False
      - OSS_TENANT_ID=vdna_994mgmr65tculnfy
    depends_on:
      - data-service
    ports:
      - "8500:8500"
    restart: always

Machine Configuration

All containers can be setup in one single vm or in distributed fashion. If all applications are being setup in one single vm below configuration is recommended.

Deployment Type

services

Ram

vCPU

DiskSize

SingleNode

data-service

32GB

8

300GB

Data Persistence, Networking, and Volumes :

Volumes ensure data inside containers isn’t lost when containers restart or rebuild. For instance, the elasticsearch_data volume stores all search and analytics data.

Volumes used:

· elasticsearch_data: Stores data generated by Elasticsearch post package and vulnerability index data synchronisation

Networking

Docker Compose automatically creates an internal bridge network. Containers communicate using service names. This abstraction avoids hardcoding IPs and simplifies scaling or relocating services.

Healthchecks ensure each service is only marked healthy once it responds to a specific URL or command, improving reliability.

Interactions Between Services :

Here’s how the services communicate:

1. Frontend Interaction

The user opens the UI at http://localhost:3000. This sends API requests (like search queries) to the orchestration-service at http://localhost:8500.

2. Orchestration-Service

Validates the request and forwards it to the appropriate backend (usually the data-service).

3. Data-Service

Receives requests from orchestration-service. Transforms them into Elasticsearch-compatible queries and sends them to Elasticsearch.

4. Elasticsearch

Performs search/indexing operations and returns results to the data-service.

5. Response Propagation

The data-service formats Elasticsearch results and returns them to orchestration-service, which may further process or log it.

6. Frontend Rendering

The risklensui receives the final result and displays it to the user.

The architecture follows this loop:

UI ➝ Orchestration ➝ Data-service ➝ Elasticsearch ➝ Data-service ➝ Orchestration ➝ UI

Compose ensures services are started in this dependency chain, meaning each service waits for its required backend to be healthy before starting.

Access URLs

· Frontend: http://<risklens-ui:localhost>:<risklens-port:3000> i.e. http://localhost:3000

· Data-Service Swagger Docs: http://<data-service:localhost>:<data-service-port:8000>/docs i.e. http://localhost:8000/docs

· Glosi Orchestration Service Swagger Docs: http://<glosi-orchestration-service:localhost>:<glosi-orchestration-service-port:8500>/docs i.e. http://localhost:8500/docs

Maintenance and Troubleshooting

An admin panel is provided to check the status of data index status.

Maintenance Tips

· Clean up unused resources:

docker system prune -a

· Restart specific service:

docker-compose restart data-service

Troubleshooting

· Check container status: docker ps

· Access container shell: docker exec -it <container-name> /bin/sh

· Watch logs for errors.

Last updated