AI Assistant for Corporate Report Analysis

The Problem

In companies that generate large volumes of data daily — sales reports, customer feedback, operational metrics, system logs — teams often spend hours manually reading and interpreting documents to extract trends, anomalies, and actionable insights.

The central question: how to transform raw data into structured analysis automatically and scalably?

This project explores a generic architecture for this type of solution, applicable to various domains (e-commerce, SaaS, operations, etc.).

Architectural Decisions

Why Vertex AI instead of OpenAI?

The choice for Vertex AI makes sense in scenarios where:

Sensitive data needs to stay within the GCP ecosystem
Infrastructure already exists in BigQuery, eliminating additional ETL
SLA and corporate compliance are requirements

Synchronous vs Asynchronous Processing

A synchronous approach — user requests and waits on screen — works for low volumes. With increasing demand, migrating to Pub/Sub is a natural evolution:

User requests analysis
Request is queued in Pub/Sub
A worker processes with the AI model
Result is persisted and the user is notified

This reduces timeouts and allows for parallel processing.

Caching Strategy

Two-layer caching is usually decisive for cost viability:

Data Cache: frequent BigQuery queries cached to avoid reprocessing
Analysis Cache: identical inputs return the previous result

In real-world scenarios, this can reduce AI API call costs by more than 60%.

Prompt Engineering

The quality of the analysis depends directly on prompt engineering. Useful patterns:

Structured Context: data in JSON with defined schema
Decision-oriented Instructions: instead of "summarize", ask to "identify anomalies and suggest actions"
Output Constraints: expected response format for consistent parsing

Technical Stack

Backend: Java 17 with Spring Boot
AI: Vertex AI (Gemini) for analysis generation
Data: BigQuery for storage and analytical queries
Messaging: Pub/Sub for asynchronous processing
CI/CD: Jenkins (or equivalent) with automated pipelines
Monitoring: Cloud Logging and alerts

Lessons Learned

Generative AI in production requires guardrails. Output validation, rate limiting, and fallbacks are essential.
Caching is not premature optimization when using AI. It is a requirement for financial viability.
Prompt engineering is software engineering. Prompts should be versioned, tested, and have quality metrics.