Back to projects

AI Assistant for Corporate Report Analysis

Conceptual project of an AI module for automatic summarization of reports in high-volume environments, exploring Vertex AI, BigQuery, and asynchronous architecture.

JavaSpring BootVertex AIBigQueryPub/Sub

The Problem

In companies that generate large volumes of data daily — sales reports, customer feedback, operational metrics, system logs — teams often spend hours manually reading and interpreting documents to extract trends, anomalies, and actionable insights.

The central question: how to transform raw data into structured analysis automatically and scalably?

This project explores a generic architecture for this type of solution, applicable to various domains (e-commerce, SaaS, operations, etc.).

Architectural Decisions

Why Vertex AI instead of OpenAI?

The choice for Vertex AI makes sense in scenarios where:

  • Sensitive data needs to stay within the GCP ecosystem
  • Infrastructure already exists in BigQuery, eliminating additional ETL
  • SLA and corporate compliance are requirements

Synchronous vs Asynchronous Processing

A synchronous approach — user requests and waits on screen — works for low volumes. With increasing demand, migrating to Pub/Sub is a natural evolution:

  1. User requests analysis
  2. Request is queued in Pub/Sub
  3. A worker processes with the AI model
  4. Result is persisted and the user is notified

This reduces timeouts and allows for parallel processing.

Caching Strategy

Two-layer caching is usually decisive for cost viability:

  • Data Cache: frequent BigQuery queries cached to avoid reprocessing
  • Analysis Cache: identical inputs return the previous result

In real-world scenarios, this can reduce AI API call costs by more than 60%.

Prompt Engineering

The quality of the analysis depends directly on prompt engineering. Useful patterns:

  • Structured Context: data in JSON with defined schema
  • Decision-oriented Instructions: instead of "summarize", ask to "identify anomalies and suggest actions"
  • Output Constraints: expected response format for consistent parsing

Technical Stack

  • Backend: Java 17 with Spring Boot
  • AI: Vertex AI (Gemini) for analysis generation
  • Data: BigQuery for storage and analytical queries
  • Messaging: Pub/Sub for asynchronous processing
  • CI/CD: Jenkins (or equivalent) with automated pipelines
  • Monitoring: Cloud Logging and alerts

Lessons Learned

  1. Generative AI in production requires guardrails. Output validation, rate limiting, and fallbacks are essential.
  2. Caching is not premature optimization when using AI. It is a requirement for financial viability.
  3. Prompt engineering is software engineering. Prompts should be versioned, tested, and have quality metrics.