AI Assistant for Corporate Report Analysis
Conceptual project of an AI module for automatic summarization of reports in high-volume environments, exploring Vertex AI, BigQuery, and asynchronous architecture.
The Problem
In companies that generate large volumes of data daily — sales reports, customer feedback, operational metrics, system logs — teams often spend hours manually reading and interpreting documents to extract trends, anomalies, and actionable insights.
The central question: how to transform raw data into structured analysis automatically and scalably?
This project explores a generic architecture for this type of solution, applicable to various domains (e-commerce, SaaS, operations, etc.).
Architectural Decisions
Why Vertex AI instead of OpenAI?
The choice for Vertex AI makes sense in scenarios where:
- Sensitive data needs to stay within the GCP ecosystem
- Infrastructure already exists in BigQuery, eliminating additional ETL
- SLA and corporate compliance are requirements
Synchronous vs Asynchronous Processing
A synchronous approach — user requests and waits on screen — works for low volumes. With increasing demand, migrating to Pub/Sub is a natural evolution:
- User requests analysis
- Request is queued in Pub/Sub
- A worker processes with the AI model
- Result is persisted and the user is notified
This reduces timeouts and allows for parallel processing.
Caching Strategy
Two-layer caching is usually decisive for cost viability:
- Data Cache: frequent BigQuery queries cached to avoid reprocessing
- Analysis Cache: identical inputs return the previous result
In real-world scenarios, this can reduce AI API call costs by more than 60%.
Prompt Engineering
The quality of the analysis depends directly on prompt engineering. Useful patterns:
- Structured Context: data in JSON with defined schema
- Decision-oriented Instructions: instead of "summarize", ask to "identify anomalies and suggest actions"
- Output Constraints: expected response format for consistent parsing
Technical Stack
- Backend: Java 17 with Spring Boot
- AI: Vertex AI (Gemini) for analysis generation
- Data: BigQuery for storage and analytical queries
- Messaging: Pub/Sub for asynchronous processing
- CI/CD: Jenkins (or equivalent) with automated pipelines
- Monitoring: Cloud Logging and alerts
Lessons Learned
- Generative AI in production requires guardrails. Output validation, rate limiting, and fallbacks are essential.
- Caching is not premature optimization when using AI. It is a requirement for financial viability.
- Prompt engineering is software engineering. Prompts should be versioned, tested, and have quality metrics.