Skip to content

Summary

A biotechnology company pioneering a platform for the intracellular delivery of therapeutic payload leveraging Protein Nanoparticles (PNPs), Targeted Lipid Nanoparticles (tLNPs), and Antibody Oligonucleotide Conjugates (AOCs) to develop next-generation therapeutics that can reach previously inaccessible targets within cells engaged Tennex to develop an AI-powered document enrichment pipeline on AWS to better enable their science teams to quickly surface actionable insights.

The Challenge 

This company's research teams needed to process and analyze large volumes of diverse scientific documents including PDFs, Excel spreadsheets, PowerPoint presentations, images, and JSON data. The existing manual review process was time-consuming and inconsistent, making it difficult to extract meaningful insights and generate structured scientific reports at scale. They required an automated pipeline that could intelligently enrich documents with AI-driven summarization and semantic analysis to accelerate their research workflows.

Gold Robot

“With Tennex's event driven document enrichment pipeline powered by Bedrock, we're processing data faster than ever and accelerating time to value.”

 

- Director, Platform

 

The Solution

Tennex designed and built a fully serverless AI/ML document enrichment and report generation pipeline on AWS. The solution uses AWS Step Functions to orchestrate a multi-stage workflow that ingests documents from Amazon S3, processes them through AWS Lambda (ARM64) and Amazon ECS Fargate containers, and enriches them using LLM summarization powered by Anthropic Claude and vector embeddings via Amazon Bedrock Nova v2 Multimodal. The pipeline generates structured scientific reports in PDF and DOCX formats.

All infrastructure is deployed as code using AWS CDK v2 in TypeScript, with credentials managed through AWS Secrets Manager. The enrichment agent uses 24 calibrated prompt templates to ensure consistent, high-quality output across all document types.