GN
GrowNative
Back to Home

AI Implementation & Guardrails

A transparent overview of the Generative AI models, workflows, and safety protocols powering the GrowNative platform.
1

Narrative Engine

Gemini 2.0 Flash (Orchestrator) • Gemini 3.0 Pro (The Brain)
HLM Methodology

**High-Density Linguistic & Multimodal Logic (HLM)** ensures every token is packed with specific cultural, pedagogical, and visual data for 100% integrity.

1. High-Density Cultural Anchoring

Injects "Technical Anchors" (e.g., exact weave of a Kasavu Mundu) to prevent fallback to Western tropes.

2. Linguistic Scaffolding

Enforces literacy standards (e.g., Level 1 SVO syntax) and extracts key vocabulary for "Pedagogical Audit".

3. Multimodal Stability

Treats image and text as a unified "State", cross-checking visual output against the Linguistic Anchor.

Safety Guardrails
Strict JSON Schema

Prevents hallucinations and ensures structural integrity.

Ex: Returns pure JSON, no "Here is your story" chat filler.
Phase 4: Auto-Audit

A dedicated LLM pass validates 'Kid-Safety' before showing output.

Ex: Flagged "scary clown" → regenerated as "silly juggler".
Topic Filtering

Pre-validation rejects harmful topics immediately.

Ex: Rejects "horror" or "violence" prompts instantly.
The "Cultural Oracle" in Action
Powered by Gemini 3.0

Same Prompt, Different Worlds.
Our HLM Logic adapts Visual DNA (clothing, environment, lighting) to match the cultural context without changing the core narrative.

Case Study A: Football
Case Study B: Sisters
Prompt: "Four boys playing football"
Tamil Cultural Context
Context: Tamil (South India)
Key Details: Dusty red earth, simple cotton attire, warm "golden hour" lighting.
Chinese Cultural Context
Context: Chinese (Urban Park)
Key Details: Lush green park, layered modern casuals, cooler urban lighting.
HLM Logic vs. Standard "One-Shot" Prompting
Feature
Standard Prompting
HLM Logic (GrowNative)

Identity

High risk of character & cultural drift.

Visual DNA Lock: High-contrast technical anchors ensure 100% continuity.


Context

Single-turn, forgets previous pages.

Sequential Memory: Maintains state through the SAGA framework.


Pedagogy

"Write a kids' story" (Unpredictable).

Standardized Rubrics: Strictly follows Reading Level 1-8 constraints.


Speed

Slow per-frame generation.

Context Caching: Reduces latency by 30-50% via cached cultural data.

2

Gemini 3.0 Multimodal SAGA

Gemini 3.0 (Director) • Imagen 3 (High-Fidelity Renderer)
SAGA & Visual DNA

**S**tate-**A**ware **G**eneration of **A**ssets ensures that characters remain consistent across the entire story.

Visual DNA Locking

We extract a "Genetic Code" for characters (hair, clothes, size) and inject it into every prompt.

Ex: "Ananya [DNA: Blue Silk Top] is always recognized by Imagen 3."

Seed Control

Locks random seeds to maintain stylistic consistency (Pixar-esque 3D).

Visual Guardrails
Prompt Rewriting Layer

Sanitizes constraints to ensure age-appropriate imagery.

Ex: "Scary woods" → rewritten to "Mysterious, foggy forest".
Vision Audit

A post-generation check uses Gemini Vision to verify 'Kid-Safety' before display.

Ex: Rejects images with scary faces or unsafe objects.
Dynamic Crop Control

Ensures no inappropriate framing or focus.

Ex: Centers the character's face/action, avoiding awkward crops.
SAGA (Stateful) vs. RAG (Retrieval)
Feature
RAG (Retrieval-Augmented)
SAGA (Sequential Agentic)

Goal

Grounding a single response in external facts.

Maintaining Stateful Continuity across a sequence.


Data Handling

Retrieves static documents from Vector DB.

Extracts & passes Visual DNA + Narrative State.


Context

"Look up this fact to answer."

"Remember how the character looked in the last frame."

Why SAGA for GrowNative?
Elimination of 'Visual Drift'

Unlike RAG, SAGA extracts a specific Physical ID (Visual DNA) from the first frame and locks it as an immutable constraint.

Narrative Staging

SAGA uses a Shot Plan. If Kayal is on the left in Page 1, the 'Director Agent' knows where she should be in Page 2.

Multimodal QA

Integrates a Vision Audit loop. It doesn't just generate; it 'sees' the output and compares it to the Visual DNA.

Lower Latency via Caching

Combines SAGA with Context Caching to store 'Rules of the World', making sequential generation 30-50% faster.

3

Gemini 3.0 Audio Orchestration

Gemini 3.0 (Scripting) • Gemini 2.0 Flash (Speech)
RESPONSIBILITY

Hybrid Audio Engine uses **Gemini 2.0 Flash** for lifelike sequencing or falls back to Browser TTS with AI-directed prosody.

UX GUARDRAILS

Sentence Pacing Engine

Custom regex logic splits text and inserts 600ms pauses for comprehension.

Ex: "The cat sat... [pause] ...she looked up."

Rate/Pitch Modulation

Adjusts browser TTS to sound friendly and less robotic.

Ex: Pitch lowered 10% to sound warmer.

Accessibility Coloring

Confetti and text use strictly enforced 'Brand Colors'.

Ex: Yellow text gets a black shadow for white backgrounds.
4

Human-in-the-Loop (HITL)

Human Logic Model (HLM) • Gemini 3.0 Safety Auditor

AI is powerful, but not perfect. We implement strict human oversight layers:

Revision Queue & HLM

Stories flagged by automated audits are quarantined for Human Review.

Ex: "High Uncertainty" stories wait for admin approval.

Satisfaction Checklist

Admins must explicitly verify safety before publishing.

Ex: "Publish" button disabled until "Kid Safety" checked.

Parent Controls

Parents can lock views or reset profiles manually.

Ex: Parent can wipe a specific category instantly.

Privacy & Data Protection

Local-First Architecture for Zero-PII
Zero Cloud PII

User profiles & data live 100% in your browser's LocalStorage.

Ex: We have no user database. Data stays on your device.
Ephemeral AI (Gemini 3.0)

Conversational states are processed and forgotten instantly.

Ex: Google forgets the story context immediately after generation.
Parent Sovereignty

One-click 'Reset Local Data' wipes all traces instantly.

Ex: 'Reset' button permanently deletes the browser key.

The "Family" Win

More than just an app. A bridge for connection.

"We don't just generate content; we generate conversations."

Engagement • Learning • Bonding
👨‍👩‍👧 Quality Family Time

Turns passive screen time into active co-reading. Parents and kids explore Cultural Nuances together, sparking questions like "Is that how grandma's house looks?"

🧠 "Smart" Parenting Tools
Gemini 3.0 Linguistics

Word Builder & Grammar References give parents the "Teacher's Key". You don't need to be a linguist to help your child learn native vocabulary.

✨ Emotional Engagement

When kids see characters that look like them in environments they recognize, engagement skyrockets. Identity = Retention.