Skip to main content
Support

AI Features & Privacy

Learn how AssisT handles AI processing with local LLMs and optional cloud APIs while keeping your data private.

Overview

AssisT uses a privacy-first hybrid AI system that gives you four ways to use AI features, from completely offline to cloud-powered. Your data stays on your device by default.

Four AI Modes

AssisT offers flexible AI processing with four distinct modes:

ModePrivacyCostPerformanceRequirements
OffN/AFreeFeatures disabledNone
Local AI (Ollama)100% PrivateFreeGood (hardware-dependent)Ollama installed
Cloud AI (API)Your API onlyPay-per-useExcellentAPI key
Gemini Nano100% PrivateFreeFast, on-deviceChrome 128+

You can switch between modes anytime in AssisT settings using the radio toggle selector.

Key Principles

  • Your Choice: Pick the AI mode that matches your privacy and performance needs
  • No Data Collection: We never see, store, or transmit your data
  • Bring Your Own Key: Cloud mode uses your own API keys, not ours
  • Graceful Fallback: Features work even without AI (with reduced functionality)

Local AI with Ollama

AssisT integrates with Ollama, a free, open-source tool that runs AI models directly on your computer.

Why Local AI?

BenefitDescription
PrivacyData never leaves your device
ComplianceSafe for GDPR, FERPA, and HIPAA environments
No CostNo API fees or subscriptions
OfflineWorks without internet connection
SpeedNo network latency for requests

Supported Models

AssisT automatically detects and uses available Ollama models:

ModelSizeBest For
phi3:mini2GBFast responses, basic tasks
llama3.25GBBalanced performance
mistral4GBComplex analysis, detailed responses
llava4GBImage understanding (vision)

Installing Ollama

  1. Download Ollama from ollama.ai
  2. Install and run Ollama on your computer
  3. AssisT will automatically detect it

Installing Models

Once Ollama is running, you can install models directly from AssisT:

  1. Open AssisT settings
  2. Go to AI Settings > Local Models
  3. Click Install next to your preferred model
  4. Wait for the download to complete

Recommended Model Sets:

  • Minimal (2GB): phi3:mini - Fast responses for basic tasks
  • Balanced (5GB): phi3:mini + llama3.2 - Good for most users
  • Full (10GB): All models including vision - Complete AI capabilities

AssisT will recommend a model set based on your system’s available memory.

How Local AI Works

Your Browser (AssisT)

    Message Bridge

Ollama (localhost:11434)

    AI Response

Back to AssisT

All communication happens locally on your machine. Nothing is sent to external servers.

Cloud Providers (Optional)

For users who want more powerful AI capabilities, AssisT supports multiple cloud providers through API keys you provide.

Supported Providers

ProviderStrengthsBest For
Anthropic (Claude)Coding, academic writing, analysisText simplification, tutoring
OpenAI (ChatGPT)Creative, conversationalBrainstorming, general tasks
Google (Gemini)Multimodal, visual, factualImage understanding
PerplexityReal-time web, citationsResearch, fact-checking

Bringing Your Own API Key

  1. Get an API key from your preferred provider:
  2. Open AssisT settings
  3. Go to AI Settings > Cloud Providers
  4. Select your provider and enter your API key
  5. Choose your preferred model

Cost vs Quality

Model TypeCostBest For
Fast (Haiku, GPT-4o-mini, Flash)Cheaper per tokenSimple tasks, high volume
Balanced (Sonnet, GPT-4o, Pro)ModerateMost use cases
Quality (Opus, GPT-4)Higher per tokenComplex tasks, accuracy critical

Tip: Start with faster models for simple tasks. Use larger models when you need more nuanced or accurate responses.

API Key Security

  • Your API keys are stored locally in Chrome’s secure storage
  • They are never sent to Fiavaion servers
  • Only transmitted directly to the provider when you use cloud features
  • You can remove them anytime from settings

Claude 4.5/4.6 Models (Anthropic)

When using Cloud AI mode with an Anthropic API key, AssisT supports the latest Claude models for powerful language understanding and generation:

ModelModel IDBest ForInput CostOutput Cost
Haiku 4.5claude-haiku-4-5Quick answers, simple tasks, high volume$0.001/1K tokens$0.005/1K tokens
Sonnet 4.5claude-sonnet-4-5Everyday tasks, balanced performance (recommended)$0.003/1K tokens$0.015/1K tokens
Opus 4.6claude-opus-4-6Complex analysis, critical work, highest quality$0.015/1K tokens$0.075/1K tokens

Cost Example: A typical 500-word document summary using Sonnet 4.5 costs approximately $0.002-0.004 per request.

Recommendation: Start with Sonnet 4.5 for the best balance of quality and cost. Use Haiku 4.5 for simple, high-volume tasks. Reserve Opus 4.6 for complex analysis where accuracy is critical.

Feature-Specific Defaults:

  • Summarization: Haiku 4.5 (fast, sufficient for most summaries)
  • Text Simplification: Sonnet 4.5 (better comprehension and clarity)
  • Assignment Breakdown: Sonnet 4.5 (detailed task analysis)
  • Socratic Tutor: Opus 4.6 (complex reasoning and questioning)
  • Citation Analysis: Sonnet 4.5 (balanced accuracy and speed)
  • Multi-Document Compare: Opus 4.6 (handles complexity well)

Gemini 2.0 Models

AssisT now supports Google’s latest Gemini 2.0 models when using Cloud AI mode with a Google API key:

ModelDescriptionBest For
Gemini 1.5 FlashFast, efficient, affordableQuick tasks, high volume
Gemini 1.5 ProBalanced performance and qualityGeneral use, complex tasks
Gemini 2.0 Flash ExperimentalLatest experimental modelCutting-edge features, testing

These models offer improved reasoning, multimodal understanding, and longer context windows compared to earlier versions.

Gemini Nano (Chrome Built-In AI)

NEW: AssisT now supports Chrome’s built-in Gemini Nano model for completely private, on-device AI processing without installing anything.

What is Gemini Nano?

Gemini Nano is Google’s smallest AI model, built directly into Chrome 128 and later. It runs entirely on your device using Chrome’s Prompt API (window.ai).

Benefits

  • Zero Setup: No Ollama installation required
  • 100% Private: All processing happens locally in Chrome
  • No API Costs: Completely free to use
  • Fast: Optimized for on-device performance
  • Offline: Works without internet connection

Requirements

To use Gemini Nano mode, you need:

  1. Chrome 128 or later (Canary, Dev, Beta, or Stable)
  2. Feature flag enabled: Visit chrome://flags/#optimization-guide-on-device-model and set to “Enabled”
  3. Model download: Chrome downloads the model automatically on first use (happens in background)

How to Enable

  1. Open AssisT settings
  2. Go to AI Settings
  3. Select Gemini Nano mode
  4. AssisT will check availability and show status

Status Indicators

StatusMeaningWhat to Do
ReadyModel downloaded and availableStart using AI features
Needs DownloadModel downloading in backgroundWait a few minutes, reload
Not SupportedChrome version too old or device incompatibleUpdate Chrome or use different mode
UnavailableFeature flag not enabledEnable flag at chrome://flags

Gemini Nano vs Ollama

FeatureGemini NanoOllama
SetupChrome flag onlyInstall separate app
Model Size~2GB (built into Chrome)2GB-7GB per model
Model ChoiceSingle model (Google’s)Many models available
PerformanceGood for basic tasksBetter for complex tasks
CustomizationLimitedFull control

Recommendation: Try Gemini Nano first for simplicity. Switch to Ollama if you need more powerful models or specific capabilities.

How the AI Mode System Works

AssisT routes requests based on your selected AI mode:

Feature Request

Check Selected AI Mode

   ┌──────┴──────────────────────────┐
   │                                  │
OFF Mode               AI Enabled Mode
   │                          │
   └→ Fallback      ┌─────────┴──────────┐
      behavior      │                    │
                    │                    │
              Local AI Mode        Cloud AI Mode
                    │                    │
             ┌──────┴──────┐        ┌────┴────┐
             │             │        │         │
        Ollama Mode   Gemini Nano  API Key   Fallback
             │             │        │
        Use Ollama    Use Chrome   Use Cloud
        (private)     built-in AI  Provider
                      (private)    (your key)

Task-Based Model Selection

Different features use the best available model for optimal results:

FeatureGemini NanoOllama (Local)Recommended Cloud
Summarization✅ Goodphi3:mini, llama3.2Any fast model
Text Simplification✅ Goodllama3.2, mistralAnthropic (clarity)
Socratic Tutor⚠️ BasicmistralAnthropic (reasoning)
Assignment Breakdown✅ Goodllama3.2, mistralClaude, GPT-4
Image Understanding❌ Not supportedllavaGemini or GPT-4o
Research & Citations❌ Not supported❌ No web accessPerplexity (web access)

Fallback Behaviors

When AI isn’t available, features gracefully degrade:

FeatureFallback Behavior
SummarizeShows first paragraph
SimplifyFeature disabled with message
Image DescribeRequires vision model
TTS ProsodyUses neutral tone

Privacy Guarantees

What We Never Do

  • Collect or store your data
  • Send data to our servers
  • Track your AI usage
  • Share information with third parties

What Stays Local

  • All text you process
  • Documents you summarize
  • Images you analyze
  • Conversation history

GDPR/FERPA/HIPAA Compliance

Because AssisT processes everything locally:

  • GDPR: No personal data is transmitted
  • FERPA: Student data stays on the device
  • HIPAA: Patient information never leaves the browser

This makes AssisT safe for educational institutions and healthcare settings.

Performance Tips

For Best Local AI Performance

  1. Use an SSD: Faster model loading
  2. 8GB+ RAM/VRAM: Required for larger models
  3. Keep Ollama Running: Faster first response
  4. Choose Appropriate Models: Match model size to your hardware

Why Memory Matters

  • More VRAM = Better Models: With more video memory (or unified memory on Apple Silicon), you can run larger, more capable models
  • More Memory = Longer Context: Additional memory allows longer context windows—the AI can “remember” more of your document
  • Longer Context = Fewer Hallucinations: When AI sees more context, it makes fewer mistakes because it has more information to work with

Memory Types

TypeWhat MattersNotes
Dedicated GPUVRAM (8GB good, 12GB+ great)NVIDIA/AMD graphics cards
Apple SiliconUnified memory (16GB good, 32GB+ excellent)M1/M2/M3/M4 Macs
CPU-onlySystem RAM (16GB min, 32GB recommended)Slower but works
SetupRAM/VRAMStorageModels
Minimal8GB4GB freephi3:mini
Standard16GB8GB freephi3:mini + llama3.2
Full32GB+15GB freeAll models + longer context

Troubleshooting

Ollama Not Detected

  1. Ensure Ollama is installed and running
  2. Check that it’s accessible at localhost:11434
  3. Restart Ollama if needed
  4. Refresh the AssisT extension

Slow Responses

  1. Try a smaller model (phi3:mini is fastest)
  2. Ensure Ollama isn’t processing other requests
  3. Check your system’s available memory
  4. Close other resource-intensive applications

Model Download Failed

  1. Check your internet connection
  2. Ensure enough disk space is available
  3. Try downloading a smaller model first
  4. Restart Ollama and try again