Skip to content

LLM Providers & AI Services

The portfolio standardizes on a unified ai-providers service in every AI-enabled app — a single TypeScript/Python/PHP module that exposes a uniform chat()/embed() interface and routes to whichever provider's SDK is configured. This page covers each provider's role, the SDKs in use, and what models are wired up.

Per Integrations_Audit.md (verified 2026-03-20): every app has all six primary AI provider SDKs installed and unified ai-providers service created, with unit tests.


Anthropic Claude

What it is. Anthropic's frontier LLM family. Strong on reasoning, tool use, long-context (Sonnet 4.7 advertises 1M tokens), and code.

Models seen in app code: - claude-sonnet-4-6, claude-sonnet-4-7 — production default in newer apps. - claude-3-5-sonnet, claude-3-haiku — older apps still on these. - claude-opus-4-7 — for the heaviest reasoning tasks.

Common roles: - Primary chat brain — Automotive-Repair-Diagnosis-AI, Ecom-Sales / GogreenSellerAI (with per-feature selection), Tutor_AI, GoGreen-Workflow-Hub. - LLM-as-reranker — Ecom-Sales / GogreenSellerAI (see AI-Concepts.md). - Tool-use agent — Tutor_AI (10 tools), AI-Wordpress (8 tools), Recruiting_AI. - HyDE generator — many apps; Claude Haiku 3.5 is cheap enough to produce hypothetical documents.

SDK: @anthropic-ai/sdk (Node), anthropic (Python ^0.42.0+), wp_remote_post() (WordPress raw HTTP).

Used in: All 35+ AI apps per the matrix.


OpenAI

What it is. GPT models, embeddings, Whisper (STT), Realtime API (low-latency speech), DALL-E 3 (image generation).

Models seen: - gpt-4o, gpt-4o-mini — chat. gpt-4o-mini is the portfolio's default for "small task" calls (HyDE, classifiers, query rewriting). - text-embedding-3-small (1536 dim) — default embedding model across 17+ apps. - text-embedding-3-large (3072 dim) — GoGreen-DOC-AI only. - text-embedding-ada-002 — legacy, no app currently on it. - whisper-1 — speech-to-text. Used in FamilyChat, Ecom-Sales, MangyDogCoffee, Boomer_AI, NaggingWifeAI, Salon-Digital-Assistant, SCO-Digital-Assistant, SellMeACar, SellMe_PRT, Tutor_AI. - dall-e-3 — image gen. Ecom-Sales is the heavy user.

OpenAI Realtime API

What it is. A WebSocket-based bidirectional audio-in / audio-out streaming API. The model "speaks" while the user is still speaking; sub-second turn-taking. Massively reduces latency vs. the older STT → LLM → TTS pipeline.

Voice-first apps using Realtime: MangyDogCoffee, Salon-Digital-Assistant, SCO-Digital-Assistant, SellMeACar, SellMe_PRT, Boomer_AI, NaggingWifeAI, Tutor_AI.

OpenAI Embeddings

See Vector-Databases.md and AI-Concepts.mdtext-embedding-3-small is the portfolio's standard, -large is the high-recall option.

SDK: openai (Node and Python).


xAI Grok

What it is. xAI's family of LLMs (Grok-2, Grok-3 etc.). Available via OpenAI-compatible API (https://api.x.ai/v1).

Role. Used as a tertiary fallback / alternative voice in multi-provider routing. Some apps use Grok for "personality" content where its less-filtered tone is desired.

Used in: All 35+ AI apps per the matrix (config + SDK installed); active routing varies.

SDK: Same openai Node/Python SDK pointed at the xAI base URL.


Google Gemini

What it is. Google's frontier LLM family. Strong long-context (1M tokens on 1.5 Pro / 2.0). gemini-2.0-flash is the portfolio default — fast and cheap.

SDK: @google/generative-ai (Node), google-generativeai (Python).

Used in: All 35+ apps.


HuggingFace

What it is. Hub of open-source models — embeddings (sentence-transformers), LLMs (Llama, Mistral, Qwen), classifiers, encoders. Inference endpoints + the Inference API.

Role in portfolio. Configured everywhere as a fallback embedding provider and for specialty models that aren't on OpenAI/Anthropic. Most apps don't actively call it during normal request paths.

Tokens. Two HF tokens are used: - Primary: hf_EYJ... (all apps except SellMe_PRT-Docker) - Alternate: hf_RVI... (SellMe_PRT-Docker only — documented in API_Keys_and_Secrets_Master.md)

SDK: @huggingface/inference (Node), huggingface_hub (Python).


ElevenLabs (TTS)

What it is. High-quality text-to-speech. Multi-voice library + voice cloning. The portfolio default for synthesized voice output where Realtime API isn't streaming directly.

Used in voice apps: AI-Wordpress (Rachel voice), AscendOne, Boomer_AI, Ecom-Sales, NaggingWifeAI, WP-Plugin, Jarvis_Electron, Jarvis_Rust_Tauri.

SDK: elevenlabs (Node and Python).


Stability AI

What it is. Image generation (Stable Diffusion XL, SD3, etc.). Used as an alternative to OpenAI DALL-E 3.

Used in: Ecom-Sales (product imagery generation).


Picovoice

What it is. On-device wake-word detection ("Hey Jarvis"), voice-activity detection, speech-to-intent. Runs locally — no cloud round-trip — which is why it's the choice for always-listening apps.

Used in: WP-Plugin, Jarvis_Electron, Jarvis_Rust_Tauri.


Vercel AI SDK

What it is. TypeScript framework that abstracts every LLM provider behind a single streamText({model, messages, tools}) interface. Streaming-first, tool-use-native, React Server Components-friendly. Sits one layer above the raw SDKs.

Used in: GoGreen-Workflow-Hub.


LangChain

What it is. Python (and TypeScript) framework for chaining LLM calls, tools, and retrievers. Heavy abstractions, large API surface — useful when you want pre-built RAG/agent patterns out of the box.

Used in: - Automotive-Repair-Diagnosis-AIlangchain ^0.3.14, RAG framework over pgvector. - GoGreen-DOC-AI — used as the middle tier of the 3-tier Graph RAG. - GoGreenMarketing — RAG with HyDE + query rewriting.

Tradeoffs. Quick to start, can be hard to debug. Newer apps prefer the Vercel AI SDK or hand-rolled loops.


LangSmith

What it is. LangChain's observability layer — traces every LLM call, tool invocation, and prompt; supports eval datasets. The closest thing to "Datadog for LLMs."

Used in: GoGreen-SmartForms.


LlamaIndex (not currently used)

What it is. Data-framework competitor to LangChain — leaner, focused on RAG pipelines. Strong on document loaders and ingestion.

Why not used. Where the portfolio needs heavy ingestion (GoGreen-DOC-AI), LangChain + Docling won. LlamaIndex remains a viable alternative if a future app needs more sophisticated index types.


OpenRouter / Ollama / Groq / Mistral / DeepSeek

Mentioned in OpenSentinel. OpenSentinel is the only app that talks to 9 providers via its routing layer: - OpenAI, Claude, Grok, Gemini — the "big four." - Groq — Llama / Mixtral inference at ~500 t/s. - Mistral — direct API for the Mistral models. - OpenRouter — gateway that fronts hundreds of models from one API. - Ollama — local inference on the host machine (no cloud). - A custom provider slot for whatever's next.

For the rest of the portfolio, these are not configured.


The unified ai-providers service pattern

What it is. Most apps ship a file like src/services/ai/ai-providers.ts (or src/lib/ai-providers.ts, or AiProviderService.php) that:

  1. Reads OPENAI_API_KEY, ANTHROPIC_API_KEY, XAI_API_KEY, GEMINI_API_KEY, HUGGINGFACE_TOKEN, ELEVENLABS_API_KEY from env.
  2. Exposes chat({provider, model, messages, tools}) and embed({provider, text}).
  3. Falls back to the next provider on rate-limit / error.
  4. Supports per-feature provider selection (e.g. embeddings always go to OpenAI; chat is configurable per-tenant).

Why it matters. No vendor lock-in. Switching the default chat provider is a one-line config change. Per-tenant routing for cost optimization.


Voice / Speech stack at a glance

Component Default in portfolio Alternatives
Wake-word Picovoice (on-device) none
STT OpenAI Whisper API Picovoice STT, Web Audio API
Realtime conversation OpenAI Realtime API none
TTS ElevenLabs OpenAI tts-1, Google Cloud TTS
Speech-to-intent Picovoice LLM with tool use

Image generation stack

Component Default Alternatives
Text → image DALL-E 3 (Ecom-Sales primary) Stability AI (also Ecom-Sales), Midjourney (not used)
Image editing None in portfolio DALL-E 3 edit, Replicate

Key reference documents

  • API_Keys_and_Secrets_Master.md at the portfolio root is the single source of truth for which keys belong to which app. Always check it before editing any .env.
  • Integrations_Audit.md has the full per-app provider matrix (last verified 2026-03-20).