LLM Providers & AI Services¶
The portfolio standardizes on a unified ai-providers service in every AI-enabled app — a single TypeScript/Python/PHP module that exposes a uniform chat()/embed() interface and routes to whichever provider's SDK is configured. This page covers each provider's role, the SDKs in use, and what models are wired up.
Per
Integrations_Audit.md(verified 2026-03-20): every app has all six primary AI provider SDKs installed and unified ai-providers service created, with unit tests.
Anthropic Claude¶
What it is. Anthropic's frontier LLM family. Strong on reasoning, tool use, long-context (Sonnet 4.7 advertises 1M tokens), and code.
Models seen in app code:
- claude-sonnet-4-6, claude-sonnet-4-7 — production default in newer apps.
- claude-3-5-sonnet, claude-3-haiku — older apps still on these.
- claude-opus-4-7 — for the heaviest reasoning tasks.
Common roles: - Primary chat brain — Automotive-Repair-Diagnosis-AI, Ecom-Sales / GogreenSellerAI (with per-feature selection), Tutor_AI, GoGreen-Workflow-Hub. - LLM-as-reranker — Ecom-Sales / GogreenSellerAI (see AI-Concepts.md). - Tool-use agent — Tutor_AI (10 tools), AI-Wordpress (8 tools), Recruiting_AI. - HyDE generator — many apps; Claude Haiku 3.5 is cheap enough to produce hypothetical documents.
SDK: @anthropic-ai/sdk (Node), anthropic (Python ^0.42.0+), wp_remote_post() (WordPress raw HTTP).
Used in: All 35+ AI apps per the matrix.
OpenAI¶
What it is. GPT models, embeddings, Whisper (STT), Realtime API (low-latency speech), DALL-E 3 (image generation).
Models seen:
- gpt-4o, gpt-4o-mini — chat. gpt-4o-mini is the portfolio's default for "small task" calls (HyDE, classifiers, query rewriting).
- text-embedding-3-small (1536 dim) — default embedding model across 17+ apps.
- text-embedding-3-large (3072 dim) — GoGreen-DOC-AI only.
- text-embedding-ada-002 — legacy, no app currently on it.
- whisper-1 — speech-to-text. Used in FamilyChat, Ecom-Sales, MangyDogCoffee, Boomer_AI, NaggingWifeAI, Salon-Digital-Assistant, SCO-Digital-Assistant, SellMeACar, SellMe_PRT, Tutor_AI.
- dall-e-3 — image gen. Ecom-Sales is the heavy user.
OpenAI Realtime API¶
What it is. A WebSocket-based bidirectional audio-in / audio-out streaming API. The model "speaks" while the user is still speaking; sub-second turn-taking. Massively reduces latency vs. the older STT → LLM → TTS pipeline.
Voice-first apps using Realtime: MangyDogCoffee, Salon-Digital-Assistant, SCO-Digital-Assistant, SellMeACar, SellMe_PRT, Boomer_AI, NaggingWifeAI, Tutor_AI.
OpenAI Embeddings¶
See Vector-Databases.md and AI-Concepts.md — text-embedding-3-small is the portfolio's standard, -large is the high-recall option.
SDK: openai (Node and Python).
xAI Grok¶
What it is. xAI's family of LLMs (Grok-2, Grok-3 etc.). Available via OpenAI-compatible API (https://api.x.ai/v1).
Role. Used as a tertiary fallback / alternative voice in multi-provider routing. Some apps use Grok for "personality" content where its less-filtered tone is desired.
Used in: All 35+ AI apps per the matrix (config + SDK installed); active routing varies.
SDK: Same openai Node/Python SDK pointed at the xAI base URL.
Google Gemini¶
What it is. Google's frontier LLM family. Strong long-context (1M tokens on 1.5 Pro / 2.0). gemini-2.0-flash is the portfolio default — fast and cheap.
SDK: @google/generative-ai (Node), google-generativeai (Python).
Used in: All 35+ apps.
HuggingFace¶
What it is. Hub of open-source models — embeddings (sentence-transformers), LLMs (Llama, Mistral, Qwen), classifiers, encoders. Inference endpoints + the Inference API.
Role in portfolio. Configured everywhere as a fallback embedding provider and for specialty models that aren't on OpenAI/Anthropic. Most apps don't actively call it during normal request paths.
Tokens. Two HF tokens are used:
- Primary: hf_EYJ... (all apps except SellMe_PRT-Docker)
- Alternate: hf_RVI... (SellMe_PRT-Docker only — documented in API_Keys_and_Secrets_Master.md)
SDK: @huggingface/inference (Node), huggingface_hub (Python).
ElevenLabs (TTS)¶
What it is. High-quality text-to-speech. Multi-voice library + voice cloning. The portfolio default for synthesized voice output where Realtime API isn't streaming directly.
Used in voice apps: AI-Wordpress (Rachel voice), AscendOne, Boomer_AI, Ecom-Sales, NaggingWifeAI, WP-Plugin, Jarvis_Electron, Jarvis_Rust_Tauri.
SDK: elevenlabs (Node and Python).
Stability AI¶
What it is. Image generation (Stable Diffusion XL, SD3, etc.). Used as an alternative to OpenAI DALL-E 3.
Used in: Ecom-Sales (product imagery generation).
Picovoice¶
What it is. On-device wake-word detection ("Hey Jarvis"), voice-activity detection, speech-to-intent. Runs locally — no cloud round-trip — which is why it's the choice for always-listening apps.
Used in: WP-Plugin, Jarvis_Electron, Jarvis_Rust_Tauri.
Vercel AI SDK¶
What it is. TypeScript framework that abstracts every LLM provider behind a single streamText({model, messages, tools}) interface. Streaming-first, tool-use-native, React Server Components-friendly. Sits one layer above the raw SDKs.
Used in: GoGreen-Workflow-Hub.
LangChain¶
What it is. Python (and TypeScript) framework for chaining LLM calls, tools, and retrievers. Heavy abstractions, large API surface — useful when you want pre-built RAG/agent patterns out of the box.
Used in:
- Automotive-Repair-Diagnosis-AI — langchain ^0.3.14, RAG framework over pgvector.
- GoGreen-DOC-AI — used as the middle tier of the 3-tier Graph RAG.
- GoGreenMarketing — RAG with HyDE + query rewriting.
Tradeoffs. Quick to start, can be hard to debug. Newer apps prefer the Vercel AI SDK or hand-rolled loops.
LangSmith¶
What it is. LangChain's observability layer — traces every LLM call, tool invocation, and prompt; supports eval datasets. The closest thing to "Datadog for LLMs."
Used in: GoGreen-SmartForms.
LlamaIndex (not currently used)¶
What it is. Data-framework competitor to LangChain — leaner, focused on RAG pipelines. Strong on document loaders and ingestion.
Why not used. Where the portfolio needs heavy ingestion (GoGreen-DOC-AI), LangChain + Docling won. LlamaIndex remains a viable alternative if a future app needs more sophisticated index types.
OpenRouter / Ollama / Groq / Mistral / DeepSeek¶
Mentioned in OpenSentinel. OpenSentinel is the only app that talks to 9 providers via its routing layer: - OpenAI, Claude, Grok, Gemini — the "big four." - Groq — Llama / Mixtral inference at ~500 t/s. - Mistral — direct API for the Mistral models. - OpenRouter — gateway that fronts hundreds of models from one API. - Ollama — local inference on the host machine (no cloud). - A custom provider slot for whatever's next.
For the rest of the portfolio, these are not configured.
The unified ai-providers service pattern¶
What it is. Most apps ship a file like src/services/ai/ai-providers.ts (or src/lib/ai-providers.ts, or AiProviderService.php) that:
- Reads
OPENAI_API_KEY,ANTHROPIC_API_KEY,XAI_API_KEY,GEMINI_API_KEY,HUGGINGFACE_TOKEN,ELEVENLABS_API_KEYfrom env. - Exposes
chat({provider, model, messages, tools})andembed({provider, text}). - Falls back to the next provider on rate-limit / error.
- Supports per-feature provider selection (e.g. embeddings always go to OpenAI; chat is configurable per-tenant).
Why it matters. No vendor lock-in. Switching the default chat provider is a one-line config change. Per-tenant routing for cost optimization.
Voice / Speech stack at a glance¶
| Component | Default in portfolio | Alternatives |
|---|---|---|
| Wake-word | Picovoice (on-device) | none |
| STT | OpenAI Whisper API | Picovoice STT, Web Audio API |
| Realtime conversation | OpenAI Realtime API | none |
| TTS | ElevenLabs | OpenAI tts-1, Google Cloud TTS |
| Speech-to-intent | Picovoice | LLM with tool use |
Image generation stack¶
| Component | Default | Alternatives |
|---|---|---|
| Text → image | DALL-E 3 (Ecom-Sales primary) | Stability AI (also Ecom-Sales), Midjourney (not used) |
| Image editing | None in portfolio | DALL-E 3 edit, Replicate |
Key reference documents¶
API_Keys_and_Secrets_Master.mdat the portfolio root is the single source of truth for which keys belong to which app. Always check it before editing any.env.Integrations_Audit.mdhas the full per-app provider matrix (last verified 2026-03-20).