LLM Providers & AI Services¶

The portfolio standardizes on a unified ai-providers service in every AI-enabled app — a single TypeScript/Python/PHP module that exposes a uniform chat()/embed() interface and routes to whichever provider's SDK is configured. This page covers each provider's role, the SDKs in use, and what models are wired up.

Per Integrations_Audit.md (verified 2026-03-20): every app has all six primary AI provider SDKs installed and unified ai-providers service created, with unit tests.

Anthropic Claude¶

What it is. Anthropic's frontier LLM family. Strong on reasoning, tool use, long-context (Sonnet 4.7 advertises 1M tokens), and code.

Models seen in app code: - claude-sonnet-4-6, claude-sonnet-4-7 — production default in newer apps. - claude-3-5-sonnet, claude-3-haiku — older apps still on these. - claude-opus-4-7 — for the heaviest reasoning tasks.

Common roles: - Primary chat brain — Automotive-Repair-Diagnosis-AI, Ecom-Sales / GogreenSellerAI (with per-feature selection), Tutor_AI, GoGreen-Workflow-Hub. - LLM-as-reranker — Ecom-Sales / GogreenSellerAI (see AI-Concepts.md). - Tool-use agent — Tutor_AI (10 tools), AI-Wordpress (8 tools), Recruiting_AI. - HyDE generator — many apps; Claude Haiku 3.5 is cheap enough to produce hypothetical documents.

SDK: @anthropic-ai/sdk (Node), anthropic (Python ^0.42.0+), wp_remote_post() (WordPress raw HTTP).

Used in: All 35+ AI apps per the matrix.

OpenAI¶

What it is. GPT models, embeddings, Whisper (STT), Realtime API (low-latency speech), DALL-E 3 (image generation).

Models seen: - gpt-4o, gpt-4o-mini — chat. gpt-4o-mini is the portfolio's default for "small task" calls (HyDE, classifiers, query rewriting). - text-embedding-3-small (1536 dim) — default embedding model across 17+ apps. - text-embedding-3-large (3072 dim) — GoGreen-DOC-AI only. - text-embedding-ada-002 — legacy, no app currently on it. - whisper-1 — speech-to-text. Used in FamilyChat, Ecom-Sales, MangyDogCoffee, Boomer_AI, NaggingWifeAI, Salon-Digital-Assistant, SCO-Digital-Assistant, SellMeACar, SellMe_PRT, Tutor_AI. - dall-e-3 — image gen. Ecom-Sales is the heavy user.

OpenAI Realtime API¶

What it is. A WebSocket-based bidirectional audio-in / audio-out streaming API. The model "speaks" while the user is still speaking; sub-second turn-taking. Massively reduces latency vs. the older STT → LLM → TTS pipeline.

Voice-first apps using Realtime: MangyDogCoffee, Salon-Digital-Assistant, SCO-Digital-Assistant, SellMeACar, SellMe_PRT, Boomer_AI, NaggingWifeAI, Tutor_AI.

OpenAI Embeddings¶

See Vector-Databases.md and AI-Concepts.md — text-embedding-3-small is the portfolio's standard, -large is the high-recall option.

SDK: openai (Node and Python).

xAI Grok¶

What it is. xAI's family of LLMs (Grok-2, Grok-3 etc.). Available via OpenAI-compatible API (https://api.x.ai/v1).

Role. Used as a tertiary fallback / alternative voice in multi-provider routing. Some apps use Grok for "personality" content where its less-filtered tone is desired.

Used in: All 35+ AI apps per the matrix (config + SDK installed); active routing varies.

SDK: Same openai Node/Python SDK pointed at the xAI base URL.

Google Gemini¶

What it is. Google's frontier LLM family. Strong long-context (1M tokens on 1.5 Pro / 2.0). gemini-2.0-flash is the portfolio default — fast and cheap.

SDK: @google/generative-ai (Node), google-generativeai (Python).

Used in: All 35+ apps.

HuggingFace¶

What it is. Hub of open-source models — embeddings (sentence-transformers), LLMs (Llama, Mistral, Qwen), classifiers, encoders. Inference endpoints + the Inference API.

Role in portfolio. Configured everywhere as a fallback embedding provider and for specialty models that aren't on OpenAI/Anthropic. Most apps don't actively call it during normal request paths.

Tokens. Two HF tokens are used: - Primary: hf_EYJ... (all apps except SellMe_PRT-Docker) - Alternate: hf_RVI... (SellMe_PRT-Docker only — documented in API_Keys_and_Secrets_Master.md)

SDK: @huggingface/inference (Node), huggingface_hub (Python).

ElevenLabs (TTS)¶

What it is. High-quality text-to-speech. Multi-voice library + voice cloning. The portfolio default for synthesized voice output where Realtime API isn't streaming directly.

Used in voice apps: AI-Wordpress (Rachel voice), AscendOne, Boomer_AI, Ecom-Sales, NaggingWifeAI, WP-Plugin, Jarvis_Electron, Jarvis_Rust_Tauri.

SDK: elevenlabs (Node and Python).

Stability AI¶

What it is. Image generation (Stable Diffusion XL, SD3, etc.). Used as an alternative to OpenAI DALL-E 3.

Used in: Ecom-Sales (product imagery generation).

Picovoice¶

What it is. On-device wake-word detection ("Hey Jarvis"), voice-activity detection, speech-to-intent. Runs locally — no cloud round-trip — which is why it's the choice for always-listening apps.

Used in: WP-Plugin, Jarvis_Electron, Jarvis_Rust_Tauri.

Vercel AI SDK¶

What it is. TypeScript framework that abstracts every LLM provider behind a single streamText({model, messages, tools}) interface. Streaming-first, tool-use-native, React Server Components-friendly. Sits one layer above the raw SDKs.

Used in: GoGreen-Workflow-Hub.

LangChain¶

What it is. Python (and TypeScript) framework for chaining LLM calls, tools, and retrievers. Heavy abstractions, large API surface — useful when you want pre-built RAG/agent patterns out of the box.

Used in: - Automotive-Repair-Diagnosis-AI — langchain ^0.3.14, RAG framework over pgvector. - GoGreen-DOC-AI — used as the middle tier of the 3-tier Graph RAG. - GoGreenMarketing — RAG with HyDE + query rewriting.

Tradeoffs. Quick to start, can be hard to debug. Newer apps prefer the Vercel AI SDK or hand-rolled loops.

LangSmith¶

What it is. LangChain's observability layer — traces every LLM call, tool invocation, and prompt; supports eval datasets. The closest thing to "Datadog for LLMs."

Used in: GoGreen-SmartForms.

LlamaIndex (not currently used)¶

What it is. Data-framework competitor to LangChain — leaner, focused on RAG pipelines. Strong on document loaders and ingestion.

Why not used. Where the portfolio needs heavy ingestion (GoGreen-DOC-AI), LangChain + Docling won. LlamaIndex remains a viable alternative if a future app needs more sophisticated index types.

OpenRouter / Ollama / Groq / Mistral / DeepSeek¶

Mentioned in OpenSentinel. OpenSentinel is the only app that talks to 9 providers via its routing layer: - OpenAI, Claude, Grok, Gemini — the "big four." - Groq — Llama / Mixtral inference at ~500 t/s. - Mistral — direct API for the Mistral models. - OpenRouter — gateway that fronts hundreds of models from one API. - Ollama — local inference on the host machine (no cloud). - A custom provider slot for whatever's next.

For the rest of the portfolio, these are not configured.

The unified ai-providers service pattern¶

What it is. Most apps ship a file like src/services/ai/ai-providers.ts (or src/lib/ai-providers.ts, or AiProviderService.php) that:

Reads OPENAI_API_KEY, ANTHROPIC_API_KEY, XAI_API_KEY, GEMINI_API_KEY, HUGGINGFACE_TOKEN, ELEVENLABS_API_KEY from env.
Exposes chat({provider, model, messages, tools}) and embed({provider, text}).
Falls back to the next provider on rate-limit / error.
Supports per-feature provider selection (e.g. embeddings always go to OpenAI; chat is configurable per-tenant).

Why it matters. No vendor lock-in. Switching the default chat provider is a one-line config change. Per-tenant routing for cost optimization.

Voice / Speech stack at a glance¶

Component	Default in portfolio	Alternatives
Wake-word	Picovoice (on-device)	none
STT	OpenAI Whisper API	Picovoice STT, Web Audio API
Realtime conversation	OpenAI Realtime API	none
TTS	ElevenLabs	OpenAI tts-1, Google Cloud TTS
Speech-to-intent	Picovoice	LLM with tool use

Image generation stack¶

Component	Default	Alternatives
Text → image	DALL-E 3 (Ecom-Sales primary)	Stability AI (also Ecom-Sales), Midjourney (not used)
Image editing	None in portfolio	DALL-E 3 edit, Replicate

Key reference documents¶

API_Keys_and_Secrets_Master.md at the portfolio root is the single source of truth for which keys belong to which app. Always check it before editing any .env.
Integrations_Audit.md has the full per-app provider matrix (last verified 2026-03-20).