Databases¶
This page covers persistent stores: relational, document, graph, time-series, and search engines. For embeddings see Vector-Databases.md. For ephemeral / queue-style stores see Caching-Queues-Realtime.md. For object/blob storage see Object-Storage.md.
Portfolio map¶
| Engine | Apps | Role |
|---|---|---|
| PostgreSQL | 24+ | Default relational DB. 16 of those use the pgvector extension. |
| MySQL / MariaDB | 4 | Legacy / WordPress / PHP apps. |
| SQLite | 4 | Embedded — Sell-Me family for single-binary deployment. |
| Neo4j | 1–2 | Graph: OpenSentinel (OSINT), Automotive (knowledge graph). |
| InfluxDB | 1 | Time-series: Automotive-Repair-Diagnosis-AI sensor data. |
| ClickHouse | 1 | Columnar OLAP: GoGreen-Workflow-Hub. |
| MongoDB | 0 | Not used. |
| Elasticsearch / OpenSearch | 0 | Not used (BM25 is done in pgvector or app code). |
PostgreSQL¶
What it is. Open-source relational database. ACID, strong SQL standard, JSONB, full-text search (tsvector), array types, partial indexes, materialized views, logical replication.
Why we use it. It does too many things well — transactional CRUD, JSON storage, full-text search, vectors (via pgvector), simple analytics. One database = one operational story.
Versions seen. 15-alpine and 16 (pgvector image: pgvector/pgvector:pg15 and pg16, sometimes pg17).
Used in: Boomer_AI, Ecom-Sales, EverythingBeer, FamilyChat, GoGreen-AI-Concierge, GoGreen-DOC-AI, GoGreen-SmartForms, GoGreen-Workflow-Hub, GoGreenMarketing, GoGreenPaperlessInitiative, GoGreenSourcingAI, GogreenSellerAI, MangyDogCoffee, MyPollingApp, NaggingWifeAI, OpenSentinel, PolyMarketAI, Recruiting_AI, Sales_AI_App, Salon-Digital-Assistant, SCO-Digital-Assistant, SellMeAPen_CLCD-1, TimeSheetAI, Tutor_AI, Automotive-Repair-Diagnosis-AI, ChoreAndMoreTracker, akt-giving-garden.
Notable patterns:
- pgvector extension (16 apps) — see Vector-Databases.md.
- Row-Level Security (RLS) — ChoreAndMoreTracker uses RLS for multi-tenant isolation in Postgres itself.
- PgBouncer — connection pooling for serverless (TimeSheetAI uses it for NextAuth + Prisma).
- asyncpg — Python async driver used in Automotive, GoGreen-DOC-AI, GoGreen-SmartForms.
- pg + Prisma / Drizzle — TypeScript apps.
MySQL / MariaDB¶
What it is. The other major open-source relational DB. Looser type system than Postgres but ubiquitous in the PHP/WordPress world.
Used in:
- AI-Wordpress_Business-Directory — MySQL 8.0, custom ggd_* tables alongside WordPress core.
- Maximus, PRT — MySQL 8.0, three Laravel apps sharing one schema.
- Realestate-all-docker — MySQL 8.0, Laravel ORM + JSON-column-as-vector RAG.
- Voting_NewAndImproved — MariaDB 10.4, embeddings stored in a LONGTEXT/JSON column.
- WP-Plugin — WordPress's MySQL with custom indexed keyword tables for BM25 RAG.
MariaDB note. Voting_NewAndImproved is the only MariaDB instance — fully MySQL-compatible at the app level.
phpMyAdmin is run as a Docker service (port 8480, 8801, etc.) for these apps to give the team a web SQL UI.
SQLite¶
What it is. Single-file embedded database. No server, no port, no auth — just a file. ACID via WAL mode.
Why we use it. When the deployment unit is a single binary or a small Docker image and you don't need concurrency. Vector embeddings stored as JSON in a BLOB column work fine for thousands of vectors.
Used in: - SCO-Digital-Assistant — SQLite + JSON embeddings + JS cosine similarity. - SellMeACar-Docker — same pattern. - SellMe_PRT-Docker — same pattern. - PolyMarketAI — SQLite + scikit-learn TF-IDF.
Tradeoffs. No multi-writer concurrency (one process at a time). File-based backups are trivial; multi-machine replication is not.
Neo4j¶
What it is. Native graph database. Cypher query language. Strong on multi-hop traversal — "friends of friends," "documents that reference documents that reference X," "ownership chains."
Used in:
- OpenSentinel — neo4j:5-community. OSINT knowledge graph: people ↔ accounts ↔ infrastructure ↔ events. Optional / profile-gated; only loaded when graph queries are needed.
- Automotive-Repair-Diagnosis-AI — neo4j ^5.27.0. Vehicle / part / fault knowledge graph for the diagnostic engine.
Tradeoffs. Excellent for graph queries, awkward for everything else. Don't replace Postgres with it — sit it next to Postgres.
InfluxDB¶
What it is. Time-series database. Built for high-cardinality, time-stamped data — metrics, sensor readings, telemetry.
Used in: Automotive-Repair-Diagnosis-AI — vehicle sensor data over time (RPM, fuel trim, O2 readings, OBD-II PIDs).
ClickHouse¶
What it is. Columnar OLAP database. Petabyte-scale aggregations in seconds. Not a transactional DB.
Used in: GoGreen-Workflow-Hub — workflow execution event analytics.
MongoDB (not used)¶
What it is. Document database. Schema-less JSON-like documents, flexible querying.
Why not used. Postgres JSONB covers the document use cases without giving up SQL or transactions. The portfolio explicitly chose Postgres-monolith over polyglot persistence.
Elasticsearch / OpenSearch (not used)¶
What it is. Search-first engines. Lucene-based, BM25 by default, dense_vector field for ANN, aggregations, faceting.
Why not used. When BM25 is needed (WP-Plugin, OpenSentinel, GoGreen-DOC-AI), it's implemented in Postgres tsvector or in app code. The portfolio doesn't have a corpus large enough to need a dedicated search cluster.
Meilisearch / Typesense / Algolia (not used)¶
Out-of-the-box hosted search-as-API products. The portfolio's needs have been met by pgvector + BM25 in Postgres.
DuckDB (not used)¶
What it is. SQLite-style embedded analytical database. Reads Parquet / CSV / Arrow directly. Great for ad-hoc analytics in a notebook.
Why not used. The portfolio doesn't run notebook-style analytics; the same ad-hoc queries happen against Postgres directly.
Migrations¶
| Tool | Apps | Notes |
|---|---|---|
| Prisma Migrate | 17+ | Default for TS apps. |
| Drizzle Kit | 3 | OpenSentinel, GoGreen-AI-Concierge, GoGreen-Workflow-Hub. |
| Alembic | 3 | Python — Automotive, SmartForms, DOC-AI. |
| Laravel Migrations | 3 | Maximus, PRT, Voting. |
| sqlx | 1 | ChoreAndMoreTracker (Rust). |
Backup posture¶
- Postgres / MySQL —
pg_dump/mysqldumpto MinIO or to local disk on the IONOS VPS. - SQLite — file copy is the backup. WAL means stop-the-world isn't required.
- MinIO — versioning enabled in production deployments.
- Documented in Integrations_Audit.md and per-app CLAUDE.md — full DR is per-app, not centralized.
Decision guide¶
Greenfield app, not WordPress? Postgres (+ pgvector if AI).
Already on WordPress / Laravel legacy? MySQL or MariaDB.
Need multi-hop graph queries? Neo4j alongside Postgres.
Time-series / sensor metrics? InfluxDB.
OLAP analytics over events? ClickHouse.
Single-binary deployment? SQLite.
"Should we use MongoDB?" No — Postgres JSONB handles it.
"Should we use Elasticsearch?" No — pgvector + tsvector + BM25-in-app suffice at portfolio scale.