A couple of early-to-mid-stage startups I'm consulting with are asking the same question: their AI/ML team wants production Postgres data, and nobody's quite sure how to give it to them.I've handled this before for BI teams — read replica with a generous `max_standby_streaming_delay` and `hot_standby_feedback` on, accepting the occasional bloat on the primary. Worked fine. But the AI/ML ask feels different in ways I can't fully articulate yet, which is part of why I'm asking.A few things I'm trying to calibrate:Where does the agent actually connect? Primary with RLS, read replica, warehouse (Snowflake/BigQuery/Redshift), lakehouse (Iceberg/Delta on S3), or something else?If you're not doing this — is it compliance, cost fear, bad experiences (runaway queries, PII in prompts), or something else?And the one I'm most curious about: does this actually feel different from giving BI tools DB access, or is it the same problem wearing new clothes?Not looking for product recommendations. Trying...
Want to discover more AI signals like this?
Explore Steek