PromptFork

Generate an opinionated Cursor ruleset for a Python data-engineering repo

Produces a complete Cursor rules file for a Python data-engineering codebase covering DAG conventions, testing, type hints, the SQL/Python boundary, and env handling, each rule justified so the agent respects your pipelines.

Open in Studio
Prompt
You are a senior data engineer who configures AI assistants to respect pipeline conventions. Generate a complete, opinionated Cursor rules file for a Python data-engineering repo.

Repo context:
- Orchestrator: [Airflow / Prefect / Dagster]
- dbt: [yes — dbt-core / no]
- Compute: [pandas / Polars / PySpark / DuckDB]
- Warehouse: [Snowflake / BigQuery / Redshift / Postgres]
- Python version: [3.11 / 3.12], dependency manager: [uv / Poetry / pip-tools]
- Layout: [DESCRIBE — e.g. 'dags/, dbt/models/, jobs/, tests/']

Generate a single .cursorrules file specific to THIS repo. Cover at minimum:

1. Architecture map — DAGs vs jobs vs dbt models, what runs where, what triggers what.
2. DAG conventions — idempotency is mandatory, tasks are pure, no top-level side effects, deterministic task IDs, partitioning and backfill expectations.
3. Python rules — full type hints on all public functions, dataclasses or Pydantic for configs, no bare except, no print in jobs (use the orchestrator's logger).
4. SQL / dbt boundaries — transformation logic lives in dbt or SQL where practical; Python only for what SQL cannot do; never inline raw SQL that duplicates a dbt model.
5. Testing — every transformation needs at least a schema test and a unit or sample test; DAGs have a DAG-integrity test; name the test runner.
6. Data safety — never hardcode credentials, no SELECT *, explicit column lists, append-over-destructive-overwrite unless flagged, partition-column discipline.
7. Env and config — settings via env vars and a config layer, never magic strings, the local .env pattern.
8. PR rules — a change to a model or DAG requires updating its tests; note the review checklist.

Critical format rules:
- Every rule MUST be followed by a one-line 'Why:' rationale.
- Be concrete to THIS orchestrator and warehouse. Ban wrong patterns by name (e.g. no mutable global state in a DAG file).
- Keep it under roughly 120 lines. No filler.

Output the full .cursorrules file in a single fenced code block, then a 5-bullet summary of the conventions it enforces.

Success signal: the output is good only if every rule has a rationale, idempotency and data-safety rules are explicit, and the SQL/Python boundary is clearly drawn for this stack.

Use case

Use when you want Cursor to edit Airflow, Prefect, or Dagster DAGs and dbt models without breaking orchestration, partitioning, or test conventions.

When to use this

For data-engineering repos with DAGs, dbt, and pandas or Spark jobs. Not for pure web-app or ML-research repos.

Follow-up prompts

  • Add a separate ruleset for dbt model conventions (naming, tests, materializations).
  • Generate the matching AGENTS.md so non-Cursor agents share the same rules.
  • Add a rule block covering secrets handling and the local .env pattern for this repo.
#cursor#cursorrules#python#data-engineering#airflow
Source
promptfork seed
License
CC-BY-4.0
Published
6/22/2026

More prompts you might like

.cursorrules for a strict TypeScript + React codebase

A tuned .cursorrules file that keeps Cursor's agent on-convention: strict types, no dead code, match existing patterns.

New

Generate an opinionated Cursor ruleset for a Next.js App Router plus TS monorepo

Produces a complete, opinionated Cursor rules file for a Next.js App Router TypeScript monorepo, with every rule paired with a one-line rationale so the agent follows your stack instead of inventing patterns.

#cursor#cursorrules
New

Generate an opinionated Cursor ruleset for a Rust or C++ systems repo

Produces a complete Cursor rules file for a Rust or C++ systems codebase covering ownership, unsafe policy, error handling, concurrency, and build rules, each justified so the agent writes memory-safe, idiomatic code.

#cursor#cursorrules
New

Next.js 15 App Router page with streaming, caching, and server data

Scaffold a production App Router page: Server Component data fetching, Suspense streaming for instant TTFB, correct cache strategy (fetch cache vs unstable_cache vs revalidatePath), loading/error boundaries, and generateMetadata — with the non-obvious patterns most tutorials skip.

New

Tailwind analytics dashboard with animated stat cards, dark mode, and skeleton loading

Production-grade dashboard layout: KPI cards with counting animations and trend sparklines, a chart area, activity table — all with dark mode, skeleton loading states, and responsive breakpoints defined to the pixel.

New

Supabase RLS: owner-write, public-read policies for a table

Generate correct, non-recursive RLS policies so anyone reads published rows and only owners edit their own.

New