Python Engineer - Data Warehouse Technology
Shopify · зарплата не указана · Americas · сайт компании · опубликовано 19 мая 2026 г.
Описание вакансии
Shopify's Data Warehouse Technology (DW Tech) team owns the tooling and AI infrastructure layer that keeps the entire Data Warehouse running — enabling hundreds of data engineers across the organization to build reliable, scalable data products.
We're hiring a Senior or Staff Engineer to take deep ownership of developer tooling, AI integration, and operational excellence across the stack. This is a hands-on technical role with broad scope.
WHAT YOU'LL DO
DATA WAREHOUSE DEVELOPER AND AI TOOLING
- Build and maintain developer tooling including batch and streaming data utilities, CI/CD pipelines, validation systems, and deployment automation that make data engineers productive.
- Debug complex framework issues across internal and external frameworks like data ingestion, streaming and batch processing, orchestration, finding workarounds and contributing upstream fixes.
- Implement governance and validation rules that automatically prevent data quality issues and architectural violations.
- Manage dependencies and updates across the entire Data Warehouse technology stack.
AI TOOLING & AUTOMATION
- Build and extend the Data Warehouse’s AI tool orchestrator to generate PRs, triage issues, and respond to code reviews.
- Author and maintain the AI agent skills, prompt templates, and convention docs that teach AI coding tools our standards, so contributors get to correct, reviewable code faster.
- Build AI-powered development tools like data pipeline and modeling cost optimization with automated benchmarking and reconciliation.
- Design structured context systems that make data warehouse knowledge consumable by AI agents.
- Build and maintain Cloud Build infrastructure for running AI agents (container images, GCP Cloud Functions, Terraform, PubSub event-driven architectures).
CROSS-TEAM COLLABORATION
- Support data engineers across all domains with technical issues, architectural guidance, and tooling improvements.
- Partner with leadership on technology strategy, dependency management, and operational excellence.
- Mentor and knowledge-share with team members and the broader data engineering community.
WHAT YOU BRING
ADVANCED PYTHON PROGRAMMING
- Experience building and owning internal libraries and packages with proper package structure, versioning, virtual environments, and dependency management (uv, pip, pyproject.toml).
- Design Python modules from scratch with typed interfaces, abstract base classes, and comprehensive test suites. You're fluent in pytest fixtures, parametrize, monkeypatch, MagicMock, and test factories.
- Experience debugging framework internals at the code level, not just at the configuration or API surface. You leverage multiple debugging techniques: pdb, stack traces, reading framework source code, understanding MRO and super() chains.
- Concurrency patterns (ThreadPoolExecutor, async/await) for parallel API calls and polling loops. Type annotations as contracts: full use of typing, Pydantic models for validation, frozen dataclasses for configuration.
CORE TECHNICAL SKILLS
- Framework debugging expertise across systems like dbt, Airflow, Meltano, and Google Cloud services, ability to understand impact, create workarounds, and contribute upstream fixes.
- SQL fluency: BigQuery SQL for data modeling, optimization, and debugging.
- Systems thinking to understand complex interactions between multiple frameworks and identify root causes across technology boundaries.
AI & LLM ENGINEERING
- Experience building systems that orchestrate LLM calls (prompt engineering, multi-model evaluation, context management, token optimization).
- Comfort with AI agent patterns: tool use, structured output, human-in-the-loop workflows, conversation continuity.
- Understanding when to use AI vs. deterministic approaches, knowing that programmatic solutions (shell scripts, BigQuery queries) are more reliable than LLM calls for well-defined tasks.
- Experience evaluating and benchmarking AI output quality (automated test suites, regression testing, cost tracking).
INFRASTRUCTURE & CLOUD
- GCP services: Cloud Functions, Cloud Build, BigQuery, PubSub, Secret Manager, Artifact Registry
- Terraform and infrastructure-as-code; CI/CD systems (Buildkite, GitHub Actions) and deployment automation
- Container workflows (Docker/Podman, container images for AI agent execution)
ADAPTABILITY & LEARNING
- Rapid ability to grasp unfamiliar programming/templating/markup languages (Ruby, Jinja compilation, Terraform HCL).
- Problem-solving mindset that balances pragmatic solutions with long-term architectural health.
BONUS
- Experience building developer tools or internal platforms
- Contributions to open-source data engineering tools
- Experience with event-driven architectures (PubSub, webhooks, GitHub Apps)
- Background in data engineering or analytics engineering
- Experience with coding agent orchestration
WHY YOU'LL LOVE THIS ROLE
- High impact: Your work directly enables hundreds of data engineers and affects data quality for all of Shopify.
- AI frontier: Build production AI systems that automate real engineering work, not demos, not chatbots, but tools that ship code.
- Technical depth: Solve complex, multi-system problems across Python frameworks, cloud infrastructure, and LLM orchestration.
- Small team, big scope: 5 engineers (so far) owning the tooling and AI layer for the entire Data Warehouse.
- Learning environment: Work with cutting-edge AI models and data technologies while contributing to open-source projects.
TEAM CULTURE
DW Tech values pragmatic solutions, continuous learning, and technical excellence. We believe in taking ownership of complex problems, sharing knowledge generously, and building systems that scale with Shopify's growth. We're opinionated about code quality and the tools we build enforce the same standards we hold ourselves to.