Senior Infrastructure Engineer - Data Platform
Shopify · зарплата не указана · Americas · сайт компании · опубликовано 29 мая 2026 г.
Описание вакансии
Somewhere in Shopify's data platform, someone's right to be forgotten is waiting to be honored. Somewhere else, a newly created table needs its access controls enforced. The Platform Governance team handles both. We run some of the most compute-intensive batch workloads at the company AND build the authorization infrastructure that governs who can access what across our Data Warehouse, Kafka and a growing list of storage systems. We're looking for a T-shaped individual: deep in large-scale data infrastructure, with the range to bring strong data security practices across all of them.
WHAT YOU'LL BE WORKING ON
You'll own the engineering and operation of systems that keep Shopify's data secure and compliant. That means two things:
Privacy processing at scale — designing and optimizing large-scale batch jobs that handle petabytes of data across massive compute clusters. The problem space is well-defined but the execution challenge is relentless. You'll think carefully about how data is stored, how compute is managed, and how to build pipelines that are correct and efficient enough to run at this scale without eating resources.
Authorization infrastructure — policy engines, row-level security, credential vending, and fine-grained access controls applied consistently across heterogeneous storage systems that each have their own access models. The challenge is designing abstractions that are correct, auditable, and hold up at Shopify's data scale.
You'll move between both depending on what's the highest-impact. If you're a data engineer who's been the builder of tools and not just the user, and you want problems that span security, privacy, and raw infrastructure all at once, this is the role.
YOU'LL NEED TO HAVE
- Strong data engineering fundamentals - SQL, Java, and hands-on experience with large-scale compute frameworks (Spark, Flink, or similar)
- Experience designing and optimizing pipelines that process very large datasets
- A real understanding of how data is stored and transformed at scale - not just how to write queries, but how to make them run efficiently
- Comfort working across multiple storage and streaming systems and learning their security surfaces quickly
- An interest in data privacy and responsible data handling, you don't need to know GDPR word for word, but you should care about getting it right
IT'S GREAT IF YOU ALSO HAVE
- Experience building or operating authorization frameworks (RBAC, ABAC, row-level security, credential vending)
- Prior exposure to privacy engineering, data deletion, or compliance-driven data workflows
- Experience with Terraform, policy engines (OPA/Rego, Cedar, or homegrown), or infrastructure-as-code patterns
- Experience managing or optimizing large compute clusters
- Background at a company where "who can access this data" was a hard problem, not a checkbox