Sr. Software Engineer, Site Reliability
Poshmark - Redwood City, CA
Apply NowJob Description
About PoshmarkPoshmark is the leading fashion marketplace where style comes alive through discovery, self-expression, and human connection. Powered by a vibrant community of 165 million members, Poshmark brings real people and taste to shopping through a social experience shaped by shared discovery. Buying and selling fashion feels simple, joyful, and personal, while every item tells its own story. Poshmark empowers sellers to grow meaningful businesses, keeps fashion in circulation longer, and gives shoppers access to unique and trusted finds, from everyday pieces to one-of-a-kind vintage and luxury. Job Title: Sr. Software Engineer SRE/DevOps Poshmark is seeking a highly skilled and collaborative Staff Software Engineer to join our Tools & Infra DevOps (SRE) team. In this role, you will help shape and drive the platform that powers developer velocity, CI/CD workflows, and non-production infrastructure reliability at scale. You will lead key initiatives improving automation, observability, and deployment efficiency - helping engineering teams deliver high-quality software with confidence. This is a high-impact role where you will influence architecture, mentor engineering teams, and define technical standards that elevate our operational excellence across Poshmark. Responsibilities: Lead design and development of scalable CI/CD systems enabling efficient and reliable delivery for multiple engineering teams Architect, automate, and optimize non-production infrastructure to improve developer productivity, reliability, and environment consistency Implement enhanced observability and monitoring solutions to reduce MTTR and proactively improve system health Drive DevEx improvements including remote dev environments, build optimizations, and automation workflows Champion infrastructure-as-code standards (Terraform, Ansible, etc.) for repeatable and secure deployments Collaborate with cross-functional stakeholders (Security, QA, Data Engineering, Platform teams) to align reliability and operational goals Lead key explorations of emerging tools and technologies to evolve the platform (e.g., GitHub Actions, ArgoCD, service mesh, logging modernization) Mentor engineers on DevOps best practices, fostering strong engineering culture, quality, and ownership Serve as an escalation point during production-like issues in non-prod environments to ensure rapid resolution Influence roadmap prioritization with data-driven decision making 6-Month Accomplishments: Deliver meaningful improvements in CI stability and runtime performance Increase developer feedback velocity by reducing pipeline execution times Strengthen observability capabilities across non-production environments Provide architecture guidance and secure automation for key platform enhancements 12+ Month Accomplishments: Mature the developer experience platform with measurable improvements in engineering throughput Lead modernization of CI/CD tech stack with self-service capabilities Drive multi-environment reliability initiatives leading to reduced incidents Establish best practices and reusable frameworks adopted across teams Requirements: 5-8+ years of experience in DevOps, SRE, or Infrastructure Engineering roles 7+ years engineering experience with majority focused on DevOps, SRE, or Platform Engineering Strong Kubernetes expertise (workload orchestration and cluster-level debugging) Proven experience architecting CI/CD systems at scale (Jenkins/GitHub Actions/Spinnaker) Hands-on experience with automation tools: Terraform, Helm, Ansible, etc. Solid understanding of distributed systems, monitoring, and operational excellence Proficient in Python, Bash, Golang, or Ruby for automation Strong leadership communication skills with ability to mentor and influence Track record of driving impactful technical initiatives end-to-end
Created: 2026-03-10