StaffAttract
  • Login
  • Create Account
  • Products
    • Private Ad Placement
    • Reports Management
    • Publisher Monetization
    • Search Jobs
  • About Us
  • Contact Us
  • Unsubscribe

Login

Forgot Password?

Create Account

Job title, industry, keywords, etc.
City, State or Postcode

Azure Data Platform Manager

A.C.Coy Company - Pittsburgh, PA

Apply Now

Job Description

LOCAL APPLICANTS ONLYNo 3rd Parties/Sub VendorsLocationHybrid, Pittsburgh, PAJob Type: Full Time / PermanentWork AuthorizationU.S. Citizens or Green Card Holders OnlyOVERVIEW: The A.C.Coy company has an immediate opening for a Data Platform Manager. This role will be responsible for designing, building, and optimizing enterprise wide data platforms within the Data Warehouse.RESPONSIBILITIES: Lead and mentor a team of data engineers, conducting code reviews and ensuring development standardsSupport troubleshooting and incident management for data-related issues in productionCollaborate with business stakeholders, data scientists, and other team members to gather requirements and translate them into technical specificationsLead the design, development and deployment of scalable and high-performance data pipelines using Azure Databricks; ensuring the data integrity, availability, efficient extraction, transformation, and loading of data from various sources into the Azure Databricks Data WarehouseCollaborate with data scientists, analysts, and other engineering teams to deliver business-critical insights. Optimize pipeline performance, cost, and scalability in the Azure cloud environmentDefine best practices for data ingestion, processing, storage, and governance. Implement data quality checks and validation procedures to ensure the accuracy and integrity of data between various sources, including API's, databases and streaming platformsCollaborate with data scientists and analysts to operationalize and deploy machine learning modelsArchitecture Design:Define the end-to-end Lakehouse architecture using Delta Lake, implementing medallion architecture (Bronze, Silver, Gold layers) for robust data processingFamiliarity with data modeling and schema design principlesPipeline Engineering:Oversee the development of robust, scalable batch and streaming ETL/ELT pipelines using PySpark, Scala, and SQL and with minimal latencyImplement data transformations, enrichment, and quality checks using PySpark/Scala within the Databricks environmentIntegrate real-time and batch data sources using Apache Kafka and ADFSupport large-scale data pipelines using Apache Spark on Databricks, Kafka, Stelo, and Azure Data Factory (ADF)Data Governance & Security:Implement Unity Catalog for unified governance, data security, fine-grained access control (RBAC), privacy measures, and data lineage trackingPerformance Optimization & Tuning:Tune Spark jobs and Databricks clusters to maximize throughput while maintaining cost efficiency through auto-scaling and cluster policiesExpertise in indexing strategies, query optimization, execution plans, and partitioning/shardingPlatform Integration:Orchestrate workflows by integrating Databricks with other Azure services like Azure Data Factory (ADF), Azure Data Lake Storage (ADLS Gen2), and Azure DevOps for CI/CD pipelinesEDUCATIONBachelor's degree in Computer Science, Engineering, or a related fieldREQUIRED EXPERIENCE5-7+ years hands-on data engineering or architecture, with at least 2-4 years specifically focused on Azure Databricks, including Azure cloud technologies2-5 years experience is preferred in managing a team of data engineers, data scientists and/or analystsCertifications (Preferred): Microsoft Certified: Azure Data Engineer Associate (DP-203), Databricks Certified Data Engineer Professional, or Azure Solutions Architect ExpertDatabase Architecture: Proficiency in both Relational (SQL) and NoSQL (Document, Key-Value, Graph, Columnar) databases. Develop and maintain data models and schemas to support data analysis and reporting requirementsDistributed Systems: Knowledge of frameworks like Apache Hadoop, Spark, or Presto/Trino for optimizing and handling massive data volumes and retrieval mechanisms, ensuring the efficient processing of large datasetsStorage Optimization: Understanding file formats like Parquet, Avro, or ORC and compression techniquesDeep proficiency in programming languages: Python (specifically PySpark), SQL, PowerShell, and ScalaInfrastructure: Hands-on experience with Azure Cloud infrastructure, including Networking (VNETs), Key Vault, and Identity ManagementBig Data Tools: Deep knowledge of Apache Spark runtime internals, MLflow for MLOps, and orchestration tools like Airflow

Created: 2026-05-09

➤
Footer Logo
Privacy Policy | Terms & Conditions | Contact Us | About Us
Designed, Developed and Maintained by: NextGen TechEdge Solutions Pvt. Ltd.