Objective of Job:
Design, build, and maintain robust, scalable data architectures that enable advanced analytics, machine learning, and data-driven decision-making across the organization. Own the end-to-end data lifecycle—from ingestion and transformation to storage and access—ensuring high data quality, reliability, and performance. Partner closely with analytics, engineering, and business teams to translate data requirements into efficient, production-grade data solutions that support operational excellence and innovation.
Key Responsibilities
Data Architecture & Pipelines
- Design, develop, and maintain scalable, fault-tolerant data pipelines for structured and unstructured data using modern ETL/ELT frameworks.
- Implement and optimize data models (e.g., dimensional, wide tables, lakehouse patterns) to support analytics, reporting, and machine learning use cases.
- Ensure high standards of data quality, consistency, lineage, and documentation across data platforms.
Platforms & Infrastructure
- Build and operate data solutions using cloud platforms (e.g., AWS, Azure, GCP) and big data technologies (e.g., Spark, distributed databases, data lakes).
- Manage data storage, processing, and orchestration tools to ensure performance, scalability, and cost efficiency.
- Support deployment, monitoring, and troubleshooting of production data pipelines and workflows.
Collaboration & Enablement
- Work closely with data scientists, analysts, and business stakeholders to understand data needs and enable efficient access to trusted datasets.
- Translate analytical and operational requirements into reliable, reusable data assets.
- Support self-service analytics by improving data discoverability and accessibility.
Governance & Reliability
- Implement data governance, security, and access controls in alignment with organizational and regulatory requirements.
- Monitor pipeline health, data freshness, and system performance; proactively resolve issues and optimize reliability.
- Contribute to best practices around version control, testing, CI/CD, and infrastructure-as-code for data systems.
Continuous Improvement & Innovation
- Stay current with emerging data engineering tools, architectures, and best practices.
- Evaluate and introduce new technologies to improve data platform capabilities, efficiency, and resilience.
- Drive continuous improvement of data engineering standards and processes across the organization.