About our Client:
Key Responsibilities:
Requirements
- Understand the requirements
- Understand the design specifications
Development
- Ensures Non-Functional Requirements can be met within the proposed solution.
- Develops good relationships with technical stakeholders and peers.
- Documents pipeline implementation in meaningful inline comments and updates design documentation where required.
- Follows Pipeline Development Standards.
- Develops ETL pipelines and transformation logic.
- Organises code review and pull request of own code.
- Develops components within the ETL frameworks.
- Manages their code in Version Management Tool (Bitbucket)
- Raises Pull Requests and ensure timely code merge
- Follows coding standards
Testing
- Creates test cases and ensures code quality is meeting expectations
- Reviews test results and provides updates and reports on any issues.
Technologies
- Ensures all Airflow/Python ETL processes and data is correct and up to date.
- Datawarehouse Concepts and SQL queries
- Maintains security and access to the warehouse.
- Ensures data quality is in accordance with the SLAs.
- Expert skills in development of python scripts and use of standard libraries.
- Ensures all code is of high quality and adheres to standards and best practices.
- Expands the ETL frameworks in line with the architecture requirements.
- Reviews code changes of peers and ensures all code is of high quality.
Operational
- Performs rostered operational monitoring and system checks (rotational system across the team, generally one week every 3-4 weeks)
- Ensures system health checks are performed and any issues identified.
- Performs error triage during operational monitoring and manages any operational issue to conclusion
- Ensures Regression Test Cases are developed and available in Operation Monitoring Dashboard.
- Ensures data quality is maintained at all times.
- Ensures excellent Security processes and standards are in place and well maintained.
Key Experience, Skills and Education:
Must Have
- Must be confident in Python programming and data libraries eg. Pandas
- 0-5 years commercial Python (recent)
- 0-2 Years Junior / Data Engineer 2 Plus Years
- Backend and Database interaction
- Efficient in Standard Query Language (SQL)
- Good written and verbal communication
Nice To Have
- Bachelor's or master's degree in data engineering, data science, information technology or equivalent
- Working experience in Data processing and/or Data Engineering,
- Exposure to Big Data technology, eg Spark, Spark Streaming
- Apache Airflow
- Equivalent programming language (Java, C#, Scala)
- Big Data streaming, Spark Streaming, Kafka Streaming, Flink
- Experience working with Data Lakes ideally Azure Data Lake
- Experience with Parque File formats
AZURE Synapse experience a plus