Job Description
KEY RESPONSIBILITIES
You will:
- Design, develop, and maintain data pipelines and ETL processes using Microsoft Azure services (e.g., Azure Data Factory, Azure Synapse, Azure Databricks, Azure Fabric).
- Utilize Azure data storage accounts for organizing and maintaining data pipeline outputs. (e.g., Azure Data Lake Storage Gen 2 & Azure Blob storage).
- Collaborate with data scientists, data analysts, data architects and other stakeholders to understand data requirements and deliver high-quality data solutions.
- Optimize data pipelines in the Azure environment for performance, scalability, and reliability.
- Ensure data quality and integrity through data validation techniques and frameworks.
- Develop and maintain documentation for data processes, configurations, and best practices.
- Monitor and troubleshoot data pipeline issues to ensure timely resolution.
- Stay current with industry trends and emerging technologies to ensure our data solutions remain cutting-edge.
- Manage the CI/CD process for deploying and maintaining data solutions.
REQUIREMENTS AND QUALIFICATIONS
- Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience) and able to demonstrate high proficiency in programming fundamentals.
- Proven experience (3+ years) as a Data Engineer or similar role dealing with data and ETL processes.
- Strong knowledge of Microsoft Azure services, including Azure Data Factory, Azure Synapse, Azure Databricks, Azure Blob Storage and Azure Data Lake Gen 2.
- Experience utilizing SQL DML to query modern RDBMS in an efficient manner (e.g., SQL Server, PostgreSQL).
- Strong understanding of Software Engineering principles and how they apply to Data Engineering (e.g., CI/CD, version control, testing).
- Experience with big data technologies (e.g., Spark).
- Strong problem-solving skills and attention to detail.
- Excellent communication and collaboration skills.
PREFERRED QUALIFICATIONS
- Learning agility
- Technical Leadership
- Consulting and managing business needs
- Strong experience in Python is preferred but experience in other languages such as Scala, Java, C#, etc. is accepted.
- Experience building spark applications utilizing PySpark.
- Experience with file formats such as Parquet, Delta, Avro.
- Experience efficiently querying API endpoints as a data source.
- Understanding of the Azure environment and related services such as subscriptions, resource groups, etc.
- Understanding of Git workflows in software development.
- Using Azure DevOps pipeline and repositories to deploy and maintain solutions.
- Understanding of Ansible and how to use it in Azure DevOps pipelines.
Work Set Up: Hybrid, Dayshift
Work Location: Makati