Python/ ETL Data Engineer (AWS Only)
Job Description:
Design and develop data pipelines using AWS Glue, AWS Athena, AWS Data Pipeline, and other ETL tools on AWS
Design and develop monitoring and early detection process for data pipeline issues such as missing data, lagging data etc.
Extract data from various sources, including relational databases, non-relational databases, and flat files
Transform data to meet business requirements and load into target data stores
Transform data based on existing data mapping and architecture
Monitor and troubleshoot data pipeline performance issues
Collaborate with other teams, including Data Scientists and Business Analysts, to understand data requirements and implement solutions
Continuously improve data pipeline performance and scalability
Build and maintain REST APIs for data access and integration with external systems
Implement testing and documentation
Strong experience building ETL pipelines on an AWS environment
In-depth knowledge of AWS services, including Redshift, Glue, EMR, and S3
Strong experience with SQL and database design
Experience coding in Python
Experience with REST API development and principles
Knowledge of API testing and documentation tools
Strong understanding of data warehousing and data modeling concepts
Experience with data pipeline monitoring and troubleshooting
Experience with data security and compliance best practices
Strong problem-solving and analytical skills
Job Description:
Design and develop data pipelines using AWS Glue, AWS Athena, AWS Data Pipeline, and other ETL tools on AWS
Design and develop monitoring and early detection process for data pipeline issues such as missing data, lagging data etc.
Extract data from various sources, including relational databases, non-relational databases, and flat files
Transform data to meet business requirements and load into target data stores
Transform data based on existing data mapping and architecture
Monitor and troubleshoot data pipeline performance issues
Collaborate with other teams, including Data Scientists and Business Analysts, to understand data requirements and implement solutions
Continuously improve data pipeline performance and scalability
Build and maintain REST APIs for data access and integration with external systems
Implement testing and documentation
Strong experience building ETL pipelines on an AWS environment
In-depth knowledge of AWS services, including Redshift, Glue, EMR, and S3
Strong experience with SQL and database design
Experience coding in Python
Experience with REST API development and principles
Knowledge of API testing and documentation tools
Strong understanding of data warehousing and data modeling concepts
Experience with data pipeline monitoring and troubleshooting
Experience with data security and compliance best practices
Strong problem-solving and analytical skills