Apply on
Availability Status
This job is expected to be in high demand and may close soon. We’ll remove this job ad once it's closed.
Original
Simplified
Key Responsibilities
- Design and Develop Data Solutions
- Develop and optimize ETL/ELT pipelines on Databricks using Apache Spark.
- Create scalable data processing solutions to ingest, transform, and analyze large datasets.
- Data Management and Integration
- Integrate various data sources (structured/unstructured) into Databricks, ensuring data quality and consistency.
- Implement data lake and data warehouse solutions using Databricks, Delta Lake, and cloud platforms (Azure/AWS/GCP).
- Performance Tuning
- Optimize Spark jobs and workflows for performance, reliability, and cost efficiency.
- Monitor, troubleshoot, and fine-tune Databricks clusters and pipelines.
- Collaboration and Support
- Collaborate with data scientists, analysts, and business teams to deliver insights and solutions.
- Provide support in deploying ML models and advanced analytics workloads on Databricks.
- Documentation and Best Practices
- Maintain proper documentation for data pipelines, processes, and Databricks workflows.
- Enforce best practices for code quality, security, and operational excellence.
Required Skills and Qualifications
- Technical Expertise:
- Hands-on experience with Databricks and Apache Spark for data engineering or analytics.
- Proficiency in programming languages: Python and SQL (Scala/Java is a plus).
- Experience with cloud platforms (e.g., Azure Databricks, AWS EMR, or GCP).
- Strong understanding of Delta Lake architecture, data lakes, and data warehouses.
- Data Engineering Skills:
- Expertise in building and maintaining ETL/ELT workflows for large-scale data processing.
- Familiarity with workflow orchestration tools such as Airflow, Azure Data Factory, or similar.
- Analytical and Problem-Solving:
- Ability to optimize performance and troubleshoot issues in Databricks/Spark environments.
- Knowledge of big data concepts, distributed systems, and scalable architectures.
- Tools and Frameworks:
- Experience working with CI/CD pipelines and version control tools like Git.
- Exposure to machine learning (ML) workflows on Databricks is a plus.
Job Type: Full-time
Pay: RM5,000.00 - RM16,000.00 per month
Similar Jobs