Epicareer Might not Working Properly
Learn More

Cloud Engineer

  • Full Time, onsite
  • POWER IT SERVICES
  • Wilayah Persekutuan Kuala Lumpur, Malaysia
Salary undisclosed

Apply on


Original
Simplified

Job Description

Data Engineer

Years of Experience – 8-12 years

Key competences:

- Data Engineer /Data Engineering Tech Lead

- Data Engineer with Cloudera and Azure Cloud experience

- Expertise on PySpark, Azure Synapse, Azure Data Factory, Hadoop, Hive

- Experienced in Batch and Realtime Data integration using Azure Cloud technologies

- Tertiary qualifications in a relevant discipline with relevant certifications in Microsoft Azure.

- Comprehensive knowledge of public cloud environments and industry trends.

- Significant experience supporting, designing, and developing public cloud solutions via

Infrastructure as Code, including Terraform and ARM.

- Extensive DevOps experience.

- The ability to communicate effectively and work collaboratively with diverse team members.

- Demonstrated experience in security hardening and testing.

- Proven ability in creating and updating accurate documentation.

- Excellent verbal and written communication skills.

Willingness and flexibility to work outside of standard office hours, and on weekends as required.

Specific activities required:

- Lead the implementation of infrastructure via code and provide strategic

advice/recommendations for the development and advancement of Microsoft Azure technologies

based on previous research on trends in public cloud environments.

- Integrate and automate the delivery of standardized Azure deployments, in conjunction with

orchestration products such as Azure DevOps with Terraform, Azure ARM templates and other

modern deployment technologies.

- Act as the escalation point for level three Azure related issues, providing technical support and

fault resolution, as well as guidance and mentoring of operational run teams, both locally and

remotely throughout the organization.

- Ensure the appropriate gathering of business requirements and their translation into

appropriate solutions.

- Maintain and deliver all related documentation for the design, development, build, and

deployment methods used, ensuring the source of control of all applicable code is stored and

managed properly.

- Provide guidance and assistance to al support teams.

- Provide complimentary support and leadership in hardening and security testing.

ETL - medallion architecture; common data pipeline development using framework or meta data;

how to handle data quality (lookup, check); data warehousing concepts ( star/snowflake;

fact/dim, surrogate key / primary key / Slowly Changing Dimension (SCD) - SCD I&II difference,

Normalization types ( 2NF vs 3 NF); how to identify Delta records/files

PySpark - Spark architecture; create Data frame from collection of data; remove duplicate value

from DF; PySpark vs TSQL difference including select, aggregate functions, union, limit, add new

column into DF, filtering; Window functions ( lead, lag, string replace , substring index); JOINS,

group by having count greater than/lower than; selecting data from multiple tables ; Case when

with multiple conditions are ok; Performance optimization ( how to mitigate shuffle and data

skew);

Hive - managed vs external table ; create DDL for external table ; Change the settings within a

Hive session; validate functions like Trim, Replace , concat , etc; How to establish JDBC connection;

What are the 3 primary complex Datatypes in Hive & difference

Azure Synapse - azure Subscription/Service principal/ tenant ID; Azure Synapse Pipeline

development, Azure Dedicated Pool SQL DDL/DML development , Azure Synapse Spark notebook

development;

ADLS Communication - Crip & confident, Video without background animation, Validate pyspark or SQL exp using candidate notebook via screen sharing

Scheduler - Oozie workflow for high level Orchestration & Scheduling , How/Where to create

spark application and how to invoke on this, validate event-based triggers , re-run from failure

point, notifications, and exception handling