Epicareer Might not Working Properly
Learn More
E

Data Engineer

Salary undisclosed

Checking job availability...

Original
Simplified
Key Responsibilities Web Scraping & Data Extraction - Develop and optimize web scraping pipelines using tools such as Scrapy, Puppeteer, Selenium, or similar frameworks. - Automate and scale data extraction from multiple sources, ensuring accuracy and compliance. - Handle CAPTCHA bypassing, rotating proxies, and anti-bot measures to enhance data collection. ETL (Extract, Transform, Load) Pipelines - Build and maintain ETL pipelines to process large volumes of data efficiently. - Implement data transformation, cleaning, validation, and enrichment for AI models. - Ensure real-time and batch data processing for optimized workflows. Data Storage & Management - Design and manage structured and unstructured data storage solutions (SQL, NoSQL, Data Lakes). - Optimize database queries, indexing, and performance for large-scale datasets. - Implement data warehousing best practices for analytics and AI applications. Collaboration & Optimization - Work closely with the backend development team to improve data access, performance, and efficiency. - Support AI model training by ensuring high-quality, well-structured datasets. - Identify bottlenecks and propose solutions to enhance data processing speed. Minimum Requirements Technical Skills Proven experience in web scraping (Scrapy, Puppeteer, Selenium, or similar). Strong understanding of ETL processes and data pipeline architecture. Experience with SQL & NoSQL databases (MySQL, PostgreSQL, MongoDB, Elasticsearch, etc.). Knowledge of data storage optimization and big data frameworks (Hadoop, Spark, or similar). Proficiency in Python (Pandas, NumPy, Airflow, FastAPI, etc.) or another relevant programming language. Familiarity with cloud platforms (AWS, GCP, Azure) for data storage and processing. Experience with real-time data streaming (Kafka, Flink, RabbitMQ, etc.). Hands-on experience with Machine Learning data preprocessing. Knowledge of data security, compliance, and privacy best practices. Experience with API development for data services. Soft Skills & Work Approach Strong problem-solving and analytical skills. Ability to work independently and collaborate with cross-functional teams. Good communication skills and documentation abilities. Adaptability to a fast-paced, AI-driven environment.