- Company Name
- PETADATA
- Job Title
- AWS Data Engineer
- Job Description
-
Job Title: AWS Data Engineer
Role Summary: Designing, building, and maintaining scalable data pipelines and data architecture on cloud platforms, primarily AWS, to support analytics, ML, and BI workloads.
Expectations:
- Execute end‑to‑end data engineering solutions, from ingestion to transformation to storage, ensuring performance, reliability, and compliance.
- Lead modernization initiatives, establish best practices, and mentor peers.
Key Responsibilities:
- Develop batch and real‑time pipelines using Spark, Flink, Kafka, and Hadoop.
- Build and manage data lakes/ warehouses/lakehouses (S3, Redshift, Athena, EMR).
- Create and maintain schemas, data models, metadata, and enforce governance, lineage, and security.
- Integrate heterogeneous data sources (databases, APIs, streaming services).
- Optimize storage, indexing, partitioning, and query performance across SQL/NoSQL systems.
- Implement data quality, validation, monitoring, and alerting.
- Automate workflows with Airflow, Prefect, or Luigi; schedule, monitor, and troubleshoot jobs.
- Collaborate with data scientists, analysts, and business stakeholders to translate requirements into data solutions.
- Monitor pipeline health, troubleshoot bottlenecks, and optimize cost vs performance.
- Define data engineering standards, roadmap, and drive platform direction.
Required Skills:
- Proficient in Python, SQL; experience with Scala/Java optional.
- Hands‑on with Spark, Kafka, Hadoop; knowledge of Flink acceptable.
- Deep expertise in AWS services: S3, Glue, Redshift, EMR, Athena; familiarity with Azure Data Factory/Synapse or GCP BigQuery/Dataflow is a plus.
- Strong database skills: PostgreSQL, MySQL, MongoDB, Cassandra, and NoSQL tuning.
- Workflow orchestration proficiency: Airflow, Prefect, Luigi.
- Solid understanding of data modeling, warehousing, lakehouse concepts.
- Knowledge of data quality, governance, security (GDPR, HIPAA).
- Excellent analytical, problem‑solving, and communication abilities.
Required Education & Certifications:
- Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or related field.
- No mandatory certifications, but familiarity with AWS certifications (e.g., Solutions Architect, Data Analytics) is advantageous.