Skills

Python Java Scala Data Engineering Encryption Test Linux AWS AWS Cloud Hadoop Spark PySpark

Job Specifications

Hello,

Please find the Job Description below

Aws Hadoop Developer

Toronto ON

Role and Responsibilities

Understand requirement from product owners and translate into requirement and scope documents Decide on the best fitment of technologies services that are available in scope
Create Solution for data ingestion and data transformation using Hadoop ser-vices like Spark, Spark streaming, Hive, etc.
Create technical design documents to communicate solutions to the team and mentor the team to develop the solution Build the solution with Hadoop Services as per design specifications Assist team teams to build test cases and support with testing the solution Coordinate with Upstream, Downstream and other supporting teams for production implementation Provide post-production support for Solutions implemented Develop data engineering frameworks in Spark on AWS Data Lake platform Coordinate with clients, data users and key stakeholders to understand feature requirements needed merge them to create reusable design patterns
Data onboarding using the developed frameworks Understand and make sense of available code in Netezza and Hadoop to design a best way to implement its current features in AWS Data Lake Unit test code and aid with QASITP erf testing Migration to production environment

MUST HAVE skills and experience for this requirement. Please include skills related to technical as well as domain and non-technical skills and experience as applicable to the position.

Candidate should have strong working experience with Hadoop platform.
Strong hands-on experience on Hive, Spark with Scala.
In-depth knowledge and extensive experience in building batch workloads on Hadoop.
Adept in analyzing and refining requirements, consumption query patterns and choosing the right technology fit like RDBMS, data lake and data warehouse.
Should have knowledge of analytical data modelling on any of the RDBMS platform Hive Should have working experience in Pentaho Proven practical experience in migrating RDBMS based data to Hadoop on-prem 7 plus years of data experience in Data warehouse and Data Lake platforms
At least 2 years of implementation experience in AWS Data Lake, S3, EMR Glue, Python, AWS RDS, Amazon Redshift, Amazon Lake Formation, Airflow, Data Models, etc.
MUST have very strong knowledge of Pyspark Must understand data Encryption techniques and be able to implement them.
Must have experience in working with Bitbucket, Artifactory, AWS Code Pipeline Please list 5 NICE TO HAVE but not mandatory skills and experience for this requirement
Hands-on experience working with Terra bytes peat bytes scale data and millions of transactions per day.

Skills

to develop ETL pipe-line using Airflow Knowledge of Spark streaming or any other streaming jobs Ability to deploy code using AWS Code Pipeline and Bit bucket is an added plus
Expert in any of the following program-ming language Scala, Java and comfortable with working on Linux platform. Knowledge of Pythonic pipeline design AWS Cloud Infrastructure for services like S3, Glue, Secrets manager, KMS, Lambda, etc

About the Company

We accelerate business transformation by solving complex technology, business and talent challenges. A digital transformation Partner and Global consulting firm transforming clients business, operating and technology models for the digital era. Strong experience to support the industry providing change management and business process improvement services with a specialist knowledge in Procurement, Supply Chain and Logistics. Know more