AI & Data
In this age of disruption, organizations need to navigate the future with confidence, embracing decision making with clear, data-driven choices that deliver enterprise value in a dynamic business environment. The AI & Data team leverages the power of data, analytics, robotics, science and cognitive technologies to uncover hidden relationships from vast troves of data, generate insights, and inform decision-making. Together with the Strategy practice, our Strategy & Analytics portfolio helps clients transform their business by architecting organizational intelligence programs and differentiated strategies to win in their chosen markets.
Senior Consultant- AWS Pyspark
The position is suited for individuals who have the ability to work in a constantly challenging environment and deliver effectively and efficiently. As a Data Engineer, you will be an integral member of our Data & Analytics team responsible for design and development of pipelines using cutting edge technologies.
Key Responsibilities
Data Modeling: Create and maintain conceptual, logical, and physical data models that accurately represent the organization's data assets. These models may include entity-relationship diagrams (ERDs), data flow diagrams, and schema designs.
Requirements Gathering: Collaborate with business analysts, data analysts, and other stakeholders to understand their data needs and requirements. Translate these requirements into data model specifications.
Data Standardization: Define and enforce data standards, naming conventions, and data definitions to ensure consistency and clarity across the organization.
Database Design: Work closely with database administrators (DBAs) to design and optimize database schemas based on the data models. Ensure that databases are efficient, scalable, and well-structured.
Data Integration: Facilitate the integration of data from various sources into the organization's data environment. Ensure that data is transformed and mapped appropriately to meet business needs.
Data Governance: Participate in data governance initiatives by establishing data quality rules, data lineage, and data ownership guidelines. Monitor and enforce data governance policies.
Documentation: Maintain comprehensive documentation of data models, data dictionaries, and metadata. Ensure that all changes to data models are properly documented and communicated.
Performance Optimization: Collaborate with developers and DBAs to optimize database performance. Identify and resolve performance bottlenecks related to data access and storage.
Data Security: Implement data security measures to protect sensitive and confidential data. Define access controls and data encryption strategies as needed.
Data Analysis Support: Support data analysts and data scientists by providing well-structured data models that facilitate data analysis and reporting.
Data Migration: Assist in data migration projects by ensuring that data models are consistent between source and target systems. Verify data integrity during migration.
Professionals should possess a clear understanding of current data models
Should be able to clearly articulate end-to-end data pipelines with analysis and accomplishment.
Implementing data pipelines and ETL processes within AWS Must possess the ability to construct data pipelines utilizing Glue and allocate cloud resources.
Should be capable of conducting data analysis, executing data quality checks, and rectifying data quality problems in collaboration with business stakeholders or Product Owners Required to adhere to leading practices to design pipelines in line with industry standards
Will be collaborating with key stakeholders to address critical issues as they arise.
Qualifications
3-7 years of proficient experience with AWS ETLs, Glue, EMR, S3 & Athena
Expert in languages e.g., Python, Pyspark
Advanced knowledge of database management systems SQL
Excellent problem-solving and analytical skills.
Strong communication and collaboration skills to work with cross-functional teams.
Experience with version control practices and CICD Process (Git, Bamboo, etc.)
Understanding of Financial Services Industry- IMRE (preferred)
Primary Skills:
AWS Glue
Athena
SQL
Python