Job Description for PySpark Jobs in Capgemini
- Must have experience implementing AWS Big Data Lake using EMR and Spark
- Experience working with Spark Hive message queue or pub-sub streaming technologies for 3 years
- 6 years of experience, developing data pipelines using a mix of Python Scala SQL etc. languages and open-source framework for implementing data ingest processing and analytics technologies
- Experience getting open-source of big data processing framework like streaming technologies like Apache Spark Hadoop and Kafka
- Hands on experience with new technologies related to data space such as Spark Airflow Apache Druid Snowflake or any other OLAP database
- Experience developing and deploying data pipelines and real-time data streams within the cloud infrastructure, preferably AWS
Primary skill for PySpark Jobs in Capgemini
- Experience in using CI CD pipeline GitLab
- Experience in code quality implementation
- Used Pep8 Pylint tool or any other code quality tool
- Experience of Python plugins operators like FTP sensor Oracle operator etc.
Related Jobs: Off campus Drives
To apply for this job please visit www.capgemini.com.