Hi, I'm Yash

Programmer AnalystData Engineer working in

Big Data and PySparkData Engineering Domain

based in Pune, India

💼Work Experience

Currently Working

Cognizant Technology Solutions

Nov – 2022 to Present

• Worked on a data migration project for a U.S.-based pharmaceutical client, migrating the existing system to Apache Iceberg using PySpark.

• Developed custom code for ingesting raw data from multiple sources: Oracle (via JDBC), Salesforce APIs, and file-based formats (CSV, text, XML), with data stored in AWS S3.

• Executed data transformations across raw, TL, and ATL layers, creating cross-references (XREFs) and harmonized golden records for critical tables.

• Developed an optimized solution for writing data to Salesforce schemas using APIs for downstream applications from Apache Iceberg tables using PySpark.

• Analyzed the existing framework and complex SQL codes, reimplementing them using dynamic, generic scripts and queries to enhance efficiency.

• Improved data quality and ensured seamless data orchestration using the Modak Nabu tool.

• Collaborated with the team to lead critical deployments and resolve challenges in production environments.

• Applied advanced data engineering concepts to refine workflows, boosting pipeline efficiency and enhancing data quality, leading to a 25% reduction in errors.

• Optimized data pipelines to enhance performance and ensure seamless data availability for outbound systems, improving accuracy and reducing processing time by 30%.

Cognizant Technology Solutions

Feb – 2022 to July – 2022

• Domain for internship training – Big Data and PySpark

• Trained on Hadoop, Hadoop YARN, Pig, Hive, HBase, Apache Spark, Apache Kafka, Apache Flume, Apache Sqoop, Apache NiFi, Data Warehouse Fundamentals, Zookeeper, Scala and PySpark and ETL

• Performed different operations on various datasets using Scala and PySpark

• Studied and implemented Apache Spark modules like Spark SQL, Spark Streaming, MLLib, etc.

• Used Pig and Hive on datasets for various operations