Collaborate with product managers, data scientists, data analysts and engineers to define requirements and data specifications.
Plan, design, build, test and deploy data warehouse and data mart solutions.
Leading small to medium size projects, solve data problems through the documentation, design and creation of ETL jobs, data marts.
Works to increase the usage and value of the data warehouse and ensures the integrity of the data delivered.
Develops and implements standards, and promotes their use throughout the warehouse
Develop, deploy and maintain data processing pipelines using cloud technology such as AWS, Kubernetes, Airflow, Redshift, Databricks, EMR.
Define and manage overall schedule and availability for a variety of data sets.
Work closely with other engineers to enhance infrastructure, improve reliability and efficiency.
Make smart engineering and product decisions based on data analysis and collaboration.
Act as an in house data expert and make recommendations regarding standards for code quality and timeliness.
Architect cloud-based data pipeline solutions to meet stakeholder needs.
Bachelor’s degree in analytics, engineering, math, computer science, information technology or related discipline.
6+ years professional experience in the big data space.
6+ years' experience in engineering data pipelines using big data technologies (Spark, Flink etc...) on large scale data sets.
Expert knowledge in writing complex pySpark, SQL, dbt and ETL development with experience processing extremely large datasets.
Expert in applying SCD types on S3 data lake using Databricks/Delta Lake.
Experience with data model principles and data cataloging.
Experience with job scheduler Airflow or similar.
Demonstrated ability to analyze large data sets to identify gaps and inconsistencies, provide data insights, and advance effective product solutions.
Deep familiarity with AWS Services (S3, Event Bridge, Kinesis, Glue, EMR, Lambda).
Experience with data warehouse platforms such as Redshift, Databricks, Big Query, Snowflake.
Ability to quickly learn complex domains and new technologies.
Innately curious and organized with the drive to analyze data to identify deliverables, anomalies and gaps and propose solutions to address these findings.
Thrives in a fast-paced startup environment.
Experience with customer data platform tools such as Segment.
Experience with data streaming such as Kafka.
Experience using Jira, GitHub, Docker, CodeFresh, Terraform.
Experience contributing to full lifecycle deployments with a focus on testing and quality.
Experience with data quality processes, data quality checks, validations, data quality metrics definition and measurement.
AWS/Kafka/Databricks or similar certifications.
At GoodRx, pay ranges are determined based on work locations and may vary based on where the successful candidate is hired. The pay ranges below are shown as a guideline, and the successful candidate’s starting pay will be determined based on job-related skills, experience, qualifications, and other relevant business and organizational factors. These pay zones may be modified in the future. Please contact your recruiter for additional information.