Building a next-generation, metadata- and automation-driven data experience for GSK’s scientists, engineers, and decision-makers, increasing productivity and reducing time spent on “data mechanics”
Providing best-in-class AI/ML and data analysis environments to accelerate our predictive capabilities and attract top-tier talent
Aggressively engineering our data at scale, as one unified asset, to unlock the value of our unique collection of data and predictions in real-time
Builds modular code / libraries / services / etc using modern data engineering tools (Python/Spark, Kafka, Storm, …) and orchestration tools (e.g. Google Workflow, Airflow Composer)
Produces well-engineered software, including appropriate automated test suites and technical documentation
Ensure consistent application of platform abstractions to ensure quality and consistency with respect to logging and lineage
Adhere to QMS framework and CI/CD best practices
Provide L3 support to existing tools / pipelines / services
Bachelors degree +2 years of data engineering experience.
Cloud experience (e.g., AWS, Google Cloud, Azure, Kubernetes)
Experience in automated testing and design
Experience with DevOps-forward ways of working
Experience with at least one common programming language: e.g., Python, Scala, Java etc.
Experience with data modelling, database concepts and SQL