Data Platform Engineer

Real Animation Works
Apply Now

Site Name: San Francisco, Cambridge 300 Technology Square, London The Stanley Building, Upper Providence, USA – Washington – Seattle
Posted Date: Jul 14 2023

Apply now

This role can be office based within the greater Seattle, WA location. The specific location of the Seattle office will be determined in the near future.

At GSK, we want to supercharge our data capability to better understand our patients and accelerate our ability to discover vaccines and medicines. The Onyx Research Data Platform organization represents a major investment by GSK R&D and Digital & Tech, designed to deliver a step-change in our ability to leverage data, knowledge, and prediction to find new medicines.

We are a full-stack shop consisting of product and portfolio leadership, data engineering, infrastructure and DevOps, data / metadata / knowledge platforms, and AI/ML and analysis platforms, all geared toward:

Building a next-generation, metadata- and automation-driven data experience for GSK’s scientists, engineers, and decision-makers, increasing productivity and reducing time spent on “data mechanics.”
Providing best-in-class AI/ML and data analysis environments to accelerate our predictive capabilities and attract top-tier talent.
Aggressively engineering our data at scale, as one unified asset, to unlock the value of our unique collection of data and predictions in real-time.
Automation of end-to-end data flows: Faster and reliable ingestion of high throughput data in genetics, genomics and multi-omics, to extract value of investments in new technology (instrument to analysis-ready data in <12h) Enabling governance by design of external and internal data: with engineered practical solutions for controlled use and monitoring Innovative disease-specific and domain-expert specific data products: to enable computational scientists and their research unit collaborators to get faster to key insights leading to faster biopharmaceutical development cycles. Supporting e2e code traceability and data provenance: Increasing assurance of data integrity through automation, integration Improving engineering efficiency: Extensible, reusable, scalable, updateable, maintainable, virtualized traceable data and code would be driven by data engineering innovation and better resource utilization. We are looking for a skilled and experienced Data Platform Engineer II to join our growing team. Data Platform Engineers take full ownership of delivering high-performing, high-impact data platform as products, and services, from a description of a problem customer Data Engineers are trying to solve all the way through to final delivery (and ongoing monitoring and operations). They are standard bearers for software engineering and quality coding practices within the team and are expected to mentor more junior engineers; they may even coordinate the work of more junior engineers on a large project. They devise useful metrics ensuring their services are meeting customer demand, having an impact, and iterate to deliver and improve on those metrics in an agile fashion. The Data Platform team builds and manages reusable components and architectures designed to make it both fast and easy to build robust, scalable, production-grade data products and services in the challenging biomedical data space. A Data Platform Engineer II is a technical individual contributor, building modern, cloud-native systems for standardizing and templatizing data engineering, such as: Standardized physical storage and search / indexing systems Schema management (data + metadata + versioning + provenance + governance) API semantics and ontology management Standard API architectures Kafka + standard streaming semantics Standard components for publishing data to file-based, relational, and other sorts of data stores Metadata systems Tooling for QA / evaluation Etc. A Data Platform Engineer II knows the metrics desired for their tools and services and iterates to deliver and improve on those metrics in an agile fashion. Additional responsibilities include: Given a well-specified data framework problem, implement end-to-end solutions using appropriate programming languages (e.g., python, Scala, or go), open-source tools (e.g., Spark, Elasticsearch, …), and cloud vendor-provided tools (e.g., Amazon S3) Leverage tools provided by Tech (e.g., infrastructure as code, cloud Ops, DevOps, logging / alerting, …) in delivery of solutions. Write proper documentation in code as well as in wikis/other documentation systems. Write fantastic code along with proper unit, functional, and integration tests for code and services to ensure quality. Stay up to date with developments in the open-source community around data engineering, data science, and similar tooling. Why You? Basic Qualifications: We are looking for professionals with these required skills to achieve our goals: Bachelor’s with 5 plus years’ direct industry experience not including internships or Master’s with 3 years’ direct industry experience in computer science with a focus in Data Engineering, DataOps, DevOps, MLOps, Software Engineering Experience with common distributed data tools in a production setting (Spark, Kafka, Hive, Presto, etc.) Experience with specialized data architecture (e.g., data lake, lake house, data fabric, data mesh, optimizing physical layout for access patterns) Experience with public cloud providers like AWS, Azure and GCP Experience with search / indexing systems (e.g., Elasticsearch) Preferred Qualifications: If you have the following characteristics, it would be a plus: Experience building and designing a DevOps first way of working. Demonstrated excellence writing production Python, Java, Scala, Go, and/or C#/C++ Practical experience with agile software development and DevOsps-forward ways of working Demonstrated experience building reusable components on top of the CNCF ecosystem including platforms like Kubernetes (or similar ecosystem) Metrics-first mindset #LI-GSK #GSKOnyx #GSKDSDE2022 #GSKDEN2022 Why GSK? GSK offers a competitive compensation package inclusive of the following: Competitive base salary, annual bonus based on company performance, access to healthcare and wellbeing programs, retirement savings program, paid time off, and employee recognition programs which reward exceptional achievements. The salary range for this role is: $0 to $0 GSK is a global biopharma company with a special purpose – to unite science, technology and talent to get ahead of disease together – so we can positively impact the health of billions of people and deliver stronger, more sustainable shareholder returns – as an organisation where people can thrive. Getting ahead means preventing disease as well as treating it, and we aim to positively impact the health of 2.5 billion people by the end of 2030. Our success absolutely depends on our people. While getting ahead of disease together is about our ambition for patients and shareholders, it’s also about making GSK a place where people can thrive. We want GSK to be a workplace where everyone can feel a sense of belonging and thrive as set out in our Equal and Inclusive Treatment of Employees policy. We’re committed to being more proactive at all levels so that our workforce reflects the communities we work and hire in, and our GSK leadership reflects our GSK workforce. If you require an accommodation or other assistance to apply for a job at GSK, please contact the GSK Service Centre at 1-877-694-7547 (US Toll Free) or +1 801 567 5155 (outside US). GSK is an Equal Opportunity Employer and, in the US, we adhere to Affirmative Action principles. This ensures that all qualified applicants will receive equal consideration for employment without regard to race, color, national origin, religion, sex, pregnancy, marital status, sexual orientation, gender identity/expression, age, disability, genetic information, military service, covered/protected veteran status or any other federal, state or local protected class. Important notice to Employment businesses/ Agencies GSK does not accept referrals from employment businesses and/or employment agencies in respect of the vacancies posted on this site. All employment businesses/agencies are required to contact GSK’s commercial and general procurement/human resources department to obtain prior written authorization before referring any candidates to GSK. The obtaining of prior written authorization is a condition precedent to any agreement (verbal or written) between the employment business/ agency and GSK. In the absence of such written authorization being obtained any actions undertaken by the employment business/agency shall be deemed to have been performed without the consent or contractual agreement of GSK. GSK shall therefore not be liable for any fees arising from such actions or any fees arising from any referrals by employment businesses/agencies in respect of the vacancies posted on this site. Please note that if you are a US Licensed Healthcare Professional or Healthcare Professional as defined by the laws of the state issuing your license, GSK may be required to capture and report expenses GSK incurs, on your behalf, in the event you are afforded an interview for employment. This capture of applicable transfers of value is necessary to ensure GSK’s compliance to all federal and state US Transparency requirements. For more information, please visit GSK’s Transparency Reporting For the Record site.