Platform Data Engineer

Company: Kuraray America, Inc.

Location: San Jose

Closing Date: 06/11/2024

Salary: £150 - £200 Per Annum

Hours: Full Time

Type: Permanent

Apply Now

Job Requirements / Description

Location: San Francisco, Seattle, LA or PST

Enterprise Data Platform to enable timely, effective and safe sharing of data to multiple engineering, operations and business teams for building world-class data products.

Responsibilities

Build data ingestion and processing pipelines to enable data analytics and data science use-cases in areas of digital commerce, service operations, charging, reliability, finance, capex, warranty, customer service, and others.
Build a modular set of data services using Python, SQL, AWS Glue, Lambda, API Gateway, Kafka, data build tool (dbt), Apache Spark on EMR, among others.
Build automated unit and integration testing pipelines using frameworks like PySpark.
Create and manage CICD pipelines with GitLab CI and AWS Code Pipeline/CodeDeploy.
Automate and schedule jobs using Managed Airflow.
Build the ODS and reporting schemas and load the data into AWS Redshift or Snowflake.
Design and build data quality management services with Apache Deequ and data observability tools like Splunk, DataDog, CloudWatch.
Provide a variety of query services with REST, Athena/Presto, server-sent events.
Configure and set up the enterprise data lineage and metadata management and data catalog support using tools like Collibra/Alation.
Assist the data scientist within the data engineering team as well as other software engineering teams with data cleansing, wrangling, and feature engineering.
Ensure green builds for deployment and work with program management and senior leads to burn down planned deliverables in a sprint cycle.

Qualifications

At least 5+ years building data and analytics platforms using AWS Cloud, Python, and SQL.
Knowledge of AWS technologies specifically MSK, EMR, Athena, Glue, Lambda, API Gateway as well as Python, SQL is a must.
Knowledge of modern data tools like dbt (data build tool) and Airflow orchestration is highly desired.
Ability to assist SQL analysts and Tableau developers in business teams in creating the right set of materialized views in a SQL data warehouse like Redshift/Snowflake.
Knowledge of automation and CICD best practices.
Familiarity with machine learning and data science ecosystems especially AWS Sagemaker and Databricks is highly preferred.
Hands-on experience in building and maintaining production data applications, current experience in both relational and distributed columnar data stores.
Deep experience using SQL, Python, and Spark. Hands-on experience with Big-data technologies (e.g. Redshift, Athena, Glue, EMR, Kinesis, Step Function, or equivalent in other web services).
Familiarity with timeseries databases, data streaming applications, Kafka, Flink, and more is a plus.
Familiarity with modern data science and product analytics tools and techniques such as R, Machine Learning, and advanced statistics is a plus.

#J-18808-Ljbffr

Apply Now

Share this job

Kuraray America, Inc.

Useful Links

More Jobs in San Jose
Full Time Jobs in San Jose
Part Time Jobs in San Jose
Information Technology Jobs
Engineering Jobs