Job Title: Sr. Big Data Engineer
Location: San Francisco, CA (open to remote)
Duration: 6 months (will extend, we have multiple consultants on this team that have been there 2+ years)
Interview: 2 rounds (1st round 1-hour video technical interview, 2nd round 30 min formality personality call)
Job Description:
Looking for a strong Big Data Engineer with experience in Spark, Scala, SQL, and Azure.
The Architecture and Platform Organizations are looking for an experienced Big Data Engineer to build analytics and ML platforms to collect, store, process, and analyze huge sets of data spread across the organization. The platform will provide frameworks for quickly rolling out new data analysis for data-driven products and micro-services.
The platform will also enable machine/deep learning infrastructure that operationalizes data science models for broad consumption. You'll partner with end-to-end Product Managers and Data Scientists to understand customer requirements and design prototypes, bringing ideas into production.
You need to be an expert in design, coding, and scripting. You'll be writing high-quality code that is consistent with our standards, creating new standards as necessary, and demonstrating correctness with pragmatic automated tests. You'll review the work of other engineers to improve quality and engineering practices and participate in continuing education programs to grow your skills. You'll be serving as a member of an Agile Engineering team and participate in the team's workflow.
Responsibilities:
- Develop distributed, scalable, and reliable data pipelines that ingest and process data at scale and in batch and real-time.
- Create Spark Jobs & Azure Cloud using Azure tools for scheduling and workflow management.
- Work with the Marketing and Supply Chain side on a Personalization initiative and manage data feeds to/from 3rd party vendors.
- Engage in batch processing and real-time streaming projects.
Qualifications:
- 5-8 years of experience as a Software Engineer.
- Strong knowledge of programming languages/tools including Java, Scala, Spark, SQL, Hive, and ElasticSearch.
- Experience with streaming technologies such as Spark Streaming, Flink, or Apache Beam.
- Experience with Kafka is a plus.
- Working experience with various NoSQL databases such as Cassandra, HBase, MongoDB, and/or Couchbase.
- Prior Machine Learning or Deep Learning knowledge is a plus (will be learned on the job).
Key Skills:
- Basic Transformations like filter, map & Actions like count, Group by, etc using Dataframe API.
- Iterating over Scala collections.
- Spark Parallelism – Data Ingestion from External RDBMS, Local Transformations.
- Datawarehouse – Dimensions, Facts when to do full load vs Incremental.
- Basic software engineering principles.