Role Overview
Data Research Engineers in Citadel Securities are responsible for the curation of large-scale structured and unstructured data and for data assimilation across a multitude of sources. They focus on the translation of high-level conceptual queries to efficient back-end queries on extremely large-scale data collections. Our data science is supported by state-of-the-art cloud-based services that scale out. The advanced queries will be crucial in enriching alpha signals for our multi-billion-dollar Systematic Business and research function, enabling us to deliver a competitive edge against competitors.
Our team of talented Data Research Engineers works hand-in-hand with the QR teams to design the most insightful queries. In addition, they closely work with business operations and QRs in order to identify and evaluate the most promising datasets from anywhere in the world and available in numerous modalities. We are looking for individuals passionate about driving business impact through data!
Responsibilities:
- Act as data owner of several datasets each with its own unique schema and delivery frequency
- Partner with researchers to produce high-value datasets
- Design, create, automate, and maintain custom data pipelines
- Extract the raw data
- Clean and normalize the data
- Translate high-level market research concepts into scalable processes that further transform the data
- Handle all edge cases
- Develop high-performance schemas and queries
- Setup “data checks” and alerts to determine when the data is “bad”
- Load the processed dataset to a format easily digested by quantitative researchers and models
- Develop tooling to facilitate the development and monitoring of custom data pipelines
- Setup analytical infrastructure to facilitate exploration and visualization of datasets
- Large scale cloud development of applications as a service
Qualifications:
Required:
- Demonstrated experience driving meaningful business outcomes through data engineering
- Prior ownership of a data project (or at least a large part of it)
- Strong attention to detail
- Good analytical and quantitative abilities
- Demonstrated ability to quickly learn new technologies and skills
- Experience with ETL development
Preferred:
- Experience with data manipulation and query design and optimization
- Experience in time series forecasting and machine learning techniques
- Strong data science skills
- Strong coding skills: proficiency in Python, SQL DBs, Cloud, schedulers, containers, CI/CD, software packaging
- Strong ability to communicate with other stakeholders (e.g., data vendors, QRs, etc.)
- Data Build Tool (DBT)
- Education: BS degree in Computer Science, Mathematics, or related Computer Engineering or Science curriculum