Lead Data Engineer
The Fixed income data team is responsible for monetizing data generated by Citi's fixed income businesses and building data analytics tools/services that provide actionable insights with direct impact on revenue.
The Lead Data Engineer will be responsible for designing, implementing, and optimizing distributed data processing jobs to handle large-scale data in Hadoop Distributed File System(HDFS) and S3 Storage using Apache Kafka, Flink Java and Flink SQL, Apache Spark and Python. This role requires deep understanding of data engineering principles, proficiency in Java, Python and hands-on experience with Kafka and S3 ecosystems. Developer will collaborate with data engineers, analysts, and business stakeholders to process, transform and drive insights and data driven decisions.
Responsibilities:
- Subject Matter Expert (SME) in Finance and Risk data with experience in data processing jobs to handle large-scale data in Hadoop Distributed File System (HDFS), S3 Storage using Apache Spark, Apache Kafka Streaming, Apache Flink Java, Flink SQL and Python.
- Good programming skills in Java, SQL and Python
- Distribute data to downstream systems by generating feeds or publishing to Kafka topics
- Required to support situations in which end user consultation is required to identify system function specifications and incorporate them into overall system design and delivery. Additionally, utilize comprehensive knowledge of multiple areas within technology to achieve technological objectives.
- Expected to effectively communicate those risks to the business owners, so that they can make informed decisions.
- Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, and establish and implement new or revised applications systems and programs to meet specific business needs or user areas
- Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
- Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement
- Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality
- Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install and assist customer exposure systems
- Ensure essential procedures are followed and help define operating standards and processes
- Serve as advisor or coach to new or lower level analysts
- Has the ability to operate with a limited level of direct supervision.
- Can exercise independence of judgement and autonomy.
- Acts as SME to senior stakeholders and /or other team members.
Qualifications:
- 8+ years of relevant experience in Hadoop Distributed File System(HDFS) using Apache Spark, Python, Java and SQL
- 2+ years of relevant experience in S3 Storage using Apache Kafka, Flink Java and Flink SQL with minimal latency, monitor and optimize the performance of Kafka clusters, troubleshoot and resolve issues related to Kafka and data processing, implement best practices for Kafka architecture and operations
- Experience in systems analysis and programming of software applications
- Experience in managing and implementing successful projects
- Working knowledge of consulting/project management techniques/methods
- Ability to work under pressure and manage deadlines or unexpected changes in expectations or requirements
- Strong communication skills and attention to detail and accuracy.
- Demonstrated leadership skills.
- Basic knowledge of industry practices and standards
- Consistently demonstrates clear and concise written and verbal communication
Education:
- Bachelor's degree/University degree or equivalent experience
- Prior Financial industry experience will be a plus
This job description provides a high-level review of the types of work performed. Other job-related duties may be assigned as required.
Additional Responsibilities:
Data Processing and Transformation:
- Design and implement big data warehouse application to process and transform large datasets
- Develop ETL Pipelines with Apache Kafka, Flink, Spark, Python for data Ingestion, cleaning, aggregation, and transformations.
Data Distribution:
- Send data to downstream systems by generating feeds or publishing to Kafka topics
Performance Optimization:
- Optimize ETL jobs for efficiency, reducing run time and resource usage.
- Finetune memory management, caching, and partitioning strategies for Optimal performance
Data Engineering with Hadoop, Spark, Kafka, Flink:
- Load data from different sources into S3 Storage, ensuring data accuracy and integrity.
- Testing and debugging:
- Troubleshoot and debug Kafka Job failures, monitor job logs, and Kaka UI Manager to Identify Issues.
Coding standard adherence:
- Coding vulnerabilities identification and addressing. Enforcement of the coding standard to eliminate code vulnerabilities.
- Bigdata best practice adherence including small files elimination, Hive SRE scan success and archival implementation for ideal architecture utilizations.
------------------------------------------------------
Job Family Group:
Technology------------------------------------------------------
Job Family:
Applications Development------------------------------------------------------
Time Type:
Full time------------------------------------------------------
Citi is an equal opportunity employer, and qualified candidates will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other characteristic protected by law.
If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi.
View Citi’s EEO Policy Statement and the Know Your Rights poster.
Featured Career Areas
Saved Jobs
You have no saved jobs
Previously Viewed Jobs
You have no viewed jobs