Job Title: Lead Data Engineer - Leading Global Analytics Firm - Banking & Finance
Contract Type: Permanent
Location: Shanghai
December 04, 2018

My client, one of the world's leading analytic firms within the finance industry, is currently seeking a Lead Data Engineer. Being part of a team that works hand-in-hand with strategy consulting teams, adding expertise where a good solution requires wrangling with a wide variety of data sources including high volume, high velocity and unstructured data and applying specialised data science and machine learning techniques, you will be responsible for the following 

  • Working in a fast paced and expansive environment you will build models, coalesces data sources, interprets results, and build services and occasionally products that enhance the ability to derive value from data and upgrade decision-making capabilities
  • Act as the key link between IT, business, software engineers, and data scientists, working directly with clients and project teams
  • Architect moving from high-performance SQL data marts for batch analytics to new data stores and cluster-based architectures, streaming analytics and scaling beyond current terabyte-level capabilities
  • Tune high-performance data pipelines to rapidly deploy some of the latest machine learning algorithms/frameworks and other advanced analytical techniques at scale
  • A technical background in computer science, data science, machine learning, artificial intelligence, statistics or other quantitative/computational science
  • Direct experience having built and deployed complex production systems that implement modern data science methods at scale and do so robustly
  • Ability to create and implement data engineering solutions using modern software engineering practices
  • Experience deploying and configuring cloud environments in AWS and Alibaba Cloud, including in-depth experience with Firewall challenges 
  • Expertise working with Pandas, Scikit-Learn, Matplotlib, TensorFlow, Jupyter and other Python data tools
  • Spark (Scala and PySpark), HDFS, Hive, Kafka and other high-volume data tools
  • Relational databases such as SQL Server, Oracle, Postgres
  • NoSQL storage tools, such as MongoDB, Cassandra, Neo4j and ElasticSearch
  • Fluency with cluster computing environments and their associated technologies
  • Deep understanding of how to balance computational considerations with theoretical properties of potential solutions
  • Demonstrated ability to deliver technical projects with a team, often working under tight time constraints to deliver value
  • Must be fluent in English and Mandarin
