Войти

Senior Data Engineer

Данная вакансия не актуальна.
Vergent Technology Solutions
Компания Vergent Technology Solutions
Тип Удаленная работа
Описание вакансии
Job Purpose

The Data Engineer will be responsible for overseeing junior data engineering activities and aiding in building the organization’s data collection systems and processing pipelines. Oversee infrastructure, tools and frameworks used to support the delivery of end-to-end solutions to business problems through high performing data infrastructure.

Responsible for expanding and optimising the organization’s data and data pipeline architecture, whilst optimising data flow and collection to ultimately support data initiatives.

Responsibilities:
  • Engage scrum masters and project managers in a timely manner on a periodic basis to provide project updates
  • Engage assigned data scientist and lead the activity of promoting developed models to production
  • Create data pipelines for ML models
  • Delivering projects within the allocated timeline
  • Lead data extraction from source systems
  • Support in EDA, data cleaning and documentation of created models
  • Engage central team ensuring laid down best practices are followed
  • Develop pipelines which do not deviate too much from developed pipelines for Kenya
  • Engage business stakeholders to glean from domain knowledge in creating fit for purpose pipelines
  • Ensure codes are refactored for data engineering pipeline
  • Multitask by building more than one pipeline at each time
Qualifications
  • Strong analytic skills related to working with unstructured datasets.
  • Build processes supporting data transformation, data structures, metadata, dependency and workload management.
  • A successful history of manipulating, processing and extracting value from large disconnected datasets.
  • Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
  • Must have experience working with data scientists.
  • We are looking for a candidate with 8 – 10 years of experience. They should have experience with the following software/ tools:
  • Big data tools: Hadoop, Spark, Kafka, etc.
  • Relational SQL and NoSQL databases, including Postgres and Cassandra.
  • Data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.
  • AWS cloud services: EC2, EMR, RDS, Redshift.
  • Stream-processing systems: Storm, Spark-Streaming, etc.
  • Object-oriented/object function scripting languages: Python, Java, C++, Scala, etc.


Отправляйте свое резюме на email: [email protected]