- To advance the organization by developing algorithms to build models that uncover connections and make better decisions without human intervention.
- The role of a Lead data engineer is to preparing data by fetching data from different channels and standardize it tobe used easily.
- Research and test new technologies
- Collaborating with other stakeholders
- Monitoring and Oversee company's data
- Managing users and user roles
- Leading the team
- Initiate projects and plans
- Detecting, announcing, and correcting errors
- Develops large scale data structures and pipelines to organize, collect, and standardize data that helps generate insights and addresses reporting needs.
- Writes ETL processes, design database systems, and develop tools for real-time and offline analytic processing.
- Collaborates with the data science team to transform data and integrate algorithms and models into automated processes.
- Uses knowledge in Hadoop architecture, HDFS commands, and experience designing & optimizing queries to build data pipelines.
- Uses strong programming skills in Python, Java, or any of the major languages to build robust data pipelines and dynamic systems.
- Builds data marts and data models to support Data Science and other internal customers.
- Integrates data from a variety of sources, assuring that they adhere to data quality and accessibility standards.
- Analyzes current information technology environments to identify and assess critical capabilities and recommend solutions.
- Experiments with available tools and advice on new tools in order to determine the optimal solution given the requirements dictated by the model/use case.
- Have degree in the computer engineering,
- Expertise with different type of structure and unstructure databases, like MySQL, Postgres, MongoDB, and etc,
- Know programming languages like Java, C++, Python, and etc,
- Know python libraries, specially Pandas and numpy,
- Know cloud infrastructures like AWS, Azure, and Google cloud,
- Know Linux shell scripting,
- Expertise with SQL, like oracle, greenplum, and teradata,
- Work with data streaming framework Kafka, NiFi, Spark streaming, and etc),
- Expertise with bigdata, like HDFS,hive, sqoop, pig, Hadoop, and spark.
It's always a good idea to include the benefits of the job the company will provide such as:
- Flexible hours to give you freedom and increase productivity
- Life insurance for you and your family members
- Work remotely in the comfort of your home
- Free Gym membership so you can stay in shape
- Fun and Energetic weekly team bonding events
Post the Job Now