Staff Data Engineer

permanent

Fully Remote

Only accepting applications from: United States

Define and drive software architecture development across different engineering teams.
Re-design data-pipeline software for over 200 million records daily to increase efficiency and responsiveness to user needs.
Drive technical direction for data engineering on product teams, ensuring alignment with business goals, and fostering best practices in software development.
Develop and maintain the data engineering technical strategy and roadmap for key Demandbase software products.
Lead the integration of data pipelines and workflows, delivering business outcomes autonomously.
Work with engineering managers, peer engineers, and product managers to ensure seamless execution of technical initiatives.
Introduce and advocate for best software engineering practices.
Act as a mentor and role model, helping to grow and develop engineering talent within the organization.
Work closely with product managers to break down product initiatives into deliverable iterations while balancing technical and business needs.
Contribute to code reviews, proof of concepts, and complex system designs when needed.
Lead design and implementation of robust data solutions and microservices to meet real-time and batch requirements.
Lead the development and optimization of large-scale, distributed microservices and APIs for real-time and batch system needs using Scala/Java/Python.
Lead the development and scaling of data pipelines using Spark, incorporating NoSQL and relational databases.
Lead data aggregation, warehousing, and processing of at least 200 million records daily.
Consume and produce data using event-driven systems like Pulsar and Kafka.
Lead automation and streamlining of deployments, ensuring efficient and secure cloud-based workflows.
Lead the maintenance and creation of GitLab pipelines to automate build, test and deployments on AWS Cloud using GitLab CI/CD.
Lead orchestration of data pipelines, including scheduling, monitoring, and managing high volume data workflows using Astronomer deployed via CI/CD.
Use Docker for containerization.
Lead development and refinement of data models to maximize performance and meet business objectives.
Create, maintain, and review data models to suit business requirements while ensuring efficient solutions.

Experience

Bachelor’s degree in Computer Science or Computer Engineering.
5 years of experience developing and optimizing large-scale, distributed microservices and APIs using Scala, Java, and/or Python.
5 years of experience using Spark to develop and scale data pipelines which incorporate noSQL and relational databases.
5 years of experience setting up automated build, test and deployment pipelines with CI/CD using Github, gitlab, and/or Jenkins.
5 years of experience working with Big Data/ Cloud technologies.
5 years of experience performing data modeling.
5 years of experience working on the orchestration of data pipelines using tools like Astronomer, Airflow or similar tools.
Experience using Docker for containerization.
Experience with data aggregation, warehousing, and processing of at least 100 million records daily.