As a Data Engineer, you'll be at the heart of our infrastructure, designing and building scalable ETL pipelines and analytics databases that power both internal operations and external customer solutions. You'll collaborate with product managers, data scientists, and UX teams to understand requirements and translate them into robust, cloud-deployed systems. This role offers the opportunity to work on large-scale distributed systems while maintaining code quality, implementing observability, and contributing to a culture of continuous improvement.
In this role, you'll:
- Scale systems and redesign existing infrastructure to handle billions of daily requests and process tens of terabytes of data
- Build ETL data pipelines using Databricks, EMR, Athena, and Glue to enable internal and external customer insights
- Develop frameworks for data quality testing and continuous quality assessment of data vendors
- Collaborate with product managers, data scientists, and UX teams to understand requirements and expose metrics effectively
- Write solid, maintainable code in a 100% cloud-deployed infrastructure with self-healing capabilities
- Implement observability and monitoring solutions to track system health and performance
- Participate in design and code review processes while supporting teammates through on-call responsibilities
- Build integrations with various data vendors and develop frameworks to streamline future integrations
We're looking for candidates who have:
- Bachelor's degree in Computer Science, Mathematics, Engineering, Information Management, Information Technology, or related field (or 5+ years equivalent experience); Master's degree acceptable with 3+ years experience
- 4+ years developing scalable, large-scale data processing and ETL pipelines
- 3+ years building data pipelines using EMR, Airflow, Athena, Redshift, PostgreSQL, Snowflake, Kinesis, Lambda, or Databricks
- 4+ years building software using Python or SQL
- 3+ years implementing observability and monitoring tools (Humio, Datadog, Amazon CloudWatch, AWS CloudTrail)
- Strong communication skills with both technical and non-technical stakeholders
- Experience with product-critical issues and ability to support teams during off-hours
Nice to have experience:
- Experience with self-healing infrastructure and auto-scaling systems
- Familiarity with blockchain or cryptocurrency data platforms
- Previous exposure to multiple data platforms and vendor ecosystems
Technologies we use:
Databricks, EMR, Athena, Glue, Redshift, PostgreSQL, Snowflake, Kinesis, Lambda, Airflow, Python, SQL, Humio, Datadog, Amazon CloudWatch, AWS CloudTrail
About Chainalysis
Blockchain technology is powering a growing wave of innovation. Businesses and governments around the world are using blockchains to make banking more efficient, connect with their customers, and investigate criminal cases. As adoption of blockchain technology grows, more and more organizations seek access to all this ecosystem has to offer. That’s where Chainalysis comes in. We provide complete knowledge of what’s happening on blockchains through our data, services, and solutions. With Chainalysis, organizations can navigate blockchains safely and with confidence.
You belong here.
At Chainalysis, we believe that diversity of experience and thought makes us stronger. With both customers and employees around the world, we are committed to ensuring our team reflects the unique communities around us. We’re ensuring we keep learning by committing to continually revisit and reevaluate our diversity culture.
We encourage applicants across any race, ethnicity, gender/gender expression, age, spirituality, ability, experience and more. If you need any accommodations to make our interview process more accessible to you due to a disability, don't hesitate to let us know. You can learn more here. We can’t wait to meet you.

