Worked on Kamke, an in-house distributed nosql database for doing audience analysis. Our team was responsible for all parts of the data pipeline, from initial ingestion of data from multiple sources and large scale map-reduce jobs to transform the data in our internal format. I worked on changes for all parts of the stack with a focus on infrastructure and the low level database portions.
● Owned deployment infrastructure. Reorganized terraform to utilize modules and other best practices, designed and implemented deployment system around Terraform Enterprise, designed and implemented system for devs to easily setup personal stacks. Built deployment and troubleshooting tooling.
● Worked on efficiency/cost saving projects.
● Implemented query language features based on customer feedback,
including aggregation capabilities and new logic for dealing with time/series
data.
● Implemented tracing using OpenTracing and Jaeger, allowing for easier
tracking of long queries and more detailed analysis of where the choke points
were.
● Assisted with implementing roaring bitmap based data structures, including
adding tests and adding additional calculations
Jan 2019 - Oct 2019
● Primary languages used were Go, Python, and Java
● Kamke was written in go and running on AWS.