Austin, Texas Metropolitan Area
ML Jobs Infra
Designed and implemented K8s data plane and networking infrastructure for OCI’s multi-tenant AI training platform.
Implemented OSS mounts in Kubernetes using Rclone as a sidecar and custom FUSE mounts for ML workloads to support checkpointing of their distributed batch training jobs.
Built custom multi-homed networking for large-scale training and inference workloads using Multus CNI and contributed to the development of a custom IPAM solution to enable scalable, cloud-native pod networking.
ML Pipelines
Tech lead for ML Pipelines and shipped multiple features since it's GA.
Designed and implemented a novel solution for our serverless spark service’s integration with ML Pipelines using Kafka events and a cross tenant rule which protects event delivery, which will allow customers to make their data storage of model artifacts transient using a set of ephemeral tokens which bolstered Oracle’s stand on data protection.