My open projects can be found at
Vector datastores are crucial for developing RAG applications with LLMs.
This resppostirory features code and reference architectures for vector search and RAG applications with MongoDB.
I have experimented with various embedding models and LLMs (openAI, Mistral, LLama)
I created these dockerized stacks to make development and running them easier
Hadoop is very particular about DNS records of servers in the cluster. DNS record mis matches can cause runtime errors.
My hadoop DNS checker utility verifies DNS records of cluster machines.
Spark Job Server allows running Spark jobs with low latency.
Submitted multiple patches and pull requests
Contributed performance patch and document patches to HBase - a distributed noSQL database