2020 Goals: Become a Search Expert
This post is in many ways a journal of what I know and what I need to think about over the course of the year and I plan to update it accordingly.
Below is a timeline of efforts I've taken this year to make myself a search expert, I'll update this regularly as new items are added.
This years updates
My Background / Experience
Part 1: Learning about Search
My first exposure to Search was at quotecatalog.com, where I helped to build a rich database of quotations, people and titles. The earliest versions of the site had some Sphinx search indicies that powered our typeahead's.
I didn't build the first implementation but I sure had to maintain and fix it once it failed. And I learned a lot! Eventually I started messing with the indexer and I was able to add a few more fields. By the time we knew it we had started to use Sphinx in a few more places.
Periodically we saw huge spikes in our SQL database load and after some debugging we correlated it with Sphinx. Sphinx is fast and powerful, but requires you index in one big SQL query especially if you want many fields indexed. After a heroic effort of reading all the documentation experimenting a bit more I had enough and was ready to move onto greener pastures.
It was at that time I began to experiment with Elastic Search. It had a great community, much nicer documentation, and some really awesome features so it became a great choice for our next search engine. Through that process I learned quite a bit about Elastic Search and I'm most proud of:
- Adding realtime indexing
- Adding disaster recovery with backups
- Implementing a ton of custom search filters (~70)
- Speed / Elastic Search was fast!
On the tail of building an Elastic Search powered API there were several thoughts that changed the way I thought about search and it's applications/implementation.
Takeaway 1: There is no reason an Elastic Search cluster can't be your primary datastore, provided you don't drift too far from your original schema
Takeaway 2: The cleaner / higher quality data that you put into your search index or database the better your API becomes, and this multiplies overtime in both directions.
Part 2: Learn about microservices
In my next role (2018) I joined at newly founded crypto exchange. Building an exchange had many different challenges than my previous role. Our latency was crucial, uptime had to be much higher and precision took on a whole new meaning. Over the next 7 months we focused heavily on:
- Microservices (NodeJS / eventually Go)
- Apache Kafka / Stream Architectures
- High availability (through Docker / eventually Kubernetes)
By the time I was done there I had experience first hand in many of the most popular patterns and techniques for building Highly available web applications / trading systems. Messaging systems in particular are really intriguing especially for its applications for search engines.
Part 3: Not building but designing search (today)
In my current role I am focused solely on UX & Search for a much larger search engine than I could have imagined. Much of my technical background is very helpful. But these days I rely much more on the experts to help us guide our implementations and ideas. However this role has made me look at many of the interfaces provided by search engines in a whole new light. I've been focused heavily on designing:
- Search facets
- Relevancy and Query Parsing
- Keyword Hits
- ML Algorithms designed to improve search
- Typeahead's and Suggestions
Takeaway 1: Designing the "simplest" search interface requires a lot of hard work
Calling All Search Experts
Clinton Halpin is a full-stack engineer and product designer based in New York City. You should follow him on Twitter