Word Embeddings - An Alternative and Efficient Approach to Search for Documents

Originally aired:

About the Session

Searching for documents in a collection is typically implemented via a TF/IDF principle in open source document search engines. However recent developments in the field of NLP has shown positive results in representing text into more concise vector representations as opposed to a bag of words construct. In addition to this, these approaches also add richness to the information models like taking care of analogies and semantics of the words. This talk would walk through an end to end data workflow to enable such a construct.

The first part of the session would describe the typical flow of how a search query is processed by default in any of the lucene powered search engines today. The concept of TF/IDF is also introduced in this part of the session.

The session then proceeds to describe the concept of word embeddings using a library like Facebooks fasttext.

Subsequently, a representative data pipeline is discussed as to how an incoming stream of data can be turned into vector representations and made amenable for searching with a few seconds of turn around time.

The session would close with a few references to the more recent developments in this space.

See Highlights

Hear What Attendees Say

PwC

“Once again Saltmarch has knocked it out of the park with interesting speakers, engaging content and challenging ideas. No jetlag fog at all, which counts for how interesting the whole thing was."

Cybersecurity Lead, PwC

Intuit

“Very much looking forward to next year. I will be keeping my eye out for the date so I can make sure I lock it in my calendar."

Software Engineering Specialist, Intuit

GroupOn

“Best conference I have ever been to with lots of insights and information on next generation technologies and those that are the need of the hour."

Software Architect, GroupOn

Hear What Speakers & Sponsors Say

Scott Davis

“Happy to meet everyone who came from near and far. Glad to know you've discovered some great lessons here, and glad you joined us for all the discoveries great and small."

Web Architect & Principal Engineer, Scott Davis

Dr. Venkat Subramaniam

“Wonderful set of conferences, well organized, fantastic speakers, and an amazingly interactive set of audience. Thanks for having me at the events!"

Founder of Agile Developer Inc., Dr. Venkat Subramaniam

Oracle Corp.

“What a buzz! The events have been instrumental in bringing the whole software community together. There has been something for everyone from developers to architects to business to vendors. Thanks everyone!"

Voltaire Yap, Global Events Manager, Oracle Corp.