Consider a typical user experience on an online shopping platform such as Amazon or eBay. As the user (i.e., a potential customer) starts typing a query to search for an item, the platform provides suggestions to (a) automatically complete the query, and (b) suggest new queries that might be relevant for the user. Thus, query recommendation is central to such e-commerce platforms.

Often, the user may not know or be able to specify exactly the product they might be interested in. Typically, either they are able to narrow down to a desired product with a few queries, or decide to look elsewhere. In order to enhance the user experience and improve the chances of selling a product, it is critical to be able to predict a query that could result in a successful purchase decision. For instance, suppose the user enters the query “charger”. The platform does not know what charger the user has in mind, and so should be able to suggest a diverse set of queries such as (a) cell charger, (b) mobile charger, (c) battery charger, (d) portable charger, and (e) laptop charger. Now, the user might be looking for a phone charger, however that could mean (a) iphone charger, (b) android charger, etc. Therefore, the system must be able to recommend such queries to the user when they click on the phone charger. Again, if the user clicks on iphone charger, they must be shown queries such as (a) apple iphone 6 charger, (b) apple iphone 11 charger, and (c) cheap iphone 6 charger (e.g., provided by a third party vendor). This sequence of recommender queries should change drastically if the user initially clicked on the laptop charger. That is, it is important to take into account all the information that unravels as the user expresses their intent through the queries they click or type in.

State-of-the-art deep learning models such as Transformers and Recurrent Neural Networks (RNNs) are designed to perform sequence (e.g., a query) to sequence transduction. However, for online shopping platforms, we need to recommend the next query based on not just the most recent query but also the previous queries. We introduce a Hierarchical Transformer model that teases out the multiresolution structure inherent in temporal data as the user progressively interacts with the system. Our model significantly outperforms both the Transformer and RNN models on real data, e.g., we achieve over 25% improvement in the BLEU score compared to RNNs.

More info!