KDD 2018 Report

Cover Photo (ExCeL London)

The 24th The Association for Computing Machinery's Special Interest Group on Knowledge Discovery and Data Mining (ACM SIGKDD) Conference on Knowledge Discovery and Data Mining (KDD) was held from 19 August 2018 to 23 August 2018 at the Excel Center in London, UK. SIGKDD promotes basic research and development in KDD, adoption of "standards" in the market in terms of terminology, evaluation, methodology and interdisciplinary education among KDD researchers, practitioners, and users. Many researchers have attended the conference from various industries.

Presentation Talks

Data Science in Retail as a Service.


This talk is from JD.com, Largest retailer in China and 3rd largest internet company globally. This talk covered end to end activities in retail sector with a strong focus on AI and promoting Retail-as-a-Service as a solution to offer customers more intimate and personalized shopping experience.

Using multiple sources of customers data like online transaction data, click through, social media, offline data; they developed algorithms to find Behavior description and intention detection. But, did not cover any exclusive details of exact approach on this.

Explained their approach on customer demand on a product. Covered different approaches to tackle demand forecasting using traditional statistical methods (ARIMA, Box-Jenkins methodologies), Machine Learning methods(Random forest and Gradient Boosting) and Deep learning methods(LSTM and Seq2seq). Understood the seq2seq deep learning model and how to take advantage of this model with huge big data.

Feature are developed to cover all possible data dimensions like sales data, Sku attributes, Time and Location. Challenging part mentioned in the talk is highly non stationary time series based on high variability of customer demand. Covered a case study about probabilistic prediction for each sku every day.

Content based methods, Collaborative filtering methods(User-User, Item-Item collaboration; methods - Association rule, Probabilistic model based, nearest neighbor memory based, matrix factorization), Hybrid methods are used in product recommendation algorithms. Covered the usage of multi arm bandits.

Deterministic (NP hard) discrete optimization problem is solved to minimize the number of locally missed orders. Demand estimation with MLE is covered in Replenishment problem. Knapsack to solve inventory level; goal is to maximize the revenue under capacity constraint.

Case study of 7 fresh grocery customer's dynamic pricing and product tracking. Unique customer is pointed based on facial recognition technology used inside a store; customer shopping path inside the store is tracked to enrich the data set.

Causal inference and counterfactual reasoning


Discussed about how machine learning methods today focus on correlation analyses and prediction, and how this is insufficient when we need to understand causal mechanisms and design interventions. Covered some scenarios where such correlations and predictive analyses can fail, showing a special case phenomenon called Simpson's Paradox.

Spoke about 3 layer causal hierarchy Association, Intervention and Counterfactual. Covered the concept of auditing the effect of an algorithm and usage of randomized experiments for causal inference. Based on Markov assumption, a structural causal model framework for expressing complex causal relationships.

Structural causal model framework is a Microsoft project. Causal inference knowledge is used in ranking features or identifying the dependent relationship among features.

Paper Sessions

E-tail product return prediction


This talk is about a generic framework for predicting E-tail product return named HyperGo. It aims to predict the customer’s intention to return after s/he has put together the shopping basket. For a given basket, they propose a local graph cut algorithm using truncated random walk on the hyper graph to identify similar historical baskets. Based on these baskets, HyperGo is able to estimate the return intention on two levels: basket-level vs. product-level, which provides the E-tailers with detailed information regarding the reason for a potential return (e.g., duplicate products with different colors). One major benefit of the proposed local algorithm lies in its time complexity, which is linearly dependent on the size of the output cluster and poly logarithmic dependent on the volume of the hypergraph. This makes HyperGo particularly suitable for processing large-scale data sets. The experimental results on multiple real-world E-tail data sets demonstrate the effectiveness and efficiency of HyperGo.

Attribute Value Extraction from Product Profiles


This talk is about the extraction of missing attribute values is to find values describing an attribute of interest from a free text input. Mentioned about deep tagging model OpenTag for this extraction problem with the following contributions: (1) formalize the problem as a sequence tagging task, and propose a joint model exploiting recurrent neural networks (specifically, bidirectional LSTM) to capture context and semantics, and Conditional Random Fields (CRF) to enforce tagging consistency; (2) develop a novel attention mechanism to provide interpretable explanation for our model’s decisions; (3) propose a novel sampling strategy exploring active learning to reduce the burden of human annotation.

Learning and Transferring IDs Representation in E-commerce


This talk is about the essential representation of IDs, including user ID, item ID, product ID, store ID, brand ID, category ID etc. The classical encoding based methods (like one-hot encoding) are inefficient in that it suffers sparsity problems due to its high dimension, and it cannot reflect the relationships among IDs, either homogeneous or heterogeneous ones. Using structural connections among IDs, all types of IDs can be embedded into one low-dimensional semantic space. Subsequently, the learned representations are utilized and transferred in four scenarios: (i) measuring the similarity between items, (ii) transferring from seen items to unseen items, (iii) transferring across different domains, (iv) transferring across different tasks.

Interpretable New User Clustering and Churn Prediction


Talks about novel order dispatch algorithm in large-scale on-demand ride-hailing platforms. While traditional order dispatch approaches usually focus on immediate customer satisfaction, the proposed algorithm is designed to provide a more efficient way to optimize resource utilization and user experience in a global and more farsighted view.

Modelling order dispatch as a large-scale sequential decision-making problem, where the decision of assigning an order to a driver is determined by a centralized algorithm in a coordinated way. The problem is solved in a learning and planning manner: 1) based on historical data, first summarize demand and supply patterns into a spatiotemporal quantization 2) a planning step is conducted in real-time, where each driver-order-pair is valued in consideration of both immediate rewards and future gains, and then dispatch is solved using a combinatorial optimization algorithm.

Learning Universal User Representations from Multiple E-commerce Tasks


Talks about how user behavior sequences (e.g., click, bookmark or purchase of products) are modeled by LSTM and attention mechanism by integrating all the corresponding content, behavior and temporal information. paper proposes the Deep User Perception Network (DUPN) that integrates the techniques of RNNs, attention and multi-task learning. RNNs are used as the building block to learn desired representations from massive user behavior logs. Some of the common E commerce multi tasks are addressed are Click through rate prediction, Learning to rank, Price preference prediction, Fashion Icon following prediction, Shop preference prediction.

Next-item Recommendation via Discriminatively Exploiting User Behavior


This paper proposes a novel Behavior-Intensive Neural Network (BINN) for next-item recommendation by incorporating both users’ historical stable preferences and present consumption motivations. Specifically, BINN contains two main components, i.e., Neural Item Embedding, and Discriminative Behaviors Learning. Firstly, a novel item embedding method based on user interactions is developed for obtaining an unified representation for each item. Then, with the embedded items and the interactive behaviors over item sequences, BINN discriminatively learns the historical preferences and present motivations of the target users.