IJCAI/ECAI 18 Program Schedule

Friday 13 13:30 - 14:30 Invited Talk (A1)

Chair: Peter Stone

Language to Action: Towards Interactive Task Learning with Physical Agents
Joyce Chai

Invited Talk

Friday 13 14:30 - 15:30 Invited Talk (A1)

Chair: Francis Bach

Building Machines that Learn and Think Like People
Josh Tenenbaum

Invited Talk

Monday 16 08:00 - 09:00 Opening (VICTORIA)

Opening

Opening

Monday 16 09:00 - 09:45 Invited Talk (VICTORIA)

Chair: Jeff Rosenschein

Learning World Models: the Next Step Towards AI
Yann Le Cun

Invited Talk

Monday 16 10:15 - 11:15 SUR-KR - Survey Track: Knowledge Representation (VICTORIA)

Chair: Natasha Alechina

#5410

Maintenance of Case Bases: Current Algorithms after Fifty Years
Jose M. Juarez, Susan Craw, J. Ricardo Lopez-Delgado, Manuel Campos

Survey Track: Knowledge Representation

Case-Based Reasoning (CBR) learns new knowledge from data and so can cope with changing environments. CBR is very different from model-based systems since it can learn incrementally as new data is available, storing new cases in its case-base. This means that it can benefit from readily available new data, but also case-base maintenance (CBM) is essential to manage the cases, deleting and compacting the case-base. In the 50th anniversary of CNN (considered the first CBM algorithm), new CBM methods are proposed to deal with the new requirements of Big Data scenarios. In this paper, we present an accessible historic perspective of CBM and we classify and analyse the most recent approaches to deal with these requirements.
#5440

Evaluation Techniques and Systems for Answer Set Programming: a Survey
Martin Gebser, Nicola Leone, Marco Maratea, Simona Perri, Francesco Ricca, Torsten Schaub

Survey Track: Knowledge Representation

Answer set programming (ASP) is a prominent knowledge representation and reasoning paradigm that found both industrial and scientific applications. The success of ASP is due to the combination of two factors: a rich modeling language and the availability of efficient ASP implementations. In this paper we trace the history of ASP systems, describing the key evaluation techniques and their implementation in actual tools.
#5442

Recent Advances in Querying Probabilistic Knowledge Bases
Stefan Borgwardt, İsmail İlkan Ceylan, Thomas Lukasiewicz

Survey Track: Knowledge Representation

We give a survey on recent advances at the forefront of research on probabilistic knowledge bases for representing and querying large-scale automatically extracted data. We concentrate especially on increasing the semantic expressivity of formalisms for representing and querying probabilistic knowledge (i) by giving up the closed-world assumption, (ii) by allowing for commonsense knowledge (and in parallel giving up the tuple-independence assumption), and (iii) by giving up the closed-domain assumption, while preserving some computational properties of query answering in such formalisms.
#5418

Ontology-Based Data Access: A Survey
Guohui Xiao, Diego Calvanese, Roman Kontchakov, Domenico Lembo, Antonella Poggi, Riccardo Rosati, Michael Zakharyaschev

Survey Track: Knowledge Representation

We present the framework of ontology-based data access, a semantic paradigm for providing a convenient and user-friendly access to data repositories, which has been actively developed and studied in the past decade. Focusing on relational data sources, we discuss the main ingredients of ontology-based data access, key theoretical results, techniques, applications and future challenges.

Monday 16 10:15 - 11:15 ML-NN1 - Neural Networks (C7)

Chair: Nevin L. Zhang

#4117

Deep Convolutional Neural Networks with Merge-and-Run Mappings
Liming Zhao, Mingjie Li, Depu Meng, Xi Li, Zhaoxiang Zhang, Yueting Zhuang, Zhuowen Tu, Jingdong Wang

Neural Networks

A deep residual network, built by stacking a sequence of residual blocks, is easy to train, because identity mappings skip residual branches and thus improve information flow. To further reduce the training difficulty, we present a simple network architecture, deep merge-and-run neural networks. The novelty lies in a modularized building block, merge-and-run block, which assembles residual branches in parallel through a merge-and-run mapping: average the inputs of these residual branches (Merge), and add the average to the output of each residual branch as the input of the subsequent residual branch (Run), respectively. We show that the merge-and-run mapping is a linear idempotent function in which the transformation matrix is idempotent, and thus improves information flow, making training easy. In comparison with residual networks, our networks enjoy compelling advantages: they contain much shorter paths and the width, i.e., the number of channels, is increased, and the time complexity remains unchanged. We evaluate the performance on the standard recognition tasks. Our approach demonstrates consistent improvements over ResNets with the comparable setup, and achieves competitive results (e.g., 3.06% testing error on CIFAR-10, 17.55% on CIFAR-100, 1.51% on SVHN).
#1269

Accelerating Convolutional Networks via Global & Dynamic Filter Pruning
Shaohui Lin, Rongrong Ji, Yuchao Li, Yongjian Wu, Feiyue Huang, Baochang Zhang

Neural Networks

Accelerating convolutional neural networks has recently received ever-increasing research focus. Among various approaches proposed in the literature, filter pruning has been regarded as a promising solution, which is due to its advantage in significant speedup and memory reduction of both network model and intermediate feature maps. To this end, most approaches tend to prune filters in a layer-wise fixed manner, which is incapable to dynamically recover the previously removed filter, as well as jointly optimize the pruned network across layers. In this paper, we propose a novel global & dynamic pruning (GDP) scheme to prune redundant filters for CNN acceleration. In particular, GDP first globally prunes the unsalient filters across all layers by proposing a global discriminative function based on prior knowledge of filters. Second, it dynamically updates the filter saliency all over the pruned sparse network, and then recover the mistakenly pruned filter, followed by a retraining phase to improve the model accuracy. Specially, we effectively solve the corresponding non-convex optimization problem of the proposed GDP via stochastic gradient descent with greedy alternative updating. Extensive experiments show that, comparing to the state-of-the-art filter pruning methods, the proposed approach achieves superior performance to accelerate several cutting-edge CNNs on the ILSVRC 2012 benchmark.
#196

Dynamically Hierarchy Revolution: DirNet for Compressing Recurrent Neural Network on Mobile Devices
Jie Zhang, Xiaolong Wang, Dawei Li, Yalin Wang

Neural Networks

Recurrent neural networks (RNNs) achieve cutting-edge performance on a variety of problems. However, due to their high computational and memory demands, deploying RNNs on resource constrained mobile devices is a challenging task. To guarantee minimum accuracy loss with higher compression rate and driven by the mobile resource requirement, we introduce a novel model compression approach DirNet based on an optimized fast dictionary learning algorithm, which 1) dynamically mines the dictionary atoms of the projection dictionary matrix within layer to adjust the compression rate 2) adaptively changes the sparsity of sparse codes cross the hierarchical layers. Experimental results on language model and an ASR model trained with a 1000h speech dataset demonstrate that our method significantly outperforms prior approaches. Evaluated on off-the-shelf mobile devices, we are able to reduce the size of original model by eight times with real-time model inference and negligible accuracy loss.
#924

Automatic Gating of Attributes in Deep Structure
Xiaoming Jin, Tao He, Cheng Wan, Lan Yi, Guiguang Ding, Dou Shen

Neural Networks

Deep structure has been widely applied in a large variety of fields for its excellence of representing data. Attributes are a unique type of data descriptions that have been successfully utilized in numerous tasks to enhance performance. However, to introduce attributes into deep structure is complicated and challenging, because different layers in deep structure accommodate features of different abstraction levels, while different attributes may naturally represent the data in different abstraction levels. This demands adaptively and jointly modeling of attributes and deep structure by carefully examining their relationship. Different from existing works that treat attributes straightforwardly as the same level without considering their abstraction levels, we can make better use of attributes in deep structure by properly connecting them. In this paper, we move forward along this new direction by proposing a deep structure named Attribute Gated Deep Belief Network (AG-DBN) that includes a tunable attribute-layer gating mechanism and automatically learns the best way of connecting attributes to appropriate hidden layers. Experimental results on a manually-labeled subset of ImageNet, a-Yahoo and a-Pascal data set justify the superiority of AG-DBN against several baselines including CNN model and other AG-DBN variants. Specifically, it outperforms the CNN model, VGG19, by significantly reducing the classification error from 26.70% to 13.56% on a-Pascal.
#1830

Regularizing Deep Neural Networks with an Ensemble-based Decorrelation Method
Shuqin Gu, Yuexian Hou, Lipeng Zhang, Yazhou Zhang

Neural Networks

Although Deep Neural Networks (DNNs) have achieved excellent performance in many tasks, improving the generalization capacity of DNNs still remains a challenge. In this work, we propose a novel regularizer named Ensemble-based Decorrelation Method (EDM), which is motivated by the idea of the ensemble learning to improve generalization capacity of DNNs. EDM can be applied to hidden layers in fully connected neural networks or convolutional neural networks. We treat each hidden layer as an ensemble of several base learners through dividing all the hidden units into several non-overlap groups, and each group will be viewed as a base learner. EDM encourages DNNs to learn more diverse representations by minimizing the covariance between all base learners during the training step. Experimental results on MNIST and CIFAR datasets demonstrate that EDM can effectively reduce the overfitting and improve the generalization capacity of DNNs

Monday 16 10:15 - 11:15 MAS-AGT1 - Algorithmic Game Theory (C8)

Chair: Jörg Rothe

#2010

Probabilistic Verification for Obviously Strategyproof Mechanisms
Diodato Ferraioli, Carmine Ventre

Algorithmic Game Theory

Obviously strategyproof (OSP) mechanisms maintain the incentive compatibility of agents that are not fully rational. They have been object of a number of studies since their recent definition. A research agenda, initiated in [Ferraioli and Ventre, 2017], is to find a small set (possibly, the smallest) of conditions allowing to implement an OSP mechanism. To this aim, we define a model of probabilistic verification wherein agents are caught misbehaving with a certain probability, and show how OSP mechanisms can implement every social choice function at the cost of either imposing very large fines or verifying a linear number of agents.
#3112

Payoff Control in the Iterated Prisoner's Dilemma
Dong Hao, Kai Li, Tao Zhou

Algorithmic Game Theory

Repeated game has long been the touchstone model for agents’ long-run relationships. Previous results suggest that it is particularly difficult for a repeated game player to exert an autocratic control on the payoffs since they are jointly determined by all participants. This work discovers that the scale of a player’s capability to unilaterally influence the payoffs may have been much underestimated. Under the conventional iterated prisoner’s dilemma, we develop a general framework for controlling the feasible region where the players’ payoff pairs lie. A control strategy player is able to confine the payoff pairs in her objective region, as long as this region has feasible linear boundaries. With this framework, many well-known existing strategies can be categorized and various new strategies with nice properties can be further identified. We show that the control strategies perform well either in a tournament or against a human-like opponent.
#3600

Adversarial Task Assignment
Chen Hajaj, Yevgeniy Vorobeychik

Algorithmic Game Theory

The problem of task assignment to workers is of long-standing fundamental importance. Examples of this include the classical problem of assigning computing tasks to nodes in a distributed computing environment, assigning jobs to robots, and crowdsourcing. Extensive research into this problem generally addresses important issues such as uncertainty and incentives. However, the problem of adversarial tampering with the task assignment process has not received as much attention. We are concerned with a particular adversarial setting in task assignment where an attacker may target a set of workers in order to prevent the tasks assigned to these workers from being completed. For the case when all tasks are homogeneous, we provide an efficient algorithm for computing the optimal assignment. When tasks are heterogeneous, we show that the adversarial assignment problem is NP-Hard, and present an algorithm for solving it approximately. Our theoretical results are accompanied by extensive simulation results showing the effectiveness of our algorithms.
#3740

Tractable (Simple) Contests
Priel Levy, David Sarne, Yonatan Aumann

Algorithmic Game Theory

Much of the work on multi-agent contests is focused on determining the equilibrium behavior of contestants. This capability is essential for the principal for choosing the optimal parameters for the contest (e.g. prize amount). As it turns out, many contests exhibit not one, but many possible equilibria, hence precluding contest design optimization and contestants behavior prediction. In this paper we examine a variation of the classic contest that alleviates this problem by having contestants make the decisions sequentially rather than in parallel. We study this model in the setting of a simple contest, wherein contestants only choose whether or not to participate, while their performance level is exogenously set. We show that by switching to the revised mechanism the principal can not only force her most desired pure-strategies based equilibrium to emerge, but also, at times, end up with an equilibrium offering a greater expected profit. Further, we show that in the modified contest the optimal prize can be effectively computed. The theoretical analysis is complemented by comprehensive experiments with people over Amazon Mechanical Turk. Here, we find that the modified mechanism offers great benefit for the principal, both in terms of an increased over-participation in the contest (compared to theoretical expectations) and increased average profit.
#775

Computational Aspects of the Preference Cores of Supermodular Two-Scenario Cooperative Games
Daisuke Hatano, Yuichi Yoshida

Algorithmic Game Theory

In a cooperative game, the utility of a coalition of players is given by the characteristic function, and the goal is to find a stable value division of the total utility to the players. In real-world applications, however, multiple scenarios could exist, each of which determines a characteristic function, and which scenario is more important is unknown. To handle such situations, the notion of multi-scenario cooperative games and several solution concepts have been proposed. However, computing the value divisions in those solution concepts is intractable in general. To resolve this issue, we focus on supermodular two-scenario cooperative games in which the number of scenarios is two and the characteristic functions are supermodular and study the computational aspects of a major solution concept called the preference core. First, we show that we can compute the value division in the preference core of a supermodular two-scenario game in polynomial time. Then, we reveal the relations among preference cores with different parameters. Finally, we provide more efficient algorithms for deciding the non-emptiness of the preference core for several specific supermodular two-scenario cooperative games such as the airport game, multicast tree game, and a special case of the generalized induced subgraph game.

Monday 16 10:15 - 11:15 SGP-GPS - Game Playing and Search (K2)

Chair: Tristan Cazenave

#3618

Back to Basics: Benchmarking Canonical Evolution Strategies for Playing Atari
Patryk Chrabąszcz, Ilya Loshchilov, Frank Hutter

Game Playing and Search

Evolution Strategies (ES) have recently been demonstrated to be a viable alternative to reinforcement learning (RL) algorithms on a set of challenging deep learning problems, including Atari games and MuJoCo humanoid locomotion benchmarks. While the ES algorithms in that work belonged to the specialized class of natural evolution strategies (which resemble approximate gradient RL algorithms, such as REINFORCE), we demonstrate that even a very basic canonical ES algorithm can achieve the same or even better performance. This success of a basic ES algorithm suggests that the state-of-the-art can be advanced further by integrating the many advances made in the field of ES in the last decades.We also demonstrate that ES algorithms have very different performance characteristics than traditional RL algorithms: on some games, they learn to exploit the environment and perform much better while on others they can get stuck in suboptimal local minima. Combining their strengths and weaknesses with those of traditional RL algorithms is therefore likely to lead to new advances in the state-of-the-art for solving RL problems.
#5465

(Journal track) Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents
Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht, Michael Bowling

Game Playing and Search

The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games. It supports a variety of different problem settings and it has been receiving increasing attention from the scientific community. In this paper we take a big picture look at how the ALE is being used by the research community. We focus on how diverse the evaluation methodologies in the ALE have become and we highlight some key concerns when evaluating agents in this platform. We use this discussion to present what we consider to be the best practices for future evaluations in the ALE. To further the progress in the field, we also introduce a new version of the ALE that supports multiple game modes and provides a form of stochasticity we call sticky actions.
#5453

(Journal track) MCTS-Minimax Hybrids with State Evaluations
Hendrik Baier, Mark H. M. Winands

Game Playing and Search

Monte-Carlo Tree Search (MCTS) has been found to show weaker play than minimax-based search in some tactical game domains. In order to combine the tactical strength of minimax and the strategic strength of MCTS, MCTS-minimax hybrids have been proposed in prior work. This article continues this line of research for the case where heuristic state evaluation functions are available. Three different approaches are considered, employing minimax in the rollout phase of MCTS, as a replacement for the rollout phase, and as a node prior to bias move selection. The latter two approaches are newly proposed. Results show that the use of enhanced minimax for computing node priors results in the strongest MCTS-minimax hybrid in the three test domains of Othello, Breakthrough, and Catch the Lion. This hybrid also outperforms enhanced minimax as a standalone player in Breakthrough, demonstrating that at least in this domain, MCTS and minimax can be combined to an algorithm stronger than its parts.
#3932

High-Fidelity Simulated Players for Interactive Narrative Planning
Pengcheng Wang, Jonathan Rowe, Wookhee Min, Bradford Mott, James Lester

Game Playing and Search

Interactive narrative planning offers significant potential for creating adaptive gameplay experiences. While data-driven techniques have been devised that utilize player interaction data to induce policies for interactive narrative planners, they require enormously large gameplay datasets. A promising approach to addressing this challenge is creating simulated players whose behaviors closely approximate those of human players. In this paper, we propose a novel approach to generating high-fidelity simulated players based on deep recurrent highway networks and deep convolutional networks. Empirical results demonstrate that the proposed models significantly outperform the prior state-of-the-art in generating high-fidelity simulated player models that accurately imitate human players’ narrative interactions. Using the high-fidelity simulated player models, we show the advantage of more exploratory reinforcement learning methods for deriving generalizable narrative adaptation policies.
#1130

Knowledge-Guided Agent-Tactic-Aware Learning for StarCraft Micromanagement
Yue Hu, Juntao Li, Xi Li, Gang Pan, Mingliang Xu

Game Playing and Search

As an important and challenging problem in artificial intelligence (AI) game playing, StarCraft micromanagement involves a dynamically adversarial game playing process with complex multi-agent control within a large action space. In this paper, we propose a novel knowledge-guided agent-tactic-aware learning scheme, that is, opponent-guided tactic learning (OGTL), to cope with this micromanagement problem. In principle, the proposed scheme takes a two-stage cascaded learning strategy which is capable of not only transferring the human tactic knowledge from the human-made opponent agents to our AI agents but also improving the adversarial ability. With the power of reinforcement learning, such a knowledge-guided agent-tactic-aware scheme has the ability to guide the AI agents to achieve high winning-rate performances while accelerating the policy exploration process in a tactic-interpretable fashion. Experimental results demonstrate the effectiveness of the proposed scheme against the state-of-the-art approaches in several benchmark combat scenarios.

Monday 16 10:15 - 11:15 NLP-GEN - Natural Language Generation (T2)

Chair: Rui Yan

#572

SentiGAN: Generating Sentimental Texts via Mixture Adversarial Networks
Ke Wang, Xiaojun Wan

Natural Language Generation

Generating texts of different sentiment labels is getting more and more attention in the area of natural language generation. Recently, Generative Adversarial Net (GAN) has shown promising results in text generation. However, the texts generated by GAN usually suffer from the problems of poor quality, lack of diversity and mode collapse. In this paper, we propose a novel framework - SentiGAN, which has multiple generators and one multi-class discriminator, to address the above problems. In our framework, multiple generators are trained simultaneously, aiming at generating texts of different sentiment labels without supervision. We propose a penalty based objective in the generators to force each of them to generate diversified examples of a specific sentiment label. Moreover, the use of multiple generators and one multi-class discriminator can make each generator focus on generating its own examples of a specific sentiment label accurately. Experimental results on four datasets demonstrate that our model consistently outperforms several state-of-the-art text generation methods in the sentiment accuracy and quality of generated texts.
#886

Generating Thematic Chinese Poetry using Conditional Variational Autoencoders with Hybrid Decoders
Xiaopeng Yang, Xiaowen Lin, Shunda Suo, Ming Li

Natural Language Generation

Computer poetry generation is our first step towards computer writing. Writing must have a theme. The current approaches of using sequence-to-sequence models with attention often produce non-thematic poems. We present a novel conditional variational autoencoder with a hybrid decoder adding the deconvolutional neural networks to the general recurrent neural networks to fully learn topic information via latent variables. This approach significantly improves the relevance of the generated poems by representing each line of the poem not only in a context-sensitive manner but also in a holistic way that is highly related to the given keyword and the learned topic. A proposed augmented word2vec model further improves the rhythm and symmetry. Tests show that the generated poems by our approach are mostly satisfying with regulated rules and consistent themes, and 73.42% of them receive an Overall score no less than 3 (the highest score is 5).
#1708

Chinese Poetry Generation with a Working Memory Model
Xiaoyuan Yi, Maosong Sun, Ruoyu Li, Zonghan Yang

Natural Language Generation

As an exquisite and concise literary form, poetry is a gem of human culture. Automatic poetry generation is an essential step towards computer creativity. In recent years, several neural models have been designed for this task. However, among lines of a whole poem, the coherence in meaning and topics still remains a big challenge. In this paper, inspired by the theoretical concept in cognitive psychology, we propose a novel Working Memory model for poetry generation. Different from previous methods, our model explicitly maintains topics and informative limited history in a neural memory. During the generation process, our model reads the most relevant parts from memory slots to generate the current line. After each line is generated, it writes the most salient parts of the previous line into memory slots. By dynamic manipulation of the memory, our model keeps a coherent information flow and learns to express each topic flexibly and naturally. We experiment on three different genres of Chinese poetry: quatrain, iambic and chinoiserie lyric. Both automatic and human evaluation results show that our model outperforms current state-of-the-art methods.
#2499

Topic-to-Essay Generation with Neural Networks
Xiaocheng Feng, Ming Liu, Jiahao Liu, Bing Qin, Yibo Sun, Ting Liu

Natural Language Generation

We focus on essay generation, which is a challenging task that generates a paragraph-level text with multiple topics.Progress towards understanding different topics and expressing diversity in this task requires more powerful generators and richer training and evaluation resources. To address this, we develop a multi-topic aware long short-term memory (MTA-LSTM) network.In this model, we maintain a novel multi-topic coverage vector, which learns the weight of each topic and is sequentially updated during the decoding process.Afterwards this vector is fed to an attention model to guide the generator.Moreover, we automatically construct two paragraph-level Chinese essay corpora, 305,000 essay paragraphs and 55,000 question-and-answer pairs.Empirical results show that our approach obtains much better BLEU score compared to various baselines.Furthermore, human judgment shows that MTA-LSTM has the ability to generate essays that are not only coherent but also closely related to the input topics.
#2846

Toward Diverse Text Generation with Inverse Reinforcement Learning
Zhan Shi, Xinchi Chen, Xipeng Qiu, Xuanjing Huang

Natural Language Generation

Text generation is a crucial task in NLP. Recently, several adversarial generative models have been proposed to improve the exposure bias problem in text generation. Though these models gain great success, they still suffer from the problems of reward sparsity and mode collapse. In order to address these two problems, in this paper, we employ inverse reinforcement learning (IRL) for text generation. Specifically, the IRL framework learns a reward function on training data, and then an optimal policy to maximum the expected total reward. Similar to the adversarial models, the reward and policy function in IRL are optimized alternately. Our method has two advantages: (1) the reward function can produce more dense reward signals. (2) the generation policy, trained by ``entropy regularized'' policy gradient, encourages to generate more diversified texts. Experiment results demonstrate that our proposed method can generate higher quality texts than the previous methods.

Monday 16 10:15 - 11:15 CV-UNS - Computer Vision and Unsupervised Learning (T1)

Chair: Mohamed Amer

#1559

Co-attention CNNs for Unsupervised Object Co-segmentation
Kuang-Jui Hsu, Yen-Yu Lin, Yung-Yu Chuang

Computer Vision and Unsupervised Learning

Object co-segmentation aims to segment the common objects in images. This paper presents a CNN-based method that is unsupervised and end-to-end trainable to better solve this task. Our method is unsupervised in the sense that it does not require any training data in the form of object masks but merely a set of images jointly covering objects of a specific class. Our method comprises two collaborative CNN modules, a feature extractor and a co-attention map generator. The former module extracts the features of the estimated objects and backgrounds, and is derived based on the proposed co-attention loss which minimizes inter-image object discrepancy while maximizing intra-image figure-ground separation. The latter module is learned to generated co-attention maps by which the estimated figure-ground segmentation can better fit the former module. Besides, the co-attention loss, the mask loss is developed to retain the whole objects and remove noises. Experiments show that our method achieves superior results, even outperforming the state-of-the-art, supervised methods.
#970

Complementary Binary Quantization for Joint Multiple Indexing
Qiang Fu, Xu Han, Xianglong Liu, Jingkuan Song, Cheng Deng

Computer Vision and Unsupervised Learning

Building multiple hash tables has been proven a successful technique for indexing massive databases, which can guarantee a desired level of overall performance. However, existing hash based multi-indexing methods suffer from the heavy redundancy, without strong table complementarity and effective hash code learning. To address the problems, this paper proposes a complementary binary quantization (CBQ) method to jointly learning multiple hash tables. It exploits the power of incomplete binary coding based on prototypes to align the original space and the Hamming space, and further utilizes the nature of multi-indexing search to jointly reduce the quantization loss based on the prototype based hash function. Our alternating optimization adaptively discovers the complementary prototype sets and the corresponding code sets of a varying size in an efficient way, which together robustly approximate the data relations. Our method can be naturally generalized to the product space for long hash codes. Extensive experiments carried out on two popular large-scale tasks including Euclidean and semantic nearest neighbor search demonstrate that the proposed CBQ method enjoys the strong table complementarity and significantly outperforms the state-of-the-art, with up to 57.76\% performance gains relatively.
#931

Cascaded Low Rank and Sparse Representation on Grassmann Manifolds
Boyue Wang, Yongli Hu, Junbin Gao, Yanfeng Sun, Baocai Yin

Computer Vision and Unsupervised Learning

Inspired by low rank representation and sparse subspace clustering acquiring success, ones attempt to simultaneously perform low rank and sparse constraints on the affinity matrix to improve the performance. However, it is just a trade-off between these two constraints. In this paper, we propose a novel Cascaded Low Rank and Sparse Representation (CLRSR) method for subspace clustering, which seeks the sparse expression on the former learned low rank latent representation. To make our proposed method suitable to multi-dimension or imageset data, we extend CLRSR onto Grassmann manifolds. An effective solution and its convergence analysis are also provided. The excellent experimental results demonstrate the proposed method is more robust than other state-of-the-art clustering methods on imageset data.
#686

Unpaired Multi-Domain Image Generation via Regularized Conditional GANs
Xudong Mao, Qing Li

Computer Vision and Unsupervised Learning

In this paper, we study the problem of multi-domain image generation, the goal of which is to generate pairs of corresponding images from different domains. With the recent development in generative models, image generation has achieved great progress and has been applied to various computer vision tasks. However, multi-domain image generation may not achieve the desired performance due to the difficulty of learning the correspondence of different domain images, especially when the information of paired samples is not given. To tackle this problem, we propose Regularized Conditional GAN (RegCGAN) which is capable of learning to generate corresponding images in the absence of paired training data. RegCGAN is based on the conditional GAN, and we introduce two regularizers to guide the model to learn the corresponding semantics of different domains. We evaluate the proposed model on several tasks for which paired training data is not given, including the generation of edges and photos, the generation of faces with different attributes, etc. The experimental results show that our model can successfully generate corresponding images for all these tasks, while outperforms the baseline methods. We also introduce an approach of applying RegCGAN to unsupervised domain adaptation.
#287

Self-Representative Manifold Concept Factorization with Adaptive Neighbors for Clustering
Sihan Ma, Lefei Zhang, Wenbin Hu, Yipeng Zhang, Jia Wu, Xuelong Li

Computer Vision and Unsupervised Learning

Matrix Factorization based methods, e.g., the Concept Factorization (CF) and Nonnegative Matrix Factorization (NMF), have been proved to be efficient and effective for data clustering tasks. In recent years, various graph extensions of CF and NMF have been proposed to explore intrinsic geometrical structure of data for the purpose of better clustering performance. However, many methods build the affinity matrix used in the manifold structure directly based on the input data. Therefore, the clustering results are highly sensitive to the input data. To further improve the clustering performance, we propose a novel manifold concept factorization model with adaptive neighbor structure to learn a better affinity matrix and clustering indicator matrix at the same time. Technically, the proposed model constructs the affinity matrix by assigning the adaptive and optimal neighbors to each point based on the local distance of the learned new representation of the original data with itself as a dictionary. Our experimental results present superior performance over the state-of-the-art alternatives on numerous datasets.

Monday 16 10:15 - 11:15 UAI-GPI - Graphical Models, Probabilistic Inference (K11)

Chair: Manfred Jaeger

#3791

Efficient Localized Inference for Large Graphical Models
Jinglin Chen, Jian Peng, Qiang Liu

Graphical Models, Probabilistic Inference

We propose a new localized inference algorithm for answering marginalization queries in large graphical models with the correlation decay property. Given a query variable and a large graphical model, we define a much smaller model in a local region around the query variable in the target model so that the marginal distribution of the query variable can be accurately approximated. We introduce two approximation error bounds based on the Dobrushin’s comparison theorem and apply our bounds to derive a greedy expansion algorithm that efficiently guides the selection of neighbor nodes for localized inference. We verify our theoretical bounds on various datasets and demonstrate that our localized inference algorithm can provide fast and accurate approximation for large graphical models.
#1119

Parameterised Queries and Lifted Query Answering
Tanya Braun, Ralf Möller

Graphical Models, Probabilistic Inference

A standard approach for inference in probabilistic formalisms with first-order constructs is lifted variable elimination (LVE) for single queries. To handle multiple queries efficiently, the lifted junction tree algorithm (LJT) employs a first-order cluster representation of a model and LVE as a subroutine. Both algorithms answer conjunctive queries of propositional random variables, shattering the model on the query, which causes unnecessary groundings for conjunctive queries of interchangeable variables. This paper presents parameterised queries as a means to avoid groundings, applying the lifting idea to queries. Parameterised queries enable LVE and LJT to compute answers faster, while compactly representing queries and answers.
#1097

Lifted Filtering via Exchangeable Decomposition
Stefan Lüdtke, Max Schröder, Sebastian Bader, Kristian Kersting, Thomas Kirste

Graphical Models, Probabilistic Inference

We present a model for exact recursive Bayesian filtering based on lifted multiset states. Combining multisets with lifting makes it possible to simultaneously exploit multiple strategies for reducing inference complexity when compared to list-based grounded state representations. The core idea is to borrow the concept of Maximally Parallel Multiset Rewriting Systems and to enhance it by concepts from Rao-Blackwellization and Lifted Inference, giving a representation of state distributions that enables efficient inference. In worlds where the random variables that define the system state are exchangeable -- where the identity of entities does not matter -- it automatically uses a representation that abstracts from ordering (achieving an exponential reduction in complexity) -- and it automatically adapts when observations or system dynamics destroy exchangeability by breaking symmetry.
#3864

Efficient Symbolic Integration for Probabilistic Inference
Samuel Kolb, Martin Mladenov, Scott Sanner, Vaishak Belle, Kristian Kersting

Graphical Models, Probabilistic Inference

Weighted model integration (WMI) extends weighted model counting (WMC) to the integration of functions over mixed discrete-continuous probability spaces. It has shown tremendous promise for solving inference problems in graphical models and probabilistic programs. Yet, state-of-the-art tools for WMI are generally limited either by the range of amenable theories, or in terms of performance. To address both limitations, we propose the use of extended algebraic decision diagrams (XADDs) as a compilation language for WMI. Aside from tackling typical WMI problems, XADDs also enable partial WMI yielding parametrized solutions. To overcome the main roadblock of XADDs -- the computational cost of integration -- we formulate a novel and powerful exact symbolic dynamic programming (SDP) algorithm that seamlessly handles Boolean, integer-valued and real variables, and is able to effectively cache partial computations, unlike its predecessor. Our empirical results demonstrate that these contributions can lead to a significant computational reduction over existing probabilistic inference algorithms.
#3464

Metadata-dependent Infinite Poisson Factorization for Efficiently Modelling Sparse and Large Matrices in Recommendation
Trong Dinh Thac Do, Longbing Cao

Graphical Models, Probabilistic Inference

Matrix Factorization (MF) is widely used in Recommender Systems (RSs) for estimating missing ratings in the rating matrix. MF faces major challenges of handling very sparse and large data. Poisson Factorization (PF) as an MF variant addresses these challenges with high efficiency by only computing on those non-missing elements. However, ignoring the missing elements in computation makes PF weak or incapable for dealing with columns or rows with very few observations (corresponding to sparse items or users). In this work, Metadata-dependent Poisson Factorization (MPF) is invented to address the user/item sparsity by integrating user/item metadata into PF. MPF adds the metadata-based observed entries to the factorized PF matrices. In addition, similar to MF, choosing the suitable number of latent components for PF is very expensive on very large datasets. Accordingly, we further extend MPF to Metadata-dependent Infinite Poisson Factorization (MIPF) that integrates Bayesian Nonparametric (BNP) technique to automatically tune the number of latent components. Our empirical results show that, by integrating metadata, MPF/MIPF significantly outperform the state-of-the-art PF models for sparse and large datasets. MIPF also effectively estimates the number of latent components.

Monday 16 10:15 - 11:15 ROB-ROB - Robotics (C2)

Chair: Danica Kragic

#453

Interactive Robot Transition Repair With SMT
Jarrett Holtz, Arjun Guha, Joydeep Biswas

Robotics

Complex robot behaviors are often structured as state machines, where states encapsulate actions and a transition function switches between states. Since transitions depend on physical parameters, when the environment changes, a roboticist has to painstakingly readjust the parameters to work in the new environment. We present interactive SMT- based Robot Transition Repair (SRTR): instead of manually adjusting parameters, we ask the roboticist to identify a few instances where the robot is in a wrong state and what the right state should be. An automated analysis of the transition function 1) identifies adjustable parameters, 2) converts the transition function into a system of logical constraints, and 3) formulates the constraints and user-supplied corrections as a MaxSMT problem that yields new parameter values. We show that SRTR finds new parameters 1) quickly, 2) with few corrections, and 3) that the parameters generalize to new scenarios. We also show that a SRTR-corrected state machine can outperform a more complex, expert-tuned state machine.
#1446

Learning Unmanned Aerial Vehicle Control for Autonomous Target Following
Siyi Li, Tianbo Liu, Chi Zhang, Dit-Yan Yeung, Shaojie Shen

Robotics

While deep reinforcement learning (RL) methods have achieved unprecedented successes in a range of challenging problems, their applicability has been mainly limited to simulation or game domains due to the high sample complexity of the trial-and-error learning process. However, real-world robotic applications often need a data-efficient learning process with safety-critical constraints. In this paper, we consider the challenging problem of learning unmanned aerial vehicle (UAV) control for tracking a moving target. To acquire a strategy that combines perception and control, we represent the policy by a convolutional neural network. We develop a hierarchical approach that combines a model-free policy gradient method with a conventional feedback proportional-integral-derivative (PID) controller to enable stable learning without catastrophic failure. The neural network is trained by a combination of supervised learning from raw images and reinforcement learning from games of self-play. We show that the proposed approach can learn a target following policy in a simulator efficiently and the learned behavior can be successfully transferred to the DJI quadrotor platform for real-world UAV control.
#3816

Online, Interactive User Guidance for High-dimensional, Constrained Motion Planning
Fahad Islam, Oren Salzman, Maxim Likhachev

Robotics

We consider the problem of planning a collision-free path for a high-dimensional robot. Specifically, we suggest a planning framework where a motion-planning algorithm can obtain guidance from a user. In contrast to existing approaches that try to speed up planning by incorporating experiences or demonstrations ahead of planning, we suggest to seek user guidance only when the planner identifies that it ceases to make significant progress towards the goal. Guidance is provided in the form of an intermediate configuration q^, which is used to bias the planner to go through q^. We demonstrate our approach for the case where the planning algorithm is Multi-Heuristic A* (MHA*) and the robot is a 34-DOF humanoid. We show that our approach allows to compute highly-constrained paths with little domain knowledge. Without our approach, solving such problems requires carefully-crafted domain-dependent heuristics.
#5115

(Sister Conferences Best Papers Track) A Unifying View of Geometry, Semantics, and Data Association in SLAM
Nikolay Atanasov, Sean L. Bowman, Kostas Daniilidis, George J. Pappas

Robotics

Traditional approaches for simultaneous localization and mapping (SLAM) rely on geometric features such as points, lines, and planes to infer the environment structure. They make hard decisions about the (data) association between observed features and mapped landmarks to update the environment model. This paper makes two contributions to the state of the art in SLAM. First, it generalizes the purely geometric model by introducing semantically meaningful objects, represented as structured models of mid-level part features. Second, instead of making hard, potentially wrong associations between semantic features and objects, it shows that SLAM inference can be performed efficiently with probabilistic data association. The approach not only allows building meaningful maps (containing doors, chairs, cars, etc.) but also offers significant advantages in ambiguous environments.
#912

Learning Transferable UAV for Forest Visual Perception
Lyujie Chen, Wufan Wang, Jihong Zhu

Robotics

In this paper, we propose a new pipeline of training a monocular UAV to fly a collision-free trajectory along the dense forest trail. As gathering high-precision images in the real world is expensive and the off-the-shelf dataset has some deficiencies, we collect a new dense forest trail dataset in a variety of simulated environment in Unreal Engine. Then we formulate visual perception of forests as a classification problem. A ResNet-18 model is trained to decide the moving direction frame by frame. To transfer the learned strategy to the real world, we construct a ResNet-18 adaptation model via multi-kernel maximum mean discrepancies to leverage the relevant labelled data and alleviate the discrepancy between simulated and real environment. Simulation and real-world flight with a variety of appearance and environment changes are both tested. The ResNet-18 adaptation and its variant model achieve the best result of 84.08% accuracy in reality.

Monday 16 10:15 - 11:15 ML-KER - Kernel Methods (C3)

Chair: Xinwang Liu

#1209

Fast Factorization-free Kernel Learning for Unlabeled Chunk Data Streams
Yi Wang, Nan Xue, Xin Fan, Jiebo Luo, Risheng Liu, Bin Chen, Haojie Li, Zhongxuan Luo

Kernel Methods

Data stream analysis aims at extracting discriminative information for classification from continuously incoming samples. It is extremely challenging to detect novel data while updating the model in an efficient and stable fashion, especially for the chunk data. This paper proposes a fast factorization-free kernel learning method to unify novelty detection and incremental learning for unlabeled chunk data streams in one framework. The proposed method constructs a joint reproducing kernel Hilbert space from known class centers by solving a linear system in kernel space. Naturally, unlabeled data can be detected and classified among multi-classes by a single decision model. And projecting samples into the discriminative feature space turns out to be the product of two small-sized kernel matrices without needing such time-consuming factorization like QR-decomposition or singular value decomposition. Moreover, the insertion of a novel class can be treated as the addition of a new orthogonal basis to the existing feature space, resulting in fast and stable updating schemes. Both theoretical analysis and experimental validation on real-world datasets demonstrate that the proposed methods learn chunk data streams with significantly lower computational costs and comparable or superior accuracy than the state of the art.
#3107

A Property Testing Framework for the Theoretical Expressivity of Graph Kernels
Nils M. Kriege, Christopher Morris, Anja Rey, Christian Sohler

Kernel Methods

Graph kernels are applied heavily for the classification of structured data. However, their expressivity is assessed almost exclusively from experimental studies and there is no theoretical justification why one kernel is in general preferable over another. We introduce a theoretical framework for investigating the expressive power of graph kernels, which is inspired by concepts from the area of property testing. We introduce the notion of distinguishability of a graph property by a graph kernel. For several established graph kernels we show that they cannot distinguish essential graph properties. In order to overcome this, we consider a kernel based on k-disc frequencies. We show that this efficiently computable kernel can distinguish fundamental graph properties. Finally, we obtain learning guarantees for nearest neighbor classifiers in our framework.
#3781

A Degeneracy Framework for Graph Similarity
Giannis Nikolentzos, Polykarpos Meladianos, Stratis Limnios, Michalis Vazirgiannis

Kernel Methods

The problem of accurately measuring the similarity between graphs is at the core of many applications in a variety of disciplines. Most existing methods for graph similarity focus either on local or on global properties of graphs. However, even if graphs seem very similar from a local or a global perspective, they may exhibit different structure at different scales. In this paper, we present a general framework for graph similarity which takes into account structure at multiple different scales. The proposed framework capitalizes on the well-known k-core decomposition of graphs in order to build a hierarchy of nested subgraphs. We apply the framework to derive variants of four graph kernels, namely graphlet kernel, shortest-path kernel, Weisfeiler-Lehman subtree kernel, and pyramid match graph kernel. The framework is not limited to graph kernels, but can be applied to any graph comparison algorithm. The proposed framework is evaluated on several benchmark datasets for graph classification. In most cases, the core-based kernels achieve significant improvements in terms of classification accuracy over the base kernels, while their time complexity remains very attractive.
#575

Fast Cross-Validation
Yong Liu, Hailun Lin, Lizhong Ding, Weiping Wang, Shizhong Liao

Kernel Methods

Cross-validation (CV) is the most widely adopted approach for selecting the optimal model. However, the computation of CV has high complexity due to multiple times of learner training, making it disabled for large scale model selection. In this paper, we present an approximate approach to CV based on the theoretical notion of Bouligand influence function (BIF) and the Nystr\"{o}m method for kernel methods. We first establish the relationship between the theoretical notion of BIF and CV, and propose a method to approximate the CV via the Taylor expansion of BIF. Then, we provide a novel computing method to calculate the BIF for general distribution, and evaluate BIF for sample distribution. Finally, we use the Nystr\"{o}m method to accelerate the computation of the BIF matrix for giving the finally approximate CV criterion. The proposed approximate CV requires training only once and is suitable for a wide variety of kernel methods. Experimental results on lots of datasets how that our approximate CV has no statistical discrepancy with the original CV, but can significantly improve the efficiency.
#2383

Beyond Similar and Dissimilar Relations : A Kernel Regression Formulation for Metric Learning
Pengfei Zhu, Ren Qi, Qinghua Hu, Qilong Wang, Changqing Zhang, Liu Yang

Kernel Methods

Most existing metric learning methods focus on learning a similarity or distance measure relying on similar and dissimilar relations between sample pairs. However, pairs of samples cannot be simply identified as similar or dissimilar in many real-world applications, e.g., multi-label learning, label distribution learning or tasks with continuous decision values. To this end, in this paper we propose a novel relation alignment metric learning (RAML) formulation to handle the metric learning problem in those scenarios. Since the relation of two samples can be measured by the difference degree of the decision values, motivated by the consistency of the sample relations in the feature space and decision space, our proposed RAML utilizes the sample relations in the decision space to guide the metric learning in the feature space. Specifically, our RAML method formulates metric learning as a kernel regression problem, which can be efficiently optimized by the standard regression solvers. We carry out several experiments on the single-label classification, multi-label classification, and label distribution learning tasks, to demonstrate that our method achieves favorable performance against the state-of-the-art methods.

Monday 16 11:25 - 12:40 EAR1 - EARLY CAREER 1 (VICTORIA)

Chair: Subbarao Kambhampati

#5449

Towards Human-Engaged AI
Xiaojuan Ma

EARLY CAREER 1

Engagement, the key construct that describes the synergy between human (users) and technology (computing systems), is gaining increasing attention in academia and industry. Human-Engaged AI (HEAI) is an emerging research paradigm that aims to jointly advance the capability and capacity of human and AI technology. In this paper, we first review the key concepts in HEAI and its driving force from the integration of Artificial Intelligence (AI) and Human-Computer Interaction (HCI). Then we present an HEAI framework developed from our own work.
#5492

Probabilistic Machine Learning: Models, Algorithms and a Programming Library
Jun Zhu

EARLY CAREER 1

Probabilistic machine learning provides a suite of powerful tools for modeling uncertainty, performing probabilistic inference, and making predictions or decisions in uncertain environments. In this paper, we present an overview of our recent work on probabilistic machine learning, including the theory of regularized Bayesian inference, Bayesian deep learning, scalable inference algorithms, a probabilistic programming library named ZhuSuan, and applications in representation learning as well as learning from crowds.
#5480

Decision-Making Under Uncertainty in Multi-Agent and Multi-Robot Systems: Planning and Learning
Christopher Amato

EARLY CAREER 1

Multi-agent planning and learning methods are becoming increasingly important in today's interconnected world. Methods for real-world domains, such as robotics, must consider uncertainty and limited communication in order to generate high-quality, robust solutions. This paper discusses our work on developing principled models to represent these problems and planning and learning methods that can scale to realistic multi-agent and multi-robot tasks.

Monday 16 11:25 - 12:50 KR-MAS1 - Knowledge Representation and Agents: Games, Decision, Social Choice (C7)

Chair: Takayuki Ito

#1047

Ceteris paribus majority for social ranking
Adrian Haret, Hossein Khani, Stefano Moretti, Meltem Öztürk

Knowledge Representation and Agents: Games, Decision, Social Choice

We study the problem of finding a social ranking over individuals given a ranking over coalitions formed by them. We investigate the use of a ceteris paribus majority principle as a social ranking solution inspired from the classical axioms of social choice theory. Faced with a Condorcet-like paradox, we analyze the consequences of restricting the domain according to an adapted version of single-peakedness. We conclude with a discussion on different interpretations of incompleteness of the ranking over coalitions and its exploitation for defining new social rankings, providing a new rule as an example.
#2116

An Efficient Algorithm To Compute Distance Between Lexicographic Preference Trees
Minyi Li, Borhan Kazimipour

Knowledge Representation and Agents: Games, Decision, Social Choice

Very often, we have to look into multiple agents' preferences, and compare or aggregate them. In this paper, we consider the well-known model, namely, lexicographic preference trees (LP-trees), for representing agents' preferences in combinatorial domains. We tackle the problem of calculating the dissimilarity/distance between agents' LP-trees. We propose an algorithm LpDis to compute the number of disagreed pairwise preferences between agents by traversing their LP-trees. The proposed algorithm is computationally efficient and allows agents to have different attribute importance structures and preference dependencies.
#2829

Game Description Language and Dynamic Epistemic Logic Compared
Thorsten Engesser, Robert Mattmüller, Bernhard Nebel, Michael Thielscher

Knowledge Representation and Agents: Games, Decision, Social Choice

Several different frameworks have been proposed to model and reason about knowledge in dynamic multi-agent settings, among them the logic-programming-based game description language GDL-III, and dynamic epistemic logic (DEL), based on possible-worlds semantics. GDL-III and DEL have complementary strengths and weaknesses in terms of ease of modeling and simplicity of semantics. In this paper, we formally study the expressiveness of GDL-III vs. DEL. We clarify the commonalities and differences between those languages, demonstrate how to bridge the differences where possible, and identify large fragments of GDL-III and DEL that are equivalent in the sense that they can be used to encode games or planning tasks that admit the same legal action sequences. We prove the latter by providing compilations between those fragments of GDL-III and DEL.
#3489

Goal-Based Collective Decisions: Axiomatics and Computational Complexity
Arianna Novaro, Umberto Grandi, Dominique Longin, Emiliano Lorini

Knowledge Representation and Agents: Games, Decision, Social Choice

We study agents expressing propositional goals over a set of binary issues to reach a collective decision. We adapt properties and rules from the literature on Social Choice Theory to our setting, providing an axiomatic characterisation of a majority rule for goal-based voting. We study the computational complexity of finding the outcome of our rules (i.e., winner determination), showing that it ranges from Nondeterministic Polynomial Time (NP) to Probabilistic Polynomial Time (PP).
#3717

Accountable Approval Sorting
Khaled Belahcene, Yann Chevaleyre, Christophe Labreuche, Nicolas Maudet, Vincent Mousseau, Wassila Ouerdane

Knowledge Representation and Agents: Games, Decision, Social Choice

We consider decision situations in which a set of points of view (voters, criteria) are to sort a set of candidates to ordered categories (Good/Bad). Candidates are judged good, when approved by a sufficient set of points of view; this corresponds to NonCompensatory Sorting. To be accountable, such approval sorting should provide guarantees about the decision process and decisions concerning specific candidates. We formalize accountability using a feasibility problem expressed as a boolean satisfiability formulation. We illustrate different forms of accountability when a committee decides with approval sorting and study the information that should be disclosed by the committee.
#5467

(Journal track) Impossibility in Belief Merging
Amilcar Mata Diaz, Ramon Pino Perez

Knowledge Representation and Agents: Games, Decision, Social Choice

With the aim of studying social properties of belief merging and having a better understanding of impossibility, we extend in three ways the framework of logic-based merging introduced by Konieczny and Pino Perez. First, at the level of representation of the information, we pass from belief bases to complex epistemic states. Second, the profiles are represented as functions of finite societies to the set of epistemic states (a sort of vectors) and not as multisets of epistemic states. Third, we extend the set of rational postulates in order to consider the epistemic versions of the classical postulates of social choice theory: standard domain, Pareto property, independence of irrelevant alternatives and absence of dictator. These epistemic versions of social postulates are given, essentially, in terms of the finite propositional logic. We state some representation theorems for these operators. These extensions and representation theorems allow us to establish an epistemic and very general version of Arrow's impossibility theorem. One of the interesting features of our result, is that it holds for different representations of epistemic states; for instance conditionals, ordinal conditional functions and, of course, total preorders.
#1251

A Savage-style Utility Theory for Belief Functions
Chunlai Zhou, Biao Qin, Xiaoyong Du

Knowledge Representation and Agents: Games, Decision, Social Choice

In this paper, we provide an axiomatic justification for decision making with belief functions by studying the belief-function counterpart of Savage's Theorem where the state space is finite and the consequence set is a continuum [l, M] (l<M). We propose six axioms for a preference relation over acts, and then show that this axiomatization admits a definition of qualitative belief functions comparing preferences over events that guarantees the existence of a belief function on the state space. The key axioms are uniformity and an analogue of the independence axiom. The uniformity axiom is used to ensure that all acts with the same maximal and minimal consequences must be equivalent. And our independence axiom shows the existence of a utility function and implies the uniqueness of the belief function on the state space. Moreover, we prove without the independence axiom the neutrality theorem that two acts are indifferent whenever they generate the same belief functions over consequences. At the end of the paper, we compare our approach with other related decision theories for belief functions.

Monday 16 11:25 - 12:50 PS-SEA - Planning and Search (K2)

Chair: Felipe Meneguzzi

#223

Analyzing Tie-Breaking Strategies for the A* Algorithm
Augusto B. Corrêa, André G. Pereira, Marcus Ritt

Planning and Search

For a given state space and admissible heuristic function h there is always a tie-breaking strategy for which A* expands the minimum number of states [Dechter and Pearl, 1985]. We say that these strategies have optimal expansion. Although such a strategy always exists it may depend on the instance, and we currently do not know a tie-breaker that always guarantees optimal expansion. In this paper, we study tie-breaking strategies for A*. We analyze common strategies from the literature and prove that they do not have optimal expansion. We propose a novel tie-breaking strategy using cost adaptation that has always optimal expansion. We experimentally analyze the performance of A* using several tie-breaking strategies on domains from the IPC and zero-cost domains. Our best strategy solves significantly more instances than the standard method in the literature and more than the previous state-of-the-art strategy. Our analysis improves the understanding of how to develop effective tie-breaking strategies and our results also improve the state-of-the-art of tie-breaking strategies for A*.
#2009

Meta-Level Control of Anytime Algorithms with Online Performance Prediction
Justin Svegliato, Kyle Hollins Wray, Shlomo Zilberstein

Planning and Search

Anytime algorithms enable intelligent systems to trade computation time with solution quality. To exploit this crucial ability in real-time decision-making, the system must decide when to interrupt the anytime algorithm and act on the current solution. Existing meta-level control techniques, however, address this problem by relying on significant offline work that diminishes their practical utility and accuracy. We formally introduce an online performance prediction framework that enables meta-level control to adapt to each instance of a problem without any preprocessing. Using this framework, we then present a meta-level control technique and two stopping conditions. Finally, we show that our approach outperforms existing techniques that require substantial offline work. The result is efficient nonmyopic meta-level control that reduces the overhead and increases the benefits of using anytime algorithms in intelligent systems.
#3028

Effect-Abstraction Based Relaxation for Linear Numeric Planning
Dongxu Li, Enrico Scala, Patrik Haslum, Sergiy Bogomolov

Planning and Search

This paper studies an effect-abstraction based relaxation for reasoning about linear numeric planning problems. The effect-abstraction decomposes non-constant linear numeric effects into actions with conditional effects over additive constant numeric effects. With little effort, on this compiled version, it is possible to use known subgoaling based relaxations and relative heuristics. The combination of these two steps leads to a novel relaxation based heuristic. Theoretically, the relaxation is proved tighter than previous interval based relaxation and leading to safe-pruning heuristics. Empirically, a heuristic developed on this relaxation leads to substantial improvements for a class of problems that are currently out of the reach of state-of-the-art numeric planners.
#3241

Unchaining the Power of Partial Delete Relaxation, Part II: Finding Plans with Red-Black State Space Search
Maximilian Fickert, Daniel Gnad, Joerg Hoffmann

Planning and Search

Red-black relaxation in classical planning allows to interpolate between delete-relaxed and real planning. Yet the traditional use of relaxations to generate heuristics restricts relaxation usage to tractable fragments. How to actually tap into the red-black relaxation's interpolation power? Prior work has devised red-black state space search (RBS) for intractable red-black planning, and has explored two uses: proving unsolvability, generating seed plans for plan repair. Here, we explore the generation of plans directly through RBS. We design two enhancements to this end: (A) use a known tractable fragment where possible, use RBS for the intractable parts; (B) check RBS state transitions for realizability, spawn relaxation refinements where the check fails. We show the potential merits of both techniques on IPC benchmarks.
#3388

Local Minima, Heavy Tails, and Search Effort for GBFS
Eldan Cohen, J. Christopher Beck

Planning and Search

Problem difficulty for greedy best first search (GBFS) is not entirely understood, though existing work points to deep local minima and poor correlation between the h-values and the distance to goal as factors that have significant negative effect on the search effort. In this work, we show that there is a very strong exponential correlation between the depth of the single deepest local minima encountered in a search and the overall search effort. Furthermore, we find that the distribution of local minima depth changes dramatically based on the constrainedness of problems, suggesting an explanation for the previously observed heavy-tailed behavior in GBFS. In combinatorial search, a similar result led to the use of randomized restarts to escape deep subtrees with no solution and corresponding significant speed-ups. We adapt this method and propose a randomized restarting GBFS variant that improves GBFS performance by escaping deep local minima, and does so even in the presence of other, randomization-based, search enhancements.
#3552

LP Heuristics over Conjunctions: Compilation, Convergence, Nogood Learning
Marcel Steinmetz, Joerg Hoffmann

Planning and Search

Two strands of research in classical planning are LP heuristics and conjunctions to improve approximations. Combinations of the two have also been explored. Here, we focus on convergence properties, forcing the LP heuristic to equal the perfect heuristic h* in the limit. We show that, under reasonable assumptions, partial variable merges are strictly dominated by the compilation Pi^C of explicit conjunctions, and that both render the state equation heuristic equal to h* for a suitable set C of conjunctions. We show that consistent potential heuristics can be computed from a variant of Pi^C, and that such heuristics can represent h* for suitable C. As an application of these convergence properties, we consider sound nogood learning in state space search, via refining the set C. We design a suitable refinement method to this end. Experiments on IPC benchmarks show significant performance improvements in several domains.
#4268

Admissible Abstractions for Near-optimal Task and Motion Planning
William Vega-Brown, Nicholas Roy

Planning and Search

We define an admissibility condition for abstractions expressed using angelic semantics and show that these conditions allow us to accelerate planning while preserving the ability to find the optimal motion plan. We then derive admissible abstractions for two motion planning domains with continuous state. We extract upper and lower bounds on the cost of concrete motion plans using local metric and topological properties of the problem domain. These bounds guide the search for a plan while maintaining performance guarantees. We show that abstraction can dramatically reduce the complexity of search relative to a direct motion planner. Using our abstractions, we find near-optimal motion plans in planning problems involving 10^13 states without using a separate task planner.

Monday 16 11:25 - 12:50 NLP-CV1 - Language and Vision (T2)

Chair: Jiajun Zhang

#66

Dilated Convolutional Network with Iterative Optimization for Continuous Sign Language Recognition
Junfu Pu, Wengang Zhou, Houqiang Li

Language and Vision

This paper presents a novel deep neural architecture with iterative optimization strategy for real-world continuous sign language recognition. Generally, a continuous sign language recognition system consists of visual input encoder for feature extraction and a sequence learning model to learn the correspondence between the input sequence and the output sentence-level labels. We use a 3D residual convolutional network (3D-ResNet) to extract visual features. After that, a stacked dilated convolutional network with Connectionist Temporal Classification (CTC) is applied for learning the mapping between the sequential features and the text sentence. The deep network is hard to train since the CTC loss has limited contribution to early CNN parameters. To alleviate this problem, we design an iterative optimization strategy to train our architecture. We generate pseudo-labels for video clips from sequence learning model with CTC, and fine-tune the 3D-ResNet with the supervision of pseudo-labels for a better feature representation. We alternately optimize feature extractor and sequence learning model with iterative steps. Experimental results on RWTH-PHOENIX-Weather, a large real-world continuous sign language recognition benchmark, demonstrate the advantages and effectiveness of our proposed method.
#651

Multi-modal Circulant Fusion for Video-to-Language and Backward
Aming Wu, Yahong Han

Language and Vision

Multi-modal fusion has been widely involved in focuses of the modern artificial intelligence research, e.g., from visual content to languages and backward. Common-used multi-modal fusion methods mainly include element-wise product, element-wise sum, or even simply concatenation between different types of features, which are somewhat straightforward but lack in-depth analysis. Recent studies have shown fully exploiting interactions among elements of multi-modal features will lead to a further performance gain. In this paper, we put forward a new approach of multi-modal fusion, namely Multi-modal Circulant Fusion (MCF). Particularly, after reshaping feature vectors into circulant matrices, we define two types of interaction operations between vectors and matrices. As each row of the circulant matrix shifts one elements, with newly-defined interaction operations, we almost explore all possible interactions between vectors of different modalities. Moreover, as only regular operations are involved and defined a priori, MCF avoids increasing parameters or computational costs for multi-modal fusion. We evaluate MCF with tasks of video captioning and temporal activity localization via language (TALL). Experiments on MSVD and MSRVTT show our method obtains the state-of-the-art for video captioning. For TALL, by plugging into MCF, we achieve a performance gain of roughly 4.2% on TACoS.
#702

Multi-modal Sentence Summarization with Modality Attention and Image Filtering
Haoran Li, Junnan Zhu, Tianshang Liu, Jiajun Zhang, Chengqing Zong

Language and Vision

In this paper, we introduce a multi-modal sentence summarization task that produces a short summary from a pair of sentence and image. This task is more challenging than sentence summarization. It not only needs to effectively incorporate visual features into standard text summarization framework, but also requires to avoid noise of image. To this end, we propose a modality-based attention mechanism to pay different attention to image patches and text units, and we design image filters to selectively use visual information to enhance the semantics of the input sentence. We construct a multimodal sentence summarization dataset and extensive experiments on this dataset demonstrate that our models significantly outperform conventional models which only employ text as input. Further analyses suggest that sentence summarization task can benefit from visually grounded representations from a variety of aspects.
#520

Cross-media Multi-level Alignment with Relation Attention Network
Jinwei Qi, Yuxin Peng, Yuxin Yuan

Language and Vision

With the rapid growth of multimedia data, such as image and text, it is a highly challenging problem to effectively correlate and retrieve the data of different media types. Naturally, when correlating an image with textual description, people focus on not only the alignment between discriminative image regions and key words, but also the relations lying in the visual and textual context. Relation understanding is essential for cross-media correlation learning, which is ignored by prior cross-media retrieval works. To address the above issue, we propose Cross-media Relation Attention Network (CRAN) with multi-level alignment. First, we propose visual-language relation attention model to explore both fine-grained patches and their relations of different media types. We aim to not only exploit cross-media fine-grained local information, but also capture the intrinsic relation information, which can provide complementary hints for correlation learning. Second, we propose cross-media multi-level alignment to explore global, local and relation alignments across different media types, which can mutually boost to learn more precise cross-media correlation. We conduct experiments on 2 cross-media datasets, and compare with 10 state-of-the-art methods to verify the effectiveness of proposed approach.
#3030

Multi-task Layout Analysis for Historical Handwritten Documents Using Fully Convolutional Networks
Yue Xu, Fei Yin, Zhaoxiang Zhang, Cheng-Lin Liu

Language and Vision

Layout analysis is a fundamental process in document image analysis and understanding. It consists of several sub-processes such as page segmentation, text line segmentation, baseline detection and so on. In this work, we propose a multi-task layout analysis method that use a single FCN model to solve the above three problems simultaneously. The FCN is trained to segment the document image into different regions and detect the center line of each text line by classifying pixels into different categories. By supervised learning on document images with pixel-wise labels, the FCN can extract discriminative features and perform pixel-wise classification accurately. After pixel-wise classification, post-processing steps are taken to reduce noises, correct wrong segmentations and find out overlapping regions. Experimental results on the public dataset DIVA-HisDB containing challenging medieval manuscripts demonstrate the effectiveness and superiority of the proposed method.
#486

Rethinking Diversified and Discriminative Proposal Generation for Visual Grounding
Zhou Yu, Jun Yu, Chenchao Xiang, Zhou Zhao, Qi Tian, Dacheng Tao

Language and Vision

Visual grounding aims to localize an object in an image referred to by a textual query phrase. Various visual grounding approaches have been proposed, and the problem can be modularized into a general framework: proposal generation, multi-modal feature representation, and proposal ranking. Of these three modules, most existing approaches focus on the latter two, with the importance of proposal generation generally neglected. In this paper, we rethink the problem of what properties make a good proposal generator. We introduce the diversity and discrimination simultaneously when generating proposals, and in doing so propose Diversified and Discriminative Proposal Networks model (DDPN). Based on the proposals generated by DDPN, we propose a high performance baseline model for visual grounding and evaluate it on four benchmark datasets. Experimental results demonstrate that our model delivers significant improvements on all the tested data-sets (e.g., 18.8% improvement on ReferItGame and 8.2% improvement on Flickr30k Entities over the existing state-of-the-arts respectively).
#2550

Deep Learning Based Multi-modal Addressee Recognition in Visual Scenes with Utterances
Thao Le Minh, Nobuyuki Shimizu, Takashi Miyazaki, Koichi Shinoda

Language and Vision

With the widespread use of intelligent systems, such as smart speakers, addressee recognition has become a concern in human-computer interaction, as more and more people expect such systems to understand complicated social scenes, including those outdoors, in cafeterias, and hospitals. Because previous studies typically focused only on pre-specified tasks with limited conversational situations such as controlling smart homes, we created a mock dataset called Addressee Recognition in Visual Scenes with Utterances (ARVSU) that contains a vast body of image variations in visual scenes with an annotated utterance and a corresponding addressee for each scenario. We also propose a multi-modal deep-learning-based model that takes different human cues, specifically eye gazes and transcripts of an utterance corpus, into account to predict the conversational addressee from a specific speaker's view in various real-life conversational scenarios. To the best of our knowledge, we are the first to introduce an end-to-end deep learning model that combines vision and transcripts of utterance for addressee recognition. As a result, our study suggests that future addressee recognition can reach the ability to understand human intention in many social situations previously unexplored, and our modality dataset is a first step in promoting research in this field.

Monday 16 11:25 - 12:50 CV-REC1 - Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation (T1)

Chair: William K. Cheung

#113

Densely Cascaded Shadow Detection Network via Deeply Supervised Parallel Fusion
Yupei Wang, Xin Zhao, Yin Li, Xuecai Hu, Kaiqi Huang

Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation

Shadow detection is an important and challenging problem in computer vision. Recently, single image shadow detection had achieved major progress with the development of deep convolutional networks. However, existing methods are still vulnerable to background clutters, and often fail to capture the global context of an input image. These global contextual and semantic cues are essential for accurately localizing the shadow regions. Moreover, rich spatial details are required to segment shadow regions with precise shape. To this end, this paper presents a novel model characterized by a deeply supervised parallel fusion (DSPF) network and a densely cascaded learning scheme. The DSPF network achieves a comprehensive fusion of global semantic cues and local spatial details by multiple stacked parallel fusion branches, which are learned in a deeply supervised manner. Moreover, the densely cascaded learning scheme is employed to refine the spatial details. Our method is evaluated on two widely used shadow detection benchmarks. Experimental results show that our method outperforms state-of-the-arts by a large margin.
#488

Hi-Fi: Hierarchical Feature Integration for Skeleton Detection
Kai Zhao, Wei Shen, Shanghua Gao, Dandan Li, Ming-Ming Cheng

Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation

In natural images, the scales (thickness) of object skeletons may dramatically vary among objects and object parts. Thus, robust skeleton detection requires powerful multi-scale feature integration ability. To address this issue, we present a new convolutional neural network (CNN) architecture by introducing a novel hierarchical feature integration mechanism, named Hi-Fi, to address the object skeleton detection problem. The proposed CNN-based approach intrinsically captures high-level semantics from deeper layers, as well as low-level details from shallower layers. By hierarchically integrating different CNN feature levels with bidirectional guidance, our approach (1) enables mutual refinement across features of different levels, and (2) possesses the strong ability to capture both rich object context and high-resolution details. Experimental results show that our method significantly outperforms the state-of-the-art methods in terms of effectively fusing features from very different scales, as evidenced by a considerable performance improvement on several benchmarks.
#660

R³Net: Recurrent Residual Refinement Network for Saliency Detection
Zijun Deng, Xiaowei Hu, Lei Zhu, Xuemiao Xu, Jing Qin, Guoqiang Han, Pheng-Ann Heng

Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation

Saliency detection is a fundamental yet challenging task in computer vision, aiming at highlighting the most visually distinctive objects in an image. We propose a novel recurrent residual refinement network (R^3Net) equipped with residual refinement blocks (RRBs) to more accurately detect salient regions of an input image. Our RRBs learn the residual between the intermediate saliency prediction and the ground truth by alternatively leveraging the low-level integrated features and the high-level integrated features of a fully convolutional network (FCN). While the low-level integrated features are capable of capturing more saliency details, the high-level integrated features can reduce non-salient regions in the intermediate prediction. Furthermore, the RRBs can obtain complementary saliency information of the intermediate prediction, and add the residual into the intermediate prediction to refine the saliency maps. We evaluate the proposed R^3Net on five widely-used saliency detection benchmarks by comparing it with 16 state-of-the-art saliency detectors. Experimental results show that our network outperforms our competitors in all the benchmark datasets.
#991

IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection
Qiangpeng Yang, Mengli Cheng, Wenmeng Zhou, Yan Chen, Minghui Qiu, Wei Lin

Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation

Incidental scene text detection, especially for multi-oriented text regions, is one of the most challenging tasks in many computer vision applications.Different from the common object detection task, scene text often suffers from a large variance of aspect ratio, scale, and orientation. To solve this problem, we propose a novel end-to-end scene text detector IncepText from an instance-aware segmentation perspective. We design a novel Inception-Text module and introduce deformable PSROI pooling to deal with multi-oriented text detection. Extensive experiments on ICDAR2015, RCTW-17, and MSRA-TD500 datasets demonstrate our method's superiority in terms of both effectiveness and efficiency. Our proposed method achieves 1st place result on ICDAR2015 challenge and the state-of-the-art performance on other datasets. Moreover, we have released our implementation as an OCR product which is available for public access.
#1837

Collaborative Learning for Weakly Supervised Object Detection
Jiajie Wang, Jiangchao Yao, Ya Zhang, Rui Zhang

Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation

Weakly supervised object detection has recently received much attention, since it only requires image-level labels instead of the bounding-box labels consumed in strongly supervised learning. Nevertheless, the save in labeling expense is usually at the cost of model accuracy.In this paper, we propose a simple but effective weakly supervised collaborative learning framework to resolve this problem, which trains a weakly supervised learner and a strongly supervised learner jointly by enforcing partial feature sharing and prediction consistency. For object detection, taking WSDDN-like architecture as weakly supervised detector sub-network and Faster-RCNN-like architecture as strongly supervised detector sub-network, we propose an end-to-end Weakly Supervised Collaborative Detection Network. As there is no strong supervision available to train the Faster-RCNN-like sub-network, a new prediction consistency loss is defined to enforce consistency of predictions between the two sub-networks as well as within the Faster-RCNN-like sub-networks. At the same time, the two detectors are designed to partially share features to further guarantee the model consistency at perceptual level. Extensive experiments on PASCAL VOC 2007 and 2012 data sets have demonstrated the effectiveness of the proposed framework.
#102

Deep Joint Semantic-Embedding Hashing
Ning Li, Chao Li, Cheng Deng, Xianglong Liu, Xinbo Gao

Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation

Hashing has been widely deployed to large-scale image retrieval due to its low storage cost and fast query speed. Almost all deep hashing methods do not sufficiently discover semantic correlation from label information, which results in the learned hash codes less discriminative. In this paper, we propose a novel Deep Joint Semantic-Embedding Hashing (DSEH) approach that contains LabNet and ImgNet. Specifically, LabNet is explored to capture abundant semantic correlation between sample pairs and supervise ImgNet from semantic level and hash codes level, which is conductive to the generated hash codes being more discriminative and similarity-preserving. Extensive experiments on three benchmark datasets show that the proposed model outperforms the state-of-the-art methods.
#154

Semantic Structure-based Unsupervised Deep Hashing
Erkun Yang, Cheng Deng, Tongliang Liu, Wei Liu, Dacheng Tao

Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation

Hashing is becoming increasingly popular for approximate nearest neighbor searching in massive databases due to its storage and search efficiency. Recent supervised hashing methods, which usually construct semantic similarity matrices to guide hash code learning using label information, have shown promising results. However, it is relatively difficult to capture and utilize the semantic relationships between points in unsupervised settings. To address this problem, we propose a novel unsupervised deep framework called Semantic Structure-based unsupervised Deep Hashing (SSDH). We first empirically study the deep feature statistics, and find that the distribution of the cosine distance for point pairs can be estimated by two half Gaussian distributions. Based on this observation, we construct the semantic structure by considering points with distances obviously smaller than the others as semantically similar and points with distances obviously larger than the others as semantically dissimilar. We then design a deep architecture and a pair-wise loss function to preserve this semantic structure in Hamming space. Extensive experiments show that SSDH significantly outperforms current state-of-the-art methods.

Monday 16 11:25 - 12:50 ML-ONL - Online Learning (K11)

Chair: Arunesh Sinha

#1969

Online Kernel Selection via Incremental Sketched Kernel Alignment
Xiao Zhang, Shizhong Liao

Online Learning

In contrast to offline kernel selection, online kernel selection must rise to the new challenges of passing the training set once, selecting optimal kernels and updating hypotheses at each round, enjoying a sublinear regret bound for online kernel learning, and requiring a constant maintenance time complexity at each round and an efficient overall time complexity integrated with online kernel learning. However, most of existing online kernel selection approaches can not meet the new challenges. To address this issue, we propose a novel online kernel selection approach via the incremental sketched kernel alignment criterion, which meets all the new challenges. We first define the incremental sketched kernel alignment (ISKA) criterion, which estimates the kernel alignment and can be computed incrementally and efficiently. When applying the proposed ISKA criterion to online kernel selection, we adopt the subclass coherence to maintain the hypothesis space, select the optimal kernel at each round using the median of the ISKA criterion estimates, and update the hypothesis following the online gradient decent method. We prove that the ISKA criterion is an unbiased estimate of the maximum mean discrepancy, enjoys the optimal logarithmic regret bound for online kernel learning, and has a constant maintenance time complexity at each round and a logarithmic overall time complexity integrated with online kernel learning. Empirical studies demonstrate that the proposed online kernel selection approach is computationally efficient while maintaining comparable accuracy for online kernel learning.
#2354

Online Deep Learning: Learning Deep Neural Networks on the Fly
Doyen Sahoo, Quang Pham, Jing Lu, Steven C. H. Hoi

Online Learning

Deep Neural Networks (DNNs) are typically trained by backpropagation in a batch setting, requiring the entire training data to be made available prior to the learning task. This is not scalable for many real-world scenarios where new data arrives sequentially in a stream. We aim to address an open challenge of ``Online Deep Learning" (ODL) for learning DNNs on the fly in an online setting. Unlike traditional online learning that often optimizes some convex objective function with respect to a shallow model (e.g., a linear/kernel-based hypothesis), ODL is more challenging as the optimization objective is non-convex, and regular DNN with standard backpropagation does not work well in practice for online settings. We present a new ODL framework that attempts to tackle the challenges by learning DNN models which dynamically adapt depth from a sequence of training data in an online learning setting. Specifically, we propose a novel Hedge Backpropagation (HBP) method for online updating the parameters of DNN effectively, and validate the efficacy on large data sets (both stationary and concept drifting scenarios).
#1499

Minimizing Adaptive Regret with One Gradient per Iteration
Guanghui Wang, Dakuan Zhao, Lijun Zhang

Online Learning

To cope with non-stationary environments, recent advances in online optimization have introduced the notion of adaptive regret, which measures the performance of an online learner against different comparators within different time intervals. Previous studies have proposed various algorithms to yield low adaptive regret under different scenarios. However, all of existing algorithms need to query the gradient of the loss function at least O(log t) times in every iteration t, which hinders their applications to broad domains, especially when the evaluation of gradients is expensive. To address this limitation, we propose a series of computationally efficient algorithms for minimizing the adaptive regret of general convex, strongly convex and exponentially concave functions respectively. The key idea is to replace each loss function with a carefully designed surrogate loss, which bounds the original loss function from below. We show that the proposed algorithms only query the gradient once per iteration, and attain the same theoretical guarantees as previous optimal algorithms. Empirical results demonstrate the efficiency and effectiveness of our methods.
#1421

Efficient Adaptive Online Learning via Frequent Directions
Yuanyu Wan, Nan Wei, Lijun Zhang

Online Learning

By employing time-varying proximal functions, adaptive subgradient methods (ADAGRAD) have improved the regret bound and been widely used in online learning and optimization. However, ADAGRAD with full matrix proximal functions (ADA-FULL) cannot deal with large-scale problems due to the impractical time and space complexities, though it has better performance when gradients are correlated. In this paper, we propose ADA-FD, an efficient variant of ADA-FULL based on a deterministic matrix sketching technique called frequent directions. Following ADA-FULL, we incorporate our ADA-FD into both primal-dual subgradient method and composite mirror descent method to develop two efficient methods. By maintaining and manipulating low-rank matrices, at each iteration, the space complexity is reduced from $O(d^2)$ to $O(\tau d)$ and the time complexity is reduced from $O(d^3)$ to $O(\tau^2d)$, where $d$ is the dimensionality of the data and $\tau \ll d$ is the sketching size. Theoretical analysis reveals that the regret of our methods is close to that of ADA-FULL as long as the outer product matrix of gradients is approximately low-rank. Experimental results show that our ADA-FD is comparable to ADA-FULL and outperforms other state-of-the-art algorithms in online convex optimization as well as in training convolutional neural networks (CNN).
#1511

Bandit Online Learning on Graphs via Adaptive Optimization
Peng Yang, Peilin Zhao, Xin Gao

Online Learning

Traditional online learning on graphs adapts graph Laplacian into ridge regression, which may not guarantee reasonable accuracy when the data are adversarially generated. To solve this issue, we exploit an adaptive optimization framework for online classification on graphs. The derived model can achieve a min-max regret under an adversarial mechanism of data generation. To take advantage of the informative labels, we propose an adaptive large-margin update rule, which enjoys a lower regret than the algorithms using error-driven update rules. However, this algorithm assumes that the full information label is provided for each node, which is violated in many practical applications where labeling is expensive and the oracle may only tell whether the prediction is correct or not. To address this issue, we propose a bandit online algorithm on graphs. It derives per-instance confidence region of the prediction, from which the model can be learned adaptively to minimize the online regret. Experiments on benchmark graph datasets show that the proposed bandit algorithm outperforms state-of-the-art competitors, even sometimes beats the algorithms using full information label feedback.
#4598

Combinatorial Pure Exploration with Continuous and Separable Reward Functions and Its Applications
Weiran Huang, Jungseul Ok, Liang Li, Wei Chen

Online Learning

We study the Combinatorial Pure Exploration problem with Continuous and Separable reward functions (CPE-CS) in the stochastic multi-armed bandit setting. In a CPE-CS instance, we are given several stochastic arms with unknown distributions, as well as a collection of possible decisions. Each decision has a reward according to the distributions of arms. The goal is to identify the decision with the maximum reward, using as few arm samples as possible. The problem generalizes the combinatorial pure exploration problem with linear rewards, which has attracted significant attention in recent years. In this paper, we propose an adaptive learning algorithm for the CPE-CS problem, and analyze its sample complexity. In particular, we introduce a new hardness measure called the consistent optimality hardness, and give both the upper and lower bounds of sample complexity. Moreover, we give examples to demonstrate that our solution has the capacity to deal with non-linear reward functions.
#195

UCBoost: A Boosting Approach to Tame Complexity and Optimality for Stochastic Bandits
Fang Liu, Sinong Wang, Swapna Buccapatnam, Ness Shroff

Online Learning

In this work, we address the open problem of finding low-complexity near-optimal multi-armed bandit algorithms for sequential decision making problems. Existing bandit algorithms are either sub-optimal and computationally simple (e.g., UCB1) or optimal and computationally complex (e.g., kl-UCB). We propose a boosting approach to Upper Confidence Bound based algorithms for stochastic bandits, that we call UCBoost. Specifically, we propose two types of UCBoost algorithms. We show that UCBoost(D) enjoys O(1) complexity for each arm per round as well as regret guarantee that is 1/e-close to that of the kl-UCB algorithm. We propose an approximation-based UCBoost algorithm, UCBoost(epsilon), that enjoys a regret guarantee epsilon-close to that of kl-UCB as well as O(log(1/epsilon)) complexity for each arm per round. Hence, our algorithms provide practitioners a practical way to trade optimality with computational complexity. Finally, we present numerical results which show that UCBoost(epsilon) can achieve the same regret performance as the standard kl-UCB while incurring only 1% of the computational cost of kl-UCB.

Monday 16 11:25 - 12:50 MUL-SP - Security and Privacy (C2)

Chair: Pradeep Murukannaiah

#2541

GELU-Net: A Globally Encrypted, Locally Unencrypted Deep Neural Network for Privacy-Preserved Learning
Qiao Zhang, Cong Wang, Hongyi Wu, Chunsheng Xin, Tran V. Phuong

Security and Privacy

Privacy is a fundamental challenge for a variety of smart applications that depend on data aggregation and collaborative learning across different entities. In this paper, we propose a novel privacy-preserved architecture where clients can collaboratively train a deep model while preserving the privacy of each client’s data. Our main strategy is to carefully partition a deep neural network to two non-colluding parties. One party performs linear computations on encrypted data utilizing a less complex homomorphic cryptosystem, while the other executes non-polynomial computations in plaintext but in a privacy-preserved manner. We analyze security and compare the communication and computation complexity with the existing approaches. Our extensive experiments on different datasets demonstrate not only stable training without accuracy loss, but also 14 to 35 times speedup compared to the state-of-the-art system.
#3594

Adversarial Regression for Detecting Attacks in Cyber-Physical Systems
Amin Ghafouri, Yevgeniy Vorobeychik, Xenofon Koutsoukos

Security and Privacy

Attacks in cyber-physical systems (CPS) which manipulate sensor readings can cause enormous physical damage if undetected. Detection of attacks on sensors is crucial to mitigate this issue. We study supervised regression as a means to detect anomalous sensor readings, where each sensor's measurement is predicted as a function of other sensors. We show that several common learning approaches in this context are still vulnerable to stealthy attacks, which carefully modify readings of compromised sensors to cause desired damage while remaining undetected. Next, we model the interaction between the CPS defender and attacker as a Stackelberg game in which the defender chooses detection thresholds, while the attacker deploys a stealthy attack in response. We present a heuristic algorithm for finding an approximately optimal threshold for the defender in this game, and show that it increases system resilience to attacks without significantly increasing the false alarm rate.
#85

Cascaded SR-GAN for Scale-Adaptive Low Resolution Person Re-identification
Zheng Wang, Mang Ye, Fan Yang, Xiang Bai, Shin'ichi Satoh

Security and Privacy

Person re-identification (REID) is an important task in video surveillance and forensics applications. Most of previous approaches are based on a key assumption that all person images have uniform and sufficiently high resolutions. Actually, various low-resolutions and scale mismatching always exist in open world REID. We name this kind of problem as Scale-Adaptive Low Resolution Person Re-identification (SALR-REID). The most intuitive way to address this problem is to increase various low-resolutions (not only low, but also with different scales) to a uniform high-resolution. SR-GAN is one of the most competitive image super-resolution deep networks, designed with a fixed upscaling factor. However, it is still not suitable for SALR-REID task, which requires a network not only synthesizing high-resolution images with different upscaling factors, but also extracting discriminative image feature for judging person’s identity. (1) To promote the ability of scale-adaptive upscaling, we cascade multiple SRGANs in series. (2) To supplement the ability of image feature representation, we plug-in a reidentification network. With a unified formulation, a Cascaded Super-Resolution GAN (CSR-GAN) framework is proposed. Extensive evaluations on two simulated datasets and one public dataset demonstrate the advantages of our method over related state-of-the-art methods.
#1774

Optimal Cruiser-Drone Traffic Enforcement Under Energy Limitation
Ariel Rosenfeld, Oleg Maksimov, Sarit Kraus

Security and Privacy

Drones can assist in mitigating traffic accidents by deterring reckless drivers, leveraging their flexible mobility. In the real world, drones are fundamentally limited by their battery/fuel capacity and have to be replenished during long operations. In this paper, we propose a novel approach where police cruisers act as mobile replenishment providers in addition to their traffic enforcement duties. We propose a binary integer linear program for determining the optimal rendezvous cruiser-drone enforcement policy which guarantees that all drones are replenished on time and minimizes the likelihood of accidents. In an extensive empirical evaluation, we first show that human drivers are expected to react to traffic enforcement drones in a similar fashion to how they react to police cruisers using a first-of-its-kind human study in realistic simulated driving. Then, we show that our proposed approach significantly outperforms the common practice of constructing stationary replenishment installations using both synthetic and real world road networks.
#4431

Generating Adversarial Examples with Adversarial Networks
Chaowei Xiao, Bo Li, Jun-yan Zhu, Warren He, Mingyan Liu, Dawn Song

Security and Privacy

Deep neural networks (DNNs) have been found to be vulnerable to adversarial examples resulting from adding small-magnitude perturbations to inputs. Such adversarial examples can mislead DNNs to produce adversary-selected results. Different attack strategies have been proposed to generate adversarial examples, but how to produce them with high perceptual quality and more efficiently requires more research efforts. In this paper, we propose AdvGAN to generate adversarial exam- ples with generative adversarial networks (GANs), which can learn and approximate the distribution of original instances. For AdvGAN, once the generator is trained, it can generate perturbations efficiently for any instance, so as to potentially accelerate adversarial training as defenses. We apply Adv- GAN in both semi-whitebox and black-box attack settings. In semi-whitebox attacks, there is no need to access the original target model after the generator is trained, in contrast to traditional white-box attacks. In black-box attacks, we dynamically train a distilled model for the black-box model and optimize the generator accordingly. Adversarial examples generated by AdvGAN on different target models have high attack success rate under state-of-the-art defenses compared to other attacks. Our attack has placed the first with 92.76% accuracy on a public MNIST black-box attack challenge.
#2048

Curriculum Adversarial Training
Qi-Zhi Cai, Chang Liu, Dawn Song

Security and Privacy

Recently, deep learning has been applied to many security-sensitive applications, such as facial authentication. The existence of adversarial examples hinders such applications. The state-of-the-art result on defense shows that adversarial training can be applied to train a robust model on MNIST against adversarial examples; but it fails to achieve a high empirical worst-case accuracy on a more complex task, such as CIFAR-10 and SVHN. In our work, we propose curriculum adversarial training (CAT) to resolve this issue. The basic idea is to develop a curriculum of adversarial examples generated by attacks with a wide range of strengths. With two techniques to mitigate the catastrophic forgetting and the generalization issues, we demonstrate that CAT can improve the prior art's empirical worst-case accuracy by a large margin of 25% on CIFAR-10 and 35% on SVHN. At the same, the model's performance on non-adversarial inputs is comparable to the state-of-the-art models.
#5107

(Sister Conferences Best Papers Track) Tamper-Proof Privacy Auditing for Artificial Intelligence Systems
Andrew Sutton, Reza Samavi

Security and Privacy

Privacy audit logs are used to capture the actions of participants in a data sharing environment in order for auditors to check compliance with privacy policies. However, collusion may occur between the auditors and participants to obfuscate actions that should be recorded in the audit logs. In this paper, we propose a Linked Data based method of utilizing blockchain technology to create tamper-proof audit logs that provide proof of log manipulation and non-repudiation.

Monday 16 11:25 - 12:50 ML-MMM1 - Multi-Instance, Multi-View, Multi-Label Learning (C3)

Chair: Xin Geng

#1913

Deep Multi-View Concept Learning
Cai Xu, Ziyu Guan, Wei Zhao, Yunfei Niu, Quan Wang, Zhiheng Wang

Multi-Instance, Multi-View, Multi-Label Learning

Multi-view data is common in real-world datasets, where different views describe distinct perspectives. To better summarize the consistent and complementary information in multi-view data, researchers have proposed various multi-view representation learning algorithms, typically based on factorization models. However, most previous methods were focused on shallow factorization models which cannot capture the complex hierarchical information. Although a deep multi-view factorization model has been proposed recently, it fails to explicitly discern consistent and complementary information in multi-view data and does not consider conceptual labels. In this work we present a semi-supervised deep multi-view factorization method, named Deep Multi-view Concept Learning (DMCL). DMCL performs nonnegative factorization of the data hierarchically, and tries to capture semantic structures and explicitly model consistent and complementary information in multi-view data at the highest abstraction level. We develop a block coordinate descent algorithm for DMCL. Experiments conducted on image and document datasets show that DMCL performs well and outperforms baseline methods.
#175

FISH-MML: Fisher-HSIC Multi-View Metric Learning
Changqing Zhang, Yeqinq Liu, Yue Liu, Qinghua Hu, Xinwang Liu, Pengfei Zhu

Multi-Instance, Multi-View, Multi-Label Learning

This work presents a simple yet effective model for multi-view metric learning, which aims to improve the classification of data with multiple views, e.g., multiple modalities or multiple types of features. The intrinsic correlation, different views describing same set of instances, makes it possible and necessary to jointly learn multiple metrics of different views, accordingly, we propose a multi-view metric learning method based on Fisher discriminant analysis (FDA) and Hilbert-Schmidt Independence Criteria (HSIC), termed as Fisher-HSIC Multi-View Metric Learning (FISH-MML). In our approach, the class separability is enforced in the spirit of FDA within each single view, while the consistence among different views is enhanced based on HSIC. Accordingly, both intra-view class separability and inter-view correlation are well addressed in a unified framework. The learned metrics can improve multi-view classification, and experimental results on real-world datasets demonstrate the effectiveness of the proposed method.
#1413

Adaptive Graph Guided Embedding for Multi-label Annotation
Lichen Wang, Zhengming Ding, Yun Fu

Multi-Instance, Multi-View, Multi-Label Learning

Multi-label annotation is challenging since a large amount of well-labeled training data are required to achieve promising performance. However, providing such data is expensive while unlabeled data are widely available. To this end, we propose a novel Adaptive Graph Guided Embedding (AG2E) approach for multi-label annotation in a semi-supervised fashion, which utilizes limited labeled data associating with large-scale unlabeled data to facilitate learning performance. Specifically, a multi-label propagation scheme and an effective embedding are jointly learned to seek a latent space where unlabeled instances tend to be well assigned multiple labels. Furthermore, a locality structure regularizer is designed to preserve the intrinsic structure and enhance the multi-label annotation. We evaluate our model in both conventional multi-label learning and zero-shot learning scenario. Experimental results demonstrate that our approach outperforms other compared state-of-the-art methods.
#2296

Label Enhancement for Label Distribution Learning
Ning Xu, An Tao, Xin Geng

Multi-Instance, Multi-View, Multi-Label Learning

Label distribution is more general than both single-label annotation and multi-label annotation. It covers a certain number of labels, representing the degree to which each label describes the instance. The learning process on the instances labeled by label distributions is called label distribution learning (LDL). Unfortunately, many training sets only contain simple logical labels rather than label distributions due to the difficulty of obtaining the label distributions directly. To solve the problem, one way is to recover the label distributions from the logical labels in the training set via leveraging the topological information of the feature space and the correlation among the labels. Such process of recovering label distributions from logical labels is defined as label enhancement (LE), which reinforces the supervision information in the training sets. This paper proposes a novel LE algorithm called Graph Laplacian Label Enhancement (GLLE). Experimental results on one artificial dataset and fourteen real-world datasets show clear advantages of GLLE over several existing LE algorithms.
#2460

Doubly Aligned Incomplete Multi-view Clustering
Menglei Hu, Songcan Chen

Multi-Instance, Multi-View, Multi-Label Learning

Nowadays, multi-view clustering has attracted more and more attention. To date, almost all the previous studies assume that views are complete. However, in reality, it is often the case that each view may contain some missing instances. Such incompleteness makes it impossible to directly use traditional multi-view clustering methods. In this paper, we propose a Doubly Aligned Incomplete Multi-view Clustering algorithm (DAIMC) based on weighted semi-nonnegative matrix factorization (semi-NMF). Specifically, on the one hand, DAIMC utilizes the given instance alignment information to learn a common latent feature matrix for all the views. On the other hand, DAIMC establishes a consensus basis matrix with the help of L2,1-Norm regularized regression for reducing the influence of missing instances. Consequently, compared with existing methods, besides inheriting the strength of semi-NMF with ability to handle negative entries, DAIMC has two unique advantages: 1) solving the incomplete view problem by introducing a respective weight matrix for each view, making it able to easily adapt to the case with more than two views; 2) reducing the influence of view incompleteness on clustering by enforcing the basis matrices of individual views being aligned with the help of regression. Experiments on four real-world datasets demonstrate its advantages.
#1522

Robust Auto-Weighted Multi-View Clustering
Pengzhen Ren, Yun Xiao, Pengfei Xu, Jun Guo, Xiaojiang Chen, Xin Wang, Dingyi Fang

Multi-Instance, Multi-View, Multi-Label Learning

Multi-view clustering has played a vital role in real-world applications. It aims to cluster the data points into different groups by exploring complementary information of multi-view. A major challenge of this problem is how to learn the explicit cluster structure with multiple views when there is considerable noise. To solve this challenging problem, we propose a novel Robust Auto-weighted Multi-view Clustering (RAMC), which aims to learn an optimal graph with exactly k connected components, where k is the number of clusters. ℓ1-norm is employed for robustness of the proposed algorithm. We have validated this in the later experiment. The new graph learned by the proposed model approximates the original graphs of each individual view but maintains an explicit cluster structure. With this optimal graph, we can immediately achieve the clustering results without any further post-processing. We conduct extensive experiments to confirm the superiority and robustness of the proposed algorithm.
#1937

Towards Enabling Binary Decomposition for Partial Label Learning
Xuan Wu, Min-Ling Zhang

Multi-Instance, Multi-View, Multi-Label Learning

The task of partial label (PL) learning is to learn a multi-class classifier from training examples each associated with a set of candidate labels, among which only one corresponds to the ground-truth label. It is well known that for inducing multi-class predictive model, the most straightforward solution is binary decomposition which works by either one-vs-rest or one-vs-one strategy. Nonetheless, the ground-truth label for each PL training example is concealed in its candidate label set and thus not accessible to the learning algorithm, binary decomposition cannot be directly applied under partial label learning scenario. In this paper, a novel approach is proposed to solving partial label learning problem by adapting the popular one-vs-one decomposition strategy. Specifically, one binary classifier is derived for each pair of class labels, where PL training examples with distinct relevancy to the label pair are used to generate the corresponding binary training set. After that, one binary classifier is further derived for each class label by stacking over predictions of existing binary classifiers to improve generalization. Experimental studies on both artificial and real-world PL data sets clearly validate the effectiveness of the proposed binary decomposition approach w.r.t state-of-the-art partial label learning techniques.

Monday 16 11:25 - 13:05 MAS-AM1 - Auctions and Markets 1 (C8)

Chair: Vangelis Markakis

#1566

On Fair Price Discrimination in Multi-Unit Markets
Michele Flammini, Manuel Mauro, Matteo Tonelli

Auctions and Markets 1

Discriminatory pricing policies, even if at first glance can be perceived as unfair, are widespread. In fact, pricing differences for the same item among different national markets are common, or forms of discrimination based on the time of purchase, like in tickets' sales. In this work we propose a framework for capturing the setting of ``fair'' discriminatory pricing and study its application to multi-unit markets, in which many copies of the same item are on sale. Our model is able to incorporate the fundamental discrimination settings proposed in the literature, by expressing individual buyers constraints for assigning prices by means of a social relationship graph, modeling the information that each buyer can acquire about the prices assigned to the other buyers. After pointing out the positive effects of fair price discrimination, we investigate the computational complexity of maximizing the social welfare and the revenue in these markets, providing hardness and approximation results under various assumptions on the buyers valuations and on the social graph topology.
#1619

Non-decreasing Payment Rules for Combinatorial Auctions
Vitor Bosshard, Ye Wang, Sven Seuken

Auctions and Markets 1

Combinatorial auctions are used to allocate resources in domains where bidders have complex preferences over bundles of goods. However, the behavior of bidders under different payment rules is not well understood, and there has been limited success in finding Bayes-Nash equilibria of such auctions due to the computational difficulties involved. In this paper, we introduce non-decreasing payment rules. Under such a rule, the payment of a bidder cannot decrease when he increases his bid, which is a natural and desirable property. VCG-nearest, the payment rule most commonly used in practice, violates this property and can thus be manipulated in surprising ways. In contrast, we show that many other payment rules are non-decreasing. We also show that a non-decreasing payment rule imposes a structure on the auction game that enables us to search for an approximate Bayes-Nash equilibrium much more efficiently than in the general case. Finally, we introduce the utility planes BNE algorithm, which exploits this structure and outperforms a state-of-the-art algorithm by multiple orders of magnitude.
#2001

Ex-post IR Dynamic Auctions with Cost-per-Action Payments
Weiran Shen, Zihe Wang, Song Zuo

Auctions and Markets 1

Motivated by online ad auctions, we consider a repeated auction between one seller and many buyers, where each buyer only has an estimation of her value in each period until she actually receives the item in that period. The seller is allowed to conduct a dynamic auction but must guarantee ex-post individual rationality. In this paper, we use a structure that we call credit accounts to enable a general reduction from any incentive compatible and ex-ante individual rational dynamic auction to an approximate incentive compatible and ex-post individually rational dynamic auction with credit accounts. Our reduction obtains stronger individual rationality guarantees at the cost of weaker incentive compatibility. Surprisingly, our reduction works without any common knowledge assumption. Finally, as a complement to our reduction, we prove that there is no non-trivial auction that is exactly incentive compatible and ex-post individually rational under this setting.
#2897

Double Auctions in Markets for Multiple Kinds of Goods
Erel Segal-Halevi, Avinatan Hassidim, Yonatan Aumann

Auctions and Markets 1

Motivated by applications such as stock exchanges and spectrum auctions, there is a growing interest in mechanisms for arranging trade in two-sided markets. However, existing mechanisms are either not truthful, do not guarantee an asymptotically-optimal gain-from-trade, rely on a prior on the traders' valuations, or operate in limited settings such as a single type of good. We extend the random-sampling technique used in earlier works to multi-good markets where traders have gross-substitute valuations. We show a prior free, truthful and strongly-budget-balanced mechanism which guarantees near-optimal gain from trade when the market sizes of all goods grow to infinity at a similar rate.
#3954

Bidding in Periodic Double Auctions Using Heuristics and Dynamic Monte Carlo Tree Search
Moinul Morshed Porag Chowdhury, Christopher Kiekintveld, Son Tran, William Yeoh

Auctions and Markets 1

In a Periodic Double Auction (PDA), there are multiple discrete trading periods for a single type of good. PDAs are commonly used in real-world energy markets to trade energy in specific time slots to balance demand on the power grid. Strategically, bidding in a PDA is complicated because the bidder must predict and plan for future auctions that may influence the bidding strategy for the current auction. We present a general bidding strategy for PDAs based on forecasting clearing prices and using Monte Carlo Tree Search (MCTS) to plan a bidding strategy across multiple time periods. In addition, we present a fast heuristic strategy that can be used either as a standalone method or as an initial set of bids to seed the MCTS policy. We evaluate our bidding strategies using a PDA simulator based on the wholesale market implemented in the Power Trading Agent Competition (PowerTAC) competition. We demonstrate that our strategies outperform state-of-the-art bidding strategies designed for that competition.
#4045

A Cloaking Mechanism to Mitigate Market Manipulation
Xintong Wang, Yevgeniy Vorobeychik, Michael P. Wellman

Auctions and Markets 1

We propose a cloaking mechanism to deter spoofing, a form of manipulation in financial markets. The mechanism works by symmetrically concealing a specified number of price levels from the inside of the order book. To study the effectiveness of cloaking, we simulate markets populated with background traders and an exploiter, who strategically spoofs to profit. The traders follow two representative bidding strategies: the non-spoofable zero intelligence and the manipulable heuristic belief learning. Through empirical game-theoretic analysis across parametrically different environments, we evaluate surplus accrued by traders, and characterize the conditions under which cloaking mitigates manipulation and benefits market welfare. We further design sophisticated spoofing strategies that probe to reveal cloaked information, and find that the effort and risk exceed the gains.
#1297

Optimal Bidding Strategy for Brand Advertising
Takanori Maehara, Atsuhiro Narita, Jun Baba, Takayuki Kawabata

Auctions and Markets 1

Brand advertising is a type of advertising that aims at increasing the awareness of companies or products. This type of advertising is well studied in economic, marketing, and psychological literature; however, there are no studies in the area of computational advertising because the effect of such advertising is difficult to observe. In this study, we consider a real-time biding strategy for brand advertising. Here, our objective to maximizes the total number of users who remember the advertisement, averaged over the time. For this objective, we first introduce a new objective function that captures the cognitive psychological properties of memory retention, and can be optimized efficiently in the online setting (i.e., it is a monotone submodular function). Then, we propose an algorithm for the bid optimization problem with the proposed objective function under the second price mechanism by reducing the problem to the online knapsack constrained monotone submodular maximization problem. We evaluated the proposed objective function and the algorithm in a real-world data collected from our system and a questionnaire survey. We observed that our objective function is reasonable in real-world setting, and the proposed algorithm outperformed the baseline online algorithms.
#1235

On the Complexity of Chore Division
Alireza Farhadi, MohammadTaghi Hajiaghayi

Auctions and Markets 1

We study the proportional chore division problem where a protocol wants to divide an undesirable object, called chore, among n different players. This problem is the dual variant of the cake cutting problem in which we want to allocate a desirable object. In this paper, we show that chore division and cake cutting problems are closely related to each other and provide a tight lower bound for proportional chore division.

Monday 16 14:00 - 14:45 Invited Talk (VICTORIA)

Chair: Shlomo Zilberstein

Interactive, Collaborative Robots: Challenges and Opportunities
Danica Kragic

Invited Talk

Monday 16 14:55 - 16:10 KR-NLP1 - KR and NLP (C7)

Chair: Olaf Hartig

#434

A Deep Modular RNN Approach for Ethos Mining
Rory Duthie, Katarzyna Budzynska

KR and NLP

Automatically recognising and extracting the reasoning expressed in natural language text is extremely demanding and only very recently has there been significant headway. While such argument mining focuses on logos (the content of what is said) evidence has demonstrated that using ethos (the character of the speaker) can sometimes be an even more powerful tool of influence. We study the UK parliamentary debates which furnish a rich source of ethos with linguistic material signalling the ethotic relationships between politicians. We then develop a novel deep modular recurrent neural network, DMRNN, approach and employ proven methods from argument mining and sentiment analysis to create an ethos mining pipeline. Annotation of ethotic statements is reliable and its extraction is robust (macro-F1 = 0.83), while annotation of polarity is perfect and its extraction is solid (macro-F1 = 0.84). By exploring correspondences between ethos in political discourse and major events in the political landscape through ethos analytics, we uncover tantalising evidence that identifying expressions of positive and negative ethotic sentiment is a powerful instrument for understanding the dynamics of governments.
#1328

Constructing Narrative Event Evolutionary Graph for Script Event Prediction
Zhongyang Li, Xiao Ding, Ting Liu

KR and NLP

Script event prediction requires a model to predict the subsequent event given an existing event context. Previous models based on event pairs or event chains cannot make full use of dense event connections, which may limit their capability of event prediction. To remedy this, we propose constructing an event graph to better utilize the event network information for script event prediction. In particular, we first extract narrative event chains from large quantities of news corpus, and then construct a narrative event evolutionary graph (NEEG) based on the extracted chains. NEEG can be seen as a knowledge base that describes event evolutionary principles and patterns. To solve the inference problem on NEEG, we present a scaled graph neural network (SGNN) to model event interactions and learn better event representations. Instead of computing the representations on the whole graph, SGNN processes only the concerned nodes each time, which makes our model feasible to large-scale graphs. By comparing the similarity between input context event representations and candidate event representations, we can choose the most reasonable subsequent event. Experimental results on widely used New York Times corpus demonstrate that our model significantly outperforms state-of-the-art baseline methods, by using standard multiple choice narrative cloze evaluation.
#1920

Learning Conceptual Space Representations of Interrelated Concepts
Zied Bouraoui, Steven Schockaert

KR and NLP

Several recently proposed methods aim to learn conceptual space representations from large text collections. These learned representations associate each object from a given domain of interest with a point in a high-dimensional Euclidean space, but they do not model the concepts from this domain, and can thus not directly be used for categorization and related cognitive tasks. A natural solution is to represent concepts as Gaussians, learned from the representations of their instances, but this can only be reliably done if sufficiently many instances are given, which is often not the case. In this paper, we introduce a Bayesian model which addresses this problem by constructing informative priors from background knowledge about how the concepts of interest are interrelated with each other. We show that this leads to substantially better predictions in a knowledge base completion task.
#2867

Mitigating the Effect of Out-of-Vocabulary Entity Pairs in Matrix Factorization for KB Inference
Prachi Jain, Shikhar Murty, Mausam, Soumen Chakrabarti

KR and NLP

This paper analyzes the varied performance of Matrix Factorization (MF) on the related tasks of relation extraction and knowledge-base completion, which have been unified recently into a single framework of knowledge-base inference (KBI) [Toutanova et al., 2015]. We first propose a new evaluation protocol that makes comparisons between MF and Tensor Factorization (TF) models fair. We find that this results in a steep drop in MF performance. Our analysis attributes this to the high out-of-vocabulary (OOV) rate of entity pairs in test folds of commonly-used datasets. To alleviate this issue, we propose three extensions to MF. Our best model is a TF-augmented MF model. This hybrid model is robust and obtains strong results across various KBI datasets.
#3114

Functional Partitioning of Ontologies for Natural Language Query Completion in Question Answering Systems
Jaydeep Sen, Ashish Mittal, Diptikalyan Saha, Karthik Sankaranarayanan

KR and NLP

Query completion systems are well studied in the context of information retrieval systems that handle keyword queries. However, Natural Language Interface to Databases (NLIDB) systems that focus on syntactically correct and semantically complete queries to obtain high precision answers require a fundamentally different approach to the query completion problem as opposed to IR systems. To the best of our knowledge, we are first to focus on the problem of query completion for NLIDB systems. In particular, we introduce a novel concept of functional partitioning of an ontology and then design algorithms to intelligently use the components obtained from functional partitioning to extend a state-of-the-art NLIDB system to produce accurate and semantically meaningful query completions in the absence of query logs. We test the proposed query completion framework on multiple benchmark datasets and demonstrate the efficacy of our technique empirically.
#4479

Joint Posterior Revision of NLP Annotations via Ontological Knowledge
Marco Rospocher, Francesco Corcoglioniti

KR and NLP

Different well-established NLP tasks contribute to elicit the semantics of entities mentioned in natural language text, such as Named Entity Recognition and Classification (NERC) and Entity Linking (EL). However, combining the outcomes of these tasks may result in NLP annotations --- such as a NERC organization linked by EL to a person --- that are unlikely or contradictory when interpreted in the light of common world knowledge about the entities these annotations refer to. We thus propose a general probabilistic model that explicitly captures the relations between multiple NLP annotations for an entity mention, the ontological entity classes implied by those annotations, and the background ontological knowledge those classes may be consistent with. We use the model to estimate the posterior probability of NLP annotations given their confidences (prior probabilities) and the ontological knowledge, and consequently revise the best annotation choice performed by the NLP tools. In a concrete scenario with two state-of-the-art tools for NERC and EL, we experimentally show on three reference datasets that for these tasks, the joint annotation revision performed by the model consistently improves on the original results of the tools.

Monday 16 14:55 - 16:10 MAS-SGN - Social Choice, Game Theory, and Networks (C8)

Chair: Leila Amgoud

#2665

Opinion Diffusion and Campaigning on Society Graphs
Piotr Faliszewski, Rica Gonen, Martin Koutecký, Nimrod Talmon

Social Choice, Game Theory, and Networks

We study the effects of campaigning, where the society is partitioned into voter clusters and a diffusion process propagates opinions in a network connecting those clusters. Our model is very general and can incorporate many campaigning actions, various partitions of the society into voter clusters, and very general diffusion processes. Perhaps surprisingly, we show that computing the cheapest campaign for rigging a given election can usually be done efficiently, even with arbitrarily-many voters.
#652

Biharmonic Distance Related Centrality for Edges in Weighted Networks
Yuhao Yi, Liren Shan, Huan Li, Zhongzhi Zhang

Social Choice, Game Theory, and Networks

The Kirchhoff index, defined as the sum of effective resistances over pairs all of nodes, is of primary significance in diverse contexts of complex networks. In this paper, we propose to use the rate at which the Kirchhoff index changes with respect to the change of resistance of an edge as a measure of importance for this edge in weighted networks. For an arbitrary edge, we explicitly determine the change of the Kirchhoff index and express it in terms of the biharmonic distance between its end nodes, and thus call this centrality as biharmonic distance related centrality (BDRC). We show that BDRC has a better discriminating power than those commonly used metrics, such as edge betweenness and spanning edge centrality. We give an efficient algorithm that provides an approximation of biharmonic distance for all edges in nearly linear time of the number of edges, with a high probability. Experiment results validate the efficiency and accuracy of the presented algorithm.
#1073

Reasoning about Consensus when Opinions Diffuse through Majority Dynamics
Vincenzo Auletta, Diodato Ferraioli, Gianluigi Greco

Social Choice, Game Theory, and Networks

Opinion diffusion is studied on social graphs where agents hold binary opinions and where social pressure leads them to conform to the opinion manifested by their neighbors. Within this setting, questions related to whether a minority/majority can spread the opinion it supports to all the other agents are considered.It is shown that, no matter of the graph given at hand, there always exists a group formed by a half of the agents that can annihilate the opposite opinion. Instead, the influence power of minorities depends on certain features of the underlying graphs, which are NP-hard to be identified. Deciding whether the two opinions can coexist in some stable configuration is NP-hard, too.
#3060

Path Evaluation and Centralities in Weighted Graphs - An Axiomatic Approach
Jadwiga Sosnowska, Oskar Skibski

Social Choice, Game Theory, and Networks

We study the problem of extending the classic centrality measures to weighted graphs. Unfortunately, in the existing extensions, paths in the graph are evaluated solely based on their weights, which is a restrictive and undesirable assumption for a variety of settings. Given this, we define a notion of the path evaluation function that assesses a path between two nodes by looking not only on the sum of edge weights, but also on the number of intermediaries. Using an axiomatic approach, we propose three classes of path evaluation functions. Building upon this analysis, we present the first systematic study how classic centrality measures can be extended to weighted graphs while taking into account an arbitrary path evaluation function. As an application, we use the newly-defined measures to identify the most well-linked districts in a sample public transport network.
#3063

Axiomatization of the PageRank Centrality
Tomasz Wąs, Oskar Skibski

Social Choice, Game Theory, and Networks

We propose an axiomatization of PageRank. Specifically, we introduce five simple axioms—Foreseeability, Outgoing Homogeneity, Monotonicity, Merging, and Dummy Node—and show that PageRank is the only centrality measure that satisfies all of them. Our axioms give a new conceptual and theoretical underpinnings of PageRank and show how it differs from other centralities.
#3252

Combining Opinion Pooling and Evidential Updating for Multi-Agent Consensus
Chanelle Lee, Jonathan Lawry, Alan Winfield

Social Choice, Game Theory, and Networks

The evidence available to a multi-agent system can take at least two distinct forms. There can be direct evidence from the environment resulting, for example, from sensor measurements or from running tests or experiments. In addition, agents also gain evidence from other individuals in the population with whom they are interacting. We, therefore, envisage an agent's beliefs as a probability distribution over a set of hypotheses of interest, which are updated either on the basis of direct evidence using Bayesian updating, or by taking account of the probabilities of other agents using opinion pooling. This paper investigates the relationship between these two processes in a multi-agent setting. We consider a possible Bayesian interpretation of probability pooling and then explore properties for pooling operators governing the extent to which direct evidence is diluted, preserved or amplified by the pooling process. We then use simulation experiments to show that pooling operators can provide a mechanism by which a limited amount of direct evidence can be efficiently propagated through a population of agents so that an appropriate consensus is reached. In particular, we explore the convergence properties of a parameterised family of operators with a range of evidence propagation strengths.

Monday 16 14:55 - 16:10 PS-ML - Planning and Learning (K2)

Chair: Florent Teichteil-Koenigsbuch

#2142

On Q-learning Convergence for Non-Markov Decision Processes
Sultan Javed Majeed, Marcus Hutter

Planning and Learning

Temporal-difference (TD) learning is an attractive, computationally efficient framework for model- free reinforcement learning. Q-learning is one of the most widely used TD learning technique that enables an agent to learn the optimal action-value function, i.e. Q-value function. Contrary to its widespread use, Q-learning has only been proven to converge on Markov Decision Processes (MDPs) and Q-uniform abstractions of finite-state MDPs. On the other hand, most real-world problems are inherently non-Markovian: the full true state of the environment is not revealed by recent observations. In this paper, we investigate the behavior of Q-learning when applied to non-MDP and non-ergodic domains which may have infinitely many underlying states. We prove that the convergence guarantee of Q-learning can be extended to a class of such non-MDP problems, in particular, to some non-stationary domains. We show that state-uniformity of the optimal Q-value function is a necessary and sufficient condition for Q-learning to converge even in the case of infinitely many internal states.
#2441

Organizing Experience: a Deeper Look at Replay Mechanisms for Sample-Based Planning in Continuous State Domains
Yangchen Pan, Muhammad Zaheer, Adam White, Andrew Patterson, Martha White

Planning and Learning

Model-based strategies for control are critical to obtain sample efficient learning. Dyna is a planning paradigm that naturally interleaves learning and planning, by simulating one-step experience to update the action-value function. This elegant planning strategy has been mostly explored in the tabular setting. The aim of this paper is to revisit sample-based planning, in stochastic and continuous domains with learned models. We first highlight the flexibility afforded by a model over Experience Replay (ER). Replay-based methods can be seen as stochastic planning methods that repeatedly sample from a buffer of recent agent-environment interactions and perform updates to improve data efficiency. We show that a model, as opposed to a replay buffer, is particularly useful for specifying which states to sample from during planning, such as predecessor states that propagate information in reverse from a state more quickly. We introduce a semi-parametric model learning approach, called Reweighted Experience Models (REMs), that makes it simple to sample next states or predecessors. We demonstrate that REM-Dyna exhibits similar advantages over replay-based methods in learning in continuous state problems, and that the performance gap grows when moving to stochastic domains, of increasing size.
#519

Open Loop Execution of Tree-Search Algorithms
Erwan Lecarpentier, Guillaume Infantes, Charles Lesire, Emmanuel Rachelson

Planning and Learning

In the context of tree-search stochastic planning algorithms where a generative model is available, we consider on-line planning algorithms building trees in order to recommend an action. We investigate the question of avoiding re-planning in subsequent decision steps by directly using sub-trees as action recommender. Firstly, we propose a method for open loop control via a new algorithm taking the decision of re-planning or not at each time step based on an analysis of the statistics of the sub-tree. Secondly, we show that the probability of selecting a suboptimal action at any depth of the tree can be upper bounded and converges towards zero. Moreover, this upper bound decays in a logarithmic way between subsequent depths. This leads to a distinction between node-wise optimality and state-wise optimality. Finally, we empirically demonstrate that our method achieves a compromise between loss of performance and computational gain.
#1354

Extracting Action Sequences from Texts Based on Deep Reinforcement Learning
Wenfeng Feng, Hankz Hankui Zhuo, Subbarao Kambhampati

Planning and Learning

Extracting action sequences from texts is challenging, as it requires commonsense inferences based on world knowledge. Although there has been work on extracting action scripts, instructions, navigation actions, etc., they require either the set of candidate actions be provided in advance, or action descriptions are restricted to a specific form, e.g., description templates. In this paper we aim to extract action sequences from texts in \emph{free} natural language, i.e., without any restricted templates, provided the set of actions is unknown. We propose to extract action sequences from texts based on the deep reinforcement learning framework. Specifically, we view ``selecting'' or ``eliminating'' words from texts as ``actions'', and texts associated with actions as ``states''. We build Q-networks to learn policies of extracting actions and extract plans from the labeled texts. We demonstrate the effectiveness of our approach on several datasets with comparison to state-of-the-art approaches.
#2152

Bayesian Active Edge Evaluation on Expensive Graphs
Sanjiban Choudhury, Siddhartha Srinivasa, Sebastian Scherer

Planning and Learning

We consider the problem of real-time motion planning that requires evaluating a minimal number of edges on a graph to quickly discover collision-free paths. Evaluating edges is expensive, both for robots with complex geometries like robot arms, and for robots sensing the world online like UAVs. Until now, this challenge has been addressed via laziness, i.e. deferring edge evaluation until absolutely necessary, with the hope that edges turn out to be valid. However, all edges are not alike in value - some have a lot of potentially good paths flowing through them, and some others encode the likelihood of neighbouring edges being valid. This leads to our key insight - instead of passive laziness, we can actively choose edges that reduce the uncertainty about the validity of paths. We show that this is equivalent to the Bayesian active learning paradigm of decision region determination (DRD). However, the DRD problem is not only combinatorially hard but also requires explicit enumeration of all possible worlds. We propose a novel framework that combines two DRD algorithms, DIRECT and BISECT, to overcome both issues. We show that our approach outperforms several state-of-the-art algorithms on a spectrum of planning problems for mobile robots, manipulators and autonomous helicopters.
#4104

Planning in Factored State and Action Spaces with Learned Binarized Neural Network Transition Models
Buser Say, Scott Sanner

Planning and Learning

In this paper, we leverage the efficiency of Binarized Neural Networks (BNNs) to learn complex state transition models of planning domains with discretized factored state and action spaces. In order to directly exploit this transition structure for planning, we present two novel compilations of the learned factored planning problem with BNNs based on reductions to Boolean Satisfiability (FD-SAT-Plan) as well as Binary Linear Programming (FD-BLP-Plan). Experimentally, we show the effectiveness of learning complex transition models with BNNs, and test the runtime efficiency of both encodings on the learned factored planning problem. After this initial investigation, we present an incremental constraint generation algorithm based on generalized landmark constraints to improve the planning accuracy of our encodings. Finally, we show how to extend the best performing encoding (FD-BLP-Plan+) beyond goals to handle factored planning problems with rewards.

Monday 16 14:55 - 16:10 NLP-EMB1 - Word Embeddings (T2)

Chair: Roberto Navigli

#952

Complementary Learning of Word Embeddings
Yan Song, Shuming Shi

Word Embeddings

Continuous bag-of-words (CB) and skip-gram (SG) models are popular approaches to training word embeddings. Conventionally they are two standing-alone techniques used individually. However, with the same goal of building embeddings by leveraging surrounding words, they are in fact a pair of complementary tasks where the output of one model can be used as input of the other, and vice versa. In this paper, we propose complementary learning of word embeddings based on the CB and SG model. Specifically, one round of learning first integrates the predicted output of a SG model with existing context, then forms an enlarged context as input to the CB model. Final models are obtained through several rounds of parameter updating. Experimental results indicate that our approach can effectively improve the quality of initial embeddings, in terms of intrinsic and extrinsic evaluations.
#1736

Approximating Word Ranking and Negative Sampling for Word Embedding
Guibing Guo, Shichang Ouyang, Fajie Yuan, Xingwei Wang

Word Embeddings

CBOW (Continuous Bag-Of-Words) is one of the most commonly used techniques to generate word embeddings in various NLP tasks. However, it fails to reach the optimal performance due to uniform involvements of positive words and a simple sampling distribution of negative words. To resolve these issues, we propose OptRank to optimize word ranking and approximate negative sampling for bettering word embedding. Specifically, we first formalize word embedding as a ranking problem. Then, we weigh the positive words by their ranks such that highly ranked words have more importance, and adopt a dynamic sampling strategy to select informative negative words. In addition, an approximation method is designed to efficiently compute word ranks. Empirical experiments show that OptRank consistently outperforms its counterparts on a benchmark dataset with different sampling scales, especially when the sampled subset is small. The code and datasets can be obtained from https://github.com/ouououououou/OptRank.
#2265

Joint Learning Embeddings for Chinese Words and their Components via Ladder Structured Networks
Yan Song, Shuming Shi, Jing Li

Word Embeddings

The components, such as characters and radicals, of a Chinese word are important sources to help in capturing semantic information of the word. In this paper, we propose a novel framework, namely, ladder structured networks (LSN), which contains three layers representing word, character and radical and learns their embeddings synchronously. LSN captures not only the relations among words, but also the relations among their component characters and radicals, as well as the relations across layers. Each layer in LSN is pluggable so that any particular type of unit (word, character, radical) can be removed and the LSN is thus adjusted for particular types of inputs. In evaluating our framework, we use word similarity as the intrinsic evaluation and part-of-speech tagging and document classification as extrinsic evaluations. Experimental results confirm the validity of our approach and show superiority of our approach over previous work.
#4294

Lifelong Domain Word Embedding via Meta-Learning
Hu Xu, Bing Liu, Lei Shu, Philip S. Yu

Word Embeddings

Learning high-quality domain word embeddings is important for achieving good performance in many NLP tasks. General-purpose embeddings trained on large-scale corpora are often sub-optimal for domain-specific applications. However, domain-specific tasks often do not have large in-domain corpora for training high-quality domain embeddings. In this paper, we propose a novel lifelong learning setting for domain embedding. That is, when performing the new domain embedding, the system has seen many past domains, and it tries to expand the new in-domain corpus by exploiting the corpora from the past domains via meta-learning. The proposed meta-learner characterizes the similarities of the contexts of the same word in many domain corpora, which helps retrieve relevant data from the past domains to expand the new domain corpus. Experimental results show that domain embeddings produced from such a process improve the performance of the downstream tasks.
#4362

Biased Random Walk based Social Regularization for Word Embeddings
Ziqian Zeng, Xin Liu, Yangqiu Song

Word Embeddings

Nowadays, people publish a lot of natural language texts on social media. Socialized word embeddings (SWE) has been proposed to deal with two phenomena of language use: everyone has his/her own personal characteristics of language use and socially connected users are likely to use language in similar ways. We observe that the spread of language use is transitive. Namely, one user can affect his/her friends and the friends can also affect their friends. However, SWE modeled the transitivity implicitly. The social regularization in SWE only applies to one-hop neighbors and thus users outside the one-hop social circle will not be affected directly. In this work, we adopt random walk methods to generate paths on the social graph to model the transitivity explicitly. Each user on a path will be affected by his/her adjacent user(s) on the path. Moreover, according to the update mechanism of SWE, fewer friends a user has, fewer update opportunities he/she can get. Hence, we propose a biased random walk method to provide these users with more update opportunities. Experiments show that our random walk based social regularizations perform better on sentiment classification.
#1154

Think Globally, Embed Locally --- Locally Linear Meta-embedding of Words
Danushka Bollegala, Kohei Hayashi, Ken-ichi Kawarabayashi

Word Embeddings

Distributed word embeddings have shown superior performances in numerous Natural Language Processing (NLP) tasks. However, their performances vary significantly across different tasks, implying that the word embeddings learnt by those methods capture complementary aspects of lexical semantics. Therefore, we believe that it is important to combine the existing word embeddings to produce more accurate and complete meta-embeddings of words. For this purpose, we propose an unsupervised locally linear meta-embedding learning method that takes pre-trained word embeddings as the input, and produces more accurate meta embeddings. Unlike previously proposed meta-embedding learning methods that learn a global projection over all words in a vocabulary, our proposed method is sensitive to the differences in local neighbourhoods of the individual source word embeddings. Moreover, we show that vector concatenation, a previously proposed highly competitive baseline approach for integrating word embeddings, can be derived as a special case of the proposed method. Experimental results on semantic similarity, word analogy, relation classification, and short-text classification tasks show that our meta-embeddings to significantly outperform prior methods in several benchmark datasets, establishing a new state of the art for meta-embeddings.

Monday 16 14:55 - 16:10 CV-IR - Image Retrieval (T1)

Chair: Xiaochun Cao

#1453

Progressive Generative Hashing for Image Retrieval
Yuqing Ma, Yue He, Fan Ding, Sheng Hu, Jun Li, Xianglong Liu

Image Retrieval

Recent years have witnessed the success of the emerging hashing techniques in large-scale image retrieval. Owing to the great learning capacity, deep hashing has become one of the most promising solutions, and achieved attractive performance in practice. However, without semantic label information, the unsupervised deep hashing still remains an open question. In this paper, we propose a novel progressive generative hashing (PGH) framework to help learn a discriminative hashing network in an unsupervised way. Very different from existing studies, it first treats the hash codes as a kind of semantic condition for the similar image generation, and simultaneously feeds the original image and its codes into the generative adversarial networks (GANs). The real images together with the synthetic ones can further help train a discriminative hashing network based on a triplet loss. By iteratively inputting the learnt codes into the hash conditioned GANs, we can progressively enable the hashing network to discover the semantic relations. Extensive experiments on the widely-used image datasets demonstrate that PGH can significantly outperforms state-of-the-art unsupervised hashing methods.
#629

Redundancy-resistant Generative Hashing for Image Retrieval
Changying Du, Xingyu Xie, Changde Du, Hao Wang

Image Retrieval

By optimizing probability distributions over discrete latent codes, Stochastic Generative Hashing (SGH) bypasses the critical and intractable binary constraints on hash codes. While encouraging results were reported, SGH still suffers from the deficient usage of latent codes, i.e., there often exist many uninformative latent dimensions in the code space, a disadvantage inherited from its auto-encoding variational framework. Motivated by the fact that code redundancy usually is severer when more complex decoder network is used, in this paper, we propose a constrained deep generative architecture to simplify the decoder for data reconstruction. Specifically, our new framework forces the latent hashing codes to not only reconstruct data through the generative network but also retain minimal squared L2 difference to the last real-valued network hidden layer. Furthermore, during posterior inference, we propose to regularize the standard auto-encoding objective with an additional term that explicitly accounts for the negative redundancy degree of latent code dimensions. We interpret such modifications as Bayesian posterior regularization and design an adversarial strategy to optimize the generative, the variational, and the redundancy-resistanting parameters. Empirical results show that our new method can significantly boost the quality of learned codes and achieve state-of-the-art performance for image retrieval.
#514

Dual Adversarial Networks for Zero-shot Cross-media Retrieval
Jingze Chi, Yuxin Peng

Image Retrieval

Existing cross-media retrieval methods usually require that testing categories remain the same with training categories, which cannot support the retrieval of increasing new categories. Inspired by zero-shot learning, this paper proposes zeroshot cross-media retrieval for addressing the above problem, which aims to retrieve data of new categories across different media types. It is challenging that zero-shot cross-media retrieval has to handle not only the inconsistent semantics across new and known categories, but also the heterogeneous distributions across different media types. To address the above challenges, this paper proposes Dual Adversarial Networks for Zero-shot Crossmedia Retrieval (DANZCR), which is the first approach to address zero-shot cross-media retrieval to the best of our knowledge. Our DANZCR approach consists of two GANs in a dual structure for common representation generation and original representation reconstruction respectively, which capture the underlying data structures as well as strengthen relations between input data and semantic space to generalize across seen and unseen categories. Our DANZCR approach exploits word embeddings to learn common representations in semantic space via an adversarial learning method, which preserves the inherent cross-media correlation and enhances the knowledge transfer to new categories. Experiments on three widely-used cross-media retrieval datasets show the effectiveness of our approach.
#1340

Tag-based Weakly-supervised Hashing for Image Retrieval
Ziyu Guan, Fei Xie, Wanqing Zhao, Xiaopeng Wang, Long Chen, Wei Zhao, Jinye Peng

Image Retrieval

We are concerned with using user-tagged images to learn proper hashing functions for image retrieval. The benefits are two-fold: (1) we could obtain abundant training data for deep hashing models; (2) tagging data possesses richer semantic information which could help better characterize similarity relationships between images. However, tagging data suffers from noises, vagueness and incompleteness. Different from previous unsupervised or supervised hashing learning, we propose a novel weakly-supervised deep hashing framework which consists of two stages: weakly-supervised pre-training and supervised fine-tuning. The second stage is as usual. In the first stage, rather than performing supervision on tags, the framework introduces a semantic embedding vector (sem-vector) for each image and performs learning of hashing and sem-vectors jointly. By carefully designing the optimization problem, it can well leverage tagging information and image content for hashing learning. The framework is general and does not depend on specific deep hashing methods. Empirical results on real world datasets show that when it is integrated with state-of-art deep hashing methods, the performance increases by 8-10%.
#2268

Learning Deep Unsupervised Binary Codes for Image Retrieval
Junjie Chen, William K. Cheung, Anran Wang

Image Retrieval

Hashing is an efficient approximate nearest neighbor search method and has been widely adopted for large-scale multimedia retrieval. While supervised learning is more popular for the data-dependent hashing, deep unsupervised hashing methods have recently been developed to learn non-linear transformations for converting multimedia inputs to binary codes. Most of existing deep unsupervised hashing methods make use of a quadratic constraint for minimizing the difference between the compact representations and the target binary codes, which inevitably causes severe information loss. In this paper, we propose a novel deep unsupervised method called DeepQuan for hashing. The DeepQuan model utilizes a deep autoencoder network, where the encoder is used to learn compact representations and the decoder is for manifold preservation. To contrast with the existing unsupervised methods, DeepQuan learns the binary codes by minimizing the quantization error through product quantization technique. Furthermore, a weighted triplet loss is proposed to avoid trivial solution and poor generalization. Extensive experimental results on standard datasets show that the proposed DeepQuan model outperforms the state-of-the-art unsupervised hashing methods for image retrieval tasks.
#3101

Hierarchical Graph Structure Learning for Multi-View 3D Model Retrieval
Yuting Su, Wenhui Li, Anan Liu, Weizhi Nie

Image Retrieval

3D model retrieval has been widely utilized in numerous domains, such as computer-aided design, digital entertainment and virtual reality. Recently, many graph-based methods have been proposed to address this task by using multiple views of 3D models. However, these methods are always constrained by the many-to-many graph matching for similarity measure between pair-wise models. In this paper, we propose an hierarchical graph structure learning method (HGS) for 3D model retrieval. The proposed method can decompose the complicated multi-view graph-based similarity measure into multiple single-view graph-based similarity measures. In the bottom hierarchy, we present the method for single-view graph generation and further propose the novel method for similarity measure in single-view graph by leveraging both node-wise context and model-wise context. In the top hierarchy, we fuse the similarities in single-view graphs with respect to different viewpoints to get the multi-view similarity between pair-wise models. In this way, the proposed method can avoid the difficulty in definition and computation in the traditional high-order graph. Moreover, this method is unsupervised and is independent of large-scale 3D dataset for model learning. We conduct extensive evaluation on three popular and challenging datasets. The comparison demonstrates the superiority and effectiveness of the proposed method comparing with the state of the arts. Especially, this unsupervised method can achieve competing performance against the most recent supervised & deep learning method.

Monday 16 14:55 - 16:10 MUL-SE - AI and Software Engineering, Program Synthesis (C2)

Chair: Giuseppe de Giacomo

#1569

Code Completion with Neural Attention and Pointer Networks
Jian Li, Yue Wang, Michael R. Lyu, Irwin King

AI and Software Engineering, Program Synthesis

Intelligent code completion has become an essential research task to accelerate modern software development. To facilitate effective code completion for dynamically-typed programming languages, we apply neural language models by learning from large codebases, and develop a tailored attention mechanism for code completion. However, standard neural language models even with attention mechanism cannot correctly predict the out-of-vocabulary (OoV) words that restrict the code completion performance. In this paper, inspired by the prevalence of locally repeated terms in program source code, and the recently proposed pointer copy mechanism, we propose a pointer mixture network for better predicting OoV words in code completion. Based on the context, the pointer mixture network learns to either generate a within-vocabulary word through an RNN component, or regenerate an OoV word from local context through a pointer component. Experiments on two benchmarked datasets demonstrate the effectiveness of our attention mechanism and pointer mixture network on the code completion task.
#3177

Positive and Unlabeled Learning for Detecting Software Functional Clones with Adversarial Training
Hui-Hui Wei, Ming Li

AI and Software Engineering, Program Synthesis

Software clone detection is an important problem for software maintenance and evolution and it has attracted lots of attentions. However, existing approaches ignore a fact that people would label the pairs of code fragments as \emph{clone} only if they happen to discover the clones while a huge number of undiscovered clone pairs and non-clone pairs are left unlabeled. In this paper, we argue that the clone detection task in the real-world should be formalized as a Positive-Unlabeled (PU) learning problem, and address this problem by proposing a novel positive and unlabeled learning approach, namely CDPU, to effectively detect software functional clones, i.e., pieces of codes with similar functionality but differing in both syntactical and lexical level, where adversarial training is employed to improve the robustness of the learned model to those non-clone pairs that look extremely similar but behave differently. Experiments on software clone detection benchmarks indicate that the proposed approach together with adversarial training outperforms the state-of-the-art approaches for software functional clone detection.
#3097

Deontic Sensors
Julian Padget, Marina De Vos, Charlie Ann Page

AI and Software Engineering, Program Synthesis

Normative capabilities in multi-agent systems (MAS) can be represented within agents, separately as institutions, or a blend of the two. This paper addresses how to extend the principles of open MAS to the provision of normative reasoning capabilities, which are currently either embedded in existing MAS platforms - tightly coupled and inaccessible - or not present. We use a resource-oriented architecture (ROA) pattern, that we call deontic sensors, to make normative reasoning part of an open MAS architecture. The pattern specifies how to loosely couple MAS and normative frameworks, such that each is agnostic of the other, while augmenting the brute facts that an agent perceives with institutional facts, that capture each institution's interpretation of an agent's action. In consequence, a MAS without normative capabilities can acquire them, and an embedded normative framework can be de-coupled and opened to other MAS platforms. More importantly, the deontic sensor pattern allows normative reasoning to be published as services, opening routes to certification and re-use, creation of (formalized) trust and non-specialist access to "on demand'' normative reasoning.
#3178

Cutting the Software Building Efforts in Continuous Integration by Semi-Supervised Online AUC Optimization
Zheng Xie, Ming Li

AI and Software Engineering, Program Synthesis

Continuous Integration (CI) systems aim to provide quick feedback on the success of the code changes by keeping on building the entire systems upon code changes are committed. However, building the entire software system is usually resource and time consuming. Thus, build outcome prediction is usually employed to distinguish the successful builds from the failed ones to cut the building efforts on those successful builds that do not result in any immediate action of the developer. Nevertheless, build outcome prediction in CI is challenging since the learner should be able to learn from a stream of build events with and without the build outcome labels and provide immediate prediction on the next build event. Also, the distribution of the successful and the failed builds are often highly imbalanced. Unfortunately, the existing methods fail to address these challenges well. In this paper, we address these challenges by proposing a semi-supervised online AUC optimization method for CI build outcome prediction. Experiments indicate that our method is able to cut the software building efforts by effectively identify the successful builds, and it outperforms the existing methods that elaborate to address part of these challenges.
#5133

(Sister Conferences Best Papers Track) Counterexample-Driven Genetic Programming: Stochastic Synthesis of Provably Correct Programs
Krzysztof Krawiec, Iwo Błądek, Jerry Swan, John H. Drake

AI and Software Engineering, Program Synthesis

Genetic programming is an effective technique for inductive synthesis of programs from tests, i.e. training examples of desired input-output behavior. Programs synthesized in this way are not guaranteed to generalize beyond the training set, which is unacceptable in many applications. We present Counterexample-Driven Genetic Programming (CDGP) that employs evolutionary search to synthesize provably correct programs from formal specifications. CDGP employs a Satisfiability Modulo Theories (SMT) solver to formally verify programs in the evaluation phase. A failed verification produces counterexamples that are in turn used to calculate fitness and thereby drive the search process. When compared with a range of approaches on a suite of state-of-the-art specification-based synthesis benchmarks, CDGP systematically outperforms them, typically synthesizing correct programs faster and using fewer tests.
#2224

Synthesizing Pattern Programs from Examples
Sunbeom So, Hakjoo Oh

AI and Software Engineering, Program Synthesis

We describe a programming-by-example system that automatically generates pattern programs from examples. Writing pattern programs, which produce various patterns of characters, is one of the most popular programming exercises for entry-level students. However, students often find it difficult to write correct solutions by themselves. In this paper, we present a method for synthesizing pattern programs from examples, allowing students to improve their programming skills efficiently. To that end, we first design a domain-specific language that supports a large class of pattern programs that students struggle with. Next, we develop a synthesis algorithm that efficiently finds a desired program by combining enumerative search, constraint solving, and program analysis. We implemented the algorithm in a tool and evaluated it on 40 exercises gathered from online forums. The experimental results and user study show that our tool can synthesize instructive solutions from 1–3 example patterns in 1.2 seconds on average.

Monday 16 14:55 - 16:10 UAI-GM - Graphical Models (C3)

Chair: Karthika Mohan

#776

Learning with Adaptive Neighbors for Image Clustering
Yang Liu, Quanxue Gao, Zhaohua Yang, Shujian Wang

Graphical Models

Due to the importance and efficiency of learning complex structures hidden in data, graph-based methods have been widely studied and get successful in unsupervised learning. Generally, most existing graph-based clustering methods require post-processing on the original data graph to extract the clustering indicators. However, there are two drawbacks with these methods: (1) the cluster structures are not explicit in the clustering results; (2) the final clustering performance is sensitive to the construction of the original data graph. To solve these problems, in this paper, a novel learning model is proposed to learn a graph based on the given data graph such that the new obtained optimal graph is more suitable for the clustering task. We also propose an efficient algorithm to solve the model. Extensive experimental results illustrate that the proposed model outperforms other state-of-the-art clustering algorithms.
#1636

Markov Random Neural Fields for Face Sketch Synthesis
Mingjin Zhang, Nannan Wang, Xinbo Gao, Yunsong Li

Graphical Models

Synthesizing face sketches with both common and specific information from photos has been recently attracting considerable attentions in digital entertainment. However, the existing approaches either make the strict similarity assumption on face sketches and photos, leading to lose some identity-specific information, or learn the direct mapping relationship from face photos to sketches by the simple neural network, resulting in the lack of some common information. In this paper, we propose a novel face sketch synthesis based on the Markov random neural fields including two structures. In the first structure, we utilize the neural network to learn the non-linear photo-sketch relationship and obtain the identity-specific information of the test photo, such as glasses, hairpins and hairstyles. In the second structure, we choose the nearest neighbors of the test photo patch and the sketch pixel synthesized in the first structure from the training data which ensure the common information of Miss or Mr Average. Experimental results on the Chinese University of Hong Kong face sketch database illustrate that our proposed framework can preserve the common structure and capture the characteristic features. Compared with the state-of-the-art methods, our method achieves better results in terms of both quantitative and qualitative experimental evaluations.
#4164

Where Have You Been? Inferring Career Trajectory from Academic Social Network
Kan Wu, Jie Tang, Chenhui Zhang

Graphical Models

A person’s career trajectory is composed of her/his past work or educational affiliations (institutions) at different points of times. Knowing people’s, especially scholars’, career trajectories can help the government make more scientific strategies to allocate resources and attract talent and help companies make smart recruiting plans. It could also support individuals find appropriate co-researchers or job opportunities. The paper focuses on inferring career trajectories in the academic social network. For about 1/3 of authors not having any affiliations in the dataset, we need to infer the missings at various years. Traditional affiliation/location inferring methods focus on inferring a stationary location (one and only) for a person. Nevertheless, people won’t stay at a place all their lives. We propose a Space-Time Factor Graph Model (STFGM) incorporating spatial and temporal correlations to fulfill the challenging and new task of inferring temporal locations. Experiments show our approach significantly outperforms baselines. At last, as case study, we develop several applications based on our approach which demonstrate the effectiveness further.
#4359

Structured Inference for Recurrent Hidden Semi-markov Model
Hao Liu, Lirong He, Haoli Bai, Bo Dai, Kun Bai, Zenglin Xu

Graphical Models

Segmentation and labeling for high dimensional time series is an important yet challenging task in a number of applications, such as behavior understanding and medical diagnosis. Recent advances to model the nonlinear dynamics in such time series data, has suggested to involve recurrent neural networks into Hidden Markov Models. However, this involvement has caused the inference procedure much more complicated, often leading to intractable inference, especially for the discrete variables of segmentation and labeling. To achieve both flexibility and tractability in modeling nonlinear dynamics of discrete variables, we present a structured and stochastic sequential neural network (SSNN), which composes with a generative network and an inference network. In detail, the generative network aims to not only capture the long-term dependencies but also model the uncertainty of the segmentation labels via semi-Markov models. More importantly, for efficient and accurate inference, the proposed bi-directional inference network reparameterizes the categorical segmentation with the Gumbel-Softmax approximation and resorts to the Stochastic Gradient Variational Bayes. We evaluate the proposed model in a number of tasks, including speech modeling, automatic segmentation and labeling in behavior understanding, and sequential multi-objects recognition. Experimental results have demonstrated that our proposed model can achieve significant improvement over the state-of-the-art methods.
#1627

Patent Litigation Prediction: A Convolutional Tensor Factorization Approach
Qi Liu, Han Wu, Yuyang Ye, Hongke Zhao, Chuanren Liu, Dongfang Du

Graphical Models

Patent litigation is an expensive legal process faced by many companies. To reduce the cost of patent litigation, one effective approach is proactive management based on predictive analysis. However, automatic prediction of patent litigation is still an open problem due to the complexity of lawsuits. In this paper, we propose a data-driven framework, Convolutional Tensor Factorization (CTF), to identify the patents that may cause litigations between two companies. Specifically, CTF is a hybrid modeling approach, where the content features from the patents are represented by the Network embedding-combined Convolutional Neural Network (NCNN) and the lawsuit records of companies are summarized in a tensor, respectively. Then, CTF integrates NCNN and tensor factorization to systematically exploit both content information and collaborative information from large amount of data. Finally, the risky patents will be returned by a learning to rank strategy. Extensive experimental results on real-world data demonstrate the effectiveness of our framework.
#4153

Building Sparse Deep Feedforward Networks using Tree Receptive Fields
Xiaopeng Li, Zhourong Chen, Nevin L. Zhang

Graphical Models

Sparse connectivity is an important factor behind the success of convolutional neural networks and recurrent neural networks. In this paper, we consider the problem of learning sparse connectivity for feedforward neural networks (FNNs). The key idea is that a unit should be connected to a small number of units at the next level below that are strongly correlated. We use Chow-Liu's algorithm to learn a tree-structured probabilistic model for the units at the current level, use the tree to identify subsets of units that are strongly correlated, and introduce a new unit with receptive field over the subsets. The procedure is repeated on the new units to build multiple layers of hidden units. The resulting model is called a TRF-net. Empirical results show that, when compared to dense FNNs, TRF-net achieves better or comparable classification performance with much fewer parameters and sparser structures. They are also more interpretable.

Monday 16 14:55 - 16:25 ML-TAM1 - Transfer, Adaptation, Multi-Task Learning 1 (K11)

Chair: Yu-Feng Li

#2483

MUSCAT: Multi-Scale Spatio-Temporal Learning with Application to Climate Modeling
Jianpeng Xu, Xi Liu, Tyler Wilson, Pang-Ning Tan, Pouyan Hatami, Lifeng Luo

Transfer, Adaptation, Multi-Task Learning 1

In climate and environmental sciences, vast amount of spatio-temporal data have been generated at varying spatial resolutions from satellite observations and computer models. Integrating such diverse sources of data has proven to be useful for building prediction models as the multi-scale data may capture different aspects of the Earth system. In this paper, we present a novel framework called MUSCAT for predictive modeling of multi-scale, spatio-temporal data. MUSCAT performs a joint decomposition of multiple tensors from different spatial scales, taking into account the relationships between the variables. The latent factors derived from the joint tensor decomposition are used to train the spatial and temporal prediction models at different scales for each location. The outputs from these ensemble of spatial and temporal models will be aggregated to generate future predictions. An incremental learning algorithm is also proposed to handle the massive size of the tensors. Experimental results on real-world data from the United States Historical Climate Network (USHCN) showed that MUSCAT outperformed other competing methods in more than 70\% of the locations.
#2306

Unsupervised Cross-Modality Domain Adaptation of ConvNets for Biomedical Image Segmentations with Adversarial Loss
Qi Dou, Cheng Ouyang, Cheng Chen, Hao Chen, Pheng-Ann Heng

Transfer, Adaptation, Multi-Task Learning 1

Convolutional networks (ConvNets) have achieved great successes in various challenging vision tasks. However, the performance of ConvNets would degrade when encountering the domain shift. The domain adaptation is more significant while challenging in the field of biomedical image analysis, where cross-modality data have largely different distributions. Given that annotating the medical data is especially expensive, the supervised transfer learning approaches are not quite optimal. In this paper, we propose an unsupervised domain adaptation framework with adversarial learning for cross-modality biomedical image segmentations. Specifically, our model is based on a dilated fully convolutional network for pixel-wise prediction. Moreover, we build a plug-and-play domain adaptation module (DAM) to map the target input to features which are aligned with source domain feature space. A domain critic module (DCM) is set up for discriminating the feature space of both domains. We optimize the DAM and DCM via an adversarial loss without using any target domain label. Our proposed method is validated by adapting a ConvNet trained with MRI images to unpaired CT data for cardiac structures segmentations, and achieved very promising results.
#2818

Distance Metric Facilitated Transportation between Heterogeneous Domains
Han-Jia Ye, Xiang-Rong Sheng, De-Chuan Zhan, Peng He

Transfer, Adaptation, Multi-Task Learning 1

Lacking training examples is one of the main obstacles to learning systems. Transfer learning aims to extract and utilize useful information from related datasets and assists the current task effectively. Most existing methods restrict tasks connection on the same feature sets, or require aligned examples cross domains, even cannot take full advantage of the limited label information. In this paper, we focus on transferring between heterogeneous domains, i.e., those with different feature spaces, and propose the Metric Transporation on HEterogeneous REpresentations (MapHere) approach. In particular, an asymmetric transformation map is first learned to compensate the cross-domain feature difference based on linkage relationship between objects; then the inner-domain discrepancy is further reduced with learned optimal transportation. Note that both source domain and cross-domain relationship are fully utilized in MapHere, which helps improve target classification task a lot. Experiments on synthetic dataset validate the importance of the ''metric facilitated'' consideration, while results on real-world image and text classification also show the superiority of the proposed MapHere approach.
#4096

Summarizing Source Code with Transferred API Knowledge
Xing Hu, Ge Li, Xin Xia, David Lo, Shuai Lu, Zhi Jin

Transfer, Adaptation, Multi-Task Learning 1

Code summarization, aiming to generate succinct natural language description of source code, is extremely useful for code search and code comprehension. It has played an important role in software maintenance and evolution. Previous approaches generate summaries by retrieving summaries from similar code snippets. However, these approaches heavily rely on whether similar code snippets can be retrieved, how similar the snippets are, and fail to capture the API knowledge in the source code, which carries vital information about the functionality of the source code. In this paper, we propose a novel approach, named TL-CodeSum, which successfully uses API knowledge learned in a different but related task to code summarization. Experiments on large-scale real-world industry Java projects indicate that our approach is effective and outperforms the state-of-the-art in code summarization.
#1674

Multi-Task Clustering with Model Relation Learning
Xiaotong Zhang, Xianchao Zhang, Han Liu, Jiebo Luo

Transfer, Adaptation, Multi-Task Learning 1

Multi-task clustering improves the clustering performance of each task by transferring knowledge among the related tasks. An important aspect of multi-task clustering is to assess the task relatedness. However, to our knowledge, only two previous works have assessed the task relatedness, but they both have limitations. In this paper, we propose a multi-task clustering with model relation learning (MTCMRL) method, which automatically learns the model parameter relatedness between each pair of tasks. The objective function of MTCMRL consists of two parts: (1) within-task clustering: clustering each task by introducing linear regression model into symmetric nonnegative matrix factorization; (2) cross-task relatedness learning: updating the parameter of the linear regression model in each task by learning the model parameter relatedness between the clusters in each pair of tasks. We present an effective alternating algorithm to solve the non-convex optimization problem. Experimental results show the superiority of the proposed method over traditional single-task clustering methods and existing multi-task clustering methods.
#4385

Semi-Supervised Optimal Transport for Heterogeneous Domain Adaptation
Yuguang Yan, Wen Li, Hanrui Wu, Huaqing Min, Mingkui Tan, Qingyao Wu

Transfer, Adaptation, Multi-Task Learning 1

Heterogeneous domain adaptation (HDA) aims to exploit knowledge from a heterogeneous source domain to improve the learning performance in a target domain. Since the feature spaces of the source and target domains are different, the transferring of knowledge is extremely difficult. In this paper, we propose a novel semi-supervised algorithm for HDA by exploiting the theory of optimal transport (OT), a powerful tool originally designed for aligning two different distributions. To match the samples between heterogeneous domains, we propose to preserve the semantic consistency between heterogeneous domains by incorporating label information into the entropic Gromov-Wasserstein discrepancy, which is a metric in OT for different metric spaces, resulting in a new semi-supervised scheme. Via the new scheme, the target and transported source samples with the same label are enforced to follow similar distributions. Lastly, based on the Kullback-Leibler metric, we develop an efficient algorithm to optimize the resultant problem. Comprehensive experiments on both synthetic and real-world datasets demonstrate the effectiveness of our proposed method.
#4027

Social Media based Simulation Models for Understanding Disease Dynamics
Ting Hua, Chandan K Reddy, Lei Zhang, Lijing Wang, Liang Zhao, Chang-Tien Lu, Naren Ramakrishnan

Transfer, Adaptation, Multi-Task Learning 1

In this modern era, infectious diseases, such as H1N1, SARS, and Ebola, are spreading much faster than any time in history. Efficient approaches are therefore desired to monitor and track the diffusion of these deadly epidemics. Traditional computational epidemiology models are able to capture the disease spreading trends through contact network, however, one unable to provide timely updates via real-world data. In contrast, techniques focusing on emerging social media platforms can collect and monitor real-time disease data, but do not provide an understanding of the underlying dynamics of ailment propagation. To achieve efficient and accurate real-time disease prediction, the framework proposed in this paper combines the strength of social media mining and computational epidemiology. Specifically, individual health status is first learned from user's online posts through Bayesian inference, disease parameters are then extracted for the computational models at population-level, and the outputs of computational epidemiology model are inversely fed into social media data based models for further performance improvement. In various experiments, our proposed model outperforms current disease forecasting approaches with better accuracy and more stability.

Monday 16 14:55 - 16:35 EAR2 - Early Career 2 (VICTORIA)

Chair: Makoto Yokoo

#5479

Towards Improving the Expressivity and Scalability of Distributed Constraint Optimization Problems
William Yeoh

Early Career 2

Constraints have long been studied in centralized systems and have proven to be practical and efficient for modeling and solving resource allocation and scheduling problems. Slightly more than a decade ago, researchers proposed the distributed constraint optimization problem (DCOP) formulation, which is well suited for modeling distributed multi-agent coordination problems. In this paper, we highlight some of our recent contributions that are aiming towards improved expressivity of the DCOP model as well as improved scalability of the accompanying algorithms.
#5493

Formal Analysis of Deep Binarized Neural Networks
Nina Narodytska

Early Career 2

Understanding properties of deep neural networks is an important challenge in deep learning. Deep learning networks are among the most successful artificial intelligence technologies that is making impact in a variety of practical applications. However, many concerns were raised about `magical' power of these networks. It is disturbing that we are really lacking of understanding of the decision making process behind this technology. Therefore, a natural question is whether we can trust decisions that neural networks make. One way to address this issue is to define properties that we want a neural network to satisfy. Verifying whether a neural network fulfills these properties sheds light on the properties of the function that it represents. In this work, we take the verification approach. Our goal is to design a framework for analysis of properties of neural networks. We start by defining a set of interesting properties to analyze. Then we focus on Binarized Neural Networks that can be represented and analyzed using well-developed means of Boolean Satisfiability and Integer Linear Programming. One of our main results is an exact representation of a binarized neural network as a Boolean formula. We also discuss how we can take advantage of the structure of neural networks in the search procedure.
#5484

Reasoning about NP-complete Constraints
Emmanuel Hebrard

Early Career 2

The concept of local consistency – making global deductions from local infeasibility – is central to constraint programming. When reasoning about NP-complete constraints, however, since achieving a ``complete'' form of local consistency is often considered too hard, we need other tools to design and analyze propagation algorithms. In this paper, we argue that NP-complete constraints are an essential part of constraint programming, that designing dedicated methods has lead to, and will bring, significant breakthroughs, and that we need to carefully investigate methods to deal about a necessarily incomplete inference. In particular, we advocate the use of fixed-parameter tractability and kernelization to this purpose.

Monday 16 16:40 - 17:55 EAR3 - Early Career 3 (VICTORIA)

Chair: Pierre Marquis

#5485

Natural Language Understanding: Instructions for (Present and Future) Use
Roberto Navigli

Early Career 3

In this paper I look at Natural Language Understanding, an area of Natural Language Processing aimed at making sense of text, through the lens of a visionary future: what do we expect a machine should be able to understand? and what are the key dimensions that require the attention of researchers to make this dream come true?
#5491

Improving Data Management using Domain Knowledge
Magdalena Ortiz

Early Career 3

The development of tools and techniques for flexible and reliable data management is a long-standing challenge, ever more pressing in today’s data-rich world. We advocate using domain knowledge expressed in ontologies to tackle it, and summarize some research efforts to this aim that follow two directions. First, we consider the problem of ontology-mediated query answering (OMQA), where queries in a standard database query language are enriched with an ontology expressing background knowledge about the domain of interest, used to retrieve more complete answers when querying incomplete data. We discuss some of our contributions to OMQA, focusing on (i) expressive languages for OMQA, with emphasis on combining the open- and closed-world assumptions to reason about partially complete data; and (ii) OMQA algorithms based on rewriting techniques. The second direction we discuss proposes to use ontologies to manage evolving data. In particular, we use ontologies to model and reason about constraints on datasets, effects of operations that modify data, and the integrity of the data as it evolves.
#5483

Artificial Argumentation for Humans
Serena Villata

Early Career 3

The latest years have seen an increasing interest in the topic of Artificial Intelligence (AI), the challenges it is facing, and the recent advances it has achieved, e.g., intelligent personal assistants. Differently from the past, where research on AI was mainly confined in research labs, the topic is now attracting interest from a wider audience, including policy-makers, information technology companies, and philosophers. Alas, these advances have also raised a number of concerns on AI’s social, economic, and legal impact. Hence, the definition of design principles and automated methods to support transparent intelligent machine deliberation is highly desirable. Argumentation is important for handling conflicting beliefs, assumptions, opinions, goals, and many other mental attitudes. Argumentation pervades human intelligent behavior, and I believe that it is a mandatory element to conceive autonomous artificial machines that can exploit argumentation models and tools in the cognitive tasks they are required to carry out. Results in this area will allow reducing the gap between humans and machines towards a good AI hybrid society.

Monday 16 16:40 - 18:05 MUL-CAU - Causality (C7)

Chair: Tommie Meyer

#286

Actual Causality in a Logical Setting
Alexander Bochman

Causality

We provide a definition of actual causation in the logical framework of the causal calculus, which is based on a causal version of the well-known NESS (or INUS) condition. We compare our definition with other, mainly counterfactual, approaches on standard examples. On the way, we explore general capabilities of the logical representation for structural equation models of causation and beyond.
#1571

Causal Inference in Time Series via Supervised Learning
Yoichi Chikahara, Akinori Fujino

Causality

Causal inference in time series is an important problem in many fields. Traditional methods use regression models for this problem. The inference accuracies of these methods depend greatly on whether or not the model can be well fitted to the data, and therefore we are required to select an appropriate regression model, which is difficult in practice. This paper proposes a supervised learning framework that utilizes a classifier instead of regression models. We present a feature representation that employs the distance between the conditional distributions given past variable values and show experimentally that the feature representation provides sufficiently different feature vectors for time series with different causal relationships. Furthermore, we extend our framework to multivariate time series and present experimental results where our method outperformed the model-based methods and the supervised learning method for i.i.d. data.
#2249

Counterfactual Resimulation for Causal Analysis of Rule-Based Models
Jonathan Laurent, Jean Yang, Walter Fontana

Causality

Models based on rules that express local and heterogeneous mechanisms of stochastic interactions between structured agents are an important tool for investigating the dynamical behavior of complex systems, especially in molecular biology. Given a simulated trace of events, the challenge is to construct a causal diagram that explains how a phenomenon of interest occurred. Counterfactual analysis can provide distinctive insights, but its standard definition is not applicable in rule-based models because they are not readily expressible in terms of structural equations. We provide a semantics of counterfactual statements that addresses this challenge by sampling counterfactual trajectories that are probabilistically as close to the factual trace as a given intervention permits them to be. We then show how counterfactual dependencies give rise to explanations in terms of relations of enablement and prevention between events.
#2748

From the Periphery to the Core: Information Brokerage in an Evolving Network
Bo Yan, Yiping Liu, Jiamou Liu, Yijin Cai, Hongyi Su, Hong Zheng

Causality

Interpersonal ties are pivotal to individual efficacy, status and performance in an agent society.This paper explores three important and interrelated themes in social network theory: the center/periphery partition of the network; network dynamics; and social integration of newcomers. We tackle the question: How would a newcomer harness information brokerage to integrate into a dynamic network going from periphery to center? We model integration as the interplay between the newcomer and the dynamics network and capture information brokerage using a process of relationship building. We analyze theoretical guarantees for the newcomer to reach the center through tactics; proving that a winning tactic always exists for certain types of network dynamics. We then propose three tactics and show their superior performance over alternative methods on four real-world datasets and four network models. In general, our tactics place the newcomer to the center by adding very few new edges on dynamic networks with ~14000 nodes.
#4010

Scalable Probabilistic Causal Structure Discovery
Dhanya Sridhar, Jay Pujara, Lise Getoor

Causality

Complex causal networks underlie many real-world problems, from the regulatory interactions between genes to the environmental patterns used to understand climate change. Computational methods seek to infer these causal networks using observational data and domain knowledge. In this paper, we identify three key requirements for inferring the structure of causal networks for scientific discovery: (1) robustness to noise in observed measurements; (2) scalability to handle hundreds of variables; and (3) flexibility to encode domain knowledge and other structural constraints. We first formalize the problem of joint probabilistic causal structure discovery. We develop an approach using probabilistic soft logic (PSL) that exploits multiple statistical tests, supports efficient optimization over hundreds of variables, and can easily incorporate structural constraints, including imperfect domain knowledge. We compare our method against multiple well-studied approaches on biological and synthetic datasets, showing improvements of up to 20% in F1-score over the best performing baseline in realistic settings.
#4239

A Graphical Criterion for Effect Identification in Equivalence Classes of Causal Diagrams
Amin Jaber, Jiji Zhang, Elias Bareinboim

Causality

Computing the effects of interventions from observational data is an important task encountered in many data-driven sciences. The problem is addressed by identifying the post-interventional distribution with an expression that involves only quantities estimable from the pre-interventional distribution over observed variables, given some knowledge about the causal structure. In this work, we relax the requirement of having a fully specified causal structure and study the identifiability of effects with a singleton intervention (X), supposing that the structure is known only up to an equivalence class of causal diagrams, which is the output of standard structural learning algorithms (e.g., FCI). We derive a necessary and sufficient graphical criterion for the identifiability of the effect of X on all observed variables. We further establish a sufficient graphical criterion to identify the effect of X on a subset of the observed variables, and prove that it is strictly more powerful than the current state-of-the-art result on this problem.
#4388

On the Conditional Logic of Simulation Models
Duligur Ibeling, Thomas Icard

Causality

We propose analyzing conditional reasoning by appeal to a notion of intervention on a simulation program, formalizing and subsuming a number of approaches to conditional thinking in the recent AI literature. Our main results include a series of axiomatizations, allowing comparison between this framework and existing frameworks (normality-ordering models, causal structural equation models), and a complexity result establishing NP-completeness of the satisfiability problem. Perhaps surprisingly, some of the basic logical principles common to all existing approaches are invalidated in our causal simulation approach. We suggest that this additional flexibility is important in modeling some intuitive examples.

Monday 16 16:40 - 18:05 MAS-SOC - Computational Social Choice (C8)

Chair: Felix Brandt

#1614

Preference Orders on Families of Sets - When Can Impossibility Results Be Avoided?
Jan Maly, Miroslaw Truszczynski, Stefan Woltran

Computational Social Choice

Lifting a preference order on elements of some universe to a preference order on subsets of this universe is often guided by postulated properties the lifted order should have. Well-known impossibility results pose severe limits on when such liftings exist if all non-empty subsets of the universe are to be ordered. The extent to which these negative results carry over to other families of sets is not known. In this paper, we consider families of sets that induce connected subgraphs in graphs. For such families, common in applications, we study whether lifted orders satisfying the well-studied axioms of dominance and (strict) independence exist for every or, in another setting, only for some underlying order on elements (strong and weak orderability). We characterize families that are strongly and weakly orderable under dominance and strict independence, and obtain a tight bound on the class of families that are strongly orderable under dominance and independence.
#2061

Service Exchange Problem
Julien Lesca, Taiki Todo

Computational Social Choice

In this paper, we study the service exchange problem where each agent is willing to provide her service in order to receive in exchange the service of someone else. We assume that agent's preference depends both on the service that she receives and the person who receives her service. This framework is an extension of the housing market problem to preferences including a degree of externalities. We investigate the complexity of computing an individually rational and Pareto efficient allocation of services to agents for ordinal preferences, and the complexity of computing an allocation which maximizes either the utility sum or the utility of the least served agent for cardinal preferences.
#2086

Computational Social Choice Meets Databases
Benny Kimelfeld, Phokion G. Kolaitis, Julia Stoyanovich

Computational Social Choice

We develop a novel framework that aims to create bridges between the computational social choice and the database management communities. This framework enriches the tasks currently supported in computational social choice with relational database context, thus making it possible to formulate sophisticated queries about voting rules, candidates, voters, issues, and positions. At the conceptual level, we give rigorous semantics to queries in this framework by introducing the notions of necessary answers and possible answers to queries. At the technical level, we embark on an investigation of the computational complexity of the necessary answers. In particular, we establish a number of results about the complexity of the necessary answers of conjunctive queries involving the plurality rule that contrast sharply with earlier results about the complexity of the necessary winners under the plurality rule.
#2922

When Rigging a Tournament, Let Greediness Blind You
Sushmita Gupta, Sanjukta Roy, Saket Saurabh, Meirav Zehavi

Computational Social Choice

A knockout tournament is a standard format of competition, ubiquitous in sports, elections and decision making. Such a competition consists of several rounds. In each round, all players that have not yet been eliminated are paired up into matches. Losers are eliminated, and winners are raised to the next round, until only one winner exists. Given that we can correctly predict the outcome of each potential match (modelled by a tournament D), a seeding of the tournament deterministically determines its winner. Having a favorite player v in mind, the Tournament Fixing Problem (TFP) asks whether there exists a seeding that makes v the winner. Aziz et al. [AAAI’14] showed that TFP is NP-hard. They initiated the study of the parameterized complexity of TFP with respect to the feedback arc set number k of D, and gave an XP-algorithm (which is highly inefficient). Recently, Ramanujan and Szeider [AAAI’17] showed that TFP admits an FPT algorithm, running in time 2^{ O(k^2 log k)} n ^{O(1)}. At the heart of this algorithm is a translation of TFP into an algebraic system of equations, solved in a black box fashion (by an ILP solver). We present a fresh, purely combinatorial greedy solution. We rely on new insights into TFP itself, which also results in the better running time bound of 2^{ O(k log k)} n^{ O(1)} . While our analysis is intricate, the algorithm itself is surprisingly simple.
#3113

Winning a Tournament by Any Means Necessary
Sushmita Gupta, Sanjukta Roy, Saket Saurabh, Meirav Zehavi

Computational Social Choice

In a tournament, $n$ players enter the competition. In each round, they are paired-up to compete against each other. Losers are thrown, while winners proceed to the next round, until only one player (the winner) is left. Given a prediction of the outcome, for every pair of players, of a match between them (modeled by a digraph $D$), the competitive nature of a tournament makes it attractive for manipulators. In the Tournament Fixing (TF) problem, the goal is to decide if we can conduct the competition (by controlling how players are paired-up) so that our favorite player $w$ wins. A common form of manipulation is to bribe players to alter the outcome of matches. Kim and Williams [IJCAI 2015] integrated such deceit into TF, and showed that the resulting problem is NP-hard when $\ell<(1-\epsilon)\log n$ alterations are possible (for any fixed $\epsilon>0$). For this problem, our contribution is fourfold. First, we present two operations that ``obfuscate deceit'': given one solution, they produce another solution. Second, we present a combinatorial result, stating that there is always a solution with all reversals incident to $w$ and ``elite players''. Third, we give a closed formula for the case where $D$ is a DAG. Finally, we present exact exponential-time and parameterized algorithms for the general case.
#3606

A Structural Approach to Activity Selection
Eduard Eiben, Robert Ganian, Sebastian Ordyniak

Computational Social Choice

The general task of finding an assignment of agents to activities under certain stability and rationality constraints has led to the introduction of two prominent problems in the area of computational social choice: Group Activity Selection (GASP) and Stable Invitations (SIP). Here we introduce and study the Comprehensive Activity Selection Problem, which naturally generalizes both of these problems. In particular, we apply the parameterized complexity paradigm, which has already been successfully employed for SIP and GASP. While previous work has focused strongly on parameters such as solution size or number of activities, here we focus on parameters which capture the complexity of agent-to-agent interactions. Our results include a comprehensive complexity map for CAS under various restrictions on the number of activities in combination with restrictions on the complexity of agent interactions.
#3513

Deep Learning for Multi-Facility Location Mechanism Design
Noah Golowich, Harikrishna Narasimhan, David C. Parkes

Computational Social Choice

Moulin [1980] characterizes the single-facility, deterministic strategy-proof mechanisms for social choice with single-peaked preferences as the set of generalized median rules. In contrast, we have only a limited understanding of multi-facility strategy-proof mechanisms, and recent work has shown negative worst case results for social cost. Our goal is to design strategy-proof, multi-facility mechanisms that minimize expected social cost. We first give a PAC learnability result for the class of multi-facility generalized median rules, and utilize neural networks to learn mechanisms from this class. Even in the absence of characterization results, we develop a computational procedure for learning almost strategy-proof mechanisms that are as good as or better than benchmarks from the literature, such as the best percentile and dictatorial rules.

Monday 16 16:40 - 18:05 ML-LT - Learning Theory (K2)

Chair: Yann Chevaleyre

#366

Differential Equations for Modeling Asynchronous Algorithms
Li He, Qi Meng, Wei Chen, Zhi-Ming Ma, Tie-Yan Liu

Learning Theory

Asynchronous stochastic gradient descent (ASGD) is a popular parallel optimization algorithm in machine learning. Most theoretical analysis on ASGD take a discrete view and prove upper bounds for their convergence rates. However, the discrete view has its intrinsic limitations: there is no characterizationof the optimization path and the proof techniques are induction-based and thus usually complicated. Inspired by the recent successful adoptions of stochastic differential equations (SDE) to the theoretical analysis of SGD, in this paper, we study the continuous approximation of ASGD by using stochastic differential delay equations (SDDE). We introduce the approximation method and study the approximation error. Then we conduct theoretical analysis on the convergence rate of ASGD algorithm based on the continuous approximation.There are two methods: moment estimation and energy function minimization can be used to analyzethe convergence rates. Moment estimation depends on the specific form of the loss function, while energy function minimization only leverages the convex property of the loss function, and does not depend on its specific form. In addition to the convergence analysis, the continuous view also helps us derive better convergence rates. All of this clearly shows the advantage of taking the continuous view in gradient descent algorithms.
#2040

On the Convergence Properties of a K-step Averaging Stochastic Gradient Descent Algorithm for Nonconvex Optimization
Fan Zhou, Guojing Cong

Learning Theory

We adopt and analyze a synchronous K-step averaging stochastic gradient descent algorithm which we call K-AVG for solving large scale machine learning problems. We establish the convergence results of K-AVG for nonconvex objectives. Our analysis of K-AVG applies to many existing variants of synchronous SGD. We explain why the K-step delay is necessary and leads to better performance than traditional parallel stochastic gradient descent which is equivalent to K-AVG with $K=1$. We also show that K-AVG scales better with the number of learners than asynchronous stochastic gradient descent (ASGD). Another advantage of K-AVG over ASGD is that it allows larger stepsizes and facilitates faster convergence. On a cluster of $128$ GPUs, K-AVG is faster than ASGD implementations and achieves better accuracies and faster convergence for training with the CIFAR-10 dataset.
#547

Quantum Divide-and-Conquer Anchoring for Separable Non-negative Matrix Factorization
Yuxuan Du, Tongliang Liu, Yinan Li, Runyao Duan, Dacheng Tao

Learning Theory

It is NP-complete to find non-negative factors W and H with fixed rank r from a non-negative matrix X by minimizing ||X-WH^Τ ||^2. Although the separability assumption (all data points are in the conical hull of the extreme rows) enables polynomial-time algorithms, the computational cost is not affordable for big data. This paper investigates how the power of quantum computation can be capitalized to solve the non-negative matrix factorization with the separability assumption (SNMF) by devising a quantum algorithm based on the divide-and-conquer anchoring (DCA) scheme [Zhou et al., 2013]. The design of quantum DCA (QDCA) is challenging. In the divide step, the random projections in DCA is completed by a quantum algorithm for linear operations, which achieves the exponential speedup. We then devise a heuristic post-selection procedure which extracts the information of anchors stored in the quantum states efficiently. Under a plausible assumption, QDCA performs efficiently, achieves the quantum speedup, and is beneficial for high dimensional problems.
#807

A Generic Approach for Accelerating Stochastic Zeroth-Order Convex Optimization
Xiaotian Yu, Irwin King, Michael R. Lyu, Tianbao Yang

Learning Theory

In this paper, we propose a generic approach for accelerating the convergence of existing algorithms to solve the problem of stochastic zeroth-order convex optimization (SZCO). Standard techniques for accelerating the convergence of stochastic zeroth-order algorithms are by exploring multiple functional evaluations (e.g., two-point evaluations), or by exploiting global conditions of the problem (e.g., smoothness and strong convexity). Nevertheless, these classic acceleration techniques are necessarily restricting the applicability of newly developed algorithms. The key of our proposed generic approach is to explore a local growth condition (or called local error bound condition) of the objective function in SZCO. The benefits of the proposed acceleration technique are: (i) it is applicable to both settings with one-point evaluation and two-point evaluations; (ii) it does not necessarily require strong convexity or smoothness condition of the objective function; (iii) it yields an improvement on convergence for a broad family of problems. Empirical studies in various settings demonstrate the effectiveness of the proposed acceleration approach.
#938

De-biasing Covariance-Regularized Discriminant Analysis
Haoyi Xiong, Wei Cheng, Yanjie Fu, Wenqing Hu, Jiang Bian, Zhishan Guo

Learning Theory

Fisher's Linear Discriminant Analysis (FLD) is a well-known technique for linear classification, feature extraction and dimension reduction. The empirical FLD relies on two key estimations from the data -- the mean vector for each class and the (inverse) covariance matrix. To improve the accuracy of FLD under the High Dimension Low Sample Size (HDLSS) settings, Covariance-Regularized FLD (CRLD) has been proposed to use shrunken covariance estimators, such as Graphical Lasso, to strike a balance between biases and variances. Though CRLD could obtain better classification accuracy, it usually incurs bias and converges to the optimal result with a slower asymptotic rate. Inspired by the recent progress in de-biased Lasso, we propose a novel FLD classifier, DBLD, which improves classification accuracy of CRLD through de-biasing. Theoretical analysis shows that DBLD possesses better asymptotic properties than CRLD. We conduct experiments on both synthetic datasets and real application datasets to confirm the correctness of our theoretical analysis and demonstrate the superiority of DBLD over classical FLD, CRLD and other downstream competitors under HDLSS settings.
#3236

Interactive Optimal Teaching with Unknown Learners
Francisco S. Melo, Carla Guerra, Manuel Lopes

Learning Theory

This paper introduces a new approach for machine teaching that partly addresses the (unavoidable) mismatch between what the teacher assumes about the learning process of the student and the actual process. We analyze several situations in which such mismatch takes place, including when the student?s learning algorithm is known but the corresponding parameters are not, and when the learning algorithm itself is not known. Our analysis is focused on the case of a Bayesian Gaussian learner, and we show that, even in this simple case, the lack of knowledge regarding the student?s learning process significantly deteriorates the performance of machine teaching: while perfect knowledge of the student ensures that the target is learned after a finite number of samples, lack of knowledge thereof implies that the student will only learn asymptotically (i.e., after an infinite number of samples). We introduce interactivity as a means to mitigate the impact of imperfect knowledge and show that, by using interactivity, we are able to recover finite learning time, in the best case, or significantly faster convergence, in the worst case. Finally, we discuss the extension of our analysis to a classification problem using linear discriminant analysis, and discuss the implications of our results in single- and multi-student settings.
#4021

Generalization-Aware Structured Regression towards Balancing Bias and Variance
Martin Pavlovski, Fang Zhou, Nino Arsov, Ljupco Kocarev, Zoran Obradovic

Learning Theory

Attaining the proper balance between underfitting and overfitting is one of the central challenges in machine learning. It has been approached mostly by deriving bounds on generalization risks of learning algorithms. Such bounds are, however, rarely controllable. In this study, a novel bias-variance balancing objective function is introduced in order to improve generalization performance. By utilizing distance correlation, this objective function is able to indirectly control a stability-based upper bound on a model's expected true risk. In addition, the Generalization-Aware Collaborative Ensemble Regressor (GLACER) is developed, a model that bags a crowd of structured regression models, while allowing them to collaborate in a fashion that minimizes the proposed objective function. The experimental results on both synthetic and real-world data indicate that such an objective enhances the overall model's predictive performance. When compared against a broad range of both traditional and structured regression models GLACER was ~10-56% and ~49-99% more accurate for the task of predicting housing prices and hospital readmissions, respectively.

Monday 16 16:40 - 18:05 NLP-DIA1 - Dialogue, Conversation Models (T2)

Chair: Xiaojuan Ma

#2384

Get The Point of My Utterance! Learning Towards Effective Responses with Multi-Head Attention Mechanism
Chongyang Tao, Shen Gao, Mingyue Shang, Wei Wu, Dongyan Zhao, Rui Yan

Dialogue, Conversation Models

Attention mechanism has become a popular and widely used component in sequence-to-sequence models. However, previous research on neural generative dialogue systems always generates universal responses, and the attention distribution learned by the model always attends to the same semantic aspect. To solve this problem, in this paper, we propose a novel Multi-Head Attention Mechanism (MHAM) for generative dialog systems, which aims at capturing multiple semantic aspects from the user utterance. Further, a regularizer is formulated to force different attention heads to concentrate on certain aspects. The proposed mechanism leads to more informative, diverse, and relevant response generated. Experimental results show that our proposed model outperforms several strong baselines.
#3483

Learning to Converse with Noisy Data: Generation with Calibration
Mingyue Shang, Zhenxin Fu, Nanyun Peng, Yansong Feng, Dongyan Zhao, Rui Yan

Dialogue, Conversation Models

The availability of abundant conversational data on the Internet brought prosperity to the generation-based open domain conversation systems. In the training of the generation models, existing methods generally treat all the training data equivalently. However, the data crawled from the websites may contain many noises. Blindly training with the noisy data could harm the performance of the final generation model. In this paper, we propose a generation with calibration framework, that allows high- quality data to have more influences on the generation model and reduces the effect of noisy data. Specifically, for each instance in training set, we employ a calibration network to produce a quality score for it, then the score is used for the weighted update of the generation model parameters. Experiments show that the calibrated model outperforms baseline methods on both automatic evaluation metrics and human annotations.
#4533

Smarter Response with Proactive Suggestion: A New Generative Neural Conversation Paradigm
Rui Yan, Dongyan Zhao

Dialogue, Conversation Models

Conversational systems are becoming more and more promising by playing an important role in human-computer communications. A conversational system is supposed to be intelligent to enable human-like interactions. The long-term goal of smart human-computer conversations is challenging and heavily driven by data. Thanks to the prosperity of Web 2.0, a large volume of conversational data become available to establish human-computer conversational systems. Given a human issued message, namely a query, a traditional conversational system would provide a response after proper training of how to respond like humans. In this paper, we propose a new paradigm for neural generative conversations: smarter response with a suggestion is provided given the query. We assume that the new conversation mode which proactively introduces contents as next utterances, keeping user actively engaged. To address the task, we propose a novel integrated model to handle both the response generation and the suggestion generation. From the experimental results, we verify the effectiveness of the new neural generative conversation paradigm.
#2739

Commonsense Knowledge Aware Conversation Generation with Graph Attention
Hao Zhou, Tom Young, Minlie Huang, Haizhou Zhao, Jingfang Xu, Xiaoyan Zhu

Dialogue, Conversation Models

Commonsense knowledge is vital to many natural language processing tasks. In this paper, we present a novel open-domain conversation generation model to demonstrate how large-scale commonsense knowledge can facilitate language understanding and generation. Given a user post, the model retrieves relevant knowledge graphs from a knowledge base and then encodes the graphs with a static graph attention mechanism, which augments the semantic information of the post and thus supports better understanding of the post. Then, during word generation, the model attentively reads the retrieved knowledge graphs and the knowledge triples within each graph to facilitate better generation through a dynamic graph attention mechanism. This is the first attempt that uses large-scale commonsense knowledge in conversation generation. Furthermore, unlike existing models that use knowledge triples (entities) separately and independently, our model treats each knowledge graph as a whole, which encodes more structured, connected semantic information in the graphs. Experiments show that the proposed model can generate more appropriate and informative responses than state-of-the-art baselines.
#788

Submodularity-Inspired Data Selection for Goal-Oriented Chatbot Training Based on Sentence Embeddings
Mladen Dimovski, Claudiu Musat, Vladimir Ilievski, Andreea Hossman, Michael Baeriswyl

Dialogue, Conversation Models

Spoken language understanding (SLU) systems, such as goal-oriented chatbots or personal assistants, rely on an initial natural language understanding (NLU) module to determine the intent and to extract the relevant information from the user queries they take as input. SLU systems usually help users to solve problems in relatively narrow domains and require a large amount of in-domain training data. This leads to significant data availability issues that inhibit the development of successful systems. To alleviate this problem, we propose a technique of data selection in the low-data regime that enables us to train with fewer labeled sentences, thus smaller labelling costs. We propose a submodularity-inspired data ranking function, the ratio-penalty marginal gain, for selecting data points to label based only on the information extracted from the textual embedding space. We show that the distances in the embedding space are a viable source of information that can be used for data selection. Our method outperforms two known active learning techniques and enables cost-efficient training of the NLU unit. Moreover, our proposed selection technique does not need the model to be retrained in between the selection steps, making it time efficient as well.
#4100

Learning Out-of-Vocabulary Words in Intelligent Personal Agents
Avik Ray, Yilin Shen, Hongxia Jin

Dialogue, Conversation Models

Semantic parsers play a vital role in intelligent agents to convert natural language instructions to an actionable logical form representation. However, after deployment, these parsers suffer from poor accuracy on encountering out-of-vocabulary (OOV) words, or significant accuracy drop on previously supported instructions after retraining. Achieving both goals simultaneously is non-trivial. In this paper, we propose novel neural networks based parsers to learn OOV words; one incorporating a new hybrid paraphrase generation model, and an enhanced sequence-to-sequence model. Extensive experiments on both benchmark and custom datasets show our new parsers achieve significant accuracy gain on OOV words and phrases, and in the meanwhile learn OOV words while maintaining accuracy on previously supported instructions.
#3124

One "Ruler" for All Languages: Multi-Lingual Dialogue Evaluation with Adversarial Multi-Task Learning
Xiaowei Tong, Zhenxin Fu, Mingyue Shang, Dongyan Zhao, Rui Yan

Dialogue, Conversation Models

Automatic evaluating the performance of Open-domain dialogue system is a challenging problem. Recent work in neural network-based metrics has shown promising opportunities for automatic dialogue evaluation. However, existing methods mainly focus on monolingual evaluation, in which the trained metric is not flexible enough to transfer across different languages. To address this issue, we propose an adversarial multi-task neural metric (ADVMT) for multi-lingual dialogue evaluation, with shared feature extraction across languages. We evaluate the proposed model in two different languages. Experiments show that the adversarial multi-task neural metric achieves a high correlation with human annotation, which yields better performance than monolingual ones and various existing metrics.

Monday 16 16:40 - 18:05 CV-CLA - Vision and Classification (T1)

Chair: Minnan Luo

#628

Ensemble Soft-Margin Softmax Loss for Image Classification
Xiaobo Wang, Shifeng Zhang, Zhen Lei, Si Liu, Xiaojie Guo, Stan Z. Li

Vision and Classification

Softmax loss is arguably one of the most popular losses to train CNN models for image classification. However, recent works have exposed its limitation on feature discriminability. This paper casts a new viewpoint on the weakness of softmax loss. On the one hand, the CNN features learned using the softmax loss are often inadequately discriminative. We hence introduce a soft-margin softmax function to explicitly encourage the discrmination between different classes. On the other hand, the learned classifier of softmax loss is weak. We propose to assemble multiple these weak classifiers to a strong one, inspired by the recognition that the diversity among weak classifiers is critical to a good ensemble. To achieve the diversity, we adopt the Hilbert-Schmidt Independence Criterion (HSIC). Considering these two aspects in one framework, we design a novel loss, named as Ensemble Soft-Margin Softmax (EM-Softmax). Extensive experiments on benchmark datasets are conducted to show the superiority of our design over the baseline softmax loss and several state-of-the-art alternatives.
#1066

Mixed Link Networks
Wenhai Wang, Xiang Li, Tong Lu, Jian Yang

Vision and Classification

On the basis of the analysis by revealing the equivalence of modern networks, we find that both ResNet and DenseNet are essentially derived from the same "dense topology", yet they only differ in the form of connection: addition (dubbed "inner link") vs. concatenation (dubbed "outer link"). However, both forms of connections have the superiority and insufficiency. To combine their advantages and avoid certain limitations on representation learning, we present a highly efficient and modularized Mixed Link Network (MixNet) which is equipped with flexible inner link and outer link modules. Consequently, ResNet, DenseNet and Dual Path Network (DPN) can be regarded as a special case of MixNet, respectively. Furthermore, we demonstrate that MixNets can achieve superior efficiency in parameter over the state-of-the-art architectures on many competitive datasets like CIFAR-10/100, SVHN and ImageNet.
#1655

Zero Shot Learning via Low-rank Embedded Semantic AutoEncoder
Yang Liu, Quanxue Gao, Jin Li, Jungong Han, Ling Shao

Vision and Classification

Zero-shot learning (ZSL) has been widely researched and get successful in machine learning. Most existing ZSL methods aim to accurately recognize objects of unseen classes by learning a shared mapping from the feature space to a semantic space. However, such methods did not investigate in-depth whether the mapping can precisely reconstruct the original visual feature. Motivated by the fact that the data have low intrinsic dimensionality e.g. low-dimensional subspace. In this paper, we formulate a novel framework named Low-rank Embedded Semantic AutoEncoder (LESAE) to jointly seek a low-rank mapping to link visual features with their semantic representations. Taking the encoder-decoder paradigm, the encoder part aims to learn a low-rank mapping from the visual feature to the semantic space, while decoder part manages to reconstruct the original data with the learned mapping. In addition, a non-greedy iterative algorithm is adopted to solve our model. Extensive experiments on six benchmark datasets demonstrate its superiority over several state-of-the-art algorithms.
#2474

Energy-efficient Amortized Inference with Cascaded Deep Classifiers
Jiaqi Guan, Yang Liu, Qiang Liu, Jian Peng

Vision and Classification

Deep neural networks have been remarkable successful in various AI tasks but often cast high computation and energy cost for energy-constrained applications such as mobile sensing. We address this problem by proposing a novel framework that optimizes the prediction accuracy and energy cost simultaneously, thus enabling effective cost-accuracy trade-off at test time. In our framework, each data instance is pushed into a cascade of deep neural networks with increasing sizes, and a selection module is used to sequentially determine when a sufficiently accurate classifier can be used for this data instance. The cascade of neural networks and the selection module are jointly trained in an end-to-end fashion by the REINFORCE algorithm to optimize a trade-off between the computational cost and the predictive accuracy. Our method is able to simultaneously improve the accuracy and efficiency by learning to assign easy instances to fast yet sufficiently accurate classifiers to save computation and energy cost, while assigning harder instances to deeper and more powerful classifiers to ensure satisfiable accuracy. Moreover, we demonstrate our method's effectiveness with extensive experiments on CIFAR-10/100, ImageNet32x32 and original ImageNet dataset.
#3068

HCR-Net: A Hybrid of Classification and Regression Network for Object Pose Estimation
Zairan Wang, Weiming Li, Yueying Kao, Dongqing Zou, Qiang Wang, Minsu Ahn, Sunghoon Hong

Vision and Classification

Object pose estimation from a single image is a fundamental and challenging problem in computer vision and robotics. Generally, current methods treat pose estimation as a classification or a regression problem. However, regression based methods usually suffer from the issue of imbalanced training data, while classification methods are difficult to discriminate nearby poses. In this paper, a hybrid CNN model, which we call it HCR-Net that integrates both a classification network and a regression network, is proposed to deal with these issues. Our model is inspired by that regression methods can get better accuracy on homogeneously distributed datasets while classification methods are more effective for coarse quantization of the poses even if the dataset is not well balanced. The classification methods and the regression methods essentially complement each other. Thus we integrate both them into a neural network in a hybrid fashion and train it end-to-end with two novel loss functions. As a result, our method surpass the state-of-the-art methods, even with imbalanced training data and much less data augmentation. The experimental results on the challenging Pascal3D+ database demonstrate that our method outperforms the state-of-the-arts significantly, achieving improvements on ACC and AVP metrics up to 4% and 6%, respectively.
#3181

Multi-scale and Discriminative Part Detectors Based Features for Multi-label Image Classification
Gong Cheng, Decheng Gao, Yang Liu, Junwei Han

Vision and Classification

Convolutional neural networks (CNNs) have shown their promise for image classification task. However, global CNN features still lack geometric invariance for addressing the problem of intra-class variations and so are not optimal for multi-label image classification. This paper proposes a new and effective framework built upon CNNs to learn Multi-scale and Discriminative Part Detectors (MsDPD)-based feature representations for multi-label image classification. Specifically, at each scale level, we (i) first present an entropy-rank based scheme to generate and select a set of discriminative part detectors (DPD), and then (ii) obtain a number of DPD-based convolutional feature maps with each feature map representing the occurrence probability of a particular part detector and learn DPD-based features by using a task-driven pooling scheme. The two steps are formulated into a unified framework by developing a new objective function, which jointly trains part detectors incrementally and integrates the learning of feature representations into the classification task. Finally, the multi-scale features are fused to produce the predictions. Experimental results on PASCAL VOC 2007 and VOC 2012 datasets demonstrate that the proposed method achieves better accuracy when compared with the existing state-of-the-art multi-label classification methods.
#1484

Extracting Privileged Information from Untagged Corpora for Classifier Learning
Yazhou Yao, Jian Zhang, Fumin Shen, Wankou Yang, Xian-Sheng Hua, Zhenmin Tang

Vision and Classification

The performance of data-driven learning approaches is often unsatisfactory when the training data is inadequate either in quantity or quality. Manually labeled privileged information (PI), \eg attributes, tags or properties, is usually incorporated to improve classifier learning. However, the process of manually labeling is time-consuming and labor-intensive. To address this issue, we propose to enhance classifier learning by extracting PI from untagged corpora, which can effectively eliminate the dependency on manually labeled data. In detail, we treat each selected PI as a subcategory and learn one classifier for per subcategory independently. The classifiers for all subcategories are then integrated together to form a more powerful category classifier. Particularly, we propose a new instance-level multi-instance learning (MIL) model to simultaneously select a subset of training images from each subcategory and learn the optimal classifiers based on the selected images. Extensive experiments demonstrate the superiority of our approach.

Monday 16 16:40 - 18:05 ML-RL1 - Reinforcement Learning (K11)

Chair: B. Ravindran

#368

Learning to Design Games: Strategic Environments in Reinforcement Learning
Haifeng Zhang, Jun Wang, Zhiming Zhou, Weinan Zhang, Yin Wen, Yong Yu, Wenxin Li

Reinforcement Learning

In typical reinforcement learning (RL), the environment is assumed given and the goal of the learning is to identify an optimal policy for the agent taking actions through its interactions with the environment. In this paper, we extend this setting by considering the environment is not given, but controllable and learnable through its interaction with the agent at the same time. This extension is motivated by environment design scenarios in the real-world, including game design, shopping space design and traffic signal design. Theoretically, we find a dual Markov decision process (MDP) w.r.t. the environment to that w.r.t. the agent, and derive a policy gradient solution to optimizing the parametrized environment. Furthermore, discontinuous environments are addressed by a proposed general generative framework. Our experiments on a Maze game design task show the effectiveness of the proposed algorithms in generating diverse and challenging Mazes against various agent settings.
#977

Where to Prune: Using LSTM to Guide End-to-end Pruning
Jing Zhong, Guiguang Ding, Yuchen Guo, Jungong Han, Bin Wang

Reinforcement Learning

Recent years have witnessed the great success of convolutional neural networks (CNNs) in many related fields. However, its huge model size and computation complexity bring in difficulty when deploying CNNs in some scenarios, like embedded system with low computation power. To address this issue, many works have been proposed to prune filters in CNNs to reduce computation. However, they mainly focus on seeking which filters are unimportant in a layer and then prune filters layer by layer or globally. In this paper, we argue that the pruning order is also very significant for model pruning. We propose a novel approach to figure out which layers should be pruned in each step. First, we utilize a long short-term memory (LSTM) to learn the hierarchical characteristics of a network and generate a pruning decision for each layer, which is the main difference from previous works. Next, a channel-based method is adopted to evaluate the importance of filters in a to-be-pruned layer, followed by an accelerated recovery step. Experimental results demonstrate that our approach is capable of reducing 70.1% FLOPs for VGG and 47.5% for Resnet-56 with comparable accuracy. Also, the learning results seem to reveal the sensitivity of each network layer.
#521

Cross-modal Bidirectional Translation via Reinforcement Learning
Jinwei Qi, Yuxin Peng

Reinforcement Learning

The inconsistent distribution and representation of image and text make it quite challenging to measure their similarity, and construct correlation between them. Inspired by neural machine translation to establish a corresponding relationship between two entirely different languages, we attempt to treat images as a special kind of language to provide visual descriptions, so that translation can be conduct between bilingual pair of image and text to effectively explore cross-modal correlation. Thus, we propose Cross-modal Bidirectional Translation (CBT) approach, and further explore the utilization of reinforcement learning to improve the translation process. First, a cross-modal translation mechanism is proposed, where image and text are treated as bilingual pairs, and cross-modal correlation can be effectively captured in both feature spaces of image and text by bidirectional translation training. Second, cross-modal reinforcement learning is proposed to perform a bidirectional game between image and text, which is played as a round to promote the bidirectional translation process. Besides, both inter-modality and intra-modality reward signals can be extracted to provide complementary clues for boosting cross-modal correlation learning. Experiments are conducted to verify the performance of our proposed approach on cross-modal retrieval, compared with 11 state-of-the-art methods on 3 datasets.
#2711

Self-Adaptive Double Bootstrapped DDPG
Zhuobin Zheng, Chun Yuan, Zhihui Lin, Yangyang Cheng, Hanghao Wu

Reinforcement Learning

Deep Deterministic Policy Gradient (DDPG) algorithm has been successful for state-of-the-art performance in high-dimensional continuous control tasks. However, due to the complexity and randomness of the environment, DDPG tends to suffer from inefficient exploration and unstable training. In this work, we propose Self-Adaptive Double Bootstrapped DDPG (SOUP), an algorithm that extends DDPG to bootstrapped actor-critic architecture. SOUP improves the efficiency of exploration by multiple actor heads capturing more potential actions and multiple critic heads evaluating more reasonable Q-values collaboratively. The crux of double bootstrapped architecture is to tackle the fluctuations in performance, caused by multiple heads of spotty capacity varying throughout training. To alleviate the instability, a self-adaptive confidence mechanism is introduced to dynamically adjust the weights of bootstrapped heads and enhance the ensemble performance effectively and efficiently. We demonstrate that SOUP achieves faster learning by at least 45% while improving cumulative reward and stability substantially in comparison to vanilla DDPG on OpenAI Gym's MuJoCo environments.
#3116

A Unified Approach for Multi-step Temporal-Difference Learning with Eligibility Traces in Reinforcement Learning
Long Yang, Minhao Shi, Qian Zheng, Wenjia Meng, Gang Pan

Reinforcement Learning

Recently, a new multi-step temporal learning algorithm Q(σ) unifies n-step Tree-Backup (when σ = 0) and n-step Sarsa (when σ = 1) by introducing a sampling parameter σ. However, similar to other multi-step temporal-difference learning algorithms, Q(σ) needs much memory consumption and computation time. Eligibility trace is an important mechanism to transform the off-line updates into efficient on-line ones which consume less memory and computation time. In this paper, we combine the original Q(σ) with eligibility traces and propose a new algorithm, called Qπ(σ,λ), where λ is trace-decay parameter. This new algorithm unifies Sarsa(λ) (when σ = 1) and Qπ (λ) (when σ = 0). Furthermore, we give an upper error bound of Qπ(σ,λ) policy evaluation algorithm. We prove that Qπ (σ, λ) control algorithm converges to the optimal value function exponentially. We also empirically compare it with conventional temporal-difference learning methods. Results show that, with an intermediate value of σ, Qπ(σ,λ) creates a mixture of the existing algorithms which learn the optimal value significantly faster than the extreme end (σ = 0, or 1).
#3171

Algorithms or Actions? A Study in Large-Scale Reinforcement Learning
Anderson Rocha Tavares, Sivasubramanian Anbalagan, Leandro Soriano Marcolino, Luiz Chaimowicz

Reinforcement Learning

Large state and action spaces are very challenging to reinforcement learning. However, in many domains there is a set of algorithms available, which estimate the best action given a state. Hence, agents can either directly learn a performance-maximizing mapping from states to actions, or from states to algorithms. We investigate several aspects of this dilemma, showing sufficient conditions for learning over algorithms to outperform over actions for a finite number of training iterations. We present synthetic experiments to further study such systems. Finally, we propose a function approximation approach, demonstrating the effectiveness of learning over algorithms in real-time strategy games.
#2254

Multinomial Logit Bandit with Linear Utility Functions
Mingdong Ou, Nan Li, Shenghuo Zhu, Rong Jin

Reinforcement Learning

Multinomial logit bandit is a sequential subset selection problem which arises in many applications. In each round, the player selects a K-cardinality subset from N candidate items, and receives a reward which is governed by a multinomial logit (MNL) choice model considering both item utility and substitution property among items. The player's objective is to dynamically learn the parameters of MNL model and maximize cumulative reward over a finite horizon T. This problem faces the exploration-exploitation dilemma, and the involved combinatorial nature makes it non-trivial. In recent years, there have developed some algorithms by exploiting specific characteristics of the MNL model, but all of them estimate the parameters of MNL model separately and incur a regret bound which is not preferred for large candidate set size N. In this paper, we consider the linear utility MNL choice model whose item utilities are represented as linear functions of d-dimension item features, and propose an algorithm, titled LUMB, to exploit the underlying structure. It is proven that the proposed algorithm achieves regret which is free of candidate set size. Experiments show the superiority of the proposed algorithm.

Monday 16 16:40 - 18:05 HAI-HCC - Human Computation and Crowdsourcing (C2)

Chair: Eugene Vorobeychik

#386

On the Cost Complexity of Crowdsourcing
Yili Fang, Hailong Sun, Pengpeng Chen, Jinpeng Huai

Human Computation and Crowdsourcing

Existing efforts mainly use empirical analysis to evaluate the effectiveness of crowdsourcing methods, which is often unreliable across experimental settings. Consequently, it is of great importance to study theoretical methods. This work, for the first time, defines the cost complexity of crowdsourcing, and presents two theorems to compute the cost complexity. Our theorems provide a general theoretical method to model the trade-off between costs and quality, which can be used to evaluate and design crowdsourcing algorithms, and characterize the complexity of crowdsourcing problems. Moreover, following our theorems, we prove a set of corollaries that can obtain existing theoretical results for special cases. We have verified our work theoretically and empirically.
#2229

A Novel Strategy for Active Task Assignment in Crowd Labeling
Zehong Hu, Jie Zhang

Human Computation and Crowdsourcing

Active learning strategies are often used in crowd labeling to improve task assignment. However, these strategies require prohibitive computation time yet still cannot improve the assignment to the utmost, because they simply evaluate each possible assignment and then greedily select the optimal one. In this paper, we first derive an efficient algorithm for assignment evaluation. Then, to overcome the uncertainty of labels, we develop a novel strategy that modulates the scope of the greedy task assignment with posterior uncertainty and keeps the evaluation optimistic. The experiments on two popular worker models and four MTurk datasets show that our strategy achieves the best performance and highest computation efficiency.
#2274

Simultaneous Clustering and Ranking from Pairwise Comparisons
Jiyi Li, Yukino Baba, Hisashi Kashima

Human Computation and Crowdsourcing

When people make decisions with a number of ideas, designs, or other kinds of objects, one attempt is probably to organize them into several groups of objects and to prioritize them according to some preference. The grouping task is referred to as clustering and the prioritizing task is called as ranking. These tasks are often outsourced with the help of human judgments in the form of pairwise comparisons. Two objects are compared on whether they are similar in the clustering problem, while the object of higher priority is determined in the ranking problem. Our research question in this paper is whether the pairwise comparisons for clustering also help ranking (and vice versa). Instead of solving the two tasks separately, we propose a unified formulation to bridge the two types of pairwise comparisons. Our formulation simultaneously estimates the object embeddings and the preference criterion vector. The experiments using real datasets support our hypothesis; our approach can generate better neighbor and preference estimation results than the approaches that only focus on a single type of pairwise comparisons.
#3189

On the Efficiency of Data Collection for Crowdsourced Classification
Edoardo Manino, Long Tran-Thanh, Nicholas R. Jennings

Human Computation and Crowdsourcing

The quality of crowdsourced data is often highly variable. For this reason, it is common to collect redundant data and use statistical methods to aggregate it. Empirical studies show that the policies we use to collect such data have a strong impact on the accuracy of the system. However, there is little theoretical understanding of this phenomenon. In this paper we provide the first theoretical explanation of the accuracy gap between the most popular collection policies: the non-adaptive uniform allocation, and the adaptive uncertainty sampling and information gain maximisation. To do so, we propose a novel representation of the collection process in terms of random walks. Then, we use this tool to derive lower and upper bounds on the accuracy of the policies. With these bounds, we are able to quantify the advantage that the two adaptive policies have over the non-adaptive one for the first time.
#4057

An Axiomatic View of the Parimutuel Consensus Mechanism
Rupert Freeman, David M. Pennock

Human Computation and Crowdsourcing

We consider an axiomatic view of the Parimutuel Consensus Mechanism defined by Eisenberg and Gale (1959). The parimutuel consensus mechanism can be interpreted as a parimutuel market for wagering with a proxy that bets optimally on behalf of the agents, depending on the bets of the other agents. We show that the parimutuel consensus mechanism uniquely satisfies the desirable properties of Pareto optimality, individual rationality, budget balance, anonymity, sybilproofness and envy-freeness. While the parimutuel consensus mechanism does violate the key property of incentive compatibility, it is incentive compatible in the limit as the number of agents becomes large. Via simulations on real contest data, we show that violations of incentive compatibility are both rare and only minimally beneficial for the participants. This suggests that the parimutuel consensus mechanism is a reasonable mechanism for eliciting information in practice.
#5143

(Sister Conferences Best Papers Track) Evaluating and Complementing Vision-to-Language Technology for People who are Blind with Conversational Crowdsourcing
Elliot Salisbury, Ece Kamar, Meredith Ringel Morris

Human Computation and Crowdsourcing

We study how real-time crowdsourcing can be used both for evaluating the value provided by existing automated approaches and for enabling workflows that provide scalable and useful alt text to blind users. We show that the shortcomings of existing AI image captioning systems frequently hinder a user's understanding of an image they cannot see to a degree that even clarifying conversations with sighted assistants cannot correct. Based on analysis of clarifying conversations collected from our studies, we design experiences that can effectively assist users in a scalable way without the need for real-time interaction. Our results provide lessons and guidelines that the designers of future AI captioning systems can use to improve labeling of social media imagery for blind users.
#5128

(Sister Conferences Best Papers Track) Geolocating Images with Crowdsourcing and Diagramming
Rachel Kohler, John Purviance, Kurt Luther

Human Computation and Crowdsourcing

Many types of investigative work involve verifying the legitimacy of visual evidence by identifying the precise geographic location where a photo or video was taken. Professional geolocation is often a manual, time-consuming process that can involve searching large areas of satellite imagery for potential matches. In this paper, we explore how crowdsourcing can be used to support expert image geolocation. We adapt an expert diagramming technique to overcome spatial reasoning limitations of novice crowds so that they can support an expert's search. In an experiment (n=540), we found that diagrams work significantly better than ground-level photos and allow crowds to reduce a search area by half before any expert intervention. We also discuss hybrid approaches to complex image analysis combining crowds, experts, and computer vision.

Monday 16 16:40 - 18:05 ML-CLU - Clustering (C3)

Chair: Zhao Kang

#3980

A Local Algorithm for Product Return Prediction in E-Commerce
Yada Zhu, Jianbo Li, Jingrui He, Brian L. Quanz, Ajay A. Deshpande

Clustering

With the rapid growth of e-tail, the cost to handle returned online orders also increases significantly and has become a major challenge in the e-commerce industry. Accurate prediction of product returns allows e-tailers to prevent problematic transactions in advance. However, the limited existing work for modeling customer online shopping behaviors and predicting their return actions fail to integrate the rich information in the product purchase and return history (e.g., return history, purchase-no-return behavior, and customer/product similarity). Furthermore, the large-scale data sets involved in this problem, typically consisting of millions of customers and tens of thousands of products, also render existing methods inefficient and ineffective at predicting the product returns. To address these problems, in this paper, we propose to use a weighted hybrid graph to represent the rich information in the product purchase and return history, in order to predict product returns. The proposed graph consists of both customer nodes and product nodes, undirected edges reflecting customer return history and customer/product similarity based on their attributes, as well as directed edges discriminating purchase-no-return and no-purchase actions. Based on this representation, we study a random-walk-based local algorithm for predicting product return propensity for each customer, whose computational complexity depends only on the size of the output cluster rather than the entire graph. Such a property makes the proposed local algorithm particularly suitable for processing the large-scale data sets to predict product returns. To test the performance of the proposed techniques, we evaluate the graph model and algorithm on multiple e-commerce data sets, showing improved performance over state-of-the-art methods.
#1810

Mixture of GANs for Clustering
Yang Yu, Wen-Ji Zhou

Clustering

For data clustering, Gaussian mixture model (GMM) is a typical method that trains several Gaussian models to capture the data. Each Gaussian model then provides the distribution information of a cluster. For clustering of high dimensional and complex data, more flexible models rather than Gaussian models are desired. Recently, the generative adversarial networks (GANs) have shown effectiveness in capturing complex data distribution. Therefore, GAN mixture model (GANMM) would be a promising alternative of GMM. However, we notice that the non-flexibility of the Gaussian model is essential in the expectation-maximization procedure for training GMM. GAN can have much higher flexibility, which disables the commonly employed expectation-maximization procedure, as that the maximization cannot change the result of the expectation. In this paper, we propose to use the epsilon-expectation-maximization procedure for training GANMM. The experiments show that the proposed GANMM can have good performance on complex data as well as simple data.
#1846

An Information Theory based Approach to Multisource Clustering
Pierre-Alexandre Murena, Jérémie Sublime, Basarab Matei, Antoine Cornuéjols

Clustering

Clustering is a compression task which consists in grouping similar objects into clusters. In real-life applications, the system may have access to several views of the same data and each view may be processed by a specific clustering algorithm: this framework is called multi-view clustering and can benefit from algorithms capable of exchanging information between the different views. In this paper, we consider this type of unsupervised ensemble learning as a compression problem and develop a theoretical framework based on algorithmic theory of information suitable for multi-view clustering and collaborative clustering applications. Using this approach, we propose a new algorithm based on solid theoretical basis, and test it on several real and artificial data sets.
#3443

High-Order Co-Clustering via Strictly Orthogonal and Symmetric L1-Norm Nonnegative Matrix Tri-Factorization
Kai Liu, Hua Wang

Clustering

Different to traditional clustering methods that deal with one single type of data, High-Order Co- Clustering (HOCC) aims to cluster multiple types of data simultaneously by utilizing the inter- or/and intra-type relationships across different data types. In existing HOCC methods, data points routinely enter the objective functions with squared residual errors. As a result, outlying data samples can dominate the objective functions, which may lead to incorrect clustering results. Moreover, existing methods usually suffer from soft clustering, where the probabilities to different groups can be very close. In this paper, we propose an L1 -norm symmetric nonnegative matrix tri-factorization method to solve the HOCC problem. Due to the orthogonal constraints and the symmetric L1 -norm formulation in our new objective, conventional auxiliary function approach no longer works. Thus we derive the solution algorithm using the alternating direction method of multipliers. Extensive experiments have been conducted on a real world data set, in which promising empirical results, including less time consumption, strictly orthogonal membership matrix, lower local minima etc., have demonstrated the effectiveness of our proposed method.
#5457

(Journal track) Rademacher Complexity Bounds for a Penalized Multi-class Semi-supervised Algorithm
Yury Maximov, Massih-Reza Amini, Zaid Harchaoui

Clustering

We propose Rademacher complexity bounds for multi-class classifiers trained with a two-step semi-supervised model. In the first step, the algorithm partitions the partially labeled data and then identifies dense clusters containing k predominant classes using the labeled training examples such that the proportion of their non-predominant classes is below a fixed threshold stands for clustering consistency. In the second step, a classifier is trained by minimizing a margin empirical loss over the labeled training set and a penalization term measuring the disability of the learner to predict the k predominant classes of the identified clusters. The resulting data-dependent generalization error bound involves the margin distribution of the classifier, the stability of the clustering technique used in the first step and Rademacher complexity terms corresponding to partially labeled training data. Our theoretical result exhibit convergence rates extending those proposed in the literature for the binary case, and experimental results on different multi-class classification problems show empirical evidence that supports the theory.
#1380

Self-weighted Multiple Kernel Learning for Graph-based Clustering and Semi-supervised Classification
Zhao Kang, Xiao Lu, Jinfeng Yi, Zenglin Xu

Clustering

Multiple kernel learning (MKL) method is generally believed to perform better than single kernel method. However, some empirical studies show that this is not always true: the combination of multiple kernels may even yield an even worse performance than using a single kernel. There are two possible reasons for the failure: (i) most existing MKL methods assume that the optimal kernel is a linear combination of base kernels, which may not hold true; and (ii) some kernel weights are inappropriately assigned due to noises and carelessly designed algorithms. In this paper, we propose a novel MKL framework by following two intuitive assumptions: (i) each kernel is a perturbation of the consensus kernel; and (ii) the kernel that is close to the consensus kernel should be assigned a large weight. Impressively, the proposed method can automatically assign an appropriate weight to each kernel without introducing additional parameters, as existing methods do. The proposed framework is integrated into a unified framework for graph-based clustering and semi-supervised classification. We have conducted experiments on multiple benchmark datasets and our empirical results verify the superiority of the proposed framework.
#2045

Ranking Preserving Nonnegative Matrix Factorization
Jing Wang, Feng Tian, Weiwei Liu, Xiao Wang, Wenjie Zhang, Kenji Yamanishi

Clustering

Nonnegative matrix factorization (NMF), a well-known technique to find parts-based representations of nonnegative data, has been widely studied. In reality, ordinal relations often exist among data, such as data i is more related to j than to q. Such relative order is naturally available, and more importantly, it truly reflects the latent data structure. Preserving the ordinal relations enables us to find structured representations of data that are faithful to the relative order, so that the learned representations become more discriminative. However, current NMFs pay no attention to this. In this paper, we make the first attempt towards incorporating the ordinal relations and propose a novel ranking preserving nonnegative matrix factorization (RPNMF) approach, which enforces the learned representations to be ranked according to the relations. We derive iterative updating rules to solve RPNMF's objective function with convergence guaranteed. Experimental results with several datasets for clustering and classification have demonstrated that RPNMF achieves greater performance against the state-of-the-arts, not only in terms of accuracy, but also interpretation of orderly data structure.

Tuesday 17 08:30 - 09:45 EAR4 - Early Career 4 (VICTORIA)

Chair: Matthijs Spaan

#5447

Improving Reinforcement Learning with Human Input
Matthew E. Taylor

Early Career 4

Reinforcement learning (RL) has had many successes when learning autonomously. This paper and accompanying talk consider how to make use of a non-technical human participant, when available. In particular, we consider the case where a human could 1) provide demonstrations of good behavior, 2) provide online evaluative feedback, or 3) define a curriculum of tasks for the agent to learn on. In all cases, our work has shown such information can be effectively leveraged. After giving a high-level overview of this work, we will highlight a set of open questions and suggest where future work could be usefully focused.
#5490

Partakable Technology
Nardine Osman

Early Career 4

This paper proposes a shift in how technology is currently being developed by giving people, the users, control over their technology. We argue that users should have a say in the behaviour of the technologies that mediate their online interactions and control their private data. We propose 'partakable technologies', technologies where users can come together to discuss and agree on its features and functionalities. To achieve this, we base our proposal on a number of existing technologies in the fields of agreement technologies, natural language processing, normative systems, and formal verification. As an IJCAI early career spotlight paper, the paper provides an overview of the author's expertise in these different areas.
#5496

Solving Games with Structured Strategy Spaces
Albert Xin Jiang

Early Career 4

Tuesday 17 08:30 - 09:55 KR-QUE - Query Answering and Databases (C7)

Chair: Diego Calvanese

#3135

Finite Model Reasoning in Hybrid Classes of Existential Rules
Georg Gottlob, Marco Manna, Andreas Pieris

Query Answering and Databases

Two paradigmatic restrictions that have been studied for ensuring the decidability of query answering under existential rules are guardedness and stickiness. With the aim of consolidating these restrictions, a flexible condition, called tameness, has been proposed a few years ago, which relies on hybrid reasoning, i.e., a combination of forward and backward procedures. The complexity of query answering under this hybrid class of existential rules is by now well-understood. However, the complexity of finite query answering, i.e., query answering under finite models, has remained an open problem. Closing this problem is the main goal of this work.
#4439

Complexity of Approximate Query Answering under Inconsistency in Datalog+/-
Thomas Lukasiewicz, Enrico Malizia, Cristian Molinaro

Query Answering and Databases

Several semantics have been proposed to query inconsistent ontological knowledge bases, including the intersection of repairs and the intersection of closed repairs as two approximate inconsistency-tolerant semantics. In this paper, we analyze the complexity of conjunctive query answering under these two semantics for a wide range of Datalog+/- languages. We consider both the standard setting, where errors may only be in the database, and the generalized setting, where also the rules of a Datalog+/- knowledge base may be erroneous.
#3199

Computing Approximate Query Answers over Inconsistent Knowledge Bases
Sergio Greco, Cristian Molinaro, Irina Trubitsyna

Query Answering and Databases

Consistent query answering is a principled approach for querying inconsistent knowledge bases. It relies on the notion of a "repair", that is, a maximal consistent subset of the facts in the knowledge base. One drawback of this approach is that entire facts are deleted to resolve inconsistency, even if they may still contain useful "reliable" information. To overcome this limitation, we propose a new notion of repair allowing values within facts to be updated for restoring consistency. This more fine-grained repair primitive allows us to preserve more information in the knowledge base. We also introduce the notion of a "universal repair", which is a compact representation of all repairs. Then, we show that consistent query answering in our framework is intractable (coNP-complete). In light of this result, we develop a polynomial time approximation algorithm for computing a sound (but possibly incomplete) set of consistent query answers.
#1068

Query Answering in Propositional Circumscription
Mario Alviano

Query Answering and Databases

Propositional circumscription defines a preference relation over the models of a propositional theory, so that models being subset-minimal on the interpretation of a set of objective atoms are preferred.The complexity of several computational tasks increase by one level in the polynomial hierarchy due to such a preference relation;among them there is query answering, which amounts to decide whether there is an optimal model satisfying the query.A complete algorithm for query answering is obtained by searching for a model, not necessarily an optimal one, that satisfies the query, and such that no model unsatisfying the query is more preferred.If the query or its complement are among the objective atoms, the algorithm has a simpler behavior, which is also described in the paper.Moreover, an incomplete algorithm is obtained by searching for a model satisfying both the query and an objective atom being unit-implied by the theory extended with the complement of the query.A prototypical implementation is tested on instances from the 2nd International Competition on Computational Models of Argumentation (ICCMA'17).
#3892

Compiling Model Representations for Querying Large ABoxes in Expressive DLs
Labinot Bajraktari, Magdalena Ortiz, Mantas Simkus

Query Answering and Databases

Answering ontology mediated queries (OMQs) has received much attention in the last decade, but the big gap between practicable algorithms for lightweight ontologies, that are supported by implemented reasoners, and purely theoretical algorithms for expressive ontologies that are not amenable to implementation, has only increased. Towards narrowing the gap, we propose an algorithm to compile a representation of sets of models for ALCHI ontologies, which is sufficient for answering any monotone OMQ. Rather than reasoning for specific ABoxes, or being fully data-independent, we use generic descriptions of families of ABoxes, given by what we call profiles. Our model compilation algorithm runs on TBoxes and sets of profiles, and supports the incremental addition of new profiles. To illustrate the potential of our approach for OMQ answering, we implement a rewriting into an extension of Datalog for OMQs comprising reachability queries, and provide some promising evaluation results.
#3448

First-Order Rewritability of Frontier-Guarded Ontology-Mediated Queries
Pablo Barceló, Gerald Berger, Carsten Lutz, Andreas Pieris

Query Answering and Databases

We focus on ontology-mediated queries (OMQs) based on (frontier-)guarded existential rules and (unions of) conjunctive queries, and we investigate the problem of FO-rewritability, i.e., whether an OMQ can be rewritten as a first-order query. We adopt two different approaches. The first approach employs standard two-way alternating parity tree automata. Although it does not lead to a tight complexity bound, it provides a transparent solution based on widely known tools. The second approach relies on a sophisticated automata model, known as cost automata. This allows us to show that our problem is 2EXPTIME-complete. In both approaches, we provide semantic characterizations of FO-rewritability that are of independent interest.
#3220

Consequence-based Reasoning for Description Logics with Disjunction, Inverse Roles, Number Restrictions, and Nominals
David Tena Cucala, Bernardo Cuenca Grau, Ian Horrocks

Query Answering and Databases

We present a consequence-based calculus for concept subsumption and classification in the description logic ALCHOIQ, which extends ALC with role hierarchies, inverse roles, number restrictions, and nominals. By using standard transformations, our calculus extends to SROIQ, which covers all of OWL 2 DL except for datatypes. A key feature of our calculus is its pay-as-you-go behaviour: unlike existing algorithms, our calculus is worst-case optimal for all the well-known proper fragments of ALCHOIQ, albeit not for the full logic.

Tuesday 17 08:30 - 09:55 ML-MMM2 - Multi-Instance, Multi-Label, Multi-View Learning 2 (C8)

Chair: Pengfei Zhu

#155

Grouping Attribute Recognition for Pedestrian with Joint Recurrent Learning
Xin Zhao, Liufang Sang, Guiguang Ding, Yuchen Guo, Xiaoming Jin

Multi-Instance, Multi-Label, Multi-View Learning 2

Pedestrian attributes recognition is to predict attribute labels of pedestrian from surveillance images, which is a very challenging task for computer vision due to poor imaging quality and small training dataset. It is observed that semantic pedestrian attributes to be recognised tend to show semantic or visual spatial correlation. Attributes can be grouped by the correlation while previous works mostly ignore this phenomenon. Inspired by Recurrent Neural Network (RNN)'s super capability of learning context correlations, this paper proposes an end-to-end Grouping Recurrent Learning (GRL) model that takes advantage of the intra-group mutual exclusion and inter-group correlation to improve the performance of pedestrian attribute recognition. Our GRL method starts with the detection of precise body region via Body Region Proposal followed by feature extraction from detected regions. These features, along with the semantic groups, are fed into RNN for recurrent grouping attribute recognition, where intra group correlations can be learned. Extensive empirical evidence shows that our GRL model achieves state-of-the-art results, based on pedestrian attribute datasets, i.e. standard PETA and RAP datasets.
#1103

Multi-Label Co-Training
Yuying Xing, Guoxian Yu, Carlotta Domeniconi, Jun Wang, Zili Zhang

Multi-Instance, Multi-Label, Multi-View Learning 2

Multi-label learning aims at assigning a set of appropriate labels to multi-label samples. Although it has been successfully applied in various domains in recent years, most multi-label learning methods require sufficient labeled training samples, because of the large number of possible label sets. Co-training, as an important branch of semi-supervised learning, can leverage unlabeled samples, along with scarce labeled ones, and can potentially help with the large labeled data requirement. However, it is a difficult challenge to combine multi-label learning with co-training. Two distinct issues are associated with the challenge: (i) how to solve the widely-witnessed class-imbalance problem in multi-label learning; and (ii) how to select samples with confidence, and communicate their predicted labels among classifiers for model refinement. To address these issues, we introduce an approach called Multi-Label Co-Training (MLCT). MLCT leverages information concerning the co-occurrence of pairwise labels to address the class-imbalance challenge; it introduces a predictive reliability measure to select samples, and applies label-wise filtering to confidently communicate labels of selected samples among co-training classifiers. MLCT performs favorably against related competitive multi-label learning methods on benchmark datasets and it is also robust to the input parameters.
#2732

Deep Discrete Prototype Multilabel Learning
Xiaobo Shen, Weiwei Liu, Yong Luo, Yew-Soon Ong, Ivor W. Tsang

Multi-Instance, Multi-Label, Multi-View Learning 2

kNN embedding methods, such as the state-of-the-art LM-kNN, have shown impressive results in multi-label learning. Unfortunately, these approaches suffer expensive computation and memory costs in large-scale settings. To fill this gap, this paper proposes a novel deep prototype compression, i.e., DBPC for fast multi-label prediction. DBPC compresses the database into a small set of short discrete prototypes, and uses the prototypes for prediction. The benefit of DBPC comes from two aspects: 1) The number of distance comparisons are reduced in the prototype; 2) The distance computation cost is significantly decreased in the reduced space. We propose to jointly learn the deep latent subspace and discrete prototypes within one framework. The encoding and decoding neural networks are employed to make deep discrete prototypes well represent the instances and labels. Extensive experiments on several large-scale datasets demonstrate that DBPC achieves several orders of magnitude lower storage and prediction complexity than state-of-the-art multi-label methods, while achieving competitive accuracy.
#3183

Leveraging Latent Label Distributions for Partial Label Learning
Lei Feng, Bo An

Multi-Instance, Multi-Label, Multi-View Learning 2

In partial label learning, each training example is assigned a set of candidate labels, only one of which is the ground-truth label. Existing partial label learning frameworks either assume each candidate label of equal confidence or consider the ground-truth label as a latent variable hidden in the indiscriminate candidate label set, while the different labeling confidence levels of the candidate labels are regrettably ignored. In this paper, we formalize the different labeling confidence levels as the latent label distributions, and propose a novel unified framework to estimate the latent label distributions while training the model simultaneously. Specifically, we present a biconvex formulation with constrained local consistency and adopt an alternating method to solve this optimization problem. The process of alternating optimization exactly facilitates the mutual adaption of the model training and the constrained label propagation. Extensive experimental results on controlled UCI datasets as well as real-world datasets clearly show the effectiveness of the proposed approach.
#3330

Robust Multi-view Learning via Half-quadratic Minimization
Yonghua Zhu, Xiaofeng Zhu, Wei Zheng

Multi-Instance, Multi-Label, Multi-View Learning 2

Although multi-view clustering is capable to usemore information than single view clustering, existing multi-view clustering methods still have issues to be addressed, such as initialization sensitivity, the specification of the number of clusters,and the influence of outliers. In this paper, we propose a robust multi-view clustering method to address these issues. Specifically, we first propose amulti-view based sum-of-square error estimation tomake the initialization easy and simple as well asuse a sum-of-norm regularization to automaticallylearn the number of clusters according to data distribution. We further employ robust estimators constructed by the half-quadratic theory to avoid theinfluence of outliers for conducting robust estimations of both sum-of-square error and the numberof clusters. Experimental results on both syntheticand real datasets demonstrate that our method outperforms the state-of-the-art methods.
#1694

Localized Incomplete Multiple Kernel k-means
Xinzhong Zhu, Xinwang Liu, Miaomiao Li, En Zhu, Li Liu, Zhiping Cai, Jianping Yin, Wen Gao

Multi-Instance, Multi-Label, Multi-View Learning 2

The recently proposed multiple kernel k-means with incomplete kernels (MKKM-IK) optimally integrates a group of pre-specified incomplete kernel matrices to improve clustering performance. Though it demonstrates promising performance in various applications, we observe that it does not \emph{sufficiently consider the local structure among data and indiscriminately forces all pairwise sample similarity to equally align with their ideal similarity values}. This could make the incomplete kernels less effectively imputed, and in turn adversely affect the clustering performance. In this paper, we propose a novel localized incomplete multiple kernel k-means (LI-MKKM) algorithm to address this issue. Different from existing MKKM-IK, LI-MKKM only requires the similarity of a sample to its k-nearest neighbors to align with their ideal similarity values. This helps the clustering algorithm to focus on closer sample pairs that shall stay together and avoids involving unreliable similarity evaluation for farther sample pairs. We carefully design a three-step iterative algorithm to solve the resultant optimization problem and theoretically prove its convergence. Comprehensive experiments on eight benchmark datasets demonstrate that our algorithm significantly outperforms the state-of-the-art comparable algorithms proposed in the recent literature, verifying the advantage of considering local structure.
#2333

Label Embedding Based on Multi-Scale Locality Preservation
Cheng-Lun Peng, An Tao, Xin Geng

Multi-Instance, Multi-Label, Multi-View Learning 2

Label Distribution Learning (LDL) fits the situations well that focus on the overall distribution of the whole series of labels. The numerical labels of LDL satisfy the integrity probability constraint. Due to LDL's special label domain, existing label embedding algorithms that focus on embedding of binary labels are thus unfit for LDL. This paper proposes a specially designed approach MSLP that achieves label embedding for LDL by Multi-Scale Locality Preserving (MSLP). Specifically, MSLP takes the locality information of data in both the label space and the feature space into account with different locality granularity. By assuming an explicit mapping from the features to the embedded labels, MSLP does not need an additional learning process after completing embedding. Besides, MSLP is insensitive to the existing of data points violating the smoothness assumption, which is usually caused by noises. Experimental results demonstrate the effectiveness of MSLP in preserving the locality structure of label distributions in the embedding space and show its superiority over the state-of-the-art baseline methods.

Tuesday 17 08:30 - 09:55 PS-UAI - Planning and Uncertainty in Ai: Markov Decision Processes (K2)

Chair: Chris Amato

#828

Policy Optimization with Second-Order Advantage Information
Jiajin Li, Baoxiang Wang, Shengyu Zhang

Planning and Uncertainty in Ai: Markov Decision Processes

Policy optimization on high-dimensional continuous control tasks exhibits its difficulty caused by the large variance of the policy gradient estimators. We present the action subspace dependent gradient (ASDG) estimator which incorporates the Rao-Blackwell theorem (RB) and Control Variates (CV) into a unified framework to reduce the variance. To invoke RB, our proposed algorithm (POSA) learns the underlying factorization structure among the action space based on the second-order advantage information. POSA captures the quadratic information explicitly and efficiently by utilizing the wide \& deep architecture. Empirical studies show that our proposed approach demonstrates the performance improvements on high-dimensional synthetic settings and OpenAI Gym's MuJoCo continuous control tasks.
#3283

Computational Approaches for Stochastic Shortest Path on Succinct MDPs
Krishnendu Chatterjee, Hongfei Fu, Amir Goharshady, Nastaran Okati

Planning and Uncertainty in Ai: Markov Decision Processes

We consider the stochastic shortest path (SSP) problem for succinct Markov decision processes (MDPs), where the MDP consists of a set of variables, and a set of nondeterministic rules that update the variables. First, we show that several examples from the AI literature can be modeled as succinct MDPs. Then we present computational approaches for upper and lower bounds for the SSP problem: (a) for computing upper bounds, our method is polynomial-time in the implicit description of the MDP; (b) for lower bounds, we present a polynomial-time (in the size of the implicit description) reduction to quadratic programming. Our approach is applicable even to infinite-state MDPs. Finally, we present experimental results to demonstrate the effectiveness of our approach on several classical examples from the AI literature.
#3957

Minimax-Regret Querying on Side Effects for Safe Optimality in Factored Markov Decision Processes
Shun Zhang, Edmund H. Durfee, Satinder Singh

Planning and Uncertainty in Ai: Markov Decision Processes

As it achieves a goal on behalf of its human user, an autonomous agent's actions may have side effects that change features of its environment in ways that negatively surprise its user. An agent that can be trusted to operate safely should thus only change features the user has explicitly permitted. We formalize this problem, and develop a planning algorithm that avoids potentially negative side effects given what the agent knows about (un)changeable features. Further, we formulate a provably minimax-regret querying strategy for the agent to selectively ask the user about features that it hasn't explicitly been told about. We empirically show how much faster it is than a more exhaustive approach and how much better its queries are than those found by the best known heuristic.
#3587

Goal-HSVI: Heuristic Search Value Iteration for Goal POMDPs
Karel Horák, Branislav Bošanský, Krishnendu Chatterjee

Planning and Uncertainty in Ai: Markov Decision Processes

Partially observable Markov decision processes (POMDPs) are the standard models for planning under uncertainty with both finite and infinite horizon. Besides the well-known discounted-sum objective, indefinite-horizon objective (aka Goal-POMDPs) is another classical objective for POMDPs. In this case, given a set of target states and a positive cost for each transition, the optimization objective is to minimize the expected total cost until a target state is reached. In the literature, RTDP-Bel or heuristic search value iteration (HSVI) have been used for solving Goal-POMDPs. Neither of these algorithms has theoretical convergence guarantees, and HSVI may even fail to terminate its trials. We give the following contributions: (1) We discuss the challenges introduced in Goal-POMDPs and illustrate how they prevent the original HSVI from converging. (2) We present a novel algorithm inspired by HSVI, termed Goal-HSVI, and show that our algorithm has convergence guarantees. (3) We show that Goal-HSVI outperforms RTDP-Bel on a set of well-known examples.
#1724

Expectation Optimization with Probabilistic Guarantees in POMDPs with Discounted-Sum Objectives
Krishnendu Chatterjee, Adrián Elgyütt, Petr Novotný, Owen Rouillé

Planning and Uncertainty in Ai: Markov Decision Processes

Partially-observable Markov decision processes (POMDPs) with discounted-sum payoff are a standard framework to model a wide range of problems related to decision making under uncertainty. Traditionally, the goal has been to obtain policies that optimize the expectation of the discounted-sum payoff. A key drawback of the expectation measure is that even low probability events with extreme payoff can significantly affect the expectation, and thus the obtained policies are not necessarily risk averse. An alternate approach is to optimize the probability that the payoff is above a certain threshold, which allows to obtain risk-averse policies, but ignore optimization of the expectation. We consider the expectation optimization with probabilistic guarantee (EOPG) problem where the goal is to optimize the expectation ensuring that the payoff is above a given threshold with at least a specified probability. We present several results on the EOPG problem, including the first algorithm to solve it.
#1673

Dynamic Resource Routing using Real-Time Dynamic Programming
Sebastian Schmoll, Matthias Schubert

Planning and Uncertainty in Ai: Markov Decision Processes

Acquiring available resources in stochastic environments becomes more and more important to future mobility. For instance, cities like Melbourne, Canberra and San Francisco install sensors that detect in real-time whether a parking spot (resource) is available or not. In such environments, the current state of the resources may be fully observable, although the future development is stochastic. In order to reduce the traffic, such cities want to fully exploit parking spots, such that the amount of searching cars is minimized. Thus, we formulate a problem setting where the expected seek time for each driver is minimized. This problem can be modeled by a Markov Decision Process (MDP) and solved using standard algorithms. In this paper, we focus on the setting, where pre-computation is not possible and search policies have to be computed on the fly. Our approach is based on state-of-the-art Real-Time Dynamic Programming (RTDP) approaches. However, standard RTDP approaches do not perform well on this specific problem setting as shown in our experiments. We introduce adapted bounds and approximations that exploit the specific nature of the problem in order to improve the performance significantly.
#3506

Planning and Learning with Stochastic Action Sets
Craig Boutilier, Alon Cohen, Avinatan Hassidim, Yishay Mansour, Ofer Meshi, Martin Mladenov, Dale Schuurmans

Planning and Uncertainty in Ai: Markov Decision Processes

In many practical uses of reinforcement learning (RL) the set of actions available at a given state is a random variable, with realizations governed by an exogenous stochastic process. Somewhat surprisingly, the foundations for such sequential decision processes have been unaddressed. In this work, we formalize and investigate MDPs with stochastic action sets (SAS-MDPs) to provide these foundations. We show that optimal policies and value functions in this model have a structure that admits a compact representation. From an RL perspective, we show that Q-learning with sampled action sets is sound. In model-based settings, we consider two important special cases: when individual actions are available with independent probabilities, and a sampling-based model for unknown distributions. We develop polynomial-time value and policy iteration methods for both cases, and provide a polynomial-time linear programming solution for the first case.

Tuesday 17 08:30 - 09:55 NLP-DIA2 - Dialogue, Conversation Models (T2)

Chair: Lei Shu

#781

Goal-Oriented Chatbot Dialog Management Bootstrapping with Transfer Learning
Vladimir Ilievski, Claudiu Musat, Andreea Hossman, Michael Baeriswyl

Dialogue, Conversation Models

Goal-Oriented (GO) Dialogue Systems, colloquially known as goal oriented chatbots, help users achieve a predefined goal (e.g. book a movie ticket) within a closed domain. A first step is to understand the user's goal by using natural language understanding techniques. Once the goal is known, the bot must manage a dialogue to achieve that goal, which is conducted with respect to a learnt policy. The success of the dialogue system depends on the quality of the policy, which is in turn reliant on the availability of high-quality training data for the policy learning method, for instance Deep Reinforcement Learning. Due to the domain specificity, the amount of available data is typically too low to allow the training of good dialogue policies. In this paper we introduce a transfer learning method to mitigate the effects of the low in-domain data availability. Our transfer learning based approach improves the bot's success rate by 20% in relative terms for distant domains and we more than double it for close domains, compared to the model without transfer learning. Moreover, the transfer learning chatbots learn the policy up to 5 to 10 times faster. Finally, as the transfer learning approach is complementary to additional processing such as warm-starting, we show that their joint application gives the best outcomes.
#378

A Weakly Supervised Method for Topic Segmentation and Labeling in Goal-oriented Dialogues via Reinforcement Learning
Ryuichi Takanobu, Minlie Huang, Zhongzhou Zhao, Fenglin Li, Haiqing Chen, Xiaoyan Zhu, Liqiang Nie

Dialogue, Conversation Models

Topic structure analysis plays a pivotal role in dialogue understanding. We propose a reinforcement learning (RL) method for topic segmentation and labeling in goal-oriented dialogues, which aims to detect topic boundaries among dialogue utterances and assign topic labels to the utterances. We address three common issues in the goal-oriented customer service dialogues: informality, local topic continuity, and global topic structure. We explore the task in a weakly supervised setting and formulate it as a sequential decision problem. The proposed method consists of a state representation network to address the informality issue, and a policy network with rewards to model local topic continuity and global topic structure. To train the two networks and offer a warm-start to the policy, we firstly use some keywords to annotate the data automatically. We then pre-train the networks on noisy data. Henceforth, the method continues to refine the data labels using the current policy to learn better state representations on the refined data for obtaining a better policy. Results demonstrate that this weakly supervised method obtains substantial improvements over state-of-the-art baselines.
#2624

Scheduled Policy Optimization for Natural Language Communication with Intelligent Agents
Wenhan Xiong, Xiaoxiao Guo, Mo Yu, Shiyu Chang, Bowen Zhou, William Yang Wang

Dialogue, Conversation Models

We investigate the task of learning to interpret natural language instructions by jointly reasoning with visual observations and language inputs. Unlike current methods which start with learning from demonstrations (LfD) and then use reinforcement learning (RL) to fine-tune the model parameters, we propose a novel policy optimization algorithm which can dynamically schedule demonstration learning and RL. The proposed training paradigm provides efficient exploration and generalization beyond existing methods. Comparing to existing ensemble models, the best single model based on our proposed method tremendously decreases the execution error by 55% on a block-world environment. To further illustrate the exploration strategy of our RL algorithm, our paper includes systematic studies on the evolution of policy entropy during training.
#955

Assigning Personality/Profile to a Chatting Machine for Coherent Conversation Generation
Qiao Qian, Minlie Huang, Haizhou Zhao, Jingfang Xu, Xiaoyan Zhu

Dialogue, Conversation Models

Endowing a chatbot with personality is challenging but significant to deliver more realistic and natural conversations. In this paper, we address the issue of generating responses that are coherent to a pre-specified personality or profile. We present a method that uses generic conversation data from social media (without speaker identities) to generate profile-coherent responses. The central idea is to detect whether a profile should be used when responding to a user post (by a profile detector), and if necessary, select a key-value pair from the profile to generate a response forward and backward (by a bidirectional decoder) so that a personality-coherent response can be generated. Furthermore, in order to train the bidirectional decoder with generic dialogue data, a position detector is designed to predict a word position from which decoding should start given a profile value. Manual and automatic evaluation shows that our model can deliver more coherent, natural, and diversified responses.
#1045

Adaboost with Auto-Evaluation for Conversational Models
Juncen Li, Ping Luo, Ganbin Zhou, Fen Lin, Cheng Niu

Dialogue, Conversation Models

We propose a boosting method for conversational models to encourage them to generate more human-like dialogs. In our method, we consider existing conversational models as weak generators and apply Adaboost to update those models. However, conventional Adaboost cannot be directly applied on conversational models. Because for conversational models, conventional Adaboost cannot adaptively adjust the weight on the instance for subsequent learning, result from the simple comparison between the true output y (to an input x) and its corresponding predicted output y' cannot directly evaluate the learning performance on x. To address this issue, we develop the Adaboost with Auto-Evaluation (called AwE). In AwE, an auto-evaluator is proposed to evaluate the predicted results, which makes it applicable to conversational models. Furthermore, we present the theoretical analysis that the training error drops exponentially fast only if certain assumption over the proposed auto-evaluator holds. Finally, we empirically show that AwE visibly boosts the performance of existing single conversational models and also outperforms the other ensemble methods for conversational models.
#1324

An Ensemble of Retrieval-Based and Generation-Based Human-Computer Conversation Systems
Yiping Song, Cheng-Te Li, Jian-Yun Nie, Ming Zhang, Dongyan Zhao, Rui Yan

Dialogue, Conversation Models

Human-computer conversation systems have attracted much attention in Natural Language Processing. Conversation systems can be roughly divided into two categories: retrieval-based and generation-based systems. Retrieval systems search a user-issued utterance (namely a query ) in a large conversational repository and return a reply that best matches the query. Generative approaches synthesize new replies. Both ways have certain advantages but suffer from their own disadvantages. We propose a novel ensemble of retrieval-based and generation-based conversation system. The retrieved candidates, in addition to the original query, are fed to a reply generator via a neural network, so that the model is aware of more information. The generated reply together with the retrieved ones then participates in a re-ranking process to find the final reply to output. Experimental results show that such an ensemble system outperforms each single module by a large margin.
#1730

Reinforcing Coherence for Sequence to Sequence Model in Dialogue Generation
Hainan Zhang, Yanyan Lan, Jiafeng Guo, Jun Xu, Xueqi Cheng

Dialogue, Conversation Models

Sequence to sequence (Seq2Seq) approach has gained great attention in the field of single-turn dialogue generation. However, one serious problem is that most existing Seq2Seq based models tend to generate common responses lacking specific meanings. Our analysis show that the underlying reason is that Seq2Seq is equivalent to optimizing Kullback–Leibler (KL) divergence, thus does not penalize the case whose generated probability is high while the true probability is low. However, the true probability is unknown, which poses challenges for tackling this problem. Inspired by the fact that the coherence (i.e. similarity) between post and response is consistent with human evaluation, we hypothesize that the true probability of a response is proportional to the coherence degree. The coherence scores are then used as the reward function in a reinforcement learning framework to penalize the case whose generated probability is high while the true probability is low. Three different types of coherence models, including an unlearned similarity function, a pretrained semantic matching function, and an end-to-end dual learning architecture, are proposed in this paper. Experimental results on both Chinese Weibo dataset and English Subtitle dataset show that the proposed models produce more specific and meaningful responses, yielding better performances against Seq2Seq models in terms of both metric-based and human evaluations.

Tuesday 17 08:30 - 09:55 CV-DEE - Deep Learning for Computer Vision (T1)

Chair: Pong C. Yuen

#3145

Image-level to Pixel-wise Labeling: From Theory to Practice
Tiezhu Sun, Wei Zhang, Zhijie Wang, Lin Ma, Zequn Jie

Deep Learning for Computer Vision

Conventional convolutional neural networks (CNNs) have achieved great success in image semantic segmentation. Existing methods mainly focus on learning pixel-wise labels from an image directly. In this paper, we advocate tackling the pixel-wise segmentation problem by considering the image-level classification labels. Theoretically, we analyze and discuss the effects of image-level labels on pixel-wise segmentation from the perspective of information theory. In practice, an end-to-end segmentation model is built by fusing the image-level and pixel-wise labeling networks. A generative network is included to reconstruct the input image and further boost the segmentation model training with an auxiliary loss. Extensive experimental results on benchmark dataset demonstrate the effectiveness of the proposed method, where good image-level labels can significantly improve the pixel-wise segmentation accuracy.
#2729

Unifying and Merging Well-trained Deep Neural Networks for Inference Stage
Yi-Min Chou, Yi-Ming Chan, Jia-Hong Lee, Chih-Yi Chiu, Chu-Song Chen

Deep Learning for Computer Vision

We propose a novel method to merge convolutional neural-nets for the inference stage. Given two well-trained networks that may have different architectures that handle different tasks, our method aligns the layers of the original networks and merges them into a unified model by sharing the representative codes of weights. The shared weights are further re-trained to fine-tune the performance of the merged model. The proposed method effectively produces a compact model that may run original tasks simultaneously on resource-limited devices. As it preserves the general architectures and leverages the co-used weights of well-trained networks, a substantial training overhead can be reduced to shorten the system development time. Experimental results demonstrate a satisfactory performance and validate the effectiveness of the method.
#1618

Refine or Represent: Residual Networks with Explicit Channel-wise Configuration
Yanyan Shen, Jinyang Gao

Deep Learning for Computer Vision

The successes of deep residual learning are mainly based on one key insight: instead of learning a completely new representation y = H(x), it is much easier to learn and optimize its residual mapping F(x)= H(x)-x, as F(x) could be generally closer to zero than the non-residual function H(x). In this paper, we further exploit this insight by explicitly configuring each feature channel with a fine-grained learning style. We define two types of channel-wise learning styles: Refine and Represent. A Refine channel is learnt via the residual function yi= Fi(x) + xi with a regularization term on the channel response ||Fi(x)||, aiming to refine the input feature channel xi of the layer. A Represent channel directly learns a new representation yi = Hi(x) without calculating the residual function with reference to xi. We apply random channel-wise configuration to each residual learning block. Experimental results on the CIFAR10, CIFAR100 and ImageNet datasets demonstrate that our proposed method can substantially improve the performance of conventional residual networks including ResNet, ResNeXt and SENet.
#853

Human Motion Generation via Cross-Space Constrained Sampling
Zhongyue Huang, Jingwei Xu, Bingbing Ni

Deep Learning for Computer Vision

We aim to automatically generate human motion sequence from a single input person image, with some specific action label. To this end, we propose a cross-space human motion video generation network which features two paths: a forward path that first samples/generates a sequence of low dimensional motion vectors based on Gaussian Process (GP), which is paired with the input person image to form a moving human figure sequence; and a backward path based on the predicted human images to re-extract the corresponding latent motion representations. As lack of supervision, the reconstructed latent motion representations are expected to be as close as possible to the GP sampled ones, thus yielding a cyclic objective function for cross-space (i.e., motion and appearance) mutual constrained generation. We further propose an alternative sampling/generation algorithm with respect to constraints from both spaces. Extensive experimental results show that the proposed framework successfully generates novel human motion sequences with reasonable visual quality.
#241

Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks
Yang He, Guoliang Kang, Xuanyi Dong, Yanwei Fu, Yi Yang

Deep Learning for Computer Vision

This paper proposed a Soft Filter Pruning (SFP) method to accelerate the inference procedure of deep Convolutional Neural Networks (CNNs). Specifically, the proposed SFP enables the pruned filters to be updated when training the model after pruning. SFP has two advantages over previous works: (1) Larger model capacity. Updating previously pruned filters provides our approach with larger optimization space than fixing the filters to zero. Therefore, the network trained by our method has a larger model capacity to learn from the training data. (2) Less dependence on the pretrained model. Large capacity enables SFP to train from scratch and prune the model simultaneously. In contrast, previous filter pruning methods should be conducted on the basis of the pre-trained model to guarantee their performance. Empirically, SFP from scratch outperforms the previous filter pruning methods. Moreover, our approach has been demonstrated effective for many advanced CNN architectures. Notably, on ILSCRC-2012, SFP reduces more than 42% FLOPs on ResNet-101 with even 0.2% top-5 accuracy improvement, which has advanced the state-of-the-art. Code is publicly available on GitHub: https://github.com/he-y/softfilter-pruning
#627

Deep Propagation Based Image Matting
Yu Wang, Yi Niu, Peiyong Duan, Jianwei Lin, Yuanjie Zheng

Deep Learning for Computer Vision

In this paper, we propose a deep propagation based image matting framework by introducing deep learning into learning an alpha matte propagation principal. Our deep learning architecture is a concatenation of a deep feature extraction module, an affinity learning module and a matte propagation module. These three modules are all differentiable and can be optimized jointly via an end-to-end training process. Our framework results in a semantic-level pairwise similarity of pixels for propagation by learning deep image representations adapted to matte propagation. It combines the power of deep learning and matte propagation and can therefore surpass prior state-of-the-art matting techniques in terms of both accuracy and training complexity, as validated by our experimental results from 243K images created based on two benchmark matting databases.
#1095

Progressive Blockwise Knowledge Distillation for Neural Network Acceleration
Hui Wang, Hanbin Zhao, Xi Li, Xu Tan

Deep Learning for Computer Vision

As an important and challenging problem in machine learning and computer vision, neural network acceleration essentially aims to enhance the computational efficiency without sacrificing the model accuracy too much. In this paper, we propose a progressive blockwise learning scheme for teacher-student model distillation at the subnetwork block level. The proposed scheme is able to distill the knowledge of the entire teacher network by locally extracting the knowledge of each block in terms of progressive blockwise function approximation. Furthermore, we propose a structure design criterion for the student subnetwork block, which is able to effectively preserve the original receptive field from the teacher network. Experimental results demonstrate the effectiveness of the proposed scheme against the state-of-the-art approaches.

Tuesday 17 08:30 - 09:55 SIS-ML1 - Sister Conferences Best Papers: Machine Learning (K11)

Chair: Volker Tresp

#5120

Emergent Tangled Program Graphs in Multi-Task Learning
Stephen Kelly, Malcolm Heywood

Sister Conferences Best Papers: Machine Learning

We propose a Genetic Programming (GP) framework to address high-dimensional Multi-Task Reinforcement Learning (MTRL) through emergent modularity. A bottom-up process is assumed in which multiple programs self-organize into collective decision-making entities, or teams, which then further develop into multi-team policy graphs, or Tangled Program Graphs (TPG). The framework learns to play three Atari video games simultaneously, producing a single control policy that matches or exceeds leading results from (game-specific) deep reinforcement learning in each game. More importantly, unlike the representation assumed for deep learning, TPG policies start simple and adaptively complexify through interaction with the task environment, resulting in agents that are exceedingly simple, operating in real-time without specialized hardware support such as GPUs.
#5138

Make Evasion Harder: An Intelligent Android Malware Detection System
Shifu Hou, Yanfang Ye, Yangqiu Song, Melih Abdulhayoglu

Sister Conferences Best Papers: Machine Learning

To combat the evolving Android malware attacks, in this paper, instead of only using Application Programming Interface (API) calls, we further analyze the different relationships between them and create higher-level semantics which require more efforts for attackers to evade the detection. We represent the Android applications (apps), related APIs, and their rich relationships as a structured heterogeneous information network (HIN). Then we use a meta-path based approach to characterize the semantic relatedness of apps and APIs. We use each meta-path to formulate a similarity measure over Android apps, and aggregate different similarities using multi-kernel learning to make predictions. Promising experimental results based on real sample collections from Comodo Cloud Security Center demonstrate that our developed system HinDroid outperforms other alternative Android malware detection techniques.
#5147

Time Series Chains: A Novel Tool for Time Series Data Mining
Yan Zhu, Makoto Imamura, Daniel Nikovski, Eamonn Keogh

Sister Conferences Best Papers: Machine Learning

Since their introduction over a decade ago, time se-ries motifs have become a fundamental tool for time series analytics, finding diverse uses in dozens of domains. In this work we introduce Time Series Chains, which are related to, but distinct from, time series motifs. Informally, time series chains are a temporally ordered set of subsequence patterns, such that each pattern is similar to the pattern that preceded it, but the first and last patterns are arbi-trarily dissimilar. In the discrete space, this is simi-lar to extracting the text chain “hit, hot, dot, dog” from a paragraph. The first and last words have nothing in common, yet they are connected by a chain of words with a small mutual difference. Time Series Chains can capture the evolution of systems, and help predict the future. As such, they potentially have implications for prognostics. In this work, we introduce a robust definition of time series chains, and a scalable algorithm that allows us to discover them in massive datasets.
#5150

TensorCast: Forecasting Time-Evolving Networks with Contextual Information
Miguel Araújo, Pedro Ribeiro, Christos Faloutsos

Sister Conferences Best Papers: Machine Learning

Can we forecast future connections in a social network? Can we predict who will start using a given hashtag in Twitter, leveraging contextual information such as who follows or retweets whom to improve our predictions? In this paper we present an abridged report of TensorCast, an award winning method for forecasting time-evolving networks, that uses coupled tensors to incorporate multiple information sources. TensorCast is scalable (linearithmic on the number of connections), effective (more precise than competing methods) and general (applicable to any data source representable by a tensor). We also showcase our method when applied to forecast two large scale heterogeneous real world temporal networks, namely Twitter and DBLP.
#5114

A Genetic Programming Approach to Designing Convolutional Neural Network Architectures
Masanori Suganuma, Shinichi Shirakawa, Tomoharu Nagao

Sister Conferences Best Papers: Machine Learning

We propose a method for designing convolutional neural network (CNN) architectures based on Cartesian genetic programming (CGP). In the proposed method, the architectures of CNNs are represented by directed acyclic graphs, in which each node represents highly-functional modules such as convolutional blocks and tensor operations, and each edge represents the connectivity of layers. The architecture is optimized to maximize the classification accuracy for a validation dataset by an evolutionary algorithm. We show that the proposed method can find competitive CNN architectures compared with state-of-the-art methods on the image classification task using CIFAR-10 and CIFAR-100 datasets.
#5130

Distributing Frank-Wolfe via Map-Reduce
Armin Moharrer, Stratis Ioannidis

Sister Conferences Best Papers: Machine Learning

We identify structural properties under which a convex optimization over the simplex can be massively parallelized via map-reduce operations using the Frank-Wolfe (FW) algorithm. A broad class of problems, e.g., Convex Approximation, Experimental Designs, and Adaboost, can be tackled this way. We implement FW over Apache Spark, and solve problems with 20 million variables using 350 cores in 79 minutes; the same operation takes 165 hours when executed serially.
#5122

An Efficient Minibatch Acceptance Test for Metropolis-Hastings
Daniel Seita, Xinlei Pan, Haoyu Chen, John Canny

Sister Conferences Best Papers: Machine Learning

We present a novel Metropolis-Hastings method for large datasets that uses small expected-size mini-batches of data. Previous work on reducing the cost of Metropolis-Hastings tests yields only constant factor reductions versus using the full dataset for each sample. Here we present a method that can be tuned to provide arbitrarily small batch sizes, by adjusting either proposal step size or temperature. Our test uses the noise-tolerant Barker acceptance test with a novel additive correction variable. The resulting test has similar cost to a normal SGD update. Our experiments demonstrate several order-of-magnitude speedups over previous work.

Tuesday 17 08:30 - 09:55 HAI-COG - Cognition (C2)

Chair: Jörg Cassens

#1751

Brain-inspired Balanced Tuning for Spiking Neural Networks
Tielin Zhang, Yi Zeng, Dongcheng Zhao, Bo Xu

Cognition

Due to the nature of Spiking Neural Networks (SNNs), it is challenging to be trained by biologically plausible learning principles. The multi-layered SNNs are with non-differential neurons, temporary-centric synapses, which make them nearly impossible to be directly tuned by back propagation. Here we propose an alternative biological inspired balanced tuning approach to train SNNs. The approach contains three main inspirations from the brain: Firstly, the biological network will usually be trained towards the state where the temporal update of variables are equilibrium (e.g. membrane potential); Secondly, specific proportions of excitatory and inhibitory neurons usually contribute to stable representations; Thirdly, the short-term plasticity (STP) is a general principle to keep the input and output of synapses balanced towards a better learning convergence. With these inspirations, we train SNNs with three steps: Firstly, the SNN model is trained with three brain-inspired principles; then weakly supervised learning is used to tune the membrane potential in the final layer for network classification; finally the learned information is consolidated from membrane potential into the weights of synapses by Spike-Timing Dependent Plasticity (STDP). The proposed approach is verified on the MNIST hand-written digit recognition dataset and the performance (the accuracy of 98.64%) indicates that the ideas of balancing state could indeed improve the learning ability of SNNs, which shows the power of proposed brain-inspired approach on the tuning of biological plausible SNNs.
#2235

CSNN: An Augmented Spiking based Framework with Perceptron-Inception
Qi Xu, Yu Qi, Hang Yu, Jiangrong Shen, Huajin Tang, Gang Pan

Cognition

Spiking Neural Networks (SNNs) represent and transmit information in spikes, which is considered more biologically realistic and computationally powerful than the traditional Artificial Neural Networks. The spiking neurons encode useful temporal information and possess highly anti-noise property. The feature extraction ability of typical SNNs is limited by shallow structures. This paper focuses on improving the feature extraction ability of SNNs in virtue of powerful feature extraction ability of Convolutional Neural Networks (CNNs). CNNs can extract abstract features resorting to the structure of the convolutional feature maps. We propose a CNN-SNN (CSNN) model to combine feature learning ability of CNNs with cognition ability of SNNs. The CSNN model learns the encoded spatial temporal representations of images in an event-driven way. We evaluate the CSNN model on the handwritten digits images dataset MNIST and its variational databases. In the presented experimental results, the proposed CSNN model is evaluated regarding learning capabilities, encoding mechanisms, robustness to noisy stimuli and its classification performance. The results show that CSNN behaves well compared to other cognitive models with significantly fewer neurons and training samples. Our work brings more biological realism into modern image classification models, with the hope that these models can inform how the brain performs this high-level vision task.
#2310

Jointly Learning Network Connections and Link Weights in Spiking Neural Networks
Yu Qi, Jiangrong Shen, Yueming Wang, Huajin Tang, Hang Yu, Zhaohui Wu, Gang Pan

Cognition

Spiking neural networks (SNNs) are considered to be biologically plausible and power-efficient on neuromorphic hardware. However, unlike the brain mechanisms, most existing SNN algorithms have fixed network topologies and connection relationships. This paper proposes a method to jointly learn network connections and link weights simultaneously. The connection structures are optimized by the spike-timing-dependent plasticity (STDP) rule with timing information, and the link weights are optimized by a supervised algorithm. The connection structures and the weights are learned alternately until a termination condition is satisfied. Experiments are carried out using four benchmark datasets. Our approach outperforms classical learning methods such as STDP, Tempotron, SpikeProp, and a state-of-the-art supervised algorithm. In addition, the learned structures effectively reduce the number of connections by about 24%, thus facilitate the computational efficiency of the network.
#4080

Replicating Active Appearance Model by Generator Network
Tian Han, Jiawen Wu, Ying Nian Wu

Cognition

A recent Cell paper [Chang and Tsao, 2017] reports an interesting discovery. For the face stimuli generated by a pre-trained active appearance model (AAM), the responses of neurons in the areas of the primate brain that are responsible for face recognition exhibit strong linear relationship with the shape variables and appearance variables of the AAM that generates the face stimuli. In this paper, we show that this behavior can be replicated by a deep generative model called the generator network, which assumes that the observed signals are generated by latent random variables via a top-down convolutional neural network. Specifically, we learn the generator network from the face images generated by a pre-trained AAM model using variational auto-encoder, and we show that the inferred latent variables of the learned generator network have strong linear relationship with the shape and appearance variables of the AAM model that generates the face images. Unlike the AAM model that has an explicit shape model where the shape variables generate the control points or landmarks, the generator network has no such shape model and shape variables. Yet the generator network can learn the shape knowledge in the sense that some of the latent variables of the learned generator network capture the shape variations in the face images generated by AAM.
#4177

Similarity-Based Reasoning, Raven's Matrices, and General Intelligence
Can Serif Mekik, Ron Sun, David Yun Dai

Cognition

This paper presents a model tackling a variant of the Raven's Matrices family of human intelligence tests along with computational experiments. Raven's Matrices are thought to challenge human subjects' ability to generalize knowledge and deal with novel situations. We investigate how a generic ability to quickly and accurately generalize knowledge can be succinctly captured by a computational system. This work is distinct from other prominent attempts to deal with the task in terms of adopting a generalized similarity-based approach. Raven's Matrices appear to primarily require similarity-based or analogical reasoning over a set of varied visual stimuli. The similarity-based approach eliminates the need for structure mapping as emphasized in many existing analogical reasoning systems. Instead, it relies on feature-based processing with both relational and non-relational features. Preliminary experimental results suggest that our approach performs comparably to existing symbolic analogy-based models.
#534

A Simple Convolutional Neural Network for Accurate P300 Detection and Character Spelling in Brain Computer Interface
Hongchang Shan, Yu Liu, Todor Stefanov

Cognition

A Brain Computer Interface (BCI) character speller allows human-beings to directly spell characters using eye-gazes, thereby building communication between the human brain and a computer. Convolutional Neural Networks (CNNs) have shown better performance than traditional machine learning methods for BCI signal recognition and its application to the character speller. However, current CNN architectures limit further accuracy improvements of signal detection and character spelling and also need high complexity to achieve competitive accuracy, thereby preventing the use of CNNs in portable BCIs. To address these issues, we propose a novel and simple CNN which effectively learns feature representations from both raw temporal information and raw spatial information. The complexity of the proposed CNN is significantly reduced compared with state-of-the-art CNNs for BCI signal detection. We perform experiments on three benchmark datasets and compare our results with those in previous research works which report the best results. The comparison shows that our proposed CNN can increase the signal detection accuracy by up to 15.61% and the character spelling accuracy by up to 19.35%.
#27

Salient Object Detection by Lossless Feature Reflection
Pingping Zhang, Wei Liu, Huchuan Lu, Chunhua Shen

Cognition

Salient object detection, which aims to identify and locate the most salient pixels or regions in images, has been attracting more and more interest due to its various real-world applications. However, this vision task is quite challenging, especially under complex image scenes. Inspired by the intrinsic reflection of natural images, in this paper we propose a novel feature learning framework for large-scale salient object detection. Specifically, we design a symmetrical fully convolutional network (SFCN) to learn complementary saliency features under the guidance of lossless feature reflection. The location information, together with contextual and semantic information, of salient objects are jointly utilized to supervise the proposed network for more accurate saliency predictions. In addition, to overcome the blurry boundary problem, we propose a new structural loss function to learn clear object boundaries and spatially consistent saliency. The coarse prediction results are effectively refined by these structural information for performance improvements. Extensive experiments on seven saliency detection datasets demonstrate that our approach achieves consistently superior performance and outperforms the very recent state-of-the-art methods.

Tuesday 17 08:30 - 09:55 MUL-CFR - Collaborative Filtering, Recommender Systems (C3)

Chair: Paola Velardi

#470

Recurrent Collaborative Filtering for Unifying General and Sequential Recommender
Disheng Dong, Xiaolin Zheng, Ruixun Zhang, Yan Wang

Collaborative Filtering, Recommender Systems

General recommender and sequential recommender are two commonly applied modeling paradigms for recommendation tasks. General recommender focuses on modeling the general user preferences, ignoring the sequential patterns in user behaviors; whereas sequential recommender focuses on exploring the item-to-item sequential relations, failing to model the global user preferences. In addition, better recommendation performance has recently been achieved by adopting an approach to combine them. However, previous approaches are unable to solve both tasks in a unified way and cannot capture the whole historical sequential information. In this paper, we propose a recommendation model named Recurrent Collaborative Filtering (RCF), which unifies both paradigms within a single model.Specifically, we combine recurrent neural network (the sequential recommender part) and matrix factorization model (the general recommender part) in a multi-task learning framework, where we perform joint optimization with shared model parameters enforcing the two parts to regularize each other. Furthermore, we empirically demonstrate on MovieLens and Netflix datasets that our model outperforms the state-of-the-art methods across the tasks of both sequential and general recommender.
#3015

Aspect-Level Deep Collaborative Filtering via Heterogeneous Information Networks
Xiaotian Han, Chuan Shi, Senzhang Wang, Philip S. Yu, Li Song

Collaborative Filtering, Recommender Systems

Latent factor models have been widely used for recommendation. Most existing latent factor models mainly utilize the rating information between users and items, although some recently extended models add some auxiliary information to learn a unified latent factor between users and items. The unified latent factor only represents the latent features of users and items from the aspect of purchase history. However, the latent features of users and items may stem from different aspects, e.g., the brand-aspect and category-aspect of items. In this paper, we propose a Neural network based Aspect-level Collaborative Filtering model (NeuACF) to exploit different aspect latent factors. Through modelling rich objects and relations in recommender system as a heterogeneous information network, NeuACF first extracts different aspect-level similarity matrices of users and items through different meta-paths and then feeds an elaborately designed deep neural network with these matrices to learn aspect-level latent factors. Finally, the aspect-level latent factors are effectively fused with an attention mechanism for the top-N recommendation. Extensive experiments on three real datasets show that NeuACF significantly outperforms both existing latent factor models and recent neural network models.
#2281

Outer Product-based Neural Collaborative Filtering
Xiangnan He, Xiaoyu Du, Xiang Wang, Feng Tian, Jinhui Tang, Tat-Seng Chua

Collaborative Filtering, Recommender Systems

In this work, we contribute a new multi-layer neural network architecture named ONCF to perform collaborative filtering. The idea is to use an outer product to explicitly model the pairwise correlations between the dimensions of the embedding space. In contrast to existing neural recommender models that combine user embedding and item embedding via a simple concatenation or element-wise product, our proposal of using outer product above the embedding layer results in a two-dimensional interaction map that is more expressive and semantically plausible. Above the interaction map obtained by outer product, we propose to employ a convolutional neural network to learn high-order correlations among embedding dimensions. Extensive experiments on two public implicit feedback data demonstrate the effectiveness of our proposed ONCF framework, in particular, the positive effect of using outer product to model the correlations between embedding dimensions in the low level of multi-layer neural recommender model.
#2222

Adaptive Collaborative Similarity Learning for Unsupervised Multi-view Feature Selection
Xiao Dong, Lei Zhu, Xuemeng Song, Jingjing Li, Zhiyong Cheng

Collaborative Filtering, Recommender Systems

In this paper, we investigate the research problem of unsupervised multi-view feature selection. Conventional solutions first simply combine multiple pre-constructed view-specific similarity structures into a collaborative similarity structure, and then perform the subsequent feature selection. These two processes are separate and independent. The collaborative similarity structure remains fixed during feature selection. Further, the simple undirected view combination may adversely reduce the reliability of the ultimate similarity structure for feature selection, as the view-specific similarity structures generally involve noises and outlying entries. To alleviate these problems, we propose an adaptive collaborative similarity learning (ACSL) for multi-view feature selection. We propose to dynamically learn the collaborative similarity structure, and further integrate it with the ultimate feature selection into a unified framework. Moreover, a reasonable rank constraint is devised to adaptively learn an ideal collaborative similarity structure with proper similarity combination weights and desirable neighbor assignment, both of which could positively facilitate the feature selection. An effective solution guaranteed with the proved convergence is derived to iteratively tackle the formulated optimization problem. Experiments demonstrate the superiority of the proposed approach.
#3266

NPE: Neural Personalized Embedding for Collaborative Filtering
ThaiBinh Nguyen, Atsuhiro Takasu

Collaborative Filtering, Recommender Systems

Matrix factorization is one of the most efficient approaches in recommender systems. However, such algorithms, which rely on the interactions between users and items, perform poorly for "cold-users" (users with little history of such interactions) and at capturing the relationships between closely related items. To address these problems, we propose a neural personalized embedding (NPE) model, which improves the recommendation performance for cold-users and can learn effective representations of items. It models a user's click to an item in two terms: the personal preference of the user for the item, and the relationships between this item and other items clicked by the user. We show that NPE outperforms competing methods for top-N recommendations, specially for cold-user recommendations. We also performed a qualitative analysis that shows the effectiveness of the representations learned by the model.
#1888

Towards Better Representation Learning for Personalized News Recommendation: a Multi-Channel Deep Fusion Approach
Jianxun Lian, Fuzheng Zhang, Xing Xie, Guangzhong Sun

Collaborative Filtering, Recommender Systems

Millions of news articles emerge every day. How to provide personalized news recommendations has become a critical task for service providers. In the past few decades, latent factor models has been widely used for building recommender systems (RSs). With the remarkable success of deep learning techniques especially in visual computing and natural language understanding, more and more researchers have been trying to leverage deep neural networks to learn latent representations for advanced RSs. Following mainstream deep learning-based RSs, we propose a novel deep fusion model (DFM), which aims to improve the representation learning abilities in deep RSs and can be used for both candidate retrieval and item re-ranking. There are two key components in our DFM approach, namely an inception module and an attention mechanism. The inception module improves the plain multi-layer network via leveraging of various levels of interaction simultaneously, while the attention mechanism merges latent representations learnt from different channels in a customized fashion. We conduct extensive experiments on a commercial news reading dataset, and the results demonstrate that the proposed DFM is superior to several state-of-the-art models.
#3708

Content-Aware Hierarchical Point-of-Interest Embedding Model for Successive POI Recommendation
Buru Chang, Yonggyu Park, Donghyeon Park, Seongsoon Kim, Jaewoo Kang

Collaborative Filtering, Recommender Systems

Recommending a point-of-interest (POI) a user will visit next based on temporal and spatial context information is an important task in mobile-based applications. Recently, several POI recommendation models based on conventional sequential-data modeling approaches have been proposed. However, such models focus on only a user's check-in sequence information and the physical distance between POIs. Furthermore, they do not utilize the characteristics of POIs or the relationships between POIs. To address this problem, we propose CAPE, the first content-aware POI embedding model which utilizes text content that provides information about the characteristics of a POI. CAPE consists of a check-in context layer and a text content layer. The check-in context layer captures the geographical influence of POIs from the check-in sequence of a user, while the text content layer captures the characteristics of POIs from the text content. To validate the efficacy of CAPE, we constructed a large-scale POI dataset. In the experimental evaluation, we show that the performance of the existing POI recommendation models can be significantly improved by simply applying CAPE to the models.

Tuesday 17 10:25 - 11:10 Invited Talk (VICTORIA)

Chair: Sarit Kraus

The Moral Machine Experiment
Jean-Francois Bonnefon

Invited Talk

Tuesday 17 11:20 - 12:45 DEMOS1 - Demos Talks 1: Planning, Robotics, Vision (VICTORIA)

Chair: Paul Weng

#5306

Data-Driven Inventory Management and Dynamic Pricing Competition on Online Marketplaces
Rainer Schlosser, Carsten Walther, Martin Boissier, Matthias Uflacker

Demos Talks 1: Planning, Robotics, Vision

Online markets are characterized by competition and limited demand information. In E-commerce, firms compete against each other using data-driven dynamic pricing and ordering strategies. To successfully manage both inventory levels as well as offer prices is a highly challenging task as (i) demand is uncertain, (ii) competitors strategically interact, and (iii) optimized pricing and ordering decisions are mutually dependent. Currently, retailers lack the possibility to test and evaluate their algorithms appropriately before releasing them into the real world. To study joint dynamic ordering and pricing competition on online marketplaces, we built an interactive simulation platform. To be both flexible and scalable, the platform has a microservice-based architecture and allows handling dozens of competing merchants and streams of consumers with configurable characteristics. Further, we deployed and compared different pricing and ordering strategies, from simple rule-based ones to highly sophisticated data-driven strategies which are based on state-of-the-art demand learning techniques and efficient dynamic optimization models.
#5326

IBM Scenario Planning Advisor: Plan Recognition as AI Planning in Practice
Shirin Sohrabi, Michael Katz, Oktie Hassanzadeh, Octavian Udrea, Mark D. Feblowitz

Demos Talks 1: Planning, Robotics, Vision

We present the IBM Research Scenario Planning Advisor (SPA), a decision support system that allows users to generate diverse alternate scenarios of the future and enhance their ability to imagine the different possible outcomes, including unlikely but potentially impactful futures. The system includes tooling for experts to intuitively encode their domain knowledge, and uses AI Planning to reason about this knowledge and the current state of the world, including news and social media, when generating scenarios.
#5328

Visualizations for an Explainable Planning Agent
Tathagata Chakraborti, Kshitij P. Fadnis, Kartik Talamadupula, Mishal Dholakia, Biplav Srivastava, Jeffrey O. Kephart, Rachel K. E. Bellamy

Demos Talks 1: Planning, Robotics, Vision

In this demonstration, we report on the visualization capabilities of an Explainable AI Planning (XAIP) agent that can support human-in-the-loop decision-making. Imposing transparency and explainability requirements on such agents is crucial for establishing human trust and common ground with an end-to-end automated planning system. Visualizing the agent's internal decision making processes is a crucial step towards achieving this. This may include externalizing the "brain" of the agent: starting from its sensory inputs, to progressively higher order decisions made by it in order to drive its planning components. We demonstrate these functionalities in the context of a smart assistant in the Cognitive Environments Laboratory at IBM's T.J. Watson Research Center.
#5338

Near Real-Time Detection of Poachers from Drones in AirSim
Elizabeth Bondi, Ashish Kapoor, Debadeepta Dey, James Piavis, Shital Shah, Robert Hannaford, Arvind Iyer, Lucas Joppa, Milind Tambe

Demos Talks 1: Planning, Robotics, Vision

The unrelenting threat of poaching has led to increased development of new technologies to combat it. One such example is the use of thermal infrared cameras mounted on unmanned aerial vehicles (UAVs or drones) to spot poachers at night and report them to park rangers before they are able to harm any animals. However, monitoring the live video stream from these conservation UAVs all night is an arduous task. Therefore, we discuss SPOT (Systematic Poacher deTector), a novel application that augments conservation drones with the ability to automatically detect poachers and animals in near real time. SPOT illustrates the feasibility of building upon state-of-the-art AI techniques, such as Faster RCNN, to address the challenges of automatically detecting animals and poachers in infrared images. This paper reports (i) the design of SPOT, (ii) efficient processing techniques to ensure usability in the field, (iii) evaluation of SPOT based on historical videos and a real-world test run by the end-users, Air Shepherd, in the field, and (iv) the use of AirSim for live demonstration of SPOT. The promising results from a field test have led to a plan for larger-scale deployment in a national park in southern Africa. While SPOT is developed for conservation drones, its design and novel techniques have wider application for automated detection from UAV videos.
#5339

A Virtual Environment with Multi-Robot Navigation, Analytics, and Decision Support for Critical Incident Investigation
David L. Smyth, James Fennell, Sai Abinesh, Nazli B. Karimi, Frank G. Glavin, Ihsan Ullah, Brett Drury, Michael G. Madden

Demos Talks 1: Planning, Robotics, Vision

Accidents and attacks that involve chemical, biological, radiological/nuclear or explosive (CBRNE) substances are rare, but can be of high consequence. Since the investigation of such events is not anybody's routine work, a range of AI techniques can reduce investigators' cognitive load and support decision-making, including: planning the assessment of the scene; ongoing evaluation and updating of risks; control of autonomous vehicles for collecting images and sensor data; reviewing images/videos for items of interest; identification of anomalies; and retrieval of relevant documentation. Because of the rare and high-risk nature of these events, realistic simulations can support the development and evaluation of AI-based tools. We have developed realistic models of CBRNE scenarios and implemented an initial set of tools.
#5348

Generating Plans for Cooperative Connected UAVs
François Bodin, Tristan Charrier, Arthur Queffelec, François Schwarzentruber

Demos Talks 1: Planning, Robotics, Vision

We present a tool for graph coverage with a fleet of UAVs. The UAVs must achieve the coverage of an area under the constraint of staying connected with the base, where the mission supervisor starts the plan. With an OpenStreetMap interface, the user is able to choose a specific location on which the mission needs to be generated and observes the resulting plan being executed.
#5347

Curly: An AI-based Curling Robot Successfully Competing in the Olympic Discipline of Curling
Dong-Ok Won, Byung-Do Kim, Ho-Jung Kim, Tae-San Eom, Klaus-Robert Müller, Seong-Whan Lee

Demos Talks 1: Planning, Robotics, Vision

Most artificial intelligence (AI) based learning systems act in virtual or laboratory environments. Here we demonstrate an AI-based curling robot system named `Curly' that competes on a real-world curling ice sheet. Curly encompasses (1) an AI-based curling strategy and simulation engine under consideration of the high `icy' uncertainty, (2) the thrower robot enabled by autonomous driving with traction control, and (3) the skip robot that allows to recognize the curling field and stone configuration based on vision technology. The Curly performed well both: in classical game situations and when interacting with human opponents, namely, the top-ranked Korean amateur high school curling team.

Tuesday 17 11:20 - 12:45 MAS-RA - Resource Allocation (C8)

Chair: Haris Aziz

#617

Democratic Fair Allocation of Indivisible Goods
Erel Segal-Halevi, Warut Suksompong

Resource Allocation

We study the problem of fairly allocating indivisible goods to groups of agents. Agents in the same group share the same set of goods even though they may have different preferences. Previous work has focused on unanimous fairness, in which all agents in each group must agree that their group's share is fair. Under this strict requirement, fair allocations exist only for small groups. We introduce the concept of democratic fairness, which aims to satisfy a certain fraction of the agents in each group. This concept is better suited to large groups such as cities or countries. We present protocols for democratic fair allocation among two or more arbitrarily large groups of agents with monotonic, additive, or binary valuations. Our protocols approximate both envy-freeness and maximin-share fairness. As an example, for two groups of agents with additive valuations, our protocol yields an allocation that is envy-free up to one good and gives at least half of the maximin share to at least half of the agents in each group.
#1170

Maximin Share Allocations on Cycles
Zbigniew Lonc, Miroslaw Truszczynski

Resource Allocation

The problem of fair division of indivisible goods is a fundamental problem of social choice. Recently, the problem was extended to the setting when goods form a graph and the goal is to allocate goods to agents so that each agent's bundle forms a connected subgraph. Researchers proved that, unlike in the original problem (which corresponds to the case of the complete graph in the extended setting), in the case of the goods-graph being a tree, allocations offering each agent a bundle of or exceeding her maximin share value always exist. Moreover, they can be found in polynomial time. We consider here the problem of maximin share allocations of goods on a cycle. Despite the simplicity of the graph, the problem turns out be significantly harder than its tree version. We present cases when maximin share allocations of goods on cycles exist and provide results on allocations guaranteeing each agent a certain portion of her maximin share. We also study algorithms for computing maximin share allocations of goods on cycles.
#1476

Truthful Fair Division without Free Disposal
Xiaohui Bei, Guangda Huzhang, Warut Suksompong

Resource Allocation

We study the problem of fairly dividing a heterogeneous resource, commonly known as cake cutting and chore division, in the presence of strategic agents. While a number of results in this setting have been established in previous works, they rely crucially on the free disposal assumption, meaning that the mechanism is allowed to throw away part of the resource at no cost. In the present work, we remove this assumption and focus on mechanisms that always allocate the entire resource. We exhibit a truthful envy-free mechanism for cake cutting and chore division for two agents with piecewise uniform valuations, and we complement our result by showing that such a mechanism does not exist when certain additional assumptions are made. Moreover, we give truthful mechanisms for multiple agents with restricted classes of valuations.
#2340

Dynamic Fair Division Problem with General Valuations
Bo Li, Wenyang Li, Yingkai Li

Resource Allocation

In this paper, we focus on how to dynamically allocate a divisible resource fairly among n players who arrive and depart over time. The players may have general heterogeneous valuations over the resource. It is known that the exact envy-free and proportional allocations may not exist in the dynamic setting [Walsh, 2011]. Thus, we will study to what extent we can guarantee the fairness in the dynamic setting. We first design two algorithms which are O(log n)-proportional and O(n)-envy-free for the setting with general valuations, and by constructing the adversary instances such that all dynamic algorithms must be at least Omega(1)-proportional and Omega(n/log n)-envy-free, we show that the bounds are tight up to a logarithmic factor. Moreover, we introduce the setting where the players' valuations are uniform on the resource but with different demands, which generalize the setting of [Friedman et al., 2015]. We prove an O(log n) upper bound and a tight lower bound for this case.
#4519

Fair Division Under Cardinality Constraints
Arpita Biswas, Siddharth Barman

Resource Allocation

We consider the problem of fairly allocating indivisible goods, among agents, under cardinality constraints and additive valuations. In this setting, we are given a partition of the entire set of goods---i.e., the goods are categorized---and a limit is specified on the number of goods that can be allocated from each category to any agent. The objective here is to find a fair allocation in which the subset of goods assigned to any agent satisfies the given cardinality constraints. This problem naturally captures a number of resource-allocation applications, and is a generalization of the well-studied unconstrained fair division problem. The two central notions of fairness, in the context of fair division of indivisible goods, are envy freeness up to one good (EF1) and the (approximate) maximin share guarantee (MMS). We show that the existence and algorithmic guarantees established for these solution concepts in the unconstrained setting can essentially be achieved under cardinality constraints. Furthermore, focusing on the case wherein all the agents have the same additive valuation, we establish that EF1 allocations exist even under matroid constraints.
#3557

Redividing the Cake
Erel Segal-Halevi

Resource Allocation

A heterogeneous resource, such as a land-estate, is already divided among several agents in an unfair way.The challenge is to re-divide it among the agents in a way that balances fairness with ownership rights.We present re-division protocols that attain various combinations of fairness and ownership rights, in various settings differing in the geometric constraints on the allotments: (a) no geometric constraints; (b) connectivity --- the cake is a one-dimensional interval and each piece must be a contiguous interval; (c) rectangularity --- the cake is a two-dimensional rectangle and the pieces should be rectangles; (d) convexity --- the cake is a two-dimensional convex polygon and the pieces should be convex.
#4014

Comparing Approximate Relaxations of Envy-Freeness
Georgios Amanatidis, Georgios Birmpas, Vangelis Markakis

Resource Allocation

In fair division problems with indivisible goods it is well known that one cannot have any guarantees for the classic fairness notions of envy-freeness and proportionality. As a result, several relaxations have been introduced, most of which in quite recent works. We focus on four such notions, namely envy-freeness up to one good (EF1), envy-freeness up to any good (EFX), maximin share fairness (MMS), and pairwise maximin share fairness (PMMS). Since obtaining these relaxations also turns out to be problematic in several scenarios, approximate versions of them have also been considered. In this work, we investigate further the connections between the four notions mentioned above and their approximate versions. We establish several tight or almost tight results concerning the approximation quality that any of these notions guarantees for the others, providing an almost complete picture of this landscape. Some of our findings reveal interesting and surprising consequences regarding the power of these notions, e.g., PMMS and EFX provide the same worst-case guarantee for MMS, despite PMMS being a strictly stronger notion than EFX. We believe such implications provide further insight on the quality of approximately fair solutions.

Tuesday 17 11:20 - 12:45 CV-MT - Motion and Tracking (T1)

Chair: Wei Feng

#2357

Feature Integration with Adaptive Importance Maps for Visual Tracking
Aishi Li, Ming Yang, Wanqi Yang

Motion and Tracking

Discriminative correlation filters have recently achieved excellent performance for visual object tracking. The key to success is to make full use of dense sampling and specific properties of circulant matrices in the Fourier domain. However, previous studies don't take into consideration the importance and complementary information of different features, simply concatenating them. This paper investigates an effective method of feature integration for correlation filters, which jointly learns filters, as well as importance maps in each frame. These importance maps borrow the advantages of different features, aiming to achieve complementary traits and improve robustness. Moreover, for each feature, an importance map is shared by its all channels to avoid overfitting. In addition, we introduce a regularization term for the importance maps and use the penalty factor to control the significance of features. Based on handcrafted and CNN features, we implement two trackers, which achieve a competitive performance compared with several state-of-the-art trackers.
#1277

Learning Robust Gaussian Process Regression for Visual Tracking
Linyu Zheng, Ming Tang, Jinqiao Wang

Motion and Tracking

Recent developments of Correlation Filter based trackers (CF trackers) have attracted much attention because of their top performance. However, the boundary effect imposed by the basic periodic assumption in their fast optimization seriously degrades the performance of CF trackers. Although there existed many recent works to relax the boundary effect in CF trackers, the cost was that they can not utilize the kernel trick to improve the accuracy further. In this paper, we propose a novel Gaussian Process Regression based tracker (GPRT) which is a conceptually natural tracking approach. Compared to all the existing CF trackers, the boundary effect is eliminated thoroughly and the kernel trick can be employed in our GPRT. In addition, we present two efficient and effective update methods for our GPRT. Experiments are performed on two public datasets: OTB-2013 and OTB-2015. Without bells and whistles, on these two datasets, our GPRT obtains 84.1% and 79.2% in mean overlap precision, respectively, outperforming all the existing trackers with hand-crafted features.
#1372

Long-Term Human Motion Prediction by Modeling Motion Context and Enhancing Motion Dynamics
Yongyi Tang, Lin Ma, Wei Liu, Wei-Shi Zheng

Motion and Tracking

Human motion prediction aims at generating future frames of human motion based on an observed sequence of skeletons. Recent methods employ the latest hidden states of a recurrent neural network (RNN) to encode the historical skeletons, which can only address short-term prediction. In this work, we propose a motion context modeling by summarizing the historical human motion with respect to the current prediction. A modified highway unit (MHU) is proposed for efficiently eliminating motionless joints and estimating next pose given the motion context. Furthermore, we enhance the motion dynamic by minimizing the gram matrix loss for long-term motion prediction. Experimental results show that the proposed model can promisingly forecast the human future movements, which yields superior performances over related state-of-the-art approaches. Moreover, specifying the motion context with the activity labels enables our model to perform human motion transfer.
#2490

Layered Optical Flow Estimation Using a Deep Neural Network with a Soft Mask
Xi Zhang, Di Ma, Xu Ouyang, Shanshan Jiang, Lin Gan, Gady Agam

Motion and Tracking

Using a layered representation for motion estimation has the advantage of being able to cope with discontinuities and occlusions. In this paper, we learn to estimate optical flow by combining a layered motion representation with deep learning. Instead of pre-segmenting the image to layers, the proposed approach automatically generates a layered representation of optical flow using the proposed soft-mask module. The essential components of the soft-mask module are maxout and fuse operations, which enable a disjoint layered representation of optical flow and more accurate flow estimation. We show that by using masks the motion estimate results in a quadratic function of input features in the output layer. The proposed soft-mask module can be added to any existing optical flow estimation networks by replacing their flow output layer. In this work, we use FlowNet as the base network to which we add the soft-mask module. The resulting network is tested on three well-known benchmarks with both supervised and unsupervised flow estimation tasks. Evaluation results show that the proposed network achieve better results compared with the original FlowNet.
#3066

Do not Lose the Details: Reinforced Representation Learning for High Performance Visual Tracking
Qiang Wang, Mengdan Zhang, Junliang Xing, Jin Gao, Weiming Hu, Steve Maybank

Motion and Tracking

This work presents a novel end-to-end trainable CNN model for high performance visual object tracking. It learns both low-level fine-grained representations and a high-level semantic embedding space in a mutual reinforced way, and a multi-task learning strategy is proposed to perform the correlation analysis on representations from both levels. In particular, a fully convolutional encoder-decoder network is designed to reconstruct the original visual features from the semantic projections to preserve all the geometric information. Moreover, the correlation filter layer working on the fine-grained representations leverages a global context constraint for accurate object appearance modeling. The correlation filter in this layer is updated online efficiently without network fine-tuning. Therefore, the proposed tracker benefits from two complementary effects: the adaptability of the fine-grained correlation analysis and the generalization capability of the semantic embedding. Extensive experimental evaluations on four popular benchmarks demonstrate its state-of-the-art performance.
#474

Evaluating Brush Movements for Chinese Calligraphy: A Computer Vision Based Approach
Pengfei Xu, Lei Wang, Ziyu Guan, Xia Zheng, Xiaojiang Chen, Zhanyong Tang, Dingyi Fang, Xiaoqing Gong, Zheng Wang

Motion and Tracking

Chinese calligraphy is a popular, highly esteemed art form in the Chinese cultural sphere and worldwide. Ink brushes are the traditional writing tool for Chinese calligraphy and the subtle nuances of brush movements have a great impact on the aesthetics of the written characters. However, mastering the brush movement is a challenging task for many calligraphy learners as it requires many years’ practice and expert supervision. This paper presents a novel approach to help Chinese calligraphy learners to quantify the quality of brush movements without expert involvement. Our approach extracts the brush trajectories from a video stream; it then compares them with example templates of reputed calligraphers to produce a score for the writing quality. We achieve this by first developing a novel neural network to extract the spatial and temporal movement features from the video stream. We then employ methods developed in the computer vision and signal processing domains to track the brush movement trajectory and calculate the score. We conducted extensive experiments and user studies to evaluate our approach. Experimental results show that our approach is highly accurate in identifying brush movements, yielding an average accuracy of 90%, and the generated score is within 3% of errors when compared to the one given by human experts.
#596

Unsupervised Learning based Jump-Diffusion Process for Object Tracking in Video Surveillance
Xiaobai Liu, Donovan Lo, Chau Thuan

Motion and Tracking

This paper presents a principled way for dealing with occlusions in visual tracking which is a long-standing issue in computer vision but largely remains unsolved. As the major innovation, we develop a learning-based jump-diffusion process to jointly track object locations and estimate their visibility statuses over time. Our method employs in particular a set of jump dynamics to change object's visibility statuses and a set of diffusion dynamics to track objects in videos. Different from the traditional jump-diffusion process that stochastically generates dynamics, we utilize deep policy functions to determine the best dynamic at the present step and learn the optimal policies from raw videos using reinforcement learning methods.Our method is capable of tracking objects with severe occlusions in crowded scenes and thus recovers the complete trajectories of objects that undergo multiple interactions with others. We evaluate the proposed method on challenging video sequences and compare it to alternative methods. Significant improvements are obtained particularly for the videos including frequent interactions or occlusions.

Tuesday 17 11:20 - 12:45 ML-LGM - Learning Generative Models (K11)

Chair: Xi Peng

#3582

MEGAN: Mixture of Experts of Generative Adversarial Networks for Multimodal Image Generation
David Keetae Park, Seungjoo Yoo, Hyojin Bahng, Jaegul Choo, Noseong Park

Learning Generative Models

Recently, generative adversarial networks (GANs) have shown promising performance in generating realistic images. However, they often struggle in learning complex underlying modalities in a given dataset, resulting in poor-quality generated images. To mitigate this problem, we present a novel approach called mixture of experts GAN (MEGAN), an ensemble approach of multiple generator networks. Each generator network in MEGAN specializes in generating images with a particular subset of modalities, e.g., an image class. Instead of incorporating a separate step of handcrafted clustering of multiple modalities, our proposed model is trained through an end-to-end learning of multiple generators via gating networks, which is responsible for choosing the appropriate generator network for a given condition. We adopt the categorical reparameterization trick for a categorical decision to be made in selecting a generator while maintaining the flow of the gradients. We demonstrate that individual generators learn different and salient subparts of the data and achieve a multiscale structural similarity (MS-SSIM) score of 0.2470 for CelebA and a competitive unsupervised inception score of 8.33 in CIFAR-10.
#2115

Geometric Enclosing Networks
Trung Le, Hung Vu, Tu Dinh Nguyen, Dinh Phung

Learning Generative Models

Training model to generate data has increasingly attracted research attention and become important in modern world applications. We propose in this paper a new geometry-based optimization approach to address this problem. Orthogonal to current state-of-the-art density-based approaches, most notably VAE and GAN, we present a fresh new idea that borrows the principle of minimal enclosing ball to train a generator G\left(\bz\right) in such a way that both training and generated data, after being mapped to the feature space, are enclosed in the same sphere. We develop theory to guarantee that the mapping is bijective so that its inverse from feature space to data space results in expressive nonlinear contours to describe the data manifold, hence ensuring data generated are also lying on the data manifold learned from training data. Our model enjoys a nice geometric interpretation, hence termed Geometric Enclosing Networks (GEN), and possesses some key advantages over its rivals, namely simple and easy-to-control optimization formulation, avoidance of mode collapsing and efficiently learn data manifold representation in a completely unsupervised manner. We conducted extensive experiments on synthesis and real-world datasets to illustrate the behaviors, strength and weakness of our proposed GEN, in particular its ability to handle multi-modal data and quality of generated data.
#2302

Generative Warfare Nets: Ensemble via Adversaries and Collaborators
Honglun Zhang, Liqiang Xiao, Wenqing Chen, Yongkun Wang, Yaohui Jin

Learning Generative Models

Generative Adversarial Nets are a powerful method for training generative models of complex data, where a Generator and a Discriminator confront with each other and get optimized in a two-player minmax manner. In this paper, we propose the Generative Warfare Nets (GWN) that involve multiple generators and multiple discriminators from two sides to exploit the advantages of Ensemble Learning. We maintain the authorities for the generators and the discriminators to enhance inter-side interactions, and utilize the mechanisms of imitation and innovation to model intra-side interactions among the generators, where they can not only learn from but also compete with each other. Extensive experiments on three natural image datasets show that GWN can achieve state-of-the-art Inception scores and produce diverse high-quality synthetic results.
#1046

Generative Adversarial Positive-Unlabelled Learning
Ming Hou, Brahim Chaib-draa, Chao Li, Qibin Zhao

Learning Generative Models

In this work, we consider the task of classifying binary positive-unlabeled (PU) data. The existing discriminative learning based PU models attempt to seek an optimal reweighting strategy for U data, so that a decent decision boundary can be found. However, given limited P data, the conventional PU models tend to suffer from overfitting when adapted to very flexible deep neural networks. In contrast, we are the first to innovate a totally new paradigm to attack the binary PU task, from perspective of generative learning by leveraging the powerful generative adversarial networks (GAN). Our generative positive-unlabeled (GenPU) framework incorporates an array of discriminators and generators that are endowed with different roles in simultaneously producing positive and negative realistic samples. We provide theoretical analysis to justify that, at equilibrium, GenPU is capable of recovering both positive and negative data distributions. Moreover, we show GenPU is generalizable and closely related to the semi-supervised classification. Given rather limited P data, experiments on both synthetic and real-world dataset demonstrate the effectiveness of our proposed framework. With infinite realistic and diverse sample streams generated from GenPU, a very flexible classifier can then be trained using deep neural networks.
#3631

Joint Generative Moment-Matching Network for Learning Structural Latent Code
Hongchang Gao, Heng Huang

Learning Generative Models

Generative Moment-Matching Network (GMMN) is a deep generative model, which employs maximum mean discrepancy as the objective to learn model parameters. However, this model can only generate samples, failing to infer the latent code from samples for downstream tasks. In this paper, we propose a novel Joint Generative Moment-Matching Network (JGMMN), which learns the structural latent code for unsupervised inference. Specifically, JGMMN has a generation network for the generation task and an inference network for the inference task. We first reformulate this model as the two joint distributions matching problem. To solve this problem, we propose to use the Joint Maximum Mean Discrepancy (JMMD) as the objective to learn these two networks simultaneously. Furthermore, to enforce the consistency between the sample distribution and the inferred latent code distribution, we propose a novel multi-modal regularization to enforce this consistency. At last, extensive experiments on both synthetic and real-world datasets have verified the effectiveness and correctness of our proposed JGMMN.
#1659

Unsupervised Disentangled Representation Learning with Analogical Relations
Zejian Li, Yongchuan Tang, Yongxing He

Learning Generative Models

Learning the disentangled representation of interpretable generative factors of data is one of the foundations to allow artificial intelligence to think like people. In this paper, we propose the analogical training strategy for the unsupervised disentangled representation learning in generative models. The analogy is one of the typical cognitive processes, and our proposed strategy is based on the observation that sample pairs in which one is different from the other in one specific generative factor show the same analogical relation. Thus, the generator is trained to generate sample pairs from which a designed classifier can identify the underlying analogical relation. In addition, we propose a disentanglement metric called the subspace score, which is inspired by subspace learning methods and does not require supervised information. Experiments show that our proposed training strategy allows the generative models to find the disentangled factors, and that our methods can give competitive performances as compared with the state-of-the-art methods.
#6

MIXGAN: Learning Concepts from Different Domains for Mixture Generation
Guang-Yuan Hao, Hong-Xing Yu, Wei-Shi Zheng

Learning Generative Models

In this work, we present an interesting attempt on mixture generation: absorbing different image concepts (e.g., content and style) from different domains and thus generating a new domain with learned concepts. In particular, we propose a mixture generative adversarial network (MIXGAN). MIXGAN learns concepts of content and style from two domains respectively, and thus can join them for mixture generation in a new domain, i.e., generating images with content from one domain and style from another. MIXGAN overcomes the limitation of current GAN-based models which either generate new images in the same domain as they observed in training stage, or require off-the-shelf content templates for transferring or translation. Extensive experimental results demonstrate the effectiveness of MIXGAN as compared to related state-of-the-art GAN-based models.

Tuesday 17 11:20 - 13:00 KR-WEB1 - Knowledge Representation and the Web: Description Logics, Ontologies (C7)

Chair: Thomas Lukasiewicz

#1609

From Conjunctive Queries to Instance Queries in Ontology-Mediated Querying
Cristina Feier, Carsten Lutz, Frank Wolter

Knowledge Representation and the Web: Description Logics, Ontologies

We consider ontology-mediated queries (OMQs) based on expressive description logics of the ALC family and (unions) of conjunctive queries, studying the rewritability into OMQs based on instance queries (IQs). Our results include exact characterizations of when such a rewriting is possible and tight complexity bounds for deciding rewritability. We also give a tight complexity bound for the related problem of deciding whether a given MMSNP sentence (in other words: the complement of a monadic disjunctive Datalog program) is equivalent to a constraint satisfaction problem.
#2968

Reverse Engineering Queries in Ontology-Enriched Systems: The Case of Expressive Horn Description Logic Ontologies
Víctor Gutiérrez-Basulto, Jean Christoph Jung, Leif Sabellek

Knowledge Representation and the Web: Description Logics, Ontologies

We introduce the query-by-example (QBE) paradigm for query answering in the presence of ontologies. Intuitively, QBE permits non-expert users to explore the data by providing examples of the information they (do not) want, which the system then generalizes into a query. Formally, we study the following question: given a knowledge base and sets of positive and negative examples, is there a query that returns all positive but none of the negative examples? We focus on description logic knowledge bases with ontologies formulated in Horn-ALCI and (unions of) conjunctive queries. Our main contributions are characterizations, algorithms and tight complexity bounds for QBE.
#1946

Horn-Rewritability vs PTime Query Evaluation in Ontology-Mediated Querying
Andre Hernich, Carsten Lutz, Fabio Papacchini, Frank Wolter

Knowledge Representation and the Web: Description Logics, Ontologies

In ontology-mediated querying with an expressive description logic L, two desirable properties of a TBox T are (1) being able to replace T with a TBox formulated in the Horn-fragment of L without affecting the answers to conjunctive queries, and (2) that every conjunctive query can be evaluated in PTime w.r.t. T. We investigate in which cases (1) and (2) are equivalent, finding that the answer depends on whether the unique name assumption (UNA) is made, on the description logic under consideration, and on the nesting depth of quantifiers in the TBox. We also clarify the relationship between query evaluation with and without UNA and consider natural variations of property (1).
#162

Fast Compliance Checking in an OWL2 Fragment
Piero A. Bonatti

Knowledge Representation and the Web: Description Logics, Ontologies

We illustrate a formalization of data usage policies in a fragment of OWL2. It can be used to encode (i) a company's data protection policy, (ii) data subjects' consent to data processing, and (iii) part of the GDPR (the forthcoming European Data Protection Regulation). Then a company's policy can be checked for compliance with data subjects' consent and with part of the GDPR by means of subsumption queries. We provide a complete and tractable structural subsumption algorithm for compliance checking and prove the intractability of a natural generalization of the policy language.
#184

On Concept Forgetting in Description Logics with Qualified Number Restrictions
Yizheng Zhao, Renate Schmidt

Knowledge Representation and the Web: Description Logics, Ontologies

This paper presents a practical method for computing solutions of concept forgetting in the description logic ALCOQ(neg,and,or), basic ALC extended with nominals, qualified number restrictions, role negation, role conjunction and role disjunction. The method is based on a non-trivial generalisation of Ackermann's Lemma, and attempts to compute either semantic solutions of concept forgetting or uniform interpolants in ALCOQ(neg,and,or). It is so far the only approach to concept forgetting in description logics with number restrictions plus nominals, as well as in description logics with ABoxes. Results of an evaluation with a prototypical implementation have shown that the method was successful in more than 90% of the test cases from a large corpus of biomedical ontologies. In only 13.2% of these cases the solutions were semantic solutions.
#1763

Embracing Change by Abstraction Materialization Maintenance for Large ABoxes
Markus Brenner, Birte Glimm

Knowledge Representation and the Web: Description Logics, Ontologies

Abstraction Refinement is a recently introduced technique which allows for reducing materialization of an ontology with a large ABox to materialization of a smaller (compressed) `abstraction' of this ontology. In this paper, we show how Abstraction Refinement can be adopted for incremental ABox materialization by combining it with the well-known DRed algorithm for materialization maintenance. Such a combination is non-trivial and to preserve soundness and completeness, already Horn ALCHI requires more complex abstractions. Nevertheless, we show that significant benefits can be obtained for synthetic and real-world ontologies.
#3680

Inconsistency-Tolerant Ontology-Based Data Access Revisited: Taking Mappings into Account
Meghyn Bienvenu

Knowledge Representation and the Web: Description Logics, Ontologies

Inconsistency-tolerant query answering in the presence of ontologies has received considerable attention in recent years. However, existing work assumes that the data is expressed using the vocabulary of the ontology and is therefore not directly applicable to ontology-based data access (OBDA), where relational data is connected to the ontology via mappings. This motivates us to revisit existing results in the wider context of OBDA with mappings. After formalizing the problem, we perform a detailed analysis of the data complexity of inconsistency-tolerant OBDA for ontologies formulated in DL-Lite and other data-tractable description logics, considering three different semantics (AR, IAR, and brave), two notions of repairs (subset and symmetric difference), and two classes of global-as-view (GAV) mappings. We show that adding plain GAV mappings does not affect data complexity, but there is a jump in complexity if mappings with negated atoms are considered.
#3609

Two Approaches to Ontology Aggregation Based on Axiom Weakening
Daniele Porello, Nicolas Troquard, Rafael Peñaloza, Roberto Confalonieri, Pietro Galliani, Oliver Kutz

Knowledge Representation and the Web: Description Logics, Ontologies

Axiom weakening is a novel technique that allows for fine-grained repair of inconsistent ontologies. In a multi-agent setting, integrating ontologies corresponding to multiple agents may lead to inconsistencies. Such inconsistencies can be resolved after the integrated ontology has been built, or their generation can be prevented during ontology generation. We implement and compare these two approaches. First, we study how to repair an inconsistent ontology resulting from a voting-based aggregation of views of heterogeneous agents. Second, we prevent the generation of inconsistencies by letting the agents engage in a turn-based rational protocol about the axioms to be added to the integrated ontology. We instantiate the two approaches using real-world ontologies and compare them by measuring the levels of satisfaction of the agents w.r.t. the ontology obtained by the two procedures.

Tuesday 17 11:20 - 13:00 CSAT-SAT - Satisfiability (K2)

Chair: Sebastian Ordyniak

#1063

Boosting MCSes Enumeration
Éric Grégoire, Yacine Izza, Jean-Marie Lagniez

Satisfiability

The enumeration of all Maximal Satisfiable Subsets (MSSes) or all Minimal Correction Subsets (MCSes) of an unsatisfiable CNF Boolean formula is a useful and sometimes necessary step for solving a variety of important A.I. issues. Although the number of different MCSes of a CNF Boolean formula is exponential in the worst case, it remains low in many practical situations; this makes the tentative enumeration possibly successful in these latter cases. In the paper, a technique is introduced that boosts the currently most efficient practical approaches to enumerate MCSes. It implements a model rotation paradigm that allows the set of MCSes to be computed in an heuristically efficient way.
#1868

DMC: A Distributed Model Counter
Jean-Marie Lagniez, Pierre Marquis, Nicolas Szczepanski

Satisfiability

We present and evaluate DMC, a distributed model counter for propositional CNF formulae based on the state-of-the-art sequential model counter D4. DMC can take advantage of a (possibly large) number of sequential model counters running on (possibly heterogeneous) computing units spread over a network of computers. For ensuring an efficient workload distribution, the model counting task is shared between the model counters following a policy close to work stealing. The number and the sizes of the messages which are exchanged by the jobs are kept small. The results obtained show DMC as a much more efficient counter than D4, the distribution of the computation yielding large improvements for some benchmarks. DMC appears also as a serious challenger to the parallel model counter CountAntom and to the distributed model counter dCountAntom.
#1947

On the Satisfiability Threshold of Random Community-Structured SAT
Dina Barak-Pelleg, Daniel Berend

Satisfiability

For both historical and practical reasons, the Boolean satisfiability problem (SAT) has become one of central importance in computer science. One type of instances arises when the clauses are chosen uniformly randomly \textendash{} random SAT. Here, a major problem, recently solved for sufficiently large clause length, is the satisfiability threshold conjecture. The value of this threshold is known exactly only for clause length $2$, and there has been a lot of research concerning its value for arbitrary fixed clause length. In this paper, we endeavor to study the satisfiability threshold for random industrial SAT. There is as yet no generally accepted model of industrial SAT, and we confine ourselves to one of the more common features of industrial SAT: the set of variables consists of a number of disjoint communities, and clauses tend to consist of variables from the same community. Our main result is that the threshold of random community-structured SAT tends to be smaller than its counterpart for random SAT. Moreover, under some conditions, this threshold even vanishes.
#3157

Conflict Directed Clause Learning for Maximum Weighted Clique Problem
Emmanuel Hebrard, George Katsirelos

Satisfiability

The maximum clique and minimum vertex cover problems are among Karp's 21 NP-complete problems, and have numerous applications: in combinatorial auctions, for computing phylogenetic trees, to predict the structure of proteins, to analyse social networks, and so forth. Currently, the best complete methods are branch & bound algorithms and rely largely on graph colouring to compute a bound. We introduce a new approach based on SAT and on the "Conflict-Driven Clause Learning" (CDCL) algorithm. We propose an efficient implementation of Babel's bound and pruning rule, as well as a novel dominance rule. Moreover, we show how to compute concise explanations for this inference. Our experimental results show that this approach is competitive and often outperforms the state of the art for finding cliques of maximum weight.
#3567

Solving Exist-Random Quantified Stochastic Boolean Satisfiability via Clause Selection
Nian-Ze Lee, Yen-Shi Wang, Jie-Hong R. Jiang

Satisfiability

Stochastic Boolean satisfiability (SSAT) is an expressive language to formulate decision problems with randomness. Solving SSAT formulas has the same PSPACE-complete computational complexity as solving quantified Boolean formulas (QBFs). Despite its broad applications and profound theoretical values, SSAT has received relatively little attention compared to QBF. In this paper, we focus on exist-random quantified SSAT formulas, also known as E-MAJSAT, which is a special fragment of SSAT commonly applied in probabilistic conformant planning, posteriori hypothesis, and maximum expected utility. Based on clause selection, a recently proposed QBF technique, we propose an algorithm to solve E-MAJSAT. Moreover, our method can provide an approximate solution to E-MAJSAT with a lower bound when an exact answer is too expensive to compute. Experiments show that the proposed algorithm achieves significant performance gains and memory savings over the state-of-the-art SSAT solvers on a number of benchmark formulas, and provides useful lower bounds for cases where prior methods fail to compute exact answers.
#3839

Divide and Conquer: Towards Faster Pseudo-Boolean Solving
Jan Elffers, Jakob Nordström

Satisfiability

The last 20 years have seen dramatic improvements in the performance of algorithms for Boolean satisfiability---so-called SAT solvers---and today conflict-driven clause learning (CDCL) solvers are routinely used in a wide range of application areas. One serious short-coming of CDCL, however, is that the underlying method of reasoning is quite weak. A tantalizing solution is to instead use stronger pseudo-Boolean (PB) reasoning, but so far the promise of exponential gains in performance has failed to materialize---the increased theoretical strength seems hard to harness algorithmically, and in many applications CDCL-based methods are still superior. We propose a modified approach to pseudo-Boolean solving based on division instead of the saturation rule used in [Chai and Kuehlmann '05] and other PB solvers. In addition to resulting in a stronger conflict analysis, this also improves performance by keeping integer coefficient sizes down, and yields a very competitive solver as shown by the results in the Pseudo-Boolean Competitions 2015 and 2016.
#3882

Seeking Practical CDCL Insights from Theoretical SAT Benchmarks
Jan Elffers, Jesús Giráldez-Cru, Stephan Gocht, Jakob Nordström, Laurent Simon

Satisfiability

Over the last decades Boolean satisfiability (SAT) solvers based on conflict-driven clause learning (CDCL) have developed to the point where they can handle formulas with millions of variables. Yet a deeper understanding of how these solvers can be so successful has remained elusive. In this work we shed light on CDCL performance by using theoretical benchmarks, which have the attractive features of being a) scalable, b) extremal with respect to different proof search parameters, and c) theoretically easy in the sense of having short proofs in the resolution proof system underlying CDCL. This allows for a systematic study of solver heuristics and how efficiently they search for proofs. We report results from extensive experiments on a wide range of benchmarks. Our findings include several examples where theory predicts and explains CDCL behaviour, but also raise a number of intriguing questions for further study.
#5470

(Journal track) Complexity of n-Queens Completion
Ian P. Gent, Christopher Jefferson, Peter Nightingale

Satisfiability

The n-Queens problem is to place n chess queens on an n by n chessboard so that no two queens are on the same row, column or diagonal. The n-Queens Completion problem is a variant, dating to 1850, in which some queens are already placed and the solver is asked to place the rest, if possible. We show that n-Queens Completion is both NP-Complete and #P-Complete. A corollary is that any non-attacking arrangement of queens can be included as a part of a solution to a larger n-Queens problem. We introduce generators of random instances for n-Queens Completion and the closely related Blocked n-Queens and Excluded Diagonals Problem. We describe three solvers for these problems, and empirically analyse the hardness of randomly generated instances. For Blocked n-Queens and the Excluded Diagonals Problem, we show the existence of a phase transition associated with hard instances as has been seen in other NP-Complete problems, but a natural generator for n-Queens Completion did not generate consistently hard instances. The significance of this work is that the n-Queens problem has been very widely used as a benchmark in Artificial Intelligence, but conclusions on it are often disputable because of the simple complexity of the decision problem. Our results give alternative benchmarks which are hard theoretically and empirically, but for which solving techniques designed for n-Queens need minimal or no change.

Tuesday 17 11:20 - 13:00 NLP-SAA - Sentiment Analysis and Argument Mining (T2)

Chair: Serena Villata

#900

Beyond Polarity: Interpretable Financial Sentiment Analysis with Hierarchical Query-driven Attention
Ling Luo, Xiang Ao, Feiyang Pan, Jin Wang, Tong Zhao, Ningzi Yu, Qing He

Sentiment Analysis and Argument Mining

Sentiment analysis has played a significant role in financial applications in recent years. The informational and emotive aspects of news texts may affect the prices, volatilities, volume of trades, and even potential risks of financial subjects. Previous studies in this field mainly focused on identifying polarity~(e.g. positive or negative). However, as financial decisions broadly require justifications, only plausible polarity cannot provide enough evidence during the decision making processes of humanity. Hence an explainable solution is in urgent demand. In this paper, we present an interpretable neural net framework for financial sentiment analysis. First, we design a hierarchical model to learn the representation of a document from multiple granularities. In addition, we propose a query-driven attention mechanism to satisfy the unique characteristics of financial documents. With the domain specified questions provided by the financial analysts, we can discover different spotlights for queries from different aspects. We conduct extensive experiments on a real-world dataset. The results demonstrate that our framework can learn better representation of the document and unearth meaningful clues on replying different users? preferences. It also outperforms the state-of-the-art methods on sentiment prediction of financial documents.
#1270

Text Emotion Distribution Learning via Multi-Task Convolutional Neural Network
Yuxiang Zhang, Jiamei Fu, Dongyu She, Ying Zhang, Senzhang Wang, Jufeng Yang

Sentiment Analysis and Argument Mining

Emotion analysis of on-line user generated textual content is important for natural language processing and social media analytics tasks. Most of previous emotion analysis approaches focus on identifying users’ emotional states from text by classifying emotions into one of the finite categories, e.g., joy, surprise, anger and fear. However, there exists ambiguity characteristic for the emotion analysis, since a single sentence can evoke multiple emotions with different intensities. To address this problem, we introduce emotion distribution learning and propose a multi-task convolutional neural network for text emotion analysis. The end-to-end framework optimizes the distribution prediction and classification tasks simultaneously, which is able to learn robust representations for the distribution dataset with annotations of different voters. While most work adopt the majority voting scheme for the ground truth labeling, we also propose a lexiconbased strategy to generate distributions from a single label, which provides prior information for the emotion classification. Experiments conducted on five public text datasets (i.e., SemEval, Fairy Tales, ISEAR, TEC, CBET) demonstrate that our proposed method performs favorably against the state-of-the-art approaches.
#2377

Aspect Term Extraction with History Attention and Selective Transformation
Xin Li, Lidong Bing, Piji Li, Wai Lam, Zhimou Yang

Sentiment Analysis and Argument Mining

Aspect Term Extraction (ATE), a key sub-task in Aspect-Based Sentiment Analysis, aims to extract explicit aspect expressions from online user reviews. We present a new framework for tackling ATE. It can exploit two useful clues, namely opinion summary and aspect detection history. Opinion summary is distilled from the whole input sentence, conditioned on each current token for aspect prediction, and thus the tailor-made summary can help aspect prediction on this token. On the other hand, the aspect detection history information is distilled from the previous aspect predictions, and it can leverage the coordinate structure and tagging schema constraints to upgrade the aspect prediction. Experimental results over four benchmark datasets clearly demonstrate that our framework can outperform all state-of-the-art methods.
#2831

A Hierarchical End-to-End Model for Jointly Improving Text Summarization and Sentiment Classification
Shuming Ma, Xu Sun, Junyang Lin, Xuancheng Ren

Sentiment Analysis and Argument Mining

Text summarization and sentiment classification both aim to capture the main ideas of the text but at different levels. Text summarization is to describe the text within a few sentences, while sentiment classification can be regarded as a special type of summarization which ``summarizes'' the text into a even more abstract fashion, i.e., a sentiment class. Based on this idea, we propose a hierarchical end-to-end model for joint learning of text summarization and sentiment classification, where the sentiment classification label is treated as the further ``summarization'' of the text summarization output. Hence, the sentiment classification layer is put upon the text summarization layer, and a hierarchical structure is derived. Experimental results on Amazon online reviews datasets show that our model achieves better performance than the strong baseline systems on both abstractive summarization and sentiment classification.
#3168

Transition-based Adversarial Network for Cross-lingual Aspect Extraction
Wenya Wang, Sinno Jialin Pan

Sentiment Analysis and Argument Mining

In fine-grained opinion mining, the task of aspect extraction involves the identification of explicit product features in customer reviews. This task has been widely studied in some major languages, e.g., English, but was seldom addressed in other minor languages due to the lack of annotated corpus. To solve it, we develop a novel deep model to transfer knowledge from a source language with labeled training data to a target language without any annotations. Different from cross-lingual sentiment classification, aspect extraction across languages requires more fine-grained adaptation. To this end, we utilize transition-based mechanism that reads a word each time and forms a series of configurations that represent the status of the whole sentence. We represent each configuration as a continuous feature vector and align these representations from different languages into a shared space through an adversarial network. In addition, syntactic structures are also integrated into the deep model to achieve more syntactically-sensitive adaptations. The proposed method is end-to-end and achieves state-of-the-art performance on English, French and Spanish restaurant review datasets.
#3276

Aspect Sentiment Classification with both Word-level and Clause-level Attention Networks
Jingjing Wang, Jie Li, Shoushan Li, Yangyang Kang, Min Zhang, Luo Si, Guodong Zhou

Sentiment Analysis and Argument Mining

Aspect sentiment classification, a challenging task in sentiment analysis, has been attracting more and more attention in recent years. In this paper, we highlight the need for incorporating the importance degrees of both words and clauses inside a sentence and propose a hierarchical network with both word-level and clause-level attentions to aspect sentiment classification. Specifically, we first adopt sentence-level discourse segmentation to segment a sentence into several clauses. Then, we leverage multiple Bi-directional LSTM layers to encode all clauses and propose a word-level attention layer to capture the importance degrees of words in each clause. Third and finally, we leverage another Bi-directional LSTM layer to encode the outputs from the former layers and propose a clause-level attention layer to capture the importance degrees of all the clauses inside a sentence. Experimental results on the laptop and restaurant datasets from SemEval-2015 demonstrate the effectiveness of our proposed approach to aspect sentiment classification.
#4342

Learning to Give Feedback: Modeling Attributes Affecting Argument Persuasiveness in Student Essays
Zixuan Ke, Winston Carlile, Nishant Gurrapadi, Vincent Ng

Sentiment Analysis and Argument Mining

Argument persuasiveness is one of the most important dimensions of argumentative essay quality, yet it is little studied in automated essay scoring research. Using a recently released corpus of essays that are simultaneously annotated with argument components, argument persuasiveness scores, and attributes of argument components that impact an argument’s persuasiveness, we design and train the first set of neural models that predict the persuasiveness of an argument and its attributes in a student essay, enabling useful feedback to be provided to students on why their arguments are (un)persuasive in addition to how persuasive they are.
#5471

(Journal track) Distributional Correspondence Indexing for Cross-Lingual and Cross-Domain Sentiment Classification
Alejandro Moreo Fernández, Andrea Esuli, Fabrizio Sebastiani

Sentiment Analysis and Argument Mining

Domain Adaptation (DA) techniques aim at enabling machine learning methods learn effective classifiers for a “target” domain when the only available training data belongs to a different “source” domain. In this extended abstract, we briefly describe our new DA method called Distributional Correspondence Indexing (DCI) for sentiment classification. DCI derives term representations in a vector space common to both domains where each dimension reflects its distributional correspondence to a pivot, i.e., to a highly predictive term that behaves similarly across domains. The experiments we have conducted show that DCI obtains better performance than current state-of-the-art techniques for cross-lingual and cross-domain sentiment classification.

Tuesday 17 11:20 - 13:00 ML-LPR - Learning Preferences or Rankings (C2)

Chair: Yukino Baba

#2980

High-dimensional Similarity Learning via Dual-sparse Random Projection
Dezhong Yao, Peilin Zhao, Tuan-Anh Nguyen Pham, Gao Cong

Learning Preferences or Rankings

We investigate how to adopt dual random projection for high-dimensional similarity learning. For a high-dimensional similarity learning problem, projection is usually adopted to map high-dimensional features into low-dimensional space, in order to reduce the computational cost. However, dimensionality reduction method sometimes results in unstable performance due to the suboptimal solution in original space. In this paper, we propose a dual random projection framework for similarity learning to recover the original optimal solution from subspace optimal solution. Previous dual random projection methods usually make strong assumptions about the data, which need to be low rank or have a large margin. Those assumptions limit dual random projection applications in similarity learning. Thus, we adopt a dual-sparse regularized random projection method that introduces a sparse regularizer into the reduced dual problem. As the original dual solution is a sparse one, applying a sparse regularizer in the reduced space relaxes the low-rank assumption. Experimental results show that our method enjoys higher effectiveness and efficiency than state-of-the-art solutions.
#2738

Modeling Contemporaneous Basket Sequences with Twin Networks for Next-Item Recommendation
Duc-Trong Le, Hady W. Lauw, Yuan Fang

Learning Preferences or Rankings

Our interactions with an application frequently leave a heterogeneous and contemporaneous trail of actions and adoptions (e.g., clicks, bookmarks, purchases). Given a sequence of a particular type (e.g., purchases)-- referred to as the target sequence, we seek to predict the next item expected to appear beyond this sequence. This task is known as next-item recommendation. We hypothesize two means for improvement. First, within each time step, a user may interact with multiple items (a basket), with potential latent associations among them. Second, predicting the next item in the target sequence may be helped by also learning from another supporting sequence (e.g., clicks). We develop three twin network structures modeling the generation of both target and support basket sequences. One based on "Siamese networks" facilitates full sharing of parameters between the two sequence types. The other two based on "fraternal networks" facilitate partial sharing of parameters. Experiments on real-world datasets show significant improvements upon baselines relying on one sequence type.
#2323

Task-Guided and Semantic-Aware Ranking for Academic Author-Paper Correlation Inference
Chuxu Zhang, Lu Yu, Xiangliang Zhang, Nitesh V. Chawla

Learning Preferences or Rankings

We study the problem of author-paper correlation inference in big scholarly data, which is to effectively infer potential correlated works for researchers using historical records. Unlike supervised learning algorithms that predict relevance score of author-paper pair via time and memory consuming feature engineering, network embedding methods automatically learn nodes' representations that can be further used to infer author-paper correlation. However, most current models suffer from two limitations: (1) they produce general purpose embeddings that are independent of the specific task; (2) they are usually based on network structure but out of content semantic awareness. To address these drawbacks, we propose a task-guided and semantic-aware ranking model. First, the historical interactions among all correlated author-paper pairs are formulated as a pairwise ranking loss. Next, the paper's semantic embedding encoded by gated recurrent neural network, together with the author's latent feature is used to score each author-paper pair in ranking loss. Finally, a heterogeneous relations integrative learning module is designed to further augment the model. The evaluation results of extensive experiments on the well known AMiner dataset demonstrate that the proposed model reaches significant better performance, comparing to a number of baselines.
#443

A Brand-level Ranking System with the Customized Attention-GRU Model
Yu Zhu, Junxiong Zhu, Jie Hou, Yongliang Li, Beidou Wang, Ziyu Guan, Deng Cai

Learning Preferences or Rankings

In e-commerce websites like Taobao, brand is playing a more important role in influencing users' decision of click/purchase, partly because users are now attaching more importance to the quality of products and brand is an indicator of quality. However, existing ranking systems are not specifically designed to satisfy this kind of demand. Some design tricks may partially alleviate this problem, but still cannot provide satisfactory results or may create additional interaction cost. In this paper, we design the first brand-level ranking system to address this problem. The key challenge of this system is how to sufficiently exploit users' rich behavior in e-commerce websites to rank the brands. In our solution, we firstly conduct the feature engineering specifically tailored for the personalized brand ranking problem and then rank the brands by an adapted Attention-GRU model containing three important modifications. Note that our proposed modifications can also apply to many other machine learning models on various tasks. We conduct a series of experiments to evaluate the effectiveness of our proposed ranking model and test the response to the brand-level ranking system from real users on a large-scale e-commerce platform, i.e. Taobao.
#1662

Attentional Image Retweet Modeling via Multi-Faceted Ranking Network Learning
Zhou Zhao, Lingtao Meng, Jun Xiao, Min Yang, Fei Wu, Deng Cai, Xiaofei He, Yueting Zhuang

Learning Preferences or Rankings

Retweet prediction is a challenging problem in social media sites (SMS). In this paper, we study the problem of image retweet prediction in social media, which predicts the image sharing behavior that the user reposts the image tweets from their followees. Unlike previous studies, we learn user preference ranking model from their past retweeted image tweets in SMS. We first propose heterogeneous image retweet modeling network (IRM) that exploits users' past retweeted image tweets with associated contexts, their following relations in SMS and preference of their followees. We then develop a novel attentional multi-faceted ranking network learning framework with multi-modal neural networks for the proposed heterogenous IRM network to learn the joint image tweet representations and user preference representations for prediction task. The extensive experiments on a large-scale dataset from Twitter site shows that our method achieves better performance than other state-of-the-art solutions to the problem.
#2338

Generalization Bounds for Regularized Pairwise Learning
Yunwen Lei, Shao-Bo Lin, Ke Tang

Learning Preferences or Rankings

Pairwise learning refers to learning tasks with the associated loss functions depending on pairs of examples. Recently, pairwise learning has received increasing attention since it covers many machine learning schemes, e.g., metric learning, ranking and AUC maximization, in a unified framework. In this paper, we establish a unified generalization error bound for regularized pairwise learning without either Bernstein conditions or capacity assumptions. We apply this general result to typical learning tasks including distance metric learning and ranking, for each of which our discussion is able to improve the state-of-the-art results.
#933

Convolutional Neural Networks based Click-Through Rate Prediction with Multiple Feature Sequences
Patrick P. K. Chan, Xian Hu, Lili Zhao, Daniel S. Yeung, Dapeng Liu, Lei Xiao

Learning Preferences or Rankings

Convolutional Neural Network (CNN) achieved satisfying performance in click-through rate (CTR) prediction in recent studies. Since features used in CTR prediction have no meaningful sequence in nature, the features can be arranged in any order. As CNN learns the local information of a sample, the feature sequence may influence its performance significantly. However, this problem has not been fully investigated. This paper firstly investigates whether and how the feature sequence affects the performance of the CNN-based CTR prediction method. As the data distribution of CTR prediction changes with time, the best current sequence may not be suitable for future data. Two multi-sequence models are proposed to learn the information provided by different sequences. The first model learns all sequences using a single feature learning module, while each sequence is learnt individually by a feature learning module in the second one. Moreover, a method of generating a set of embedding sequences which aims to consider the combined influence of all feature pairs on feature learning is also introduced. The experiments are conducted to demonstrate the effectiveness and stability of our proposed models in the offline and online environment on both the benchmark Avazu dataset and a real commercial dataset.
#2804

A Bayesian Latent Variable Model of User Preferences with Item Context
Aghiles Salah, Hady W. Lauw

Learning Preferences or Rankings

Personalized recommendation has proven to be very promising in modeling the preference of users over items. However, most existing work in this context focuses primarily on modeling user-item interactions, which tend to be very sparse. We propose to further leverage the item-item relationships that may reflect various aspects of items that guide users' choices. Intuitively, items that occur within the same "context" (e.g., browsed in the same session, purchased in the same basket) are likely related in some latent aspect. Therefore, accounting for the item's context would complement the sparse user-item interactions by extending a user's preference to other items of similar aspects. To realize this intuition, we develop Collaborative Context Poisson Factorization (C2PF), a new Bayesian latent variable model that seamlessly integrates contextual relationships among items into a personalized recommendation approach. We further derive a scalable variational inference algorithm to fit C2PF to preference data. Empirical results on real-world datasets show evident performance improvements over strong factorization models.

Tuesday 17 11:20 - 13:00 MLA-NET - Machine Learning Applications: Networks (C3)

Chair: Chuan Shi

#713

Efficient Attributed Network Embedding via Recursive Randomized Hashing
Wei Wu, Bin Li, Ling Chen, Chengqi Zhang

Machine Learning Applications: Networks

Attributed network embedding aims to learn a low-dimensional representation for each node of a network, considering both attributes and structure information of the node. However, the learning based methods usually involve substantial cost in time, which makes them impractical without the help of a powerful workhorse. In this paper, we propose a simple yet effective algorithm, named NetHash, to solve this problem only with moderate computing capacity. NetHash employs the randomized hashing technique to encode shallow trees, each of which is rooted at a node of the network. The main idea is to efficiently encode both attributes and structure information of each node by recursively sketching the corresponding rooted tree from bottom (i.e., the predefined highest-order neighboring nodes) to top (i.e., the root node), and particularly, to preserve as much information closer to the root node as possible. Our extensive experimental results show that the proposed algorithm, which does not need learning, runs significantly faster than the state-of-the-art learning-based network embedding methods while achieving competitive or even better performance in accuracy.
#833

ANOMALOUS: A Joint Modeling Approach for Anomaly Detection on Attributed Networks
Zhen Peng, Minnan Luo, Jundong Li, Huan Liu, Qinghua Zheng

Machine Learning Applications: Networks

The key point of anomaly detection on attributed networks lies in the seamless integration of network structure information and attribute information. A vast majority of existing works are mainly based on the Homophily assumption that implies the nodal attribute similarity of connected nodes. Nonetheless, this assumption is untenable in practice as the existence of noisy and structurally irrelevant attributes may adversely affect the anomaly detection performance. Despite the fact that recent attempts perform subspace selection to address this issue, these algorithms treat subspace selection and anomaly detection as two separate steps which often leads to suboptimal solutions. In this paper, we investigate how to fuse attribute and network structure information more synergistically to avoid the adverse effects brought by noisy and structurally irrelevant attributes. Methodologically, we propose a novel joint framework to conduct attribute selection and anomaly detection as a whole based on CUR decomposition and residual analysis. By filtering out noisy and irrelevant node attributes, we perform anomaly detection with the remaining representative attributes. Experimental results on both synthetic and real-world datasets corroborate the effectiveness of the proposed framework.
#1144

Galaxy Network Embedding: A Hierarchical Community Structure Preserving Approach
Lun Du, Zhicong Lu, Yun Wang, Guojie Song, Yiming Wang, Wei Chen

Machine Learning Applications: Networks

Network embedding is a method of learning a low-dimensional vector representation of network vertices under the condition of preserving different types of network properties. Previous studies mainly focus on preserving structural information of vertices at a particular scale, like neighbor information or community information, but cannot preserve the hierarchical community structure, which would enable the network to be easily analyzed at various scales. Inspired by the hierarchical structure of galaxies, we propose the Galaxy Network Embedding (GNE) model, which formulates an optimization problem with spherical constraints to describe the hierarchical community structure preserving network embedding. More specifically, we present an approach of embedding communities into a low dimensional spherical surface, the center of which represents the parent community they belong to. Our experiments reveal that the representations from GNE preserve the hierarchical community structure and show advantages in several applications such as vertex multi-class classification and network visualization. The source code of GNE is available online.
#1182

Power-law Distribution Aware Trust Prediction
Xiao Wang, Ziwei Zhang, Jing Wang, Peng Cui, Shiqiang Yang

Machine Learning Applications: Networks

Trust prediction, aiming to predict the trust relations between users in a social network, is a key to helping users discover the reliable information. Many trust prediction methods are proposed based on the low-rank assumption of a trust network. However, one typical property of the trust network is that the trust relations follow the power-law distribution, i.e., few users are trusted by many other users, while most tail users have few trustors. Due to these tail users, the fundamental low-rank assumption made by existing methods is seriously violated and becomes unrealistic. In this paper, we propose a simple yet effective method to address the problem of the violated low-rank assumption. Instead of discovering the low-rank component of the trust network alone, we learn a sparse component of the trust network to describe the tail users simultaneously. With both of the learned low-rank and sparse components, the trust relations in the whole network can be better captured. Moreover, the transitive closure structure of the trust relations is also integrated into our model. We then derive an effective iterative algorithm to infer the parameters of our model, along with the proof of correctness. Extensive experimental results on real-world trust networks demonstrate the superior performance of our proposed method over the state-of-the-arts.
#1956

Dynamic Network Embedding : An Extended Approach for Skip-gram based Network Embedding
Lun Du, Yun Wang, Guojie Song, Zhicong Lu, Junshan Wang

Machine Learning Applications: Networks

Network embedding, as an approach to learn low-dimensional representations of vertices, has been proved extremely useful in many applications. Lots of state-of-the-art network embedding methods based on Skip-gram framework are efficient and effective. However, these methods mainly focus on the static network embedding and cannot naturally generalize to the dynamic environment. In this paper, we propose a stable dynamic embedding framework with high efficiency. It is an extension for the Skip-gram based network embedding methods, which can keep the optimality of the objective in the Skip-gram based methods in theory. Our model can not only generalize to the new vertex representation, but also update the most affected original vertex representations during the evolvement of the network. Multi-class classification on three real-world networks demonstrates that, our model can update the vertex representations efficiently and achieve the performance of retraining simultaneously. Besides, the visualization experimental result illustrates that, our model is capable of avoiding the embedding space drifting.
#4371

Feature Hashing for Network Representation Learning
Qixiang Wang, Shanfeng Wang, Maoguo Gong, Yue Wu

Machine Learning Applications: Networks

The goal of network representation learning is to embed nodes so as to encode the proximity structures of a graph into a continuous low-dimensional feature space. In this paper, we propose a novel algorithm called node2hash based on feature hashing for generating node embeddings. This approach follows the encoder-decoder framework. There are two main mapping functions in this framework. The first is an encoder to map each node into high-dimensional vectors. The second is a decoder to hash these vectors into a lower dimensional feature space. More specifically, we firstly derive a proximity measurement called expected distance as target which combines position distribution and co-occurrence statistics of nodes over random walks so as to build a proximity matrix, then introduce a set of T different hash functions into feature hashing to generate uniformly distributed vector representations of nodes from the proximity matrix. Compared with the existing state-of-the-art network representation learning approaches, node2hash shows a competitive performance on multi-class node classification and link prediction tasks on three real-world networks from various domains.
#3065

Discrete Network Embedding
Xiaobo Shen, Shirui Pan, Weiwei Liu, Yew-Soon Ong, Quan-Sen Sun

Machine Learning Applications: Networks

Network embedding aims to seek low-dimensional vector representations for network nodes, by preserving the network structure. The network embedding is typically represented in continuous vector, which imposes formidable challenges in storage and computation costs, particularly in large-scale applications. To address the issue, this paper proposes a novel discrete network embedding (DNE) for more compact representations. In particular, DNE learns short binary codes to represent each node. The Hamming similarity between two binary embeddings is then employed to well approximate the ground-truth similarity. A novel discrete multi-class classifier is also developed to expedite classification. Moreover, we propose to jointly learn the discrete embedding and classifier within a unified framework to improve the compactness and discrimination of network embedding. Extensive experiments on node classification consistently demonstrate that DNE exhibits lower storage and computational complexity than state-of-the-art network embedding methods, while obtains competitive classification results.
#2060

Sampling for Approximate Bipartite Network Projection
Nesreen Ahmed, Nick Duffield, Liangzhen Xia

Machine Learning Applications: Networks

Bipartite graphs manifest as a stream of edges that represent transactions, e.g., purchases by retail customers. Recommender systems employ neighborhood-based measures of node similarity, such as the pairwise number of common neighbors (CN) and related metrics. While the number of node pairs that share neighbors is potentially enormous, only a relatively small proportion of them have many common neighbors. This motivates finding a weighted sampling approach to preferentially sample these node pairs. This paper presents a new sampling algorithm that provides a fixed size unbiased estimate of the similarity matrix resulting from a bipartite edge stream projection. The algorithm has two components. First, it maintains a reservoir of sampled bipartite edges with sampling weights that favor selection of high similarity nodes. Second, arriving edges generate a stream of similarity updates, based on their adjacency with the current sample. These updates are aggregated in a second reservoir sample-based stream aggregator to yield the final unbiased estimate. Experiments on real world graphs show that a 10% sample at each stage yields estimates of high similarity edges with weighted relative errors of about 1%.

Tuesday 17 11:20 - 18:20 Competition (K14)

Angry Birds Competition

Competition

Show details

Tuesday 17 14:00 - 14:45 Invited Talk (VICTORIA)

Chair: Jerome Lang

Model-free, Model-based, and General Intelligence
Hector Geffner

Invited Talk

Tuesday 17 14:55 - 16:10 EAR5 - Early Career 5 (VICTORIA)

Chair: Qiang Yang

#5482

Mining Streaming and Temporal Data: from Representation to Knowledge
Xiangliang Zhang

Early Career 5

In this big-data era, vast amount of continuously arriving data can be found in various fields, such as sensor networks, network management, web and financial applications. To process such data, algorithms are usually challenged by its complex structure and high volume. Representation learning facilitates the data operation by providing a condensed description of patterns underlying the data. Knowledge discovery based on the new representations will then be computationally efficient, and to certain extent be more effective due to the removal of noise and irrelevant information in the step of representation learning. In this paper, we will briefly review state-of-the-art techniques for extracting representation and discovering knowledge from streaming and temporal data, and demonstrate their performance at addressing several real application problems.
#5495

The power of convexity in deep learning
J. Zico Kolter

Early Career 5
#5494

Towards Sample Efficient Reinforcement Learning
Yang Yu

Early Career 5

Reinforcement learning is a major tool to realize intelligent agents that can be autonomously adaptive to the environment. With deep models, reinforcement learning has shown great potential in complex tasks such as playing games from pixels. However, current reinforcement learning techniques are still suffer from requiring a huge amount of interaction data, which could result in unbearable cost in real-world applications. In this article, we share our understanding of the problem, and discuss possible ways to alleviate the sample cost of reinforcement learning, from the aspects of exploration, optimization, environment modeling, experience transfer, and abstraction. We also discuss some challenges in real-world applications, with the hope of inspiring future researches.

Tuesday 17 14:55 - 16:10 KR-PS - Knowledge Representation and Planning (C7)

Chair: David Toman

#1756

Automata-Theoretic Foundations of FOND Planning for LTLf and LDLf Goals
Giuseppe De Giacomo, Sasha Rubin

Knowledge Representation and Planning

We study planning for LTLf and LDLf temporally extended goals in nondeterministic fully observable domains (FOND). We consider both strong and strong cyclic plans, and develop foundational automata-based techniques to deal with both cases. Using these techniques we provide the computational characterization of both problems, separating the complexity in the size of the domain specification from that in the size of the formula. Specifically we establish them to be EXPTIME-complete and 2EXPTIME-complete, respectively, for both problems. In doing so, we also show 2EXPTIME-hardness for strong cyclic plans, which was open.
#3368

Features, Projections, and Representation Change for Generalized Planning
Blai Bonet, Hector Geffner

Knowledge Representation and Planning

Generalized planning is concerned with the characterization and computation of plans that solve many instances at once. In the standard formulation, a generalized plan is a mapping from fea- ture or observation histories into actions, assuming that the instances share a common pool of features and actions. This assumption, however, excludes the standard relational planning domains where actions and objects change across instances. In this work, we extend the standard formulation of generalized planning to such domains. This is achieved by projecting the actions over the features, resulting in a common set of abstract actions which can be tested for soundness and completeness, and which can be used for generating general policies such as “if the gripper is empty, pick the clear block above x and place it on the table” that achieve the goal clear(x) in any Blocksworld instance. In this policy, “pick the clear block above x” is an abstract action that may represent the action Unstack(a, b) in one situation and the action Unstack(b, c) in another. Transformations are also introduced for computing such policies by means of fully observable non-deterministic (FOND) planners. The value of generalized representations for learning general policies is also discussed.
#4265

Complexity of Scheduling Charging in the Smart Grid
Mathijs de Weerdt, Michael Albert, Vincent Conitzer, Koos van der Linden

Knowledge Representation and Planning

The problem of optimally scheduling the charging demand of electric vehicles within the constraints of the electricity infrastructure is called the charge scheduling problem. The models of the charging speed, horizon, and charging demand determine the computational complexity of the charge scheduling problem. We show that for about 20 variants the problem is either in P or weakly NP-hard and dynamic programs exist to compute optimal solutions. About 10 other variants of the problem are strongly NP-hard, presenting a potentially significant obstacle to their use in practical situations of scale. An experimental study establishes up to what parameter values the dynamic programs can determine optimal solutions in a couple of minutes.
#1927

Small Undecidable Problems in Epistemic Planning
Sébastien Lê Cong, Sophie Pinchinat, François Schwarzentruber

Knowledge Representation and Planning

Epistemic planning extends classical planning with knowledge and is based on dynamic epistemic logic (DEL). The epistemic planning problem is undecidable in general. We exhibit a small undecidable subclass of epistemic planning over 2-agent S5 models with a fixed repertoire of one action, 6 propositions and a fixed goal. We furthermore consider a variant of the epistemic planning problem where the initial knowledge state is an automatic structure, hence possibly infinite. In that case, we show the epistemic planning problem with 1 public action and 2 propositions to be undecidable, while it is known to be decidable with public actions over finite models. Our results are obtained by reducing the reachability problem over small universal cellular automata. While our reductions yield a goal formula that displays the common knowledge operator, we show, for each of our considered epistemic problems, a reduction into an epistemic planning problem for a common-knowledge-operator-free goal formula by using 2 additional actions.
#2715

Multi-agent Epistemic Planning with Common Knowledge
Qiang Liu, Yongmei Liu

Knowledge Representation and Planning

In the past decade, multi-agent epistemic planning has received much attention from both dynamic logic and planning communities. Common knowledge is an essential part of multi-agent modal logics, and plays an important role in coordination and interaction of multiple agents. However, existing implementations of multi-agent epistemic planning provide very limited support for common knowledge, basically static propositional common knowledge. Our work aims to extend an existing multi-agent epistemic planning framework based on higher-order belief change with the capability to deal with common knowledge. We propose a novel normal form for multi-agent KD45 logic with common knowledge. We propose satisfiability solving, revision and update algorithms for this normal form. Based on our algorithms, we implemented a multi-agent epistemic planner with common knowledge called MEPC. Our planner successfully generated solutions for several domains that demonstrate the typical usage of common knowledge.
#1239

PEORL: Integrating Symbolic Planning and Hierarchical Reinforcement Learning for Robust Decision-Making
Fangkai Yang, Daoming Lyu, Bo Liu, Steven Gustafson

Knowledge Representation and Planning

Reinforcement learning and symbolic planning have both been used to build intelligent autonomous agents. Reinforcement learning relies on learning from interactions with real world, which often requires an unfeasibly large amount of experience. Symbolic planning relies on manually crafted symbolic knowledge, which may not be robust to domain uncertainties and changes. In this paper we present a unified framework PEORL that integrates symbolic planning with hierarchical reinforcement learning (HRL) to cope with decision-making in dynamic environment with uncertainties. Symbolic plans are used to guide the agent's task execution and learning, and the learned experience is fed back to symbolic knowledge to improve planning. This method leads to rapid policy search and robust symbolic plans in complex domains. The framework is tested on benchmark domains of HRL.

Tuesday 17 14:55 - 16:10 MAS-GSC - Game Theory and Social Choice (C8)

Chair: Maria Polukarov

#1741

Strategyproof and Fair Matching Mechanism for Union of Symmetric M-convex Constraints
Yuzhe Zhang, Kentaro Yahiro, Nathanaël Barrot, Makoto Yokoo

Game Theory and Social Choice

In this paper, we identify a new class of distributional constraints defined as a union of symmetric M-convex sets, which can represent a variety of real-life constraints in two-sided matching settings. Since M-convexity is not closed under union, a union of symmetric M-convex sets does not belong to this well-behaved class of constraints in general. Thus, developing a fair and strategyproof mechanism that can handle this class is challenging. We present a novel mechanism called Quota Reduction Deferred Acceptance (QRDA), which repeatedly applies the standard DA mechanism by sequentially reducing artificially introduced maximum quotas. We show that QRDA is fair and strategyproof when handling a union of symmetric M-convex sets. Furthermore, in comparison to a baseline mechanism called Artificial Cap Deferred Acceptance (ACDA), QRDA always obtains a weakly better matching for students and, experimentally, performs better in terms of nonwastefulness.
#2693

Exact Algorithms and Complexity of Kidney Exchange
Mingyu Xiao, Xuanbei Wang

Game Theory and Social Choice

Kidney Exchange is an approach to donor kidney transplantation where patients with incompatible donors swap kidneys to receive a compatible kidney. Since it was first put forward in 1986, increasing amount of people have gotten a life-saving kidney with the popularity of Kidney Exchange, as patients have more opportunities to get saved in this way. This growth is making the problem of optimally matching patients to donors more difficult to solve. The central problem, indeed, is the NP-hard problem to find the largest vertex-disjoint packing of cycles and chains in a graph that represents the compatibility between patients and donors, where due to the human resource limitation we may have constraints on the maximum length of cycles and chains. This paper mainly contributes to algorithms from theory for this problem with and without length constraints (restricted and free versions). We give: 1. A single-exponential exact algorithm based on subset convolution for the two versions; 2. An FPT algorithm for the free version with parameter being the number of vertex ``types'' in the graph.
#3408

Facility Reallocation on the Line
Bart de Keijzer, Dominik Wojtczak

Game Theory and Social Choice

We consider a multi-stage facility reallocation problems on the real line, where a facility is being moved between stages based on the locations reported by n agents. The aim of the reallocation mechanism is to minimize the social cost, i.e., the sum over the total distance between the facility and all agents at all stages, plus the cost incurred for moving the facility. We also study this problem both in the offline setting and online setting. In the offline case the mechanism has full knowledge of the agent locations in all future stages, and in the online setting the mechanism does not know these future locations and must decide the location of the facility on a stage-per-stage basis. For both cases, we derive the optimal mechanism, where for the online setting we show that its competitive ratio is (n+2)/(n+1). As neither of these mechanisms turns out to be strategyproof, we propose another strategyproof mechanism which has a competitive ratio of (n+3)/(n+1) for odd n and (n+4)/n for even n, which we conjecture to be the best possible. We also consider a generalization with multiple facilities and weighted agents, for which we show that the optimum can be computed in polynomial time for a fixed number of facilities.
#3413

Negotiation Strategies for Agents with Ordinal Preferences
Sefi Erlich, Noam Hazon, Sarit Kraus

Game Theory and Social Choice

Negotiation is a very common interaction between automated agents. Many common negotiation protocols work with cardinal utilities, even though ordinal preferences, which only rank the outcomes, are easier to elicit from humans. In this work we concentrate on negotiation with ordinal preferences over a finite set of outcomes. We study an intuitive protocol for bilateral negotiation, where the two parties make offers alternately. We analyze the negotiation protocol under different settings. First, we assume that each party has full information about the other party's preference order. We provide elegant strategies that specify a sub-game perfect equilibrium for the agents. We further show how the studied negotiation protocol almost completely implements a known bargaining rule. Finally, we analyze the no information setting. We study several solution concepts that are distribution-free, and analyze both the case where neither party knows the preference order of the other party, and the case where only one party is uninformed.
#4120

Big City vs. the Great Outdoors: Voter Distribution and How It Affects Gerrymandering
Allan Borodin, Omer Lev, Nisarg Shah, Tyrone Strangway

Game Theory and Social Choice

Gerrymandering is the process by which parties manipulate boundaries of electoral districts in order to maximize the number of districts they can win. Demographic trends show an increasingly strong correlation between residence and party affiliation; some party’s supporters congregate in cities, while others stay in more rural areas. We investigate both theoretically and empirically the effect of this trend on a party's ability to gerrymander in a two-party model ("urban party" and "rural party"). Along the way, we propose a definition of the gerrymandering power of a party, and an algorithmic approach for near-optimal gerrymandering in large instances. Our results suggest that beyond a fairly small concentration of urban party's voters, the gerrymandering power of a party depends almost entirely on the level of concentration, and not on the party's share of the population. As partisan separation grows, the gerrymandering power of both parties converge so that each party can gerrymander to get only slightly more than what its voting share warrants, bringing about, ultimately, a more representative outcome. Moreover, there seems to be an asymmetry between the gerrymandering power of the parties, with the rural party being more capable of gerrymandering.
#5137

(Sister Conferences Best Papers Track) Combinatorial Cost Sharing
Shahar Dobzinski, Shahar Ovadia

Game Theory and Social Choice

We introduce a combinatorial variant of the cost sharing problem: several services can be provided to each player and each player values every combination of services differently. A publicly known cost function specifies the cost of providing every possible combination of services. A combinatorial cost sharing mechanism is a protocol that decides which services each player gets and at what price. We look for dominant strategy mechanisms that are (economically) efficient and cover the cost, ideally without overcharging (i.e., budget balanced). Note that unlike the standard cost sharing setting, combinatorial cost sharing is a multi-parameter domain. This makes designing dominant strategy mechanisms with good guarantees a challenging task. We present the Potential Mechanism -- a combination of the VCG mechanism and a well-known tool from the theory of cooperative games: Hart and Mas-Colell's potential function. The potential mechanism is a dominant strategy mechanism that always covers the incurred cost. When the cost function is subadditive the same mechanism is also approximately efficient. Our main technical contribution shows that when the cost function is submodular the potential mechanism is approximately budget balanced in three settings: supermodular valuations, symmetric cost function and general symmetric valuations, and two players with general valuations.

Tuesday 17 14:55 - 16:10 CSAT-ML - Constraints, Satisfiability and Learning (K2)

Chair: Chen Gong

#507

Descriptive Clustering: ILP and CP Formulations with Applications
Thi-Bich-Hanh Dao, Chia-Tung Kuo, S. S. Ravi, Christel Vrain, Ian Davidson

Constraints, Satisfiability and Learning

In many settings just finding a good clustering is insufficient and an explanation of the clustering is required. If the features used to perform the clustering are interpretable then methods such as conceptual clustering can be used. However, in many applications this is not the case particularly for image, graph and other complex data. Here we explore the setting where a set of interpretable discrete tags for each instance is available. We formulate the descriptive clustering problem as a bi-objective optimization to simultaneously find compact clusters using the features and to describe them using the tags. We present our formulation in a declarative platform and show it can be integrated into a standard iterative algorithm to find all Pareto optimal solutions to the two objectives. Preliminary results demonstrate the utility of our approach on real data sets for images and electronic health care records and that it outperforms single objective and multi-view clustering baselines.
#1228

Machine Learning and Constraint Programming for Relational-To-Ontology Schema Mapping
Diego De Uña, Nataliia Rümmele, Graeme Gange, Peter Schachte, Peter J. Stuckey

Constraints, Satisfiability and Learning

The problem of integrating heterogeneous data sources into an ontology is highly relevant in the database field. Several techniques exist to approach the problem, but side constraints on the data cannot be easily implemented and thus the results may be inconsistent. In this paper we improve previous work by Taheriyan et al. [2016a] using Machine Learning (ML) to take into account inconsistencies in the data (unmatchable attributes) and encode the problem as a variation of the Steiner Tree, for which we use work by De Uña et al. [2016] in Constraint Programming (CP). Combining ML and CP achieves state-of-the-art precision, recall and speed, and provides a more flexible framework for variations of the problem.
#2569

Faster Training Algorithms for Structured Sparsity-Inducing Norm
Bin Gu, Xingwang Ju, Xiang Li, Guansheng Zheng

Constraints, Satisfiability and Learning

Structured-sparsity regularization is popular for sparse learning because of its flexibility of encoding the feature structures. This paper considers a generalized version of structured-sparsity regularization (especially for $l_1/l_{\infty}$ norm) with arbitrary group overlap. Due to the group overlap, it is time-consuming to solve the associated proximal operator. Although Mairal~\shortcite{mairal2010network} have proposed a network-flow algorithm to solve the proximal operator, it is still time-consuming especially in the high-dimensional setting. To address this challenge, in this paper, we have developed a more efficient solution for $l_1/l_{\infty}$ group lasso with arbitrary group overlap using an Inexact Proximal-Gradient method. In each iteration, our algorithm only requires to calculate an inexact solution to the proximal sub-problem, which can be done efficiently. On the theoretic side, the proposed algorithm enjoys the same global convergence rate as the exact proximal methods. Experiments demonstrate that our algorithm is much more efficient than network-flow algorithm, while retaining the similar generalization performance.
#3840

Learning SMT(LRA) Constraints using SMT Solvers
Samuel Kolb, Stefano Teso, Andrea Passerini, Luc De Raedt

Constraints, Satisfiability and Learning

We introduce the problem of learning SMT(LRA) constraints from data. SMT(LRA) extends propositional logic with (in)equalities between numerical variables. Many relevant formal verification problems can be cast as SMT(LRA) instances and SMT(LRA) has supported recent developments in optimization and counting for hybrid Boolean and numerical domains. We introduce SMT(LRA) learning, the task of learning SMT(LRA) formulas from examples of feasible and infeasible instances, and we contribute INCAL, an exact non-greedy algorithm for this setting. Our approach encodes the learning task itself as an SMT(LRA) satisfiability problem that can be solved directly by SMT solvers. INCAL is an incremental algorithm that achieves exact learning by looking only at a small subset of the data, leading to significant speed-ups. We empirically evaluate our approach on both synthetic instances and benchmark problems taken from the SMT-LIB benchmarks repository.
#3899

Learning Optimal Decision Trees with SAT
Nina Narodytska, Alexey Ignatiev, Filipe Pereira, Joao Marques-Silva

Constraints, Satisfiability and Learning

Explanations of machine learning (ML) predictions are of fundamental importance in different settings. Moreover, explanations should be succinct, to enable easy understanding by humans. Decision trees represent an often used approach for developing explainable ML models, motivated by the natural mapping between decision tree paths and rules. Clearly, smaller trees correlate well with smaller rules, and so one challenge is to devise solutions for computing smallest size decision trees given training data. Although simple to formulate, the computation of smallest size decision trees turns out to be an extremely challenging computational problem, for which no practical solutions are known. This paper develops a SAT-based model for computing smallest-size decision trees given training data. In sharp contrast with past work, the proposed SAT model is shown to scale for publicly available datasets of practical interest.
#2772

Neural Networks for Predicting Algorithm Runtime Distributions
Katharina Eggensperger, Marius Lindauer, Frank Hutter

Constraints, Satisfiability and Learning

Many state-of-the-art algorithms for solving hard combinatorial problems in artificial intelligence (AI) include elements of stochasticity that lead to high variations in runtime, even for a fixed problem instance. Knowledge about the resulting runtime distributions (RTDs) of algorithms on given problem instances can be exploited in various meta-algorithmic procedures, such as algorithm selection, portfolios, and randomized restarts. Previous work has shown that machine learning can be used to individually predict mean, median and variance of RTDs. To establish a new state-of-the-art in predicting RTDs, we demonstrate that the parameters of an RTD should be learned jointly and that neural networks can do this well by directly optimizing the likelihood of an RTD given runtime observations. In an empirical study involving five algorithms for SAT solving and AI planning, we show that neural networks predict the true RTDs of unseen instances better than previous methods, and can even do so when only few runtime observations are available per training instance.

Tuesday 17 14:55 - 16:10 NLP-CLA - Sentence and Text Classification, Text Segmentation (T2)

Chair: Mausam

#189

Differentiated Attentive Representation Learning for Sentence Classification
Qianrong Zhou, Xiaojie Wang, Xuan Dong

Sentence and Text Classification, Text Segmentation

Attention-based models have shown to be effective in learning representations for sentence classification. They are typically equipped with multi-hop attention mechanism. However, existing multi-hop models still suffer from the problem of paying much attention to the most frequently noticed words, which might not be important to classify the current sentence. And there is a lack of explicitly effective way that helps the attention to be shifted out of a wrong part in the sentence. In this paper, we alleviate this problem by proposing a differentiated attentive learning model. It is composed of two branches of attention subnets and an example discriminator. An explicit signal with the loss information of the first attention subnet is passed on to the second one to drive them to learn different attentive preference. The example discriminator then selects the suitable attention subnet for sentence classification. Experimental results on real and synthetic datasets demonstrate the effectiveness of our model.
#255

Jumper: Learning When to Make Classification Decision in Reading
Xianggen Liu, Lili Mou, Haotian Cui, Zhengdong Lu, Sen Song

Sentence and Text Classification, Text Segmentation

In early years, text classification is typically accomplished by feature-based classifiers; recently, neural networks, as powerful classifiers, make it possible to work with raw input as the text stands. In this paper, we propose a novel framework, Jumper, inspired by the cognitive process of text reading, that models text classification as a sequential decision process. Basically, Jumper is a neural system that can scan a piece of text sequentially and make classification decision at the time it chooses. Both the classification and when to make the classification are part of the decision process which are controlled by the policy net and trained with reinforcement learning to maximize the overall classification accuracy. Experimental results show that a properly trained Jumper has the following properties: (1) It can make decisions whenever the evidence is enough, therefore reducing the total text reading by 30~40% and often finding the key rationale of prediction. (2) It can achieve classification accuracy better or comparable to state-of-the-art model in several benchmark and industrial datasets.
#555

SegBot: A Generic Neural Text Segmentation Model with Pointer Network
Jing Li, Aixin Sun, Shafiq Joty

Sentence and Text Classification, Text Segmentation

Text segmentation is a fundamental task in natural language processing that comes in two levels of granularity: (i) segmenting a document into a sequence of topical segments (topic segmentation), and (ii) segmenting a sentence into a sequence of elementary discourse units (EDU segmentation). Traditional solutions to the two tasks heavily rely on carefully designed features. The recently proposed neural models do not need manual feature engineering, but they either suffer from sparse boundary tags or they cannot well handle the issue of variable size output vocabulary. We propose a generic end-to-end segmentation model called SegBot. SegBot uses a bidirectional recurrent neural network to encode input text sequence. The model then uses another recurrent neural network together with a pointer network to select text boundaries in the input sequence. In this way, SegBot does not require hand-crafted features. More importantly, our model inherently handles the issue of variable size output vocabulary and the issue of sparse boundary tags. In our experiments, SegBot outperforms state-of-the-art models on both topic and EDU segmentation tasks.
#4344

Translations as Additional Contexts for Sentence Classification
Reinald Kim Amplayo, Kyungjae Lee, Jinyoung Yeo, Seung-won Hwang

Sentence and Text Classification, Text Segmentation

In sentence classification tasks, additional contexts, such as the neighboring sentences, may improve the accuracy of the classifier. However, such contexts are domain-dependent and thus cannot be used for another classification task with an inappropriate domain. In contrast, we propose the use of translated sentences as domain-free context that is always available regardless of the domain. We find that naive feature expansion of translations gains only marginal improvements and may decrease the performance of the classifier, due to possible inaccurate translations thus producing noisy sentence vectors. To this end, we present multiple context fixing attachment (MCFA), a series of modules attached to multiple sentence vectors to fix the noise in the vectors using the other sentence vectors as context. We show that our method performs competitively compared to previous models, achieving best classification performance on multiple data sets. We are the first to use translations as domain-free contexts for sentence classification.
#769

Deep Text Classification Can be Fooled
Bin Liang, Hongcheng Li, Miaoqiang Su, Pan Bian, Xirong Li, Wenchang Shi

Sentence and Text Classification, Text Segmentation

In this paper, we present an effective method to craft text adversarial samples, revealing one important yet underestimated fact that DNN-based text classifiers are also prone to adversarial sample attack. Specifically, confronted with different adversarial scenarios, the text items that are important for classification are identified by computing the cost gradients of the input (white-box attack) or generating a series of occluded test samples (black-box attack). Based on these items, we design three perturbation strategies, namely insertion, modification, and removal, to generate adversarial samples. The experiment results show that the adversarial samples generated by our method can successfully fool both state-of-the-art character-level and word-level DNN-based text classifiers. The adversarial samples can be perturbed to any desirable classes without compromising their utilities. At the same time, the introduced perturbation is difficult to be perceived.
#696

Multiway Attention Networks for Modeling Sentence Pairs
Chuanqi Tan, Furu Wei, Wenhui Wang, Weifeng Lv, Ming Zhou

Sentence and Text Classification, Text Segmentation

Modeling sentence pairs plays the vital role for judging the relationship between two sentences, such as paraphrase identification, natural language inference, and answer sentence selection. Previous work achieves very promising results using neural networks with attention mechanism. In this paper, we propose the multiway attention networks which employ multiple attention functions to match sentence pairs under the matching-aggregation framework. Specifically, we design four attention functions to match words in corresponding sentences. Then, we aggregate the matching information from each function, and combine the information from all functions to obtain the final representation. Experimental results demonstrate that the proposed multiway attention networks improve the result on the Quora Question Pairs, SNLI, MultiNLI, and answer sentence selection task on the SQuAD dataset.

Tuesday 17 14:55 - 16:10 SGP-ML - Heuristic Search and Learning (T1)

Chair: Frans Oliehoek

#593

Distributed Self-Paced Learning in Alternating Direction Method of Multipliers
Xuchao Zhang, Liang Zhao, Zhiqian Chen, Chang-Tien Lu

Heuristic Search and Learning

Self-paced learning (SPL) mimics the cognitive process of humans, who generally learn from easy samples to hard ones. One key issue in SPL is the training process required for each instance weight depends on the other samples and thus cannot easily be run in a distributed manner in a large-scale dataset. In this paper, we reformulate the self-paced learning problem into a distributed setting and propose a novel Distributed Self-Paced Learning method (DSPL) to handle large scale datasets. Specifically, both the model and instance weights can be optimized in parallel for each batch based on a consensus alternating direction method of multipliers. We also prove the convergence of our algorithm under mild conditions. Extensive experiments on both synthetic and real datasets demonstrate that our approach is superior to those of existing methods.
#616

Episodic Memory Deep Q-Networks
Zichuan Lin, Tianqi Zhao, Guangwen Yang, Lintao Zhang

Heuristic Search and Learning

Reinforcement learning (RL) algorithms have made huge progress in recent years by leveraging the power of deep neural networks (DNN). Despite the success, deep RL algorithms are known to be sample inefficient, often requiring many rounds of interactions with the environments to obtain satisfactory performances. Recently, episodic memory based RL has attracted attention due to its ability to latch on good actions quickly. In this paper, we present a simple yet effective biologically inspired RL algorithm called Episodic Memory Deep Q-Networks (EMDQN), which leverages episodic memory to supervise an agent during training. Experiments show that our proposed method leads to better sample efficiency and is more likely to find good policy. It only requires 1/5 of the interactions of DQN to achieve many state-of-the-art performances on Atari games, significantly outperforming regular DQN and other episodic memory based RL algorithms.
#1919

Optimization based Layer-wise Magnitude-based Pruning for DNN Compression
Guiying Li, Chao Qian, Chunhui Jiang, Xiaofen Lu, Ke Tang

Heuristic Search and Learning

Layer-wise magnitude-based pruning (LMP) is a very popular method for deep neural network (DNN) compression. However, tuning the layer-specific thresholds is a difficult task, since the space of threshold candidates is exponentially large and the evaluation is very expensive. Previous methods are mainly by hand and require expertise. In this paper, we propose an automatic tuning approach based on optimization, named OLMP. The idea is to transform the threshold tuning problem into a constrained optimization problem (i.e., minimizing the size of the pruned model subject to a constraint on the accuracy loss), and then use powerful derivative-free optimization algorithms to solve it. To compress a trained DNN, OLMP is conducted within a new iterative pruning and adjusting pipeline. Empirical results show that OLMP can achieve the best pruning ratio on LeNet-style models (i.e., 114 times for LeNet-300-100 and 298 times for LeNet-5) compared with some state-of-the- art DNN pruning methods, and can reduce the size of an AlexNet-style network up to 82 times without accuracy loss.
#2419

Three-Head Neural Network Architecture for Monte Carlo Tree Search
Chao Gao, Martin Müller, Ryan Hayward

Heuristic Search and Learning

AlphaGo Zero pioneered the concept of two-head neural networks in Monte Carlo Tree Search (MCTS), where the policy output is used for prior action probability and the state-value estimate is used for leaf node evaluation. We propose a three-head neural net architecture with policy, state- and action-value outputs, which could lead to more efficient MCTS since neural leaf estimate can still be back-propagated in tree with delayed node expansion and evaluation. To effectively train the newly introduced action-value head on the same game dataset as for two-head nets, we exploit the optimal relations between parent and children nodes for data augmentation and regularization. In our experiments for the game of Hex, the action-value head learning achieves similar error as the state-value prediction of a two-head architecture. The resulting neural net models are then combined with the same Policy Value MCTS (PV-MCTS) implementation. We show that, due to more efficient use of neural net evaluations, PV-MCTS with three-head neural nets consistently performs better than the two-head ones, significantly outplaying the state-of-the-art player MoHex-CNN.
#3371

Master-Slave Curriculum Design for Reinforcement Learning
Yuechen Wu, Wei Zhang, Ke Song

Heuristic Search and Learning

Curriculum learning is often introduced as a leverage to improve the agent training for complex tasks, where the goal is to generate a sequence of easier subasks for an agent to train on, such that final performance or learning speed is improved. However, conventional curriculum is mainly designed for one agent with fixed action space and sequential simple-to-hard training manner. Instead, we present a novel curriculum learning strategy by introducing the concept of master-slave agents and enabling flexible action setting for agent training. Multiple agents, referred as master agent for the target task and slave agents for the subtasks, are trained concurrently within different action spaces by sharing a perception network with an asynchronous strategy. Extensive evaluation on the VizDoom platform demonstrates the joint learning of master agent and slave agents mutually benefit each other. Significant improvement is obtained over A3C in terms of learning speed and performance.
#1958

Approximation Guarantees of Stochastic Greedy Algorithms for Subset Selection
Chao Qian, Yang Yu, Ke Tang

Heuristic Search and Learning

Subset selection is a fundamental problem in many areas, which aims to select the best subset of size at most $k$ from a universe. Greedy algorithms are widely used for subset selection, and have shown good approximation performances in deterministic situations. However, their behaviors are stochastic in many realistic situations (e.g., large-scale and noisy). For general stochastic greedy algorithms, bounded approximation guarantees were obtained only for subset selection with monotone submodular objective functions, while real-world applications often involve non-monotone or non-submodular objective functions and can be subject to a more general constraint than a size constraint. This work proves their approximation guarantees in these cases, and thus largely extends the applicability of stochastic greedy algorithms.

Tuesday 17 14:55 - 16:10 ML-ROL - Reinforcement Learning and Online Learning (K11)

Chair: Fei Fang

#3769

Exploration by Distributional Reinforcement Learning
Yunhao Tang, Shipra Agrawal

Reinforcement Learning and Online Learning

We propose a framework based on distributional reinforcement learning and recent attempts to combine Bayesian parameter updates with deep reinforcement learning. We show that our proposed framework conceptually unifies multiple previous methods in exploration. We also derive a practical algorithm that achieves efficient exploration on challenging control tasks.
#734

Finite Sample Analysis of LSTD with Random Projections and Eligibility Traces
Haifang Li, Yingce Xia, Wensheng Zhang

Reinforcement Learning and Online Learning

Policy evaluation with linear function approximation is an important problem in reinforcement learning. When facing high-dimensional feature spaces, such a problem becomes extremely hard considering the computation efficiency and quality of approximations. We propose a new algorithm, LSTD(lambda)-RP, which leverages random projection techniques and takes eligibility traces into consideration to tackle the above two challenges. We carry out theoretical analysis of LSTD(lambda)-RP, and provide meaningful upper bounds of the estimation error, approximation error and total generalization error. These results demonstrate that LSTD(lambda)-RP can benefit from random projection and eligibility traces strategies, and LSTD(lambda)-RP can achieve better performances than prior LSTD-RP and LSTD(lambda) algorithms.
#2402

Multi-modality Sensor Data Classification with Selective Attention
Xiang Zhang, Lina Yao, Chaoran Huang, Sen Wang, Mingkui Tan, Guodong Long, Can Wang

Reinforcement Learning and Online Learning

Multimodel wearable sensor data classificationplays an important role in ubiquitous computingand has a wide range of applications in variousscenarios from healthcare to entertainment. How-ever, most of the existing work in this field em-ploys domain-specific approaches and is thus inef-fective in complex situations where multi-modalitysensor data is collected. Moreover, the wearablesensor data is less informative than the conven-tional data such as texts or images. In this paper,to improve the adaptability of such classificationmethods across different application contexts, weturn this classification task into a game and applya deep reinforcement learning scheme to dynami-cally deal with complex situations. We also intro-duce a selective attention mechanism into the rein-forcement learning scheme to focus on the crucialdimensions of the data. This mechanism helps tocapture extra information from the signal, and canthus significantly improve the discriminative powerof the classifier. We carry out several experimentson three wearable sensor datasets, and demonstratecompetitive performance of the proposed approachcompared to several state-of-the-art baselines.
#4471

Reinforced Mnemonic Reader for Machine Reading Comprehension
Minghao Hu, Yuxing Peng, Zhen Huang, Xipeng Qiu, Furu Wei, Ming Zhou

Reinforcement Learning and Online Learning

In this paper, we introduce the Reinforced Mnemonic Reader for machine reading comprehension tasks, which enhances previous attentive readers in two aspects. First, a reattention mechanism is proposed to refine current attentions by directly accessing to past attentions that are temporally memorized in a multi-round alignment architecture, so as to avoid the problems of attention redundancy and attention deficiency. Second, a new optimization approach, called dynamic-critical reinforcement learning, is introduced to extend the standard supervised method. It always encourages to predict a more acceptable answer so as to address the convergence suppression problem occurred in traditional reinforcement learning algorithms. Extensive experiments on the Stanford Question Answering Dataset (SQuAD) show that our model achieves state-of-the-art results. Meanwhile, our model outperforms previous systems by over 6% in terms of both Exact Match and F1 metrics on two adversarial SQuAD datasets.
#779

Preventing Disparate Treatment in Sequential Decision Making
Hoda Heidari, Andreas Krause

Reinforcement Learning and Online Learning

We study fairness in sequential decision making environments, where at each time step a learning algorithm receives data corresponding to a new individual (e.g. a new job application) and must make an irrevocable decision about him/her (e.g. whether to hire the applicant) based on observations made so far. In order to prevent cases of disparate treatment, our time-dependent notion of fairness requires algorithmic decisions to be consistent: if two individuals are similar in the feature space and arrive during the same time epoch, the algorithm must assign them to similar outcomes. We propose a general framework for post-processing predictions made by a black-box learning model, that guarantees the resulting sequence of outcomes is consistent. We show theoretically that imposing consistency will not significantly slow down learning. Our experiments on two real-world data sets illustrate and confirm this finding in practice.
#1965

Cost-aware Cascading Bandits
Ruida Zhou, Chao Gan, Jing Yang, Cong Shen

Reinforcement Learning and Online Learning

In this paper, we propose a cost-aware cascading bandits model, a new variant of multi-armed bandits with cascading feedback, by considering the random cost of pulling arms. In each step, the learning agent chooses an {\it ordered} list of items and \congr{examines} them sequentially, until certain stopping condition is satisfied. Our objective is then to maximize the expected {\it net reward} in each step, i.e., the reward obtained in each step minus the total cost incurred in examining the items, by deciding the ordered list of items, as well as when to stop examination. We study both the offline and online settings, depending on whether the state and cost statistics of the items are known beforehand. For the offline setting, we show that the Unit Cost Ranking with Threshold 1 (UCR-T1) policy is optimal. For the online setting, we propose a Cost-aware Cascading Upper Confidence Bound (CC-UCB) algorithm, and show that the cumulative regret scales in $O(\log T)$. We also provide a lower bound for all $\alpha$-consistent policies, which scales in $\Omega(\log T)$ and matches our upper bound. The performance of the CC-UCB algorithm is evaluated with both synthetic and real-world data.

Tuesday 17 14:55 - 16:10 HAI-PUM - Personalization, User Modeling (C2)

Chair: Grzegorz J. Nalepa

#3485

Algorithms for Fair Load Shedding in Developing Countries
Olabambo I. Oluwasuji, Obaid Malik, Jie Zhang, Sarvapali D. Ramchurn

Personalization, User Modeling

Due to the limited generation capacity of power stations, many developing countries frequently resort to disconnecting large parts of the power grid from supply, a process termed load shedding. During load shedding, many homes are left without electricity, causing them inconvenience and discomfort. In this paper, we present a number of optimization heuristics that focus on pairwise and groupwise fairness, such that households (i.e. agents) are fairly allocated electricity. We evaluate the heuristics against standard fairness metrics in terms of comfort delivered to homes, as well as the number of times they are disconnected from electricity supply. Thus, we establish new benchmarks for fair load shedding schemes.
#607

Learning Sequential Correlation for User Generated Textual Content Popularity Prediction
Wen Wang, Wei Zhang, Jun Wang, Junchi Yan, Hongyuan Zha

Personalization, User Modeling

Popularity prediction of user generated textual content is critical for prioritizing information in the web, which alleviates heavy information overload for ordinary readers. Most previous studies model each content instance separately for prediction and thus overlook the sequential correlations between instances of a specific user. In this paper, we go deeper into this problem based on the two observations for each user, i.e., sequential content correlation and sequential popularity correlation. We propose a novel deep sequential model called User Memory-augmented recurrent Attention Network (UMAN). This model encodes the two correlations by updating external user memories which is further leveraged for target text representation learning and popularity prediction. The experimental results on several real-world datasets validate the benefits of considering these correlations and demonstrate UMAN achieves best performance among several strong competitors.
#2399

Personality-Aware Personalized Emotion Recognition from Physiological Signals
Sicheng Zhao, Guiguang Ding, Jungong Han, Yue Gao

Personalization, User Modeling

Emotion recognition methodologies from physiological signals are increasingly becoming personalized, due to the subjective responses of different subjects to physical stimuli. Existing works mainly focused on modelling the involved physiological corpus of each subject, without considering the psychological factors. The latent correlation among different subjects has also been rarely examined. We propose to investigate the influence of personality on emotional behavior in a hypergraph learning framework. Assuming that each vertex is a compound tuple (subject, stimuli), multi-modal hypergraphs can be constructed based on the personality correlation among different subjects and on the physiological correlation among corresponding stimuli. To reveal the different importance of vertices, hyperedges, and modalities, we assign each of them with weights. The emotion relevance learned on the vertex-weighted multi-modal multi-task hypergraphs is employed for emotion recognition. We carry out extensive experiments on the ASCERTAIN dataset and the results demonstrate the superiority of the proposed method.
#4075

Cross-Domain Depression Detection via Harvesting Social Media
Tiancheng Shen, Jia Jia, Guangyao Shen, Fuli Feng, Xiangnan He, Huanbo Luan, Jie Tang, Thanassis Tiropanis, Tat-Seng Chua, Wendy Hall

Personalization, User Modeling

Depression detection is a significant issue for human well-being. In previous studies, online detection has proven effective in Twitter, enabling proactive care for depressed users. Owing to cultural differences, replicating the method to other social media platforms, such as Chinese Weibo, however, might lead to poor performance because of insufficient available labeled (self-reported depression) data for model training. In this paper, we study an interesting but challenging problem of enhancing detection in a certain target domain (e.g. Weibo) with ample Twitter data as the source domain. We first systematically analyze the depression-related feature patterns across domains and summarize two major detection challenges, namely isomerism and divergency. We further propose a cross-domain Deep Neural Network model with Feature Adaptive Transformation & Combination strategy (DNN-FATC) that transfers the relevant information across heterogeneous domains. Experiments demonstrate improved performance compared to existing heterogeneous transfer methods or training directly in the target domain (over 3.4% improvement in F1), indicating the potential of our model to enable depression detection via social media for more countries with different cultural settings.
#3052

Neural Framework for Joint Evolution Modeling of User Feedback and Social Links in Dynamic Social Networks
Peizhi Wu, Yi Tu, Xiaojie Yuan, Adam Jatowt, Zhenglu Yang

Personalization, User Modeling

Modeling the evolution of user feedback and social links in dynamic social networks is of considerable significance, because it is the basis of many applications, including recommendation systems and user behavior analyses. Most of the existing methods in this area model user behaviors separately and consider only certain aspects of this problem, such as dynamic preferences of users, dynamic attributes of items, evolutions of social networks, and their partial integration. This work proposes a comprehensive general neural framework with several optimal strategies to jointly model the evolution of user feedback and social links. The framework considers the dynamic user preferences, dynamic item attributes, and time-dependent social links in time evolving social networks. Experimental results conducted on two real-world datasets demonstrate that our proposed model performs remarkably better than state-of-the-art methods.
#2280

LSTM Networks for Online Cross-Network Recommendations
Dilruk Perera, Roger Zimmermann

Personalization, User Modeling

Cross-network recommender systems use auxiliary information from multiple source networks to create holistic user profiles and improve recommendations in a target network. However, we find two major limitations in existing cross-network solutions that reduce overall recommender performance. Existing models (1) fail to capture complex non-linear relationships in user interactions, and (2) are designed for offline settings hence, not updated online with incoming interactions to capture the dynamics in the recommender environment. We propose a novel multi-layered Long Short-Term Memory (LSTM) network based online solution to mitigate these issues. The proposed model contains three main extensions to the standard LSTM: First, an attention gated mechanism to capture long-term user preference changes. Second, a higher order interaction layer to alleviate data sparsity. Third, time aware LSTM cell gates to capture irregular time intervals between user interactions. We illustrate our solution using auxiliary information from Twitter and Google Plus to improve recommendations on YouTube. Extensive experiments show that the proposed model consistently outperforms state-of-the-art in terms of accuracy, diversity and novelty.

Tuesday 17 14:55 - 16:10 ML-DL - Deep Learning (C3)

Chair: Matthias Schubert

#4089

Network Approximation using Tensor Sketching
Shiva Prasad Kasiviswanathan, Nina Narodytska, Hongxia Jin

Deep Learning

Deep neural networks are powerful learning models that achieve state-of-the-art performance on many computer vision, speech, and language processing tasks. In this paper, we study a fundamental question that arises when designing deep network architectures: Given a target network architecture can we design a `smaller' network architecture that 'approximates' the operation of the target network? The question is, in part, motivated by the challenge of parameter reduction (compression) in modern deep neural networks, as the ever increasing storage and memory requirements of these networks pose a problem in resource constrained environments.In this work, we focus on deep convolutional neural network architectures, and propose a novel randomized tensor sketching technique that we utilize to develop a unified framework for approximating the operation of both the convolutional and fully connected layers. By applying the sketching technique along different tensor dimensions, we design changes to the convolutional and fully connected layers that substantially reduce the number of effective parameters in a network. We show that the resulting smaller network can be trained directly, and has a classification accuracy that is comparable to the original network.
#1075

Stochastic Fractional Hamiltonian Monte Carlo
Nanyang Ye, Zhanxing Zhu

Deep Learning

In this paper, we propose a novel stochastic fractional Hamiltonian Monte Carlo approach which generalizes the Hamiltonian Monte Carlo method within the framework of fractional calculus and L\'evy diffusion. Due to the large ``jumps'' introduced by L\'evy noise and momentum term, the proposed dynamics is capable of exploring the parameter space more efficiently and effectively. We have shown that the fractional Hamiltonian Monte Carlo could sample the multi-modal and high-dimensional target distribution more efficiently than the existing methods driven by Brownian diffusion. We further extend our method for optimizing deep neural networks. The experimental results show that the proposed stochastic fractional Hamiltonian Monte Carlo for training deep neural networks could converge faster than other popular optimization schemes and generalize better.
#737

HST-LSTM: A Hierarchical Spatial-Temporal Long-Short Term Memory Network for Location Prediction
Dejiang Kong, Fei Wu

Deep Learning

The widely use of positioning technology has made mining the movements of people feasible and plenty of trajectory data have been accumulated. How to efficiently leverage these data for location prediction has become an increasingly popular research topic as it is fundamental to location-based services (LBS). The existing methods often focus either on long time (days or months) visit prediction (i.e., the recommendation of point of interest) or on real time location prediction (i.e., trajectory prediction). In this paper, we are interested in the location prediction problem in a weak real time condition and aim to predict users' movement in next minutes or hours. We propose a Spatial-Temporal Long-Short Term Memory (ST-LSTM) model which naturally combines spatial-temporal influence into LSTM to mitigate the problem of data sparsity. Further, we employ a hierarchical extension of the proposed ST-LSTM (HST-LSTM) in an encoder-decoder manner which models the contextual historic visit information in order to boost the prediction performance. The proposed HST-LSTM is evaluated on a real world trajectory data set and the experimental results demonstrate the effectiveness of the proposed model.
#3881

Spatio-Temporal Check-in Time Prediction with Recurrent Neural Network based Survival Analysis
Guolei Yang, Ying Cai, Chandan K Reddy

Deep Learning

We introduce a novel check-in time prediction problem. The goal is to predict the time a user will check-in to a given location. We formulate check-in prediction as a survival analysis problem and propose a Recurrent-Censored Regression (RCR) model. We address the key challenge of check-in data scarcity, which is due to the uneven distribution of check-ins among users/locations. Our idea is to enrich the check-in data with potential visitors, i.e., users who have not visited the location before but are likely to do so. RCR uses recurrent neural network to learn latent representations from historical check-ins of both actual and potential visitors, which is then incorporated with censored regression to make predictions. Experiments show RCR outperforms state-of-the-art event time prediction techniques on real-world datasets.
#4420

Learning to Recognize Transient Sound Events using Attentional Supervision
Szu-Yu Chou, Jyh-Shing Jang, Yi-Hsuan Yang

Deep Learning

Making sense of the surrounding context and ongoing events through not only the visual inputs but also acoustic cues is critical for various AI applications. This paper presents an attempt to learn a neural network model that recognizes more than 500 different sound events from the audio part of user generated videos (UGV). Aside from the large number of categories and the diverse recording conditions found in UGV, the task is challenging because a sound event may occur only for a short period of time in a video clip. Our model specifically tackles this issue by combining a main subnet that aggregates information from the entire clip to make clip-level predictions, and a supplementary subnet that examines each short segment of the clip for segment-level predictions. As the labeled data available for model training are typically on the clip level, the latter subnet learns to pay attention to segments selectively to facilitate attentional segment-level supervision. We call our model the M&mnet, for it leverages both “M”acro (clip-level) supervision and “m”icro (segment-level) supervision derived from the macro one. Our experiments show that M&mnet works remarkably well for recognizing sound events, establishing a new state-of-theart for DCASE17 and AudioSet data sets. Qualitative analysis suggests that our model exhibits strong gains for short events. In addition, we show that the micro subnet is computationally light and we can use multiple micro subnets to better exploit information in different temporal scales.
#1582

LC-RNN: A Deep Learning Model for Traffic Speed Prediction
Zhongjian Lv, Jiajie Xu, Kai Zheng, Hongzhi Yin, Pengpeng Zhao, Xiaofang Zhou

Deep Learning

Traffic speed prediction is known as an important but challenging problem. In this paper, we propose a novel model, called LC-RNN, to achieve more accurate traffic speed prediction than existing solutions. It takes advantage of both RNN and CNN models by a rational integration of them, so as to learn more meaningful time-series patterns that can adapt to the traffic dynamics of surrounding areas. Furthermore, since traffic evolution is restricted by the underlying road network, a network embedded convolution structure is proposed to capture topology aware features. The fusion with other information, including periodicity and context factors, is also considered to further improve accuracy. Extensive experiments on two real datasets demonstrate that our proposed LC-RNN outperforms six well-known existing methods.

Tuesday 17 16:20 - 19:00 ANAC Competition (K13)

ANAC Competition

ANAC Competition

Tuesday 17 16:40 - 18:20 SPE-EC - Special Track: Evolution of the Contours of AI (VICTORIA)

Chair: Ronen Brafman

#5202

Towards Consumer-Empowering Artificial Intelligence
Giuseppe Contissa, Francesca Lagioia, Marco Lippi, Hans-Wolfgang Micklitz, Przemyslaw Palka, Giovanni Sartor, Paolo Torroni

Special Track: Evolution of the Contours of AI

Artificial Intelligence and Law is undergoing a critical transformation. Traditionally focused on the development of expert systems and on a scholarly effort to develop theories and methods for knowledge representation and reasoning in the legal domain, this discipline is now adapting to a sudden change of scenery. No longer confined to the walls of academia, it has welcomed new actors, such as businesses and companies, who are willing to play a major role and seize new opportunities offered by the same transformational impact that recent AI breakthroughs are having on many other areas. As it happens, commercial interests create new opportunities but they also represent a potential threat to consumers, as the balance of power seems increasingly determined by the availability of data. We believe that while this transformation is still in progress, time is ripe for the next frontier of this field of study, where a new shift of balance may be enabled by tools and services that can be of service not only to businesses but also to consumers and, more generally, the civil society. We call that frontier consumer-empowering AI.
#5203

Quantifying Algorithmic Improvements over Time
Lars Kotthoff, Alexandre Fréchette, Tomasz Michalak, Talal Rahwan, Holger H. Hoos, Kevin Leyton-Brown

Special Track: Evolution of the Contours of AI

Assessing the progress made in AI and contributions to the state of the art is of major concern to the community. Recently, Frechette et al. [2016] advocated performing such analysis via the Shapley value, a concept from coalitional game theory. In this paper, we argue that while this general idea is sound, it unfairly penalizes older algorithms that advanced the state of the art when introduced, but were then outperformed by modern counterparts. Driven by this observation, we introduce the temporal Shapley value, a measure that addresses this problem while maintaining the desirable properties of the (classical) Shapley value. We use the tempo- ral Shapley value to analyze the progress made in (i) the different versions of the Quicksort algorithm; (ii) the annual SAT competitions 2007–2014; (iii) an annual competition of Constraint Programming, namely the MiniZinc challenge 2014–2016. Our analysis reveals novel insights into the development made in these important areas of research over time.
#5204

The Facets of Artificial Intelligence: A Framework to Track the Evolution of AI
Fernando Martínez-Plumed, Bao Sheng Loe, Peter Flach, Seán Ó hÉigeartaigh, Karina Vold, José Hernández-Orallo

Special Track: Evolution of the Contours of AI

We present nine facets for the analysis of the past and future evolution of AI. Each facet has also a set of edges that can summarise different trends and contours in AI. With them, we first conduct a quantitative analysis using the information from two decades of AAAI/IJCAI conferences and around 50 years of documents from AI topics, an official database from the AAAI, illustrated by several plots. We then perform a qualitative analysis using the facets and edges, locating AI systems in the intelligence landscape and the discipline as a whole. This analytical framework provides a more structured and systematic way of looking at the shape and boundaries of AI.
#5201

On a Scientific Discipline (Once) Named AI
Wolfgang Bibel

Special Track: Evolution of the Contours of AI

The paper envisions a scientific discipline of fundamental importance comparable to Physics or Biology, reminding that a discipline of such a contour was originally intended by the founders of Artificial Intelligence (AI). AI today, however, is far from such an encompassing discipline sharing the respective research interests with at least half a dozen of other disciplines. After the analysis of this situation and its background we discuss the consequences of this splintering by means of selected challenges. We deliberate thereby what could be done to alleviate the disadvantages resulting from the current state of affairs and to leverage AI's current prominence in the public attention to re-engage in the field's broader mission.
#5206

Artificial Intelligence Conferences Closeness
Sébastien Konieczny, Emmanuel Lonca

Special Track: Evolution of the Contours of AI

We study the evolution of Artificial Intelligence conference closeness, using the coscinus tool. Coscinus computes the closeness between publication supports using the co-publication habits of authors: the more authors publish in two conferences, the closer these two conferences. In this paper we perform an analysis of the main Artificial Intelligence conferences based on principal components analysis and clustering performed on this closeness relation.
#5205

Evolving AI from Research to Real Life – Some Challenges and Suggestions
Sandya Mannarswamy, Shourya Roy

Special Track: Evolution of the Contours of AI

Artificial Intelligence (AI) has come a long way from the stages of being just scientific fiction or academic research curiosity to a point, where it is poised to impact human life significantly. AI driven applications such as autonomous vehicles, medical diagnostics, conversational agents etc. are becoming a reality. In this position paper, we argue that there are certain challenges AI still needs to overcome in its evolution from Research to Real Life. We outline some of these challenges and our suggestions to address them. We provide pointers to similar issues and their resolutions in disciplines such as psychology and medicine from which AI community can leverage the learning. More importantly, this paper is intended to focus the attention of AI research community on translating AI research efforts into real world deployments.

Tuesday 17 16:40 - 18:20 KR-MAS2 - Knowledge Representation and Agents: Verification, Model Checking (C7)

Chair: Dengji Zhao

#2765

Model Checking Probabilistic Epistemic Logic for Probabilistic Multiagent Systems
Chen Fu, Andrea Turrini, Xiaowei Huang, Lei Song, Yuan Feng, Lijun Zhang

Knowledge Representation and Agents: Verification, Model Checking

In this work we study the model checking problem for probabilistic multiagent systems with respect to the probabilistic epistemic logic PETL, which can specify both temporal and epistemic properties. We show that under the realistic assumption of uniform schedulers, i.e., the choice of every agent depends only on its observation history, PETL model checking is undecidable. By restricting the class of schedulers to be memoryless schedulers, we show that the problem becomes decidable. More importantly, we design a novel algorithm which reduces the model checking problem into a mixed integer non-linear programming problem, which can then be solved by using an SMT solver. The algorithm has been implemented in an existing model checker and experiments are conducted on examples from the IPPC competitions.
#3182

Alternating-time Temporal Logic on Finite Traces
Francesco Belardinelli, Alessio Lomuscio, Aniello Murano, Sasha Rubin

Knowledge Representation and Agents: Verification, Model Checking

We develop a logic-based technique to analyse finite interactions in multi-agent systems. We introduce a semantics for Alternating-time Temporal Logic (for both perfect and imperfect recall) and its branching-time fragments in which paths are finite instead of infinite. We study validities of these logics and present optimal algorithms for their model-checking problems in the perfect recall case.
#3907

LTL Realizability via Safety and Reachability Games
Alberto Camacho, Christian Muise, Jorge A. Baier, Sheila A. McIlraith

Knowledge Representation and Agents: Verification, Model Checking

In this paper, we address the problem of LTL realizability and synthesis. State of the art techniques rely on so-called bounded synthesis methods, which reduce the problem to a safety game. Realizability is determined by solving synthesis in a dual game. We provide a unified view of duality, and introduce novel bounded realizability methods via reductions to reachability games. Further, we introduce algorithms, based on AI automated planning, to solve these safety and reachability games. This is the the first complete approach to LTL realizability and synthesis via automated planning. Experiments illustrate that reductions to reachability games are an alternative to reductions to safety games, and show that planning can be a competitive approach to LTL realizability and synthesis.
#2881

Symbolic Synthesis of Fault-Tolerance Ratios in Parameterised Multi-Agent Systems
Panagiotis Kouvaros, Alessio Lomuscio, Edoardo Pirovano

Knowledge Representation and Agents: Verification, Model Checking

We study the problem of determining the robustness of a multi-agent system of unbounded size against specifications expressed in a temporal-epistemic logic. We introduce a procedure to synthesise automatically the maximal ratio of faulty agents that may be present at runtime for a specification to be satisfied in a multi-agent system. We show the procedure to be sound and amenable to symbolic implementation. We present an implementation and report the experimental results obtained by running this on a number of protocols from swarm robotics.
#2914

Synthesis of Controllable Nash Equilibria in Quantitative Objective Game
Shaull Almagor, Orna Kupferman, Giuseppe Perelli

Knowledge Representation and Agents: Verification, Model Checking

In Rational Synthesis, we consider a multi-agent system in which some of the agents are controllable and some are not. All agents have objectives, and the goal is to synthesize strategies for the controllable agents so that their objectives are satisfied, assuming rationality of the uncontrollable agents. Previous work on rational synthesis considers objectives in LTL, namely ones that describe on-going behaviors, and in Objective-LTL, which allows ranking of LTL formulas. In this paper, we extend rational synthesis to LTL[F] -- an extension of LTL by quality operators. The satisfaction value of an LTL[F] formula is a real value in [0,1], where the higher the value is, the higher is the quality in which the computation satisfies the specification. The extension significantly strengthens the framework of rational synthesis and enables a study its game- and social-choice theoretic aspects. In particular, we study the price of stability and price of anarchy of the rational-synthesis game and use them to explain the cooperative and non-cooperative settings of rational synthesis. Our algorithms make use of strategy logic and decision procedures for it. Thus, we are able to handle the richer quantitative setting using existing tools. In particular, we show that the cooperative and non-cooperative versions of quantitative rational synthesis are 2EXPTIME-complete and in 3EXPTIME, respectively -- not harder than the complexity known for their Boolean analogues.
#1925

Verifying Emergence of Bounded Time Properties in Probabilistic Swarm Systems
Alessio Lomuscio, Edoardo Pirovano

Knowledge Representation and Agents: Verification, Model Checking

We introduce a parameterised semantics for reasoning about swarms as unbounded collections of agents in a probabilistic setting. We develop a method for the formal identification of emergent properties, expressed in a fragment of the probabilistic logic PCTL. We introduce algorithms for solving the related decision problems and show their correctness. We present an implementation and evaluate its performance on an ant coverage algorithm.
#3835

Reachability Analysis of Deep Neural Networks with Provable Guarantees
Wenjie Ruan, Xiaowei Huang, Marta Kwiatkowska

Knowledge Representation and Agents: Verification, Model Checking

Verifying correctness for deep neural networks (DNNs) is challenging. We study a generic reachability problem for feed-forward DNNs which, for a given set of inputs to the network and a Lipschitz-continuous function over its outputs computes the lower and upper bound on the function values. Because the network and the function are Lipschitz continuous, all values in the interval between the lower and upper bound are reachable. We show how to obtain the safety verification problem, the output range analysis problem and a robustness measure by instantiating the reachability problem. We present a novel algorithm based on adaptive nested optimisation to solve the reachability problem. The technique has been implemented and evaluated on a range of DNNs, demonstrating its efficiency, scalability and ability to handle a broader class of networks than state-of-the-art verification approaches.
#1183

Abstraction of Agents Executing Online and their Abilities in the Situation Calculus
Bita Banihashemi, Giuseppe De Giacomo, Yves Lespérance

Knowledge Representation and Agents: Verification, Model Checking

We develop a general framework for abstracting online behavior of an agent that may acquire new knowledge during execution (e.g., by sensing), in the situation calculus and ConGolog. We assume that we have both a high-level action theory and a low-level one that represent the agent's behavior at different levels of detail. In this setting, we define ability to perform a task/achieve a goal, and then show that under some reasonable assumptions, if the agent has a strategy by which she is able to achieve a goal at the high level, then we can refine it into a low-level strategy to do so.

Tuesday 17 16:40 - 18:20 MAS-CCC - Cooperation, Coordination, Collaboration, Coalitions (C8)

Chair: Chen Hajaj

#800

Fostering Cooperation in Structured Populations Through Local and Global Interference Strategies
The Anh Han, Simon Lynch, Long Tran-Thanh, Francisco C. Santos

Cooperation, Coordination, Collaboration, Coalitions

We study the situation of an exogenous decision-maker aiming to encourage a population of autonomous, self-regarding agents to follow a desired behaviour at a minimal cost. The primary goal is therefore to reach an efficient trade-off between pushing the agents to achieve the desired configuration while minimising the total investment. To this end, we test several interference paradigms resorting to simulations of agents facing a cooperative dilemma in a spatial arrangement. We systematically analyse and compare interference strategies rewarding local or global behavioural patterns. Our results show that taking into account the neighbourhood's local properties, such as its level of cooperativeness, can lead to a significant improvement regarding cost efficiency while guaranteeing high levels of cooperation. As such, we argue that local interference strategies are more efficient than global ones in fostering cooperation in a population of autonomous agents.
#4411

Vocabulary Alignment for Collaborative Agents: a Study with Real-World Multilingual How-to Instructions
Paula Chocron, Paolo Pareti

Cooperation, Coordination, Collaboration, Coalitions

Collaboration between heterogeneous agents typically requires the ability to communicate meaningfully. This can be challenging in open environments where participants may use different languages. Previous work proposed a technique to infer alignments between different vocabularies that uses only information about the tasks being executed, without any external resource. Until now, this approach has only been evaluated with artificially created data. We adapt this technique to protocols written by humans in natural language, which we extract from instructional webpages. In doing so, we show how to take into account challenges that arise when working with natural language labels.The quality of the alignments obtained with our technique is evaluated in terms of their effectiveness in enabling successful collaborations, using a translation dictionary as a baseline. We show how our technique outperforms the dictionary when used to interact.
#3123

Robust Norm Emergence by Revealing and Reasoning about Context: Socially Intelligent Agents for Enhancing Privacy
Nirav Ajmeri, Hui Guo, Pradeep K. Murukannaiah, Munindar P. Singh

Cooperation, Coordination, Collaboration, Coalitions

Norms describe the social architecture of a society and govern the interactions of its member agents. It may be appropriate for an agent to deviate from a norm; the deviation being indicative of a specialized norm applying under a specific context. Existing approaches for norm emergence assume simplified interactions wherein deviations are negatively sanctioned. We investigate via simulation the benefits of enriched interactions where deviating agents share selected elements of their contexts. We find that as a result (1) the norms are learned better with fewer sanctions, indicating improved social cohesion; and (2) the agents are better able to satisfy their individual goals. These results are robust under societies of varying sizes and characteristics reflecting pragmatic, considerate, and selfish agents.
#5466

(Journal track) Incentive-Compatible Mechanisms for Norm Monitoring in Open Multi-Agent Systems
Natasha Alechina, Joseph Y. Halpern, Ian A. Kash, Brian Logan

Cooperation, Coordination, Collaboration, Coalitions

We consider the problem of detecting norm violations in open multi-agent systems (MAS). In this extended abstract, we outline the approach of [Alechina et al., 2018], and show how, using ideas from scrip systems, we can design mechanisms where the agents comprising the MAS are incentivised to monitor the actions of other agents for norm violations.
#664

Explaining Multi-Criteria Decision Aiding Models with an Extended Shapley Value
Christophe Labreuche, Simon Fossier

Cooperation, Coordination, Collaboration, Coalitions

The capability to explain the result of aggregation models to decision makers is key to reinforcing user trust. In practice, Multi-Criteria Decision Aiding models are often organized in a hierarchical way, based on a tree of criteria. We present an explanation approach usable with any hierarchical multi-criteria model, based on an influence index of each attribute on the decision. A set of desirable axioms are defined. We show that there is a unique index fulfilling these axioms. This new index is an extension of the Shapley value on trees. An efficient rewriting of this index, drastically reducing the computation time, is obtained. Finally, the use of the new index is illustrated on an example.
#5459

(Journal track) A COP Model for Graph-Constrained Coalition Formation
Filippo Bistaffa, Alessandro Farinelli

Cooperation, Coordination, Collaboration, Coalitions

We focus on Graph-Constrained Coalition Formation (GCCF), a widely studied subproblem of coalition formation where the set of valid coalitions is constrained by a graph. We propose COP-GCCF, a novel approach that models GCCF as a COP. We then solve such COP with a highly-parallel GPU implementation of Bucket Elimination, which is able to exploit the high constraint tightness of COP-GCCF. Results on realistic graphs, i.e., a crawl of the Twitter social graph, show that our approach outperforms state of the art algorithms (i.e., DyCE and IDP G ) by at least one order of magnitude, both in terms of runtime and memory.
#5478

(Journal track) Constrained Coalition Formation on Valuation Structures: Formal Framework, Applications, and Islands of Tractability
Gianluigi Greco, Antonella Guzzo

Cooperation, Coordination, Collaboration, Coalitions

Coalition structure generation is considered in a setting where feasible coalition structures must satisfy constraints of two different kinds modeled in terms of a valuation structure, which consists of a set of pivotal agents that are pairwise incompatible, plus an interaction graph prescribing that a coalition C can form only if the subgraph induced over the nodes/agents in C is connected. It is shown that valuation structures can be used to model a number of relevant problems in real-world applications. Moreover, complexity issues arising with them are studied, by focusing in particular on identifying islands of tractability based on topological properties of the underlying interaction graph. Stability issues on valuation structures are studied too.

Tuesday 17 16:40 - 18:20 SIS-PS - Sister Conferences Best Papers: Planning, Reinforcement Learning (K2)

Chair: Abdallah Saffidine

#5110

Operator Counting Heuristics for Probabilistic Planning
Felipe Trevizan, Sylvie Thiébaux, Patrik Haslum

Sister Conferences Best Papers: Planning, Reinforcement Learning

For the past 25 years, heuristic search has been used to solve domain-independent probabilistic planning problems, but with heuristics that determinise the problem and ignore precious probabilistic information. In this paper, we present a generalization of the operator-counting family of heuristics to Stochastic Shortest Path problems (SSPs) that is able to represent the probability of the actions outcomes. Our experiments show that the equivalent of the net change heuristic in this generalized framework obtains significant run time and coverage improvements over other state-of-the-art heuristics in different planners.
#5111

Cost-Based Goal Recognition for the Path-Planning Domain
Peta Masters, Sebastian Sardina

Sister Conferences Best Papers: Planning, Reinforcement Learning

"Plan recognition as planning" uses an off-the-shelf planner to perform goal recognition. In this paper, we apply the technique to path-planning. We show that a simpler formula provides an identical result in all but one set of conditions and, further, that identical ranking of goals by probability can be achieved without using any observations other than the agent's start location and where she is "now".
#5126

Inductive Certificates of Unsolvability for Domain-Independent Planning
Salomé Eriksson, Gabriele Röger, Malte Helmert

Sister Conferences Best Papers: Planning, Reinforcement Learning

If a planning system outputs a solution for a given problem, it is simple to verify that the solution is valid. However, if a planner claims that a task is unsolvable, we currently have no choice but to trust the planner blindly. We propose a sound and complete class of certificates of unsolvability which can be verified efficiently by an independent program. To highlight their practical use, we show how these certificates can be generated for a wide range of state-of-the-art planning techniques with only polynomial overhead for the planner.
#5112

An Empirical Study of Branching Heuristics through the Lens of Global Learning Rate
Jia Liang, Hari Govind, Pascal Poupart, Krzysztof Czarnecki, Vijay Ganesh

Sister Conferences Best Papers: Planning, Reinforcement Learning

In this paper, we analyze a suite of 7 well-known branching heuristics proposed by the SAT community and show that the better heuristics tend to generate more learnt clauses per decision, a metric we define as the global learning rate (GLR). We propose GLR as a metric for the branching heuristic to optimize. We test our hypothesis by developing a new branching heuristic that maximizes GLR greedily. We show empirically that this heuristic achieves very high GLR and interestingly very low literal block distance (LBD) over the learnt clauses. In our experiments this greedy branching heuristic enables the solver to solve instances faster than VSIDS, when the branching time is taken out of the equation. This experiment is a good proof of concept that a branching heuristic maximizing GLR will lead to good solver performance modulo the computational overhead. Finally, we propose a new branching heuristic, called SGDB, that uses machine learning to cheapily approximate greedy maximization of GLR. We show experimentally that SGDB performs on par with the VSIDS branching heuristic.
#5116

Search Progress and Potentially Expanded States in Greedy Best-First Search
Manuel Heusner, Thomas Keller, Malte Helmert

Sister Conferences Best Papers: Planning, Reinforcement Learning

A classical result in optimal search shows that A* with an admissible and consistent heuristic expands every state whose f-value is below the optimal solution cost and no state whose f-value is above the optimal solution cost. For satisficing search algorithms, a similarly clear understanding is currently lacking. We examine the search behavior of greedy best-first search (GBFS) in order to make progress towards such an understanding. We introduce the concept of high-water mark benches, which separate the search space into areas that are searched by a GBFS algorithm in sequence. High-water mark benches allow us to exactly determine the set of states that are expanded by at least one GBFS tie-breaking strategy and give us a clearer understanding of search progress.
#5142

Multi-Robot Motion Planning with Dynamics Guided by Multi-Agent Search
Duong Le, Erion Plaku

Sister Conferences Best Papers: Planning, Reinforcement Learning

This paper presents an effective multi-robot motion planner that enables each robot to reach its desired location while avoiding collisions with the other robots and the obstacles. The approach takes into account the differential constraints imposed by the underlying dynamics of each robot and generates dynamically-feasible motions that can be executed in the physical world. The crux of the approach is the sampling-based expansion of a motion tree in the continuous state space of all the robots guided by multi-agent search over a discrete abstraction. Experiments using vehicle models with nonlinear dynamics operating in complex environments show significant speedups over related work.

Tuesday 17 16:40 - 18:20 NLP-CV2 - Language and Vision: Image Captioning, Visual Question Answering (T2)

Chair: Zhou Cheng

#509

Image Cationing with Visual-Semantic LSTM
Nannan Li, Zhenzhong Chen

Language and Vision: Image Captioning, Visual Question Answering

In this paper, a novel image captioning approach is proposed to describe the content of images. Inspired by the visual processing of our cognitive system, we propose a visual-semantic LSTM model to locate the attention objects with their low-level features in the visual cell, and then successively extract high-level semantic features in the semantic cell. In addition, a state perturbation term is introduced to the word sampling strategy in the REINFORCE based method to explore proper vocabularies in the training process. Experimental results on MS COCO and Flickr30K validate the effectiveness of our approach when compared to the state-of-the-art methods.
#182

Show and Tell More: Topic-Oriented Multi-Sentence Image Captioning
Yuzhao Mao, Chang Zhou, Xiaojie Wang, Ruifan Li

Language and Vision: Image Captioning, Visual Question Answering

Image captioning aims to generate textual descriptions for images. Most previous work generates a single-sentence description for each image. However, a picture is worth a thousand words. Single-sentence can hardly give a complete view of an image even by humans. In this paper, we propose a novel Topic-Oriented Multi-Sentence (\emph{TOMS}) captioning model, which can generate multiple topic-oriented sentences to describe an image. Different from object instances or attributes, topics mined by the latent Dirichlet allocation reflect hidden thematic structures in reference sentences of an image. In our model, each topic is integrated to a caption generator with a Fusion Gate Unit (FGU) to guide the generation of a sentence towards a certain topic perspective. With multiple sentences from different topics, our \emph{TOMS} provides a complete description of an image. Experimental results on both sentence and paragraph datasets demonstrate the effectiveness of our \emph{TOMS} in terms of topical consistency and descriptive completeness.
#3045

Multi-Level Policy and Reward Reinforcement Learning for Image Captioning
Anan Liu, Ning Xu, Hanwang Zhang, Weizhi Nie, Yuting Su, Yongdong Zhang

Language and Vision: Image Captioning, Visual Question Answering

Image captioning is one of the most challenging hallmark of AI, due to its complexity in visual and natural language understanding. As it is essentially a sequential prediction task, recent advances in image captioning use Reinforcement Learning (RL) to better explore the dynamics of word-by-word generation. However, existing RL-based image captioning methods mainly rely on a single policy network and reward function that does not well fit the multi-level (word and sentence) and multi-modal (vision and language) nature of the task. To this end, we propose a novel multi-level policy and reward RL framework for image captioning. It contains two modules: 1) Multi-Level Policy Network that can adaptively fuse the word-level policy and the sentence-level policy for the word generation; and 2) Multi-Level Reward Function that collaboratively leverages both vision-language reward and language-language reward to guide the policy. Further, we propose a guidance term to bridge the policy and the reward for RL optimization. Extensive experiments and analysis on MSCOCO and Flickr30k show that the proposed framework can achieve competing performances with respect to different evaluation metrics.
#2374

A Multi-task Learning Approach for Image Captioning
Wei Zhao, Benyou Wang, Jianbo Ye, Min Yang, Zhou Zhao, Ruotian Luo, Yu Qiao

Language and Vision: Image Captioning, Visual Question Answering

In this paper, we propose a Multi-task Learning Approach for Image Captioning (MLAIC ), motivated by the fact that humans have no difficulty performing such task because they possess capabilities of multiple domains. Specifically, MLAIC consists of three key components: (i) A multi-object classification model that learns rich category-aware image representations using a CNN image encoder; (ii) A syntax generation model that learns better syntax-aware LSTM based decoder; (iii) An image captioning model that generates image descriptions in text, sharing its CNN encoder and LSTM decoder with the object classification task and the syntax generation task, respectively. In particular, the image captioning model can benefit from the additional object categorization and syntax knowledge. To verify the effectiveness of our approach, we conduct extensive experiments on MS-COCO dataset. The experimental results demonstrate that our model achieves impressive results compared to other strong competitors.
#4561

Feature Enhancement in Attention for Visual Question Answering
Yuetan Lin, Zhangyang Pang, Donghui Wang, Yueting Zhuang

Language and Vision: Image Captioning, Visual Question Answering

Attention mechanism has been an indispensable part of Visual Question Answering (VQA) models, due to the importance of its selective ability on image regions and/or question words. However, attention mechanism in almost all the VQA models takes as input the image visual and question textual features, which stem from different sources and between which there exists essential semantic gap. In order to further improve the accuracy of correlation between region and question in attention, we focus on region representation and propose the idea of feature enhancement, which includes three aspects. (1) We propose to leverage region semantic representation which is more consistent with the question representation. (2) We enrich the region representation using features from multiple hierarchies and (3) we refine the semantic representation for richer information. With these three incremental feature enhancement mechanisms, we improve the region representation and achieve better attentive effect and VQA performance. We conduct extensive experiments on the largest VQA v2.0 benchmark dataset and achieve competitive results without additional training data, and prove the effectiveness of our proposed feature-enhanced attention by visual demonstrations.
#2651

From Pixels to Objects: Cubic Visual Attention for Visual Question Answering
Jingkuan Song, Pengpeng Zeng, Lianli Gao, Heng Tao Shen

Language and Vision: Image Captioning, Visual Question Answering

Recently, attention-based Visual Question Answering (VQA) has achieved great success by utilizing question to selectively target different visual areas that are related to the answer. Existing visual attention models are generally planar, i.e., different channels of the last conv-layer feature map of an image share the same weight. This conflicts with the attention mechanism because CNN features are naturally spatial and channel-wise. Also, visual attention models are usually conducted on pixel-level, which may cause region discontinuous problem. In this paper we propose a Cubic Visual Attention (CVA) model by successfully applying a novel channel and spatial attention on object regions to improve VQA task. Specifically, instead of attending to pixels, we first take advantage of the object proposal networks to generate a set of object candidates and extract their associated conv features. Then, we utilize the question to guide channel attention and spatial attention calculation based on the con-layer feature map. Finally, the attended visual features and the question are combined to infer the answer. We assess the performance of our proposed CVA on three public image QA datasets, including COCO-QA, VQA and Visual7W. Experimental results show that our proposed method significantly outperforms the state-of-the-arts.
#1401

Show, Observe and Tell: Attribute-driven Attention Model for Image Captioning
Hui Chen, Guiguang Ding, Zijia Lin, Sicheng Zhao, Jungong Han

Language and Vision: Image Captioning, Visual Question Answering

Despite the fact that attribute-based approaches and attention-based approaches have been proven to be effective in image captioning, most attribute-based approaches simply predict attributes independently without taking the co-occurrence dependencies among attributes into account. Besides, most attention-based captioning models directly leverage the feature map extracted from CNN, in which many features may be redundant in relation to the image content. In this paper, we focus on training a good attribute-inference model via the recurrent neural network (RNN) for image captioning, where the co-occurrence dependencies among attributes can be maintained. The uniqueness of our inference model lies in the usage of a RNN with the visual attention mechanism to \textit{observe} the image before generating captions. Additionally, it is noticed that compact and attribute-driven features will be more useful for the attention-based captioning model. To this end, we extract the context feature for each attribute, and guide the captioning model adaptively attend to these context features. We verify the effectiveness and superiority of the proposed approach over the other captioning approaches by conducting massive experiments and comparisons on MS COCO image captioning dataset.
#823

A Question Type Driven Framework to Diversify Visual Question Generation
Zhihao Fan, Zhongyu Wei, Piji Li, Yanyan Lan, Xuanjing Huang

Language and Vision: Image Captioning, Visual Question Answering

Visual question generation aims at asking questions about an image automatically. Existing research works on this topic usually generate a single question for each given image without considering the issue of diversity. In this paper, we propose a question type driven framework to produce multiple questions for a given image with different focuses. In our framework, each question is constructed following the guidance of a sampled question type in a sequence-to-sequence fashion. To diversify the generated questions, a novel conditional variational auto-encoder is introduced to generate multiple questions with a specific question type. Moreover, we design a strategy to conduct the question type distribution learning for each image to select the final questions. Experimental results on three benchmark datasets show that our framework outperforms the state-of-the-art approaches in terms of both relevance and diversity.

Tuesday 17 16:40 - 18:20 CV-BFG - Biometrics, Face and Gesture Recognition (T1)

Chair: Mayank Vatsa

#997

Deep Attribute Guided Representation for Heterogeneous Face Recognition
Decheng Liu, Nannan Wang, Chunlei Peng, Jie Li, Xinbo Gao

Biometrics, Face and Gesture Recognition

Heterogeneous face recognition (HFR) is a challenging problem in face recognition, subject to large texture and spatial structure differences of face images. Different from conventional face recognition in homogeneous environments, there exist many face images taken from different sources (including different sensors or different mechanisms) in reality. Motivated by human cognitive mechanism, we naturally utilize the explicit invariant semantic information (face attributes) to help address the gap of different modalities. Existing related face recognition methods mostly regard attributes as the high level feature integrated with other engineering features enhancing recognition performance, ignoring the inherent relationship between face attributes and identities. In this paper, we propose a novel deep attribute guided representation based heterogeneous face recognition method (DAG-HFR) without labeling attributes manually. Deep convolutional networks are employed to directly map face images in heterogeneous scenarios to a compact common space where distances mean similarities of pairs. An attribute guided triplet loss (AGTL) is designed to train an end-to-end HFR network which could effectively eliminate defects of incorrectly detected attributes. Extensive experiments on multiple heterogeneous scenarios (composite sketches, resident ID cards) demonstrate that the proposed method achieves superior performances compared with state-of-the-art methods.
#730

Harnessing Synthesized Abstraction Images to Improve Facial Attribute Recognition
Keke He, Yanwei Fu, Wuhao Zhang, Chengjie Wang, Yu-Gang Jiang, Feiyue Huang, Xiangyang Xue

Biometrics, Face and Gesture Recognition

Facial attribute recognition is an important and yet challenging research topic. Different from most previous approaches which predict attributes only based on the whole images, this paper leverages facial parts locations for better attribute prediction. A facial abstraction image which contains both local facial parts and facial texture information is introduced. This abstraction image is generated by a Generative Adversarial Network (GAN). Then we build a dual-path facial attribute recognition network to utilize features from the original face images and facial abstraction images. Empirically, the features of facial abstraction images are complementary to features of original face images. With the facial parts localized by the abstraction images, our method improves facial attributes recognition, especially the attributes located on small face regions. Extensive evaluations conducted on CelebA and LFWA benchmark datasets show that state-of-the-art performance is achieved.
#358

Live Face Verification with Multiple Instantialized Local Homographic Parameterization
Chen Lin, Zhouyingcheng Liao, Peng Zhou, Jianguo Hu, Bingbing Ni

Biometrics, Face and Gesture Recognition

State-of-the-art live face verification methods would easily be attacked by recorded facial expression sequence. This work directly addresses this issue via proposing a patch-wise motion parameterization based verification network infrastructure. This method directly explores the underlying subtle motion difference between the facial movements re-captured from a planer screen (e.g., a pad) and those from a real face; therefore interactive facial expression is no longer required. Furthermore, inspired by the fact that ?a fake facial movement sequence MUST contains many patch-wise fake sequences?, we embed our network into a multiple instance learning framework, which further enhance the recall rate of the proposed technique. Extensive experimental results on several face benchmarks well demonstrate the superior performance of our method.
#2339

Adversarial Attribute-Image Person Re-identification
Zhou Yin, Wei-Shi Zheng, Ancong Wu, Hong-Xing Yu, Hai Wan, Xiaowei Guo, Feiyue Huang, Jianhuang Lai

Biometrics, Face and Gesture Recognition

While attributes have been widely used for person re-identification (Re-ID) which aims at matching the same person images across disjoint camera views, they are used either as extra features or for performing multi-task learning to assist the image-image matching task. However, how to find a set of person images according to a given attribute description, which is very practical in many surveillance applications, remains a rarely investigated cross-modality matching problem in person Re-ID. In this work, we present this challenge and leverage adversarial learning to formulate the attribute-image cross-modality person Re-ID model. By imposing a semantic consistency constraint across modalities as a regularization, the adversarial learning enables to generate image-analogous concepts of query attributes for matching the corresponding images at both global level and semantic ID level. We conducted extensive experiments on three attribute datasets and demonstrated that the regularized adversarial modelling is so far the most effective method for the attribute-image cross-modality person Re-ID problem.
#1600

Dual Conditional GANs for Face Aging and Rejuvenation
Jingkuan Song, Jingqiu Zhang, Lianli Gao, Xianglong Liu, Heng Tao Shen

Biometrics, Face and Gesture Recognition

Face aging and rejuvenation is to predict the face of a person at different ages. While tremendous progress have been made in this topic, there are two central problems remaining largely unsolved: 1) the majority of prior works requires sequential training data, which is very rare in real scenarios, and 2) how to simultaneously render aging face and preserve personality. To tackle these issues, in this paper, we develop a novel dual conditional GAN (DCGAN) mechanism, which enables face aging and rejuvenation to be trained from multiple sets of unlabeled face images with different ages. In our architecture, the primal conditional GAN transforms a face image to other ages based on the age condition, while the dual conditional GAN learns to invert the task. Hence a loss function that accounts for the reconstruction error of images can preserve the personal identity, while the discriminators on the generated images learn the transition patterns (e.g., the shape and texture changes between age groups) and guide the generation of age-specific photo-realistic faces. Experimental results on two publicly dataset demonstrate the appealing performance of the proposed framework by comparing with the state-of-the-art methods.
#159

DRPose3D: Depth Ranking in 3D Human Pose Estimation
Min Wang, Xipeng Chen, Wentao Liu, Chen Qian, Liang Lin, Lizhuang Ma

Biometrics, Face and Gesture Recognition

In this paper, we propose a two-stage depth ranking based method (DRPose3D) to tackle the problem of 3D human pose estimation. Instead of accurate 3D positions, the depth ranking can be identified by human intuitively and learned using the deep neural network more easily by solving classification problems. Moreover, depth ranking contains rich 3D information. It prevents the 2D-to-3D pose regression in two-stage methods from being ill-posed. In our method, firstly, we design a Pairwise Ranking Convolutional Neural Network (PRCNN) to extract depth rankings of human joints from images. Secondly, a coarse-to-fine 3D Pose Network(DPNet) is proposed to estimate 3D poses from both depth rankings and 2D human joint locations. Additionally, to improve the generality of our model, we introduce a statistical method to augment depth rankings. Our approach outperforms the state-of-the-art methods in the Human3.6M benchmark for all three testing protocols, indicating that depth ranking is an essential geometric feature which can be learned to improve the 3D pose estimation.
#3601

Anonymizing k Facial Attributes via Adversarial Perturbations
Saheb Chhabra, Richa Singh, Mayank Vatsa, Gaurav Gupta

Biometrics, Face and Gesture Recognition

A face image not only provides details about the identity of a subject but also reveals several attributes such as gender, race, sexual orientation, and age. Advancements in machine learning algorithms and popularity of sharing images on the World Wide Web, including social media websites, have increased the scope of data analytics and information profiling from photo collections. This poses a serious privacy threat for individuals who do not want to be profiled. This research presents a novel algorithm for anonymizing selective attributes which an individual does not want to share without affecting the visual quality of images. Using the proposed algorithm, a user can select single or multiple attributes to be surpassed while preserving identity information and visual content. The proposed adversarial perturbation based algorithm embeds imperceptible noise in an image such that attribute prediction algorithm for the selected attribute yields incorrect classification result, thereby preserving the information according to user's choice. Experiments on three popular databases i.e. MUCT, LFWcrop, and CelebA show that the proposed algorithm not only anonymizes \textit{k}-attributes, but also preserves image quality and identity information.
#37

3D-Aided Deep Pose-Invariant Face Recognition
Jian Zhao, Lin Xiong, Yu Cheng, Yi Cheng, Jianshu Li, Li Zhou, Yan Xu, Jayashree Karlekar, Sugiri Pranata, Shengmei Shen, Junliang Xing, Shuicheng Yan, Jiashi Feng

Biometrics, Face and Gesture Recognition

Learning from synthetic faces, though perhaps appealing for high data efficiency, may not bring satisfactory performance due to the distribution discrepancy of the synthetic and real face images. To mitigate this gap, we propose a 3D-Aided Deep Pose-Invariant Face Recognition Model (3D-PIM), which automatically recovers realistic frontal faces from arbitrary poses through a 3D face model in a novel way. Specifically, 3D-PIM incorporates a simulator with the aid of a 3D Morphable Model (3D MM) to obtain shape and appearance prior for accelerating face normalization learning, requiring less training data. It further leverages a global-local Generative Adversarial Network (GAN) with multiple critical improvements as a refiner to enhance the realism of both global structures and local details of the face simulator’s output using unlabelled real data only, while preserving the identity information. Qualitative and quantitative experiments on both controlled and in-the-wild benchmarks clearly demonstrate superiority of the proposed model over state-of-the-arts.

Tuesday 17 16:40 - 18:20 SGP-SO - Heuristic Search and Optimization (K11)

Chair: Pavel Surynek

#24

The FastMap Algorithm for Shortest Path Computations
Liron Cohen, Tansel Uras, Shiva Jahangiri, Aliyah Arunasalam, Sven Koenig, T. K. Satish Kumar

Heuristic Search and Optimization

We present a new preprocessing algorithm for embedding the nodes of a given edge-weighted undirected graph into a Euclidean space. The Euclidean distance between any two nodes in this space approximates the length of the shortest path between them in the given graph. Later, at runtime, a shortest path between any two nodes can be computed with an A* search using the Euclidean distances as heuristic. Our preprocessing algorithm, called FastMap, is inspired by the data-mining algorithm of the same name and runs in near-linear time. Hence, FastMap is orders of magnitude faster than competing approaches that produce a Euclidean embedding using Semidefinite Programming. FastMap also produces admissible and consistent heuristics and therefore guarantees the generation of shortest paths. Moreover, FastMap applies to general undirected graphs for which many traditional heuristics, such as the Manhattan Distance heuristic, are not well defined. Empirically, we demonstrate that A* search using the FastMap heuristic is competitive with A* search using other state-of-the-art heuristics, such as the Differential heuristic.
#111

A Fast Local Search Algorithm for Minimum Weight Dominating Set Problem on Massive Graphs
Yiyuan Wang, Shaowei Cai, Jiejiang Chen, Minghao Yin

Heuristic Search and Optimization

The minimum weight dominating set (MWDS) problem is NP-hard and also important in many applications. Recent heuristic MWDS algorithms can hardly solve massive real world graphs effectively. In this paper, we design a fast local search algorithm called FastMWDS for the MWDS problem, which aims to obtain a good solution on massive graphs within a short time. In this novel local search framework, we propose two ideas to make it effective. Firstly, we design a new fast construction procedure with four reduction rules to cut down the size of massive graphs. Secondly, we propose the three-valued two-level configuration checking strategy to improve local search, which is interestingly a variant of configuration checking (CC) with two levels and multiple values. Experiment results on a broad range of massive real world graphs show that FastMWDS finds much better solutions than state of the art MWDS algorithms.
#179

Convergence Analysis of Gradient Descent for Eigenvector Computation
Zhiqiang Xu, Xin Cao, Xin Gao

Heuristic Search and Optimization

We present a novel, simple and systematic convergence analysis of gradient descent for eigenvector computation. As a popular, practical, and provable approach to numerous machine learning problems, gradient descent has found successful applications to eigenvector computation as well. However, surprisingly, it lacks a thorough theoretical analysis for the underlying geodesically non-convex problem. In this work, the convergence of the gradient descent solver for the leading eigenvector computation is shown to be at a global rate O(min{ (lambda_1/Delta_p)^2 log(1/epsilon), 1/epsilon }), where Delta_p=lambda_p-lambda_p+1>0 represents the generalized positive eigengap and always exists without loss of generality with lambda_i being the i-th largest eigenvalue of the given real symmetric matrix and p being the multiplicity of lambda_1. The rate is linear at (lambda_1/Delta_p)^2 log(1/epsilon) if (lambda_1/Delta_p)^2=O(1), otherwise sub-linear at O(1/epsilon). We also show that the convergence only logarithmically instead of quadratically depends on the initial iterate. Particularly, this is the first time the linear convergence for the case that the conventionally considered eigengap Delta_1= lambda_1 - lambda_2=0 but the generalized eigengap Delta_p satisfies (lambda_1/Delta_p)^2=O(1), as well as the logarithmic dependence on the initial iterate are established for the gradient descent solver. We are also the first to leverage for analysis the log principal angle between the iterate and the space of globally optimal solutions. Theoretical properties are verified in experiments.
#637

A Fast Algorithm for Optimally Finding Partially Disjoint Shortest Paths
Longkun Guo, Yunyun Deng, Kewen Liao, Qiang He, Timos Sellis, Zheshan Hu

Heuristic Search and Optimization

The classical disjoint shortest path problem has recently recalled interests from researchers in the network planning and optimization community. However, the requirement of the shortest paths being completely vertex or edge disjoint might be too restrictive and demands much more resources in a network. Partially disjoint shortest paths, in which a bounded number of shared vertices or edges is allowed, balance between degree of disjointness and occupied network resources. In this paper, we consider the problem of finding k shortest paths which are edge disjoint but partially vertex disjoint. For a pair of distinct vertices in a network graph, the problem aims to optimally find k edge disjoint shortest paths among which at most a bounded number of vertices are shared by at least two paths. In particular, we present novel techniques for exactly solving the problem with a runtime that significantly improves the current best result. The proposed algorithm is also validated by computer experiments on both synthetic and real networks which demonstrate its superior efficiency of up to three orders of magnitude faster than the state of the art.
#1777

A General Approach to Running Time Analysis of Multi-objective Evolutionary Algorithms
Chao Bian, Chao Qian, Ke Tang

Heuristic Search and Optimization

Evolutionary algorithms (EAs) have been widely applied to solve multi-objective optimization problems. In contrast to great practical successes, their theoretical foundations are much less developed, even for the essential theoretical aspect, i.e., running time analysis. In this paper, we propose a general approach to estimating upper bounds on the expected running time of multi-objective EAs (MOEAs), and then apply it to diverse situations, including bi-objective and many-objective optimization as well as exact and approximate analysis. For some known asymptotic bounds, our analysis not only provides their leading constants, but also improves them asymptotically. Moreover, our results provide some theoretical justification for the good empirical performance of MOEAs in solving multi-objective combinatorial problems.
#1812

An Exact Algorithm for Maximum k-Plexes in Massive Graphs
Jian Gao, Jiejiang Chen, Minghao Yin, Rong Chen, Yiyuan Wang

Heuristic Search and Optimization

The maximum k-plex, a generalization of maximum clique, is used to cope with a great number of real-world problems. The aim of this paper is to propose a novel exact k-plex algorithm that can deal with large-scaled graphs with millions of vertices and edges. Specifically, we first propose several new graph reduction methods through a careful analyzing of structures of induced subgraphs. Afterwards, we present a preprocessing method to simplify initial graphs. Additionally, we present a branch-and-bound algorithm integrating the reduction methods as well as a new dynamic vertex selection mechanism. We perform intensive experiments to evaluate our algorithm, and show that the proposed strategies are effective and our algorithm outperforms state-of-the-art algorithms, especially for real-world massive graphs.
#3209

Methods for off-line/on-line optimization under uncertainty
Allegra De Filippo, Michele Lombardi, Michela Milano

Heuristic Search and Optimization

In this work we present two general techniques to deal with multi-stage optimization problems under uncertainty, featuring off-line and on-line decisions. The methods are applicable when: 1) the uncertainty is exogenous; 2) there exists a heuristic for the on-line phase that can be modeled as a parametric convex optimization problem. The first technique replaces the on-line heuristics with an anticipatory solver, obtained through a systematic procedure. The second technique consists in making the off-line solver aware of the on-line heuristic, and capable of controlling its parameters so as to steer its behavior. We instantiate our approaches on two case studies: an energy management system with uncertain renewable generation and load demand, and a vehicle routing problem with uncertain travel times. We show how both techniques achieve high solution quality w.r.t. an oracle operating under perfect information, by obtaining different trade-offs in terms of computation time.
#2812

Sequence Selection by Pareto Optimization
Chao Qian, Chao Feng, Ke Tang

Heuristic Search and Optimization

The problem of selecting a sequence of items from a universe that maximizes some given objective function arises in many real-world applications. In this paper, we propose an anytime randomized iterative approach POSeqSel, which maximizes the given objective function and minimizes the sequence length simultaneously. We prove that for any previously studied objective function, POSeqSel using a reasonable time can always reach or improve the best known approximation guarantee. Empirical results exhibit the superior performance of POSeqSel.

Tuesday 17 16:40 - 18:20 ML-TS1 - Time Series and Data Streams (C2)

Chair: Kerstin Bach

#10

Predicting Complex Activities from Ongoing Multivariate Time Series
Weihao Cheng, Sarah Erfani, Rui Zhang, Ramamohanarao Kotagiri

Time Series and Data Streams

The rapid development of sensor networks enables recognition of complex activities (CAs) using multivariate time series. However, CAs are usually performed over long periods of time, which causes slow recognition by models based on fully observed data. Therefore, predicting CAs at early stages becomes an important problem. In this paper, we propose Simultaneous Complex Activities Recognition and Action Sequence Discovering (SimRAD), an algorithm which predicts a CA over time by mining a sequence of multivariate actions from sensor data using a Deep Neural Network. SimRAD simultaneously learns two probabilistic models for inferring CAs and action sequences, where the estimations of the two models are conditionally dependent on each other. SimRAD continuously predicts the CA and the action sequence, thus the predictions are mutually updated until the end of the CA. We conduct evaluations on a real-world CA dataset consisting of a rich amount of sensor data, and the results show that SimRAD outperforms state-of-the-art methods by average 7.2% in prediction accuracy with high confidence.
#829

Trajectory-User Linking via Variational AutoEncoder
Fan Zhou, Qiang Gao, Goce Trajcevski, Kunpeng Zhang, Ting Zhong, Fengli Zhang

Time Series and Data Streams

Trajectory-User Linking (TUL) is an essential task in Geo-tagged social media (GTSM) applications, enabling personalized Point of Interest (POI) recommendation and activity identification. Existing works on mining mobility patterns often model trajectories using Markov Chains (MC) or recurrent neural networks (RNN) -- either assuming independence between non-adjacent locations or following a shallow generation process. However, most of them ignore the fact that human trajectories are often sparse, high-dimensional and may contain embedded hierarchical structures. We tackle the TUL problem with a semi-supervised learning framework, called TULVAE (TUL via Variational AutoEncoder), which learns the human mobility in a neural generative architecture with stochastic latent variables that span hidden states in RNN. TULVAE alleviates the data sparsity problem by leveraging large-scale unlabeled data and represents the hierarchical and structural semantics of trajectories with high-dimensional latent variables. Our experiments demonstrate that TULVAE improves efficiency and linking performance in real GTSM datasets, in comparison to existing methods.
#879

Online Continuous-Time Tensor Factorization Based on Pairwise Interactive Point Processes
Hongteng Xu, Dixin Luo, Lawrence Carin

Time Series and Data Streams

A continuous-time tensor factorization method is developed for event sequences containing multiple "modalities." Each data element is a point in a tensor, whose dimensions are associated with the discrete alphabet of the modalities. Each tensor data element has an associated time of occurence and a feature vector. We model such data based on pairwise interactive point processes, and the proposed framework connects pairwise tensor factorization with a feature-embedded point process. The model accounts for interactions within each modality, interactions across different modalities, and continuous-time dynamics of the interactions. Model learning is formulated as a convex optimization problem, based on online alternating direction method of multipliers. Compared to existing state-of-the-art methods, our approach captures the latent structure of the tensor and its evolution over time, obtaining superior results on real-world datasets.
#1604

Periodic-CRN: A Convolutional Recurrent Model for Crowd Density Prediction with Recurring Periodic Patterns
Ali Zonoozi, Jung-jae Kim, Xiao-Li Li, Gao Cong

Time Series and Data Streams

Time-series forecasting in geo-spatial domains has important applications, including urban planning, traffic management and behavioral analysis. We observed recurring periodic patterns in some spatio-temporal data, which were not considered explicitly by previous non-linear works. To address this lack, we propose novel `Periodic-CRN' (PCRN) method, which adapts convolutional recurrent network (CRN) to accurately capture spatial and temporal correlations, learns and incorporates explicit periodic representations, and can be optimized with multi-step ahead prediction. We show that PCRN consistently outperforms the state-of-the-art methods for crowd density prediction across two taxi datasets from Beijing and Singapore.
#1648

Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting
Bing Yu, Haoteng Yin, Zhanxing Zhu

Time Series and Data Streams

Timely accurate traffic forecast is crucial for urban traffic control and guidance. Due to the high nonlinearity and complexity of traffic flow, traditional methods cannot satisfy the requirements of mid-and-long term prediction tasks and often neglect spatial and temporal dependencies. In this paper, we propose a novel deep learning framework, Spatio-Temporal Graph Convolutional Networks (STGCN), to tackle the time series prediction problem in traffic domain. Instead of applying regular convolutional and recurrent units, we formulate the problem on graphs and build the model with complete convolutional structures, which enable much faster training speed with fewer parameters. Experiments show that our model STGCN effectively captures comprehensive spatio-temporal correlations through modeling multi-scale traffic networks and consistently outperforms state-of-the-art baselines on various real-world traffic datasets.
#3197

NeuCast: Seasonal Neural Forecast of Power Grid Time Series
Pudi Chen, Shenghua Liu, Chuan Shi, Bryan Hooi, Bai Wang, Xueqi Cheng

Time Series and Data Streams

In the smart power grid, short-term load forecasting (STLF) is a crucial step in scheduling and planning for future load, so as to improve the reliability, cost, and emissions of the power grid. Different from traditional time series forecast, STLF is a more challenging task, because of the complex demand of active and reactive power from numerous categories of electrical loads and the effects of environment. Therefore, we propose NeuCast, a seasonal neural forecasting method, which dynamically models various loads as co-evolving time series in a hidden space, as well as extra weather conditions, in a neural network structure. NeuCast captures seasonality and patterns of the time series by integrating factor modeling and hidden state recognition. NeuCast can also detect anomalies and forecast under different temperature assumptions. Extensive experiments on 134 real-word datasets show the improvements of NeuCast over the stateof-the-art methods.
#4106

Finding Frequent Entities in Continuous Data
Ferran Alet, Rohan Chitnis, Leslie P. Kaelbling, Tomas Lozano-Perez

Time Series and Data Streams

In many applications that involve processing high-dimensional data, it is important to identify a small set of entities that account for a significant fraction of detections. Rather than formalize this as a clustering problem, in which all detections must be grouped into hard or soft categories, we formalize it as an instance of the frequent items or heavy hitters problem, which finds groups of tightly clustered objects that have a high density in the feature space. We show that the heavy hitters formulation generates solutions that are more accurate and effective than the clustering formulation. In addition, we present a novel online algorithm for heavy hitters, called HAC, which addresses problems in continuous space, and demonstrate its effectiveness on real video and household domains.
#1805

Hierarchical Electricity Time Series Forecasting for Integrating Consumption Patterns Analysis and Aggregation Consistency
Yue Pang, Bo Yao, Xiangdong Zhou, Yong Zhang, Yiming Xu, Zijing Tan

Time Series and Data Streams

Electricity demand forecasting is a very important problem for energy supply and environmental protection. It can be formalized as a hierarchical time series forecasting problem with the aggregation constraints according to the geographical hierarchy, since the sum of the prediction results of the disaggregated time series should be equal to the prediction results of the aggregated ones. However in most previous work, the aggregation consistency is ensured at the loss of forecast accuracy. In this paper, we propose a novel clustering-based hierarchical electricity time series forecasting approach. Instead of dealing with the geographical hierarchy directly, we explore electricity consumption patterns by clustering analysis and build a new consumption pattern based time series hierarchy. We then present a novel hierarchical forecasting method with consumption hierarchical aggregation constraints to improve the electricity demand predictions of the bottom level, followed by a ``bottom-up" method to obtain forecasts of the geographical higher levels. Especially, we observe that in our consumption pattern based hierarchy the reconciliation error of the bottom level time series is ``correlated" to its membership degree of the corresponding cluster (consumption pattern), and hence apply this correlations as the regularization term in our forecasting objective function. Extensive experiments on real-life datasets verify that our approach achieves the best prediction accuracy, compared with the state-of-the-art methods.

Tuesday 17 16:40 - 18:20 MUL-WEB1 - AI and the Web, Networks 1 (C3)

Chair: David Pennock

#525

A Comparative Study of Transactional and Semantic Approaches for Predicting Cascades on Twitter
Yunwei Zhao, Can Wang, Chi-Hung Chi, Kwok-Yan Lam, Sen Wang

AI and the Web, Networks 1

The availability of massive social media data has enabled the prediction of people’s future behavioral trends at an unprecedented large scale. Information cascades study on Twitter has been an integral part of behavior analysis. A number of methods based on the transactional features (such as keyword frequency) and the semantic features (such as sentiment) have been proposed to predict the future cascading trends. However, an in-depth understanding of the pros and cons of semantic and transactional models is lacking. This paper conducts a comparative study of both approaches in predicting information diffusion with three mechanisms: retweet cascade, url cascade, and hashtag cascade. Experiments on Twitter data show that the semantic model outperforms the transactional model, if the exterior pattern is less directly observable (i.e. hashtag cascade). When it becomes more directly observable (i.e. retweet and url cascades), the semantic method yet delivers approximate accuracy (i.e. url cascade) or even worse accuracy (i.e. retweet cascade). Further, we demonstrate that the transactional and semantic models are not independent, and the performance gets greatly enhanced when combining both.
#650

Improving Information Centrality of a Node in Complex Networks by Adding Edges
Liren Shan, Yuhao Yi, Zhongzhi Zhang

AI and the Web, Networks 1

The problem of increasing the centrality of a network node arises in many practical applications. In this paper, we study the optimization problem of maximizing the information centrality Iv of a given node v in a network with n nodes and m edges, by creating k new edges incident to v. Since Iv is the reciprocal of the sum of resistance distance Rv between v and all nodes, we alternatively consider the problem of minimizing Rv by adding k new edges linked to v. We show that the objective function is monotone and supermodular. We provide a simple greedy algorithm with an approximation factor (1 − 1/e) and O(n^3) running time. To speed up the computation, we also present an algorithm to compute (1 − 1/e − epsilon) approximate resistance distance Rv after iteratively adding k edges, the running time of which is Otilde(mk*epsilon^−2) for any epsilon > 0, where the Otilde(·) notation suppresses the poly(log n) factors. We experimentally demonstrate the effectiveness and efficiency of our proposed algorithms.
#1734

Scalable Multiplex Network Embedding
Hongming Zhang, Liwei Qiu, Lingling Yi, Yangqiu Song

AI and the Web, Networks 1

Network embedding has been proven to be helpful for many real-world problems. In this paper, we present a scalable multiplex network embedding model to represent information of multi-type relations into a unified embedding space. To combine information of different types of relations while maintaining their distinctive properties, for each node, we propose one high-dimensional common embedding and a lower-dimensional additional embedding for each type of relation. Then multiple relations can be learned jointly based on a unified network embedding model. We conduct experiments on two tasks: link prediction and node classification using six different multiplex networks. On both tasks, our model achieved better or comparable performance compared to current state-of-the-art models with less memory use.
#1884

Adversarially Regularized Graph Autoencoder for Graph Embedding
Shirui Pan, Ruiqi Hu, Guodong Long, Jing Jiang, Lina Yao, Chengqi Zhang

AI and the Web, Networks 1

Graph embedding is an effective method to represent graph data in a low dimensional space for graph analytics. Most existing embedding algorithms typically focus on preserving the topological structure or minimizing the reconstruction errors of graph data, but they have mostly ignored the data distribution of the latent codes from the graphs, which often results in inferior embedding in real-world graph data. In this paper, we propose a novel adversarial graph embedding framework for graph data. The framework encodes the topological structure and node content in a graph to a compact representation, on which a decoder is trained to reconstruct the graph structure. Furthermore, the latent representation is enforced to match a prior distribution via an adversarial training scheme. To learn a robust embedding, two variants of adversarial approaches, adversarially regularized graph autoencoder (ARGA) and adversarially regularized variational graph autoencoder (ARVGA), are developed. Experimental studies on real-world graphs validate our design and demonstrate that our algorithms outperform baselines by a wide margin in link prediction, graph clustering, and graph visualization tasks.
#3528

Discrete Interventions in Hawkes Processes with Applications in Invasive Species Management
Amrita Gupta, Mehrdad Farajtabar, Bistra Dilkina, Hongyuan Zha

AI and the Web, Networks 1

The spread of invasive species to new areas threatens the stability of ecosystems and causes major economic losses. We propose a novel approach to minimize the spread of an invasive species given a limited intervention budget. We first model invasive species spread using Hawkes processes, and then derive closed-form expressions for characterizing the effect of an intervention action on the invasion process. We use this to obtain an optimal intervention plan based on an integer programming formulation, and compare the optimal plan against several ecologically-motivated heuristic strategies used in practice. We present an empirical study of two variants of the invasive control problem: minimizing the final rate of invasions, and minimizing the number of invasions at the end of a given time horizon. The optimized intervention achieves nearly the same level of control that would be attained by completely eradicating the species, but at only 60-80\% of the cost.
#2600

Learning to Explain Ambiguous Headlines of Online News
Tianyu Liu, Wei Wei, Xiaojun Wan

AI and the Web, Networks 1

With the purpose of attracting clicks, online news publishers and editors use diverse strategies to make their headlines catchy, with a sacrifice of accuracy. Specifically, a considerable portion of news headlines is ambiguous. Such headlines are unclear relative to the content of the story, and largely degrade the reading experience of the audience. In this paper, we focus on dealing with the information gap caused by the ambiguous news headlines. We define a new task of explaining ambiguous headlines with short informative texts, and build a benchmark dataset for evaluation. We address the task by selecting a proper sentence from the news body to resolve the ambiguity in an ambiguous headline. Both feature engineering methods and neural network methods are explored. For feature engineering, we improve a standard SVM classifier with elaborately designed features. For neural networks, we propose an ambiguity-aware neural matching model based on a previous model. Utilizing automatic and manual evaluation metrics, we demonstrate the efficacy and the complementarity of the two methods, and the ambiguity-aware neural matching model achieves the state-of-the-art performance on this challenging task.
#2703

Fact Checking via Evidence Patterns
Valeria Fionda, Giuseppe Pirrò

AI and the Web, Networks 1

We tackle fact checking using Knowledge Graphs (KGs) as a source of background knowledge. Our approach leverages the KG schema to generate candidate evidence patterns, that is, schema-level paths that capture the semantics of a target fact in alternative ways. Patterns verified in the data are used to both assemble semantic evidence for a fact and provide a numerical assessment of its truthfulness. We present efficient algorithms to generate and verify evidence patterns, and assemble evidence. We also provide a translation of the core of our algorithms into the SPARQL query language. Not only our approach is faster than the state of the art and offers comparable accuracy, but it can also use any SPARQL-enabled KG.
#1650

Globally Optimized Mutual Influence Aware Ranking in E-Commerce Search
Tao Zhuang, Wenwu Ou, Zhirong Wang

AI and the Web, Networks 1

In web search, mutual influences between documents have been studied from the perspective of search result diversification. But the methods in web search is not directly applicable to e-commerce search because of their differences. And little research has been done on the mutual influences between items in e-commerce search. We propose a global optimization framework for mutual influence aware ranking in e-commerce search. Our framework directly optimizes the Gross Merchandise Volume (GMV) for ranking, and decomposes ranking into two tasks. The first task is mutual influence aware purchase probability estimation. We propose a global feature extension method to incorporate mutual influences into the features of an item. We also use Recurrent Neural Network (RNN) to capture influences related to ranking orders in purchase probability estimation. The second task is to find the best ranking order based on the purchase probability estimations. We treat the second task as a sequence generation problem and solved it using the beam search algorithm. We performed online A/B test on a large e-commerce search engine. The results show that our method brings a 5% increase in GMV for the search engine over a strong baseline.

Wednesday 18 08:30 - 09:45 EAR6 - Early Career 6 (VICTORIA)

Chair: Craig Boutilier

#5487

Statistical Quality Control for Human Computation and Crowdsourcing
Yukino Baba

Early Career 6

Human computation is a method for solving difficult problems by combining humans and computers. Quality control is a critical issue in human computation because it relies on a large number of participants (i.e., crowds) and there is an uncertainty about their reliability. A solution for this issue is to leverage the power of the "wisdom of crowds"; for example, we can aggregate the outputs of multiple participants or ask a participant to check the output of another participant to improve its quality. In this paper, we review several statistical approaches for controlling the quality of outputs from crowds.
#5489

Engineering Graph Features via Network Functional Blocks
Vincent W. Zheng

Early Career 6

Graph is a prevalent data structure that enables many predictive tasks. How to engineer graph features is a fundamental question. Our concept is to go beyond nodes and edges, and explore richer structures (e.g., paths, subgraphs) for graph feature engineering. We call such richer structures as network functional blocks, because each structure serves as a network building block but with some different functionality. We use semantic proximity search as an example application to share our recent work on exploiting different granularities of network functional blocks. We show that network functional blocks are effective, and they can be useful for a wide range of applications.
#5497

Symbolic Compilation, Inference, and Decision-making with Deep-learned Models
Scott Sanner

Early Career 6

Wednesday 18 08:30 - 09:45 MAS-AM2 - Auctions and Markets 2 (C8)

Chair: Aris Filos-Ratsikas

#1168

Online Pricing for Revenue Maximization with Unknown Time Discounting Valuations
Weichao Mao, Zhenzhe Zheng, Fan Wu, Guihai Chen

Auctions and Markets 2

Online pricing mechanisms have been widely applied to resource allocation in multi-agent systems. However, most of the existing online pricing mechanisms assume buyers have fixed valuations over the time horizon, which cannot capture the dynamic nature of valuation in emerging applications. In this paper, we study the problem of revenue maximization in online auctions with unknown time discounting valuations, and model it as non-stationary multi-armed bandit optimization. We design an online pricing mechanism, namely Biased-UCB, based on unique features of the discounting valuations. We use competitive analysis to theoretically evaluate the performance guarantee of our pricing mechanism, and derive the competitive ratio. Numerical results show that our design achieves good performance in terms of revenue maximization on a real-world bidding dataset.
#2088

The Promise and Perils of Myopia in Dynamic Pricing With Censored Information
Meenal Chhabra, Sanmay Das, Ilya Ryzhov

Auctions and Markets 2

A seller with unlimited inventory of a digital good interacts with potential buyers with i.i.d. valuations. The seller can adaptively quote prices to each buyer to maximize long-term profits, but does not know the valuation distribution exactly. Under a linear demand model, we consider two information settings: partially censored, where agents who buy reveal their true valuations after the purchase is completed, and completely censored, where agents never reveal their valuations. In the partially censored case, we prove that myopic pricing with a Pareto prior is Bayes optimal and has finite regret. In both settings, we evaluate the myopic strategy against more sophisticated look-aheads using three valuation distributions generated from real data on auctions of physical goods, keyword auctions, and user ratings, where the linear demand assumption is clearly violated. For some datasets, complete censoring actually helps, because the restricted data acts as a "regularizer" on the posterior, preventing it from being affected too much by outliers.
#3130

Customer Sharing in Economic Networks with Costs
Bin Li, Dong Hao, Dengji Zhao, Tao Zhou

Auctions and Markets 2

In an economic market, sellers, infomediaries and customers constitute an economic network. Each seller has her own customer group and the seller's private customers are unobservable to other sellers. Therefore, a seller can only sell commodities among her own customers unless other sellers or infomediaries share her sale information to their customer groups. However, a seller is not incentivized to share others' sale information by default, which leads to inefficient resource allocation and limited revenue for the sale. To tackle this problem, we develop a novel mechanism called customer sharing mechanism (CSM) which incentivizes all sellers to share each other's sale information to their private customer groups. Furthermore, CSM also incentivizes all customers to truthfully participate in the sale. In the end, CSM not only allocates the commodities efficiently but also optimizes the seller's revenue.
#3275

Budget-feasible Procurement Mechanisms in Two-sided Markets
Weiwei Wu, Xiang Liu, Minming Li

Auctions and Markets 2

This paper considers the mechanism design problem in two-sided markets where multiple strategic buyers come with budgets to procure as much value of items as possible from the strategic sellers. Each seller holds an item with public value and is allowed to bid its private cost. Buyers could claim their budgets, not necessarily the true ones. The goal is to seek budget-feasible mechanisms that ensure sellers are rewarded enough payment and buyers' budgets are not exceeded. Our main contribution is a random mechanism that guarantees various desired theoretical guarantees like the budget feasibility, the truthfulness on the sellers' side and the buyers' side simultaneously, and constant approximation to the optimal total procured value of buyers.
#3604

Integrating Demand Response and Renewable Energy In Wholesale Market
Chaojie Li, Chen Liu, Xinghuo Yu, Ke Deng, Tingwen Huang, Liangchen Liu

Auctions and Markets 2

Demand response (DR) can provide a cost-effect approach for reducing peak loads while renewable energy sources (RES) can result in an environmental-friendly solution for solving the problem of power shortage. The increasingly integration of DR and renewable energy bring challenging issues for energy policy makers, and electricity market regulators in the main power grid. In this paper, a new two-stage stochastic game model is introduced to operate the electricity market, where Stochastic Stackelberg-Cournot-Nash (SSCN) equilibrium is applied to characterize the optimal energy bidding strategy of the forward market and the optimal energy trading strategy of the spot market. To obtain a SSCN equilibrium, sampling average approximation (SAA) technique is harnessed to address the stochastic game model in a distributed way. By this game model, the participation ratio of demand response can be significantly increased while the unreliability of power system caused by renewable energy resources can be considerably reduced. The effectiveness of proposed model is illustrated by extensive simulations.
#2070

Equilibrium Behavior in Competing Dynamic Matching Markets
Zhuoshu Li, Neal Gupta, Sanmay Das, John P. Dickerson

Auctions and Markets 2

Rival markets like rideshare services, universities, and organ exchanges compete to attract participants, seeking to maximize their own utility at potential cost to overall social welfare. Similarly, individual participants in such multi-market systems also seek to maximize their individual utility. If entry is costly, they should strategically enter only a subset of the available markets. All of this decision making---markets competitively adapting their matching strategies and participants arriving, choosing which market(s) to enter, and departing from the system---occurs dynamically over time. This paper provides the first analysis of equilibrium behavior in dynamic competing matching market systems---first from the points of view of individual participants when market policies are fixed, and then from the points of view of markets when agents are stochastic. When compared to single markets running social-welfare-maximizing matching policies, losses in overall social welfare in competitive systems manifest due to both market fragmentation and the use of non-optimal matching policies. We quantify such losses and provide policy recommendations to help alleviate them in fielded systems.

Wednesday 18 08:30 - 09:55 KR-CSAT - Knowledge Representation, Constraints and Satisfiability (C7)

Chair: Francesco Ricca

#382

Exploiting Justifications for Lazy Grounding of Answer Set Programs
Bart Bogaerts, Antonius Weinzierl

Knowledge Representation, Constraints and Satisfiability

Answer set programming (ASP) is an established knowledge representation formalism. Lazy grounding avoids the so-called grounding bottleneck of ASP by interleaving grounding and solving; this technique was recently extended to work with conflict-driven clause learning. Unfortunately, it often happens that such a lazy grounding ASP system, at the fixpoint of the evaluation, arrives at an assignment that contains literals that are true but unjustified. The system then is unable to determine the actual causes of the situation and falls back to chronological backtracking, potentially wasting an exponential amount of time. In this paper, we show how top-down query mechanisms can be used to analyze the situation, learn a new clause or nogood, and backjump further in the search tree. Contributions include a rephrasing of lazy grounding in terms of justifications and algorithms to construct relevant justifications without grounding. Initial experiments indicate that the newly developed techniques indeed allow for an exponential speed-up.
#1227

Possibilistic ASP Base Revision by Certain Input
Laurent Garcia, Claire Lefèvre, Odile Papini, Igor Stéphan, Eric Würbel

Knowledge Representation, Constraints and Satisfiability

Belief base revision has been studied within the answer set programming framework. We go a step further by introducing uncertainty and studying belief base revision when beliefs are represented by possibilistic logic programs under possibilistic answer set semantics and revised by certain input. The paper proposes two approaches of rule-based revision operators and presents their semantic characterization in terms of possibilistic distribution. This semantic characterization allows for equivalently considering the evolution of syntactic logic programs and the evolution of their semantic content. It then studies the logical properties of the proposed operators and gives complexity results.
#1719

Pseudo-Boolean Constraints from a Knowledge Representation Perspective
Daniel Le Berre, Pierre Marquis, Stefan Mengel, Romain Wallon

Knowledge Representation, Constraints and Satisfiability

We study pseudo-Boolean constraints (PBC) and their special case cardinality constraints (CARD) from the perspective of knowledge representation. To this end, the succinctness of PBC and CARD is compared to that of many standard propositional languages. Moreover, we determine which queries and transformations are feasible in polynomial time when knowledge is represented by PBC or CARD, and which are not (unconditionally or unless P = NP). In particular, the advantages and disadvantages compared to CNF are discussed.
#3103

Novel Algorithms for Abstract Dialectical Frameworks based on Complexity Analysis of Subclasses and SAT Solving
Thomas Linsbichler, Marco Maratea, Andreas Niskanen, Johannes P. Wallner, Stefan Woltran

Knowledge Representation, Constraints and Satisfiability

Abstract dialectical frameworks (ADFs) constitute one of the most powerful formalisms in abstract argumentation. Their high computational complexity poses, however, certain challenges when designing efficient systems. In this paper, we tackle this issue by (i) analyzing the complexity of ADFs under structural restrictions, (ii) presenting novel algorithms which make use of these insights, and (iii) empirically evaluating a resulting implementation which relies on calls to SAT solvers.
#3648

Stratified Negation in Limit Datalog Programs
Mark Kaminski, Bernardo Cuenca Grau, Egor V. Kostylev, Boris Motik, Ian Horrocks

Knowledge Representation, Constraints and Satisfiability

There has recently been an increasing interest in declarative data analysis, where analytic tasks are specified using a logical language, and their implementation and optimisation are delegated to a general-purpose query engine. Existing declarative languages for data analysis can be formalised as variants of logic programming equipped with arithmetic function symbols and/or aggregation, and are typically undecidable. In prior work, the language of limit programs was proposed, which is sufficiently powerful to capture many analysis tasks and has decidable entailment problem. Rules in this language, however, do not allow for negation. In this paper, we study an extension of limit programs with stratified negation-as-failure. We show that the additional expressive power makes reasoning computationally more demanding, and provide tight data complexity bounds. We also identify a fragment with tractable data complexity and sufficient expressivity to capture many relevant tasks.
#1833

Classification Transfer for Qualitative Reasoning Problems
Manuel Bodirsky, Peter Jonsson, Barnaby Martin, Antoine Mottet

Knowledge Representation, Constraints and Satisfiability

We study formalisms for temporal and spatial reasoning in the modern context of Constraint Satisfaction Problems (CSPs). We show how questions on the complexity of their subclasses can be solved using existing results via the powerful use of primitive positive (pp) interpretations and pp-homotopy. We demonstrate the methodology by giving a full complexity classification of all constraint languages that are first-order definable in Allen's Interval Algebra and contain the basic relations (s) and (f). In the case of the Rectangle Algebra we answer in the affirmative the old open question as to whether ORD-Horn is a maximally tractable subset among the (disjunctive, binary) relations. We then generalise our results for the Rectangle Algebra to the r-dimensional Block Algebra.
#3240

Simpler and Faster Algorithm for Checking the Dynamic Consistency of Conditional Simple Temporal Networks
Luke Hunsberger, Roberto Posenato

Knowledge Representation, Constraints and Satisfiability

Recent work on Conditional Simple Temporal Networks (CSTNs) has focused on checking the dynamic consistency (DC) property assuming that execution strategies can react instantaneously to observations. Three alternative semantics---IR-DC, 0-DC, and π-DC---have been presented. The most practical DC-checking algorithm for CSTNs has only been analyzed with respect to the IR-DC semantics, while the 0-DC semantics was shown to have a serious flaw that the π-DC semantics fixed. Whether the IR-DC semantics had the same flaw and, if so, what the consequences would be for the DC-checking algorithm remained open questions. This paper (1) shows that the IR-DC semantics is also flawed; (2) shows that one of the constraint-propagation rules from the IR-DC-checking algorithm is not sound with respect to the IR-DC semantics; (3) presents a simpler algorithm, called the π-DC-checking algorithm; (4) proves that it is sound and complete with respect to the π-DC semantics; and (5) empirically evaluates the new algorithm.

Wednesday 18 08:30 - 09:55 ML-TAM2 - Transfer, Adaptation, Multi-Task Learning 2 (K2)

Chair: Yuguang Yan

#2043

Label-Sensitive Task Grouping by Bayesian Nonparametric Approach for Multi-Task Multi-Label Learning
Xiao Zhang, Wenzhong Li, Vu Nguyen, Fuzhen Zhuang, Hui Xiong, Sanglu Lu

Transfer, Adaptation, Multi-Task Learning 2

Multi-label learning is widely applied in many real-world applications, such as image and gene annotation. While most of the existing multi-label learning models focus on the single-task learning problem, there are always some tasks that share some commonalities, which can help each other to improve the learning performances if the knowledge in the similar tasks can be smartly shared. In this paper, we propose a LABel-sensitive TAsk Grouping framework, named LABTAG, based on Bayesian nonparametric approach for multi-task multi-label classification. The proposed framework explores the label correlations to capture feature-label patterns, and clusters similar tasks into groups with shared knowledge, which are learned jointly to produce a strengthened multi-task multi-label model. We evaluate the model performance on three public multi-task multi-label data sets, and the results show that LABTAG outperforms the compared baselines with a significant margin.
#3072

Cross-Domain 3D Model Retrieval via Visual Domain Adaption
Anan Liu, Shu Xiang, Wenhui Li, Weizhi Nie, Yuting Su

Transfer, Adaptation, Multi-Task Learning 2

Recent advances in 3D capturing devices and 3D modeling software have led to extensive and diverse 3D datasets, which usually have different distributions. Cross-domain 3D model retrieval is becoming an important but challenging task. However, existing works mainly focus on 3D model retrieval in a closed dataset, which seriously constrain their implementation for real applications. To address this problem, we propose a novel crossdomain 3D model retrieval method by visual domain adaptation. This method can inherit the advantage of deep learning to learn multi-view visual features in the data-driven manner for 3D model representation. Moreover, it can reduce the domain divergence by exploiting both domainshared and domain-specific features of different domains. Consequently, it can augment the discrimination of visual descriptors for cross-domain similarity measure. Extensive experiments on two popular datasets, under three designed cross-domain scenarios, demonstrate the superiority and effectiveness of the proposed method by comparing against the state-of-the-art methods. Especially, the proposed method can significantly outperform the most recent method for cross-domain 3D model retrieval and the champion of Shrec’16 Large-Scale 3D Shape Retrieval from ShapeNet Core55.
#4085

Same Representation, Different Attentions: Shareable Sentence Representation Learning from Multiple Tasks
Renjie Zheng, Junkun Chen, Xipeng Qiu

Transfer, Adaptation, Multi-Task Learning 2

Distributed representation plays an important role in deep learning based natural language processing. However, the representation of a sentence often varies in different tasks, which is usually learned from scratch and suffers from the limited amounts of training data. In this paper, we claim that a good sentence representation should be invariant and can benefit the various subsequent tasks. To achieve this purpose, we propose a new scheme of information sharing for multi-task learning. More specifically, all tasks share the same sentence representation and each task can select the task-specific information from the shared sentence representation with attention mechanisms. The query vector of each task's attention could be either static parameters or generated dynamically. We conduct extensive experiments on 16 different text classification tasks, which demonstrate the benefits of our architecture. Source codes of this paper are available on Github.
#2791

Online Heterogeneous Transfer Metric Learning
Yong Luo, Tongliang Liu, Yonggang Wen, Dacheng Tao

Transfer, Adaptation, Multi-Task Learning 2

Distance metric learning (DML) has been demonstrated to be successful and essential in diverse applications. Transfer metric learning (TML) can help DML in the target domain with limited label information by utilizing information from some related source domains. The heterogeneous TML (HTML), where the feature representations vary from the source to the target domain, is general and challenging. However, current HTML approaches are usually conducted in a batch manner and cannot handle sequential data. This motivates the proposed online HTML (OHTML) method. In particular, the distance metric in the source domain is pre-trained using some existing DML algorithms. To enable knowledge transfer, we assume there are large amounts of unlabeled corresponding data that have representations in both the source and target domains. By enforcing the distances (between these unlabeled samples) in the target domain to agree with those in the source domain under the manifold regularization theme, we learn an improved target metric. We formulate the problem in the online setting so that the optimization is efficient and the model can be adapted to new coming data. Experiments in diverse applications demonstrate both effectiveness and efficiency of the proposed method.
#3379

Predicting Activity and Location with Multi-task Context Aware Recurrent Neural Network
Dongliang Liao, Weiqing Liu, Yuan Zhong, Jing Li, Guowei Wang

Transfer, Adaptation, Multi-Task Learning 2

Predicting users’ activity and location preferences is of great significance in location based services. Considering that users’ activity and location preferences interplay with each other, many scholars tried to figure out the relation between users’ activities and locations for improving prediction performance. However, most previous works enforce a rigid human-defined modeling strategy to capture these two factors, either activity purpose controlling location preference or spatial region determining activity preference. Unlike existing methods, we introduce spatial-activity topics as the latent factor capturing both users’ activity and location preferences. We propose Multi-task Context Aware Recurrent Neural Network to leverage the spatial activity topic for activity and location prediction. More specifically, a novel Context Aware Recurrent Unit is designed to integrate the sequential dependency and temporal regularity of spatial activity topics. Extensive experimental results based on real-world public datasets demonstrate that the proposed model significantly outperforms state-of-the-art approaches.
#40

Improving Entity Recommendation with Search Log and Multi-Task Learning
Jizhou Huang, Wei Zhang, Yaming Sun, Haifeng Wang, Ting Liu

Transfer, Adaptation, Multi-Task Learning 2

Entity recommendation, providing search users with an improved experience by assisting them in finding related entities for a given query, has become an indispensable feature of today's Web search engine. Existing studies typically only consider the query issued at the current time step while ignoring the in-session preceding queries. Thus, they typically fail to handle the ambiguous queries such as "apple" because the model could not understand which apple (company or fruit) is talked about. In this work, we believe that the in-session contexts convey valuable evidences that could facilitate the semantic modeling of queries, and take that into consideration for entity recommendation. Furthermore, in order to better model the semantics of queries, we learn the model in a multi-task learning setting where the query representation is shared across entity recommendation and context-aware ranking. We evaluate our approach using large-scale, real-world search logs of a widely used commercial Web search engine. The experimental results show that incorporating context information significantly improves entity recommendation, and learning the model in a multi-task learning setting could bring further improvements.
#4231

Experienced Optimization with Reusable Directional Model for Hyper-Parameter Search
Yi-Qi Hu, Yang Yu, Zhi-Hua Zhou

Transfer, Adaptation, Multi-Task Learning 2

Hyper-parameter selection is a crucial yet difficult issue in machine learning. For this problem, derivative-free optimization has being playing an irreplaceable role. However, derivative-free optimization commonly requires a lot of hyper-parameter samples, while each sample could have a high cost for hyper-parameter selection due to the costly evaluation of a learning model. To tackle this issue, in this paper, we propose an experienced optimization approach, i.e., learning how to optimize better from a set of historical optimization processes. From the historical optimization processes on previous datasets, a directional model is trained to predict the direction of the next good hyper-parameter. The directional model is then reused to guide the optimization in learning new datasets. We implement this mechanism within a state-of-the-art derivative-free optimization method SRacos, and conduct experiments on learning the hyper-parameters of heterogeneous ensembles and neural network architectures. Experimental results verify that the proposed approach can significantly improve the learning accuracy within a limited hyper-parameter sample budget.

Wednesday 18 08:30 - 09:55 NLP-SEC - Sentence Embedding, Text Classification (T2)

Chair: Yangqiu Song

#1285

Reinforced Self-Attention Network: a Hybrid of Hard and Soft Attention for Sequence Modeling
Tao Shen, Tianyi Zhou, Guodong Long, Jing Jiang, Sen Wang, Chengqi Zhang

Sentence Embedding, Text Classification

Many natural language processing tasks solely rely on sparse dependencies between a few tokens in a sentence. Soft attention mechanisms show promising performance in modeling local/global dependencies by soft probabilities between every two tokens, but they are not effective and efficient when applied to long sentences. By contrast, hard attention mechanisms directly select a subset of tokens but are difficult and inefficient to train due to their combinatorial nature. In this paper, we integrate both soft and hard attention into one context fusion model, "reinforced self-attention (ReSA)", for the mutual benefit of each other. In ReSA, a hard attention trims a sequence for a soft self-attention to process, while the soft attention feeds reward signals back to facilitate the training of the hard one. For this purpose, we develop a novel hard attention called "reinforced sequence sampling (RSS)", selecting tokens in parallel and trained via policy gradient. Using two RSS modules, ReSA efficiently extracts the sparse dependencies between each pair of selected tokens. We finally propose an RNN/CNN-free sentence-encoding model, "reinforced self-attention network (ReSAN)", solely based on ReSA. It achieves state-of-the-art performance on both the Stanford Natural Language Inference (SNLI) and the Sentences Involving Compositional Knowledge (SICK) datasets.
#1685

An Adaptive Hierarchical Compositional Model for Phrase Embedding
Bing Li, Xiaochun Yang, Bin Wang, Wei Wang, Wei Cui, Xianchao Zhang

Sentence Embedding, Text Classification

Phrase embedding aims at representing phrases in a vector space and it is important for the performance of many NLP tasks. Existing models only regard a phrase as either full-compositional or non-compositional, while ignoring the hybrid-compositionality that widely exists, especially in long phrases. This drawback prevents them from having a deeper insight into the semantic structure for long phrases and as a consequence, weakens the accuracy of the embeddings. In this paper, we present a novel method for jointly learning compositionality and phrase embedding by adaptively weighting different compositions using an implicit hierarchical structure. Our model has the ability of adaptively adjusting among different compositions without entailing too much model complexity and time cost. To the best of our knowledge, our work is the first effort that considers hybrid-compositionality in phrase embedding. The experimental evaluation demonstrates that our model outperforms state-of-the-art methods in both similarity tasks and analogy tasks.
#1909

Transformable Convolutional Neural Network for Text Classification
Liqiang Xiao, Honglun Zhang, Wenqing Chen, Yongkun Wang, Yaohui Jin

Sentence Embedding, Text Classification

Convolutional neural networks (CNNs) have shown their promising performance for natural language processing tasks, which extract n-grams as features to represent the input. However, n-gram based CNNs are inherently limited to fixed geometric structure and cannot proactively adapt to the transformations of features. In this paper, we propose two modules to provide CNNs with the flexibility for complex features and the adaptability for transformation, namely, transformable convolution and transformable pooling. Our method fuses dynamic and static deviations to redistribute the sampling locations, which can capture both current and global transformations. Our modules can be easily integrated by other models to generate new transformable networks. We test proposed modules on two state-of-the-art models, and the results demonstrate that our modules can effectively adapt to the feature transformation in text classification.
#2269

Instance Weighting with Applications to Cross-domain Text Classification via Trading off Sample Selection Bias and Variance
Rui Xia, Zhenchun Pan, Feng Xu

Sentence Embedding, Text Classification

Domain adaptation is an important problem in natural language processing (NLP) due to the distributional difference between the labeled source domain and the target domain. In this paper, we study the domain adaptation problem from the instance weighting perspective. By using density ratio as the instance weight, the traditional instance weighting approaches can potentially correct the sample selection bias in domain adaptation. However, researchers often failed to achieve good performance when applying instance weighting to domain adaptation in NLP and many negative results were reported in the literature. In this work, we conduct an in-depth study on the causes of the failure, and find that previous work only focused on reducing the sample selection bias, but ignored another important factor, sample selection variance, in domain adaptation. On this basis, we propose a new instance weighting framework by trading off two factors in instance weight learning. We evaluate our approach on two cross-domain text classification tasks and compare it with eight instance weighting methods. The results prove our approach's advantages in domain adaptation performance, optimization efficiency and parameter stability.
#1962

Inferring Temporal Knowledge for Near-Periodic Recurrent Events
Dinesh Raghu, Surag Nair, Mausam

Sentence Embedding, Text Classification

We define the novel problem of extracting and predicting occurrence dates for a class of recurrent events -- events that are held periodically as per a near-regular schedule (e.g., conferences, film festivals, sport championships). Knowledge-bases such as Freebase contain a large number of such recurring events, but they also miss substantial information regarding specific event instances and their occurrence dates. We develop a temporal extraction and inference engine to fill in the missing dates as well as to predict their future occurrences. Our engine performs joint inference over several knowledge sources -- (1) information about an event instance and its date extracted from text by our temporal extractor, (2) information about the typical schedule (e.g., ``every second week of June") for a recurrent event extracted by our schedule extractor, and (3) known dates for other instances of the same event. The output of our system is a representation for the event schedule and an occurrence date for each event instance. We find that our system beats humans in predicting future occurrences of recurrent events by significant margins. We release our code and system output for further research.
#2458

Time-evolving Text Classification with Deep Neural Networks
Yu He, Jianxin Li, Yangqiu Song, Mutian He, Hao Peng

Sentence Embedding, Text Classification

Traditional text classification algorithms are based on the assumption that data are independent and identically distributed. However, in most non-stationary scenarios, data may change smoothly due to long-term evolution and short-term fluctuation, which raises new challenges to traditional methods. In this paper, we present the first attempt to explore evolutionary neural network models for time-evolving text classification. We first introduce a simple way to extend arbitrary neural networks to evolutionary learning by using a temporal smoothness framework, and then propose a diachronic propagation framework to incorporate the historical impact into currently learned features through diachronic connections. Experiments on real-world news data demonstrate that our approaches greatly and consistently outperform traditional neural network models in both accuracy and stability.
#2173

EZLearn: Exploiting Organic Supervision in Automated Data Annotation
Maxim Grechkin, Hoifung Poon, Bill Howe

Sentence Embedding, Text Classification

Many real-world applications require automated data annotation, such as identifying tissue origins based on gene expressions and classifying images into semantic categories. Annotation classes are often numerous and subject to changes over time, and annotating examples has become the major bottleneck for supervised learning methods. In science and other high-value domains, large repositories of data samples are often available, together with two sources of organic supervision: a lexicon for the annotation classes, and text descriptions that accompany some data samples. Distant supervision has emerged as a promising paradigm for exploiting such indirect supervision by automatically annotating examples where the text description contains a class mention in the lexicon. However, due to linguistic variations and ambiguities, such training data is inherently noisy, which limits the accuracy in this approach. In this paper, we introduce an auxiliary natural language processing system for the text modality, and incorporate co-training to reduce noise and augment signal in distant supervision. Without using any manually labeled data, our EZLearn system learned to accurately annotate data samples in functional genomics and scientific figure comprehension, substantially outperforming state-of-the-art supervised methods trained on tens of thousands of annotated examples.

Wednesday 18 08:30 - 09:55 ROB-CV - Robotics and Vision (T1)

Chair: Mohan Sridharan

#1147

CR-GAN: Learning Complete Representations for Multi-view Generation
Yu Tian, Xi Peng, Long Zhao, Shaoting Zhang, Dimitris N. Metaxas

Robotics and Vision

Generating multi-view images from a single-view input is an important yet challenging problem. It has broad applications in vision, graphics, and robotics. Our study indicates that the widely-used generative adversarial network (GAN) may learn ?incomplete? representations due to the single-pathway framework: an encoder-decoder network followed by a discriminator network.We propose CR-GAN to address this problem. In addition to the single reconstruction path, we introduce a generation sideway to maintain the completeness of the learned embedding space. The two learning paths collaborate and compete in a parameter-sharing manner, yielding largely improved generality to ?unseen? dataset. More importantly, the two-pathway framework makes it possible to combine both labeled and unlabeled data for self-supervised learning, which further enriches the embedding space for realistic generations. We evaluate our approach on a wide range of datasets. The results prove that CR-GAN significantly outperforms state-of-the-art methods, especially when generating from ?unseen? inputs in wild conditions.
#700

GraspNet: An Efficient Convolutional Neural Network for Real-time Grasp Detection for Low-powered Devices
Umar Asif, Jianbin Tang, Stefan Harrer

Robotics and Vision

Recent research on grasp detection has focused on improving accuracy through deep CNN models, but at the cost of large memory and computational resources. In this paper, we propose an efficient CNN architecture which produces high grasp detection accuracy in real-time while maintaining a compact model design. To achieve this, we introduce a CNN architecture termed GraspNet which has two main branches: i) An encoder branch which downsamples an input image using our novel Dilated Dense Fire (DDF) modules - squeeze and dilated convolutions with dense residual connections. ii) A decoder branch which upsamples the output of the encoder branch to the original image size using deconvolutions and fuse connections. We evaluated GraspNet for grasp detection using offline datasets and a real-world robotic grasping setup. In experiments, we show that GraspNet achieves superior grasp detection accuracy compared to the stateof-the-art computation-efficient CNN models with real-time inference speed on embedded GPU hardware (Nvidia Jetson TX1), making it suitable for low-powered devices.
#3050

An Appearance-and-Structure Fusion Network for Object Viewpoint Estimation
Yueying Kao, Weiming Li, Zairan Wang, Dongqing Zou, Ran He, Qiang Wang, Minsu Ahn, Sunghoon Hong

Robotics and Vision

Automatic object viewpoint estimation from a single image is an important but challenging problem in machine intelligence community. Although impressive performance has been achieved, current state-of-the-art methods still have difficulty to deal with the visual ambiguity and structure ambiguity in real world images. To tackle these problems, a novel Appearance-and-Structure Fusion network, which we call it ASFnet that estimates viewpoint by fusing both appearance and structure information, is proposed in this paper. The structure information is encoded by precise semantic keypoints and can help address the visual ambiguity. Meanwhile, distinguishable appearance features contribute to overcoming the structure ambiguity. Our ASFnet integrates an appearance path and a structure path to an end-to-end network and allows deep features effectively share supervision from both the two complementary aspects. A convolutional layer is learned to fuse the two path results adaptively. To balance the influence from the two supervision sources, a piecewise loss weight strategy is employed during training. Experimentally, our proposed network outperforms state-of-the-art methods on a public PASCAL 3D+ dataset, which verifies the effectiveness of our method and further corroborates the above proposition.
#2836

Active Recurrence of Lighting Condition for Fine-Grained Change Detection
Qian Zhang, Wei Feng, Liang Wan, Fei-Peng Tian, Ping Tan

Robotics and Vision

This paper addresses active lighting recurrence (ALR), a new problem that actively relocalizes a light source to physically reproduce the lighting condition for a same scene from single reference image. ALR is of great importance for fine-grained visual monitoring and change detection, because some phenomena or minute changes can only be clearly observed under particular lighting conditions. Hence, effective ALR should be able to online navigate a light source toward the target pose, which is challenging due to the complexity and diversity of real-world lighting \& imaging processes. We propose to use the simple parallel lighting as an analogy model and based on Lambertian law to compose an instant navigation ball for this purpose. We theoretically prove the feasibility of this ALR strategy for realistic near point light sources and its invariance to the ambiguity of normal \& lighting decomposition. Extensive quantitative experiments and challenging real-world tasks on fine-grained change monitoring of cultural heritages verify the effectiveness of our approach. We also validate its generality to non-Lambertian scenes.
#2284

Implicit Non-linear Similarity Scoring for Recognizing Unseen Classes
Yuchen Guo, Guiguang Ding, Jungong Han, Sicheng Zhao, Bin Wang

Robotics and Vision

Recognizing unseen classes is an important task for real-world applications, due to: 1) it is common that some classes in reality have no labeled image exemplar for training; and 2) novel classes emerge rapidly. Recently, to address this task many zero-shot learning (ZSL) approaches have been proposed where explicit linear scores, like inner product score, are employed to measure the similarity between a class and an image. We argue that explicit linear scoring (ELS) seems too weak to capture complicated image-class correspondence. We propose a simple yet effective framework, called Implicit Non-linear Similarity Scoring (ICINESS). In particular, we train a scoring network which uses image and class features as input, fuses them by hidden layers, and outputs the similarity. Based on the universal approximation theorem, it can approximate the true similarity function between images and classes if a proper structure is used in an implicit non-linear way, which is more flexible and powerful. With ICINESS framework, we implement ZSL algorithms by shallow and deep networks, which yield consistently superior results.
#2750

Virtual-to-Real: Learning to Control in Visual Semantic Segmentation
Zhang-Wei Hong, Yu-Ming Chen, Hsuan-Kung Yang, Shih-Yang Su, Tzu-Yun Shann, Yi-Hsiang Chang, Brian Hsi-Lin Ho, Chih-Chieh Tu, Tsu-Ching Hsiao, Hsin-Wei Hsiao, Sih-Pin Lai, Yueh-Chuan Chang, Chun-Yi Lee

Robotics and Vision

Collecting training data from the physical world is usually time-consuming and even dangerous for fragile robots, and thus, recent advances in robot learning advocate the use of simulators as the training platform. Unfortunately, the reality gap between synthetic and real visual data prohibits direct migration of the models trained in virtual worlds to the real world. This paper proposes a modular architecture for tackling the virtual-to-real problem. The proposed architecture separates the learning model into a perception module and a control policy module, and uses semantic image segmentation as the meta representation for relating these two modules. The perception module translates the perceived RGB image to semantic image segmentation. The control policy module is implemented as a deep reinforcement learning agent, which performs actions based on the translated image segmentation. Our architecture is evaluated in an obstacle avoidance task and a target following task. Experimental results show that our architecture significantly outperforms all of the baseline methods in both virtual and real environments, and demonstrates a faster learning curve than them. We also present a detailed analysis for a variety of variant configurations, and validate the transferability of our modular architecture.
#2196

3D-PhysNet: Learning the Intuitive Physics of Non-Rigid Object Deformations
Zhihua Wang, Stefano Rosa, Bo Yang, Sen Wang, Niki Trigoni, Andrew Markham

Robotics and Vision

The ability to interact and understand the environment is a fundamental prerequisite for a wide range of applications from robotics to augmented reality. In particular, predicting how deformable objects will react to applied forces in real time is a significant challenge. This is further confounded by the fact that shape information about encountered objects in the real world is often impaired by occlusions, noise and missing regions e.g. a robot manipulating an object will only be able to observe a partial view of the entire solid. In this work we present a framework, 3D-PhysNet, which is able to predict how a three-dimensional solid will deform under an applied force using intuitive physics modelling. In particular, we propose a new method to encode the physical properties of the material and the applied force, enabling generalisation over materials. The key is to combine deep variational autoencoders with adversarial training, conditioned on the applied force and the material properties.We further propose a cascaded architecture that takes a single 2.5D depth view of the object and predicts its deformation. Training data is provided by a physics simulator. The network is fast enough to be used in real-time applications from partial views. Experimental results show the viability and the generalisation properties of the proposed architecture.

Wednesday 18 08:30 - 09:55 ML-ACT - Active Learning (K11)

Chair: Sheng-Jun Huang

#1487

Cost-Effective Active Learning for Hierarchical Multi-Label Classification
Yi-Fan Yan, Sheng-Jun Huang

Active Learning

Active learning reduces the labeling cost by actively querying labels for the most valuable data. It is particularly important for multi-label learning, where the annotation cost is rather high because each instance may have multiple labels simultaneously. In many multi-label tasks, the labels are organized into hierarchies from coarse to fine. The labels at different levels of the hierarchy contribute differently to the model training, and also have diverse annotation costs. In this paper, we propose a multi-label active learning approach to exploit the label hierarchies for cost-effective queries. By incorporating the potential contribution of ancestor and descendant labels, a novel criterion is proposed to estimate the informativeness of each candidate query. Further, a subset selection method is introduced to perform active batch selection by balancing the informativeness and cost of each instance-label pair. Experimental results validate the effectiveness of both the proposed criterion and the selection method.
#2386

On Whom Should I Perform this Lab Test Next? An Active Feature Elicitation Approach
Sriraam Natarajan, Srijita Das, Nandini Ramanan, Gautam Kunapuli, Predrag Radivojac

Active Learning

We consider the problem of actively feature elicitation in which given a few examples with all the features (say the full EHR) and a few examples with some of the features (say demographics), the goal is to identify the set of examples on whom more informationÂ (say the lab tests) needs to be collected. The observation is that some set of features may be more expensive, personal or cumbersome to collect. We propose an active learning approach which identifies examples that are dissimilar to the ones with the full set of data and acquire the complete set of features for these examples. Motivated by real clinical tasks, our extensive evaluation on three clinical tasks demonstrate the effectiveness of this approach.
#4050

Hierarchical Active Learning with Group Proportion Feedback
Zhipeng Luo, Milos Hauskrecht

Active Learning

Learning of classification models in practice often relies on nontrivial human annotation effort in which humans assign class labels to data instances. As this process can be very time consuming and costly, finding effective ways to reduce the annotation cost becomes critical for building such models. In this work we solve this problem by exploring a new approach that actively learns classification models from groups, which are subpopulations of instances, and human feedback on the groups. Each group is labeled with a number in [0,1] interval representing a human estimate of the proportion of instances with one of the class labels in this subpopulation. To form the groups to be annotated, we develop a hierarchical active learning framework that divides the whole population into smaller subpopulations, which allows us to gradually learn more refined models from the subpopulations and their class proportion labels. Our extensive experiments on numerous datasets show that our method is competitive and outperforms existing approaches for reducing the human annotation cost.
#3901

Experimental Design under the Bradley-Terry Model
Yuan Guo, Peng Tian, Jayashree Kalpathy-Cramer, Susan Ostmo, J.Peter Campbell, Michael F.Chiang, Deniz Erdogmus, Jennifer Dy, Stratis Ioannidis

Active Learning

Labels generated by human experts via comparisons exhibit smaller variance compared to traditional sample labels. Collecting comparison labels is challenging over large datasets, as the number of comparisons grows quadratically with the dataset size. We study the following experimental design problem: given a budget of expert comparisons, and a set of existing sample labels, we determine the comparison labels to collect that lead to the highest classification improvement. We study several experimental design objectives motivated by the Bradley-Terry model. The resulting optimization problems amount to maximizing submodular functions. We experimentally evaluate the performance of these methods over synthetic and real-life datasets.
#3268

Self-Supervised Deep Low-Rank Assignment Model for Prototype Selection
Xingxing Zhang, Zhenfeng Zhu, Yao Zhao, Deqiang Kong

Active Learning

Prototype selection is a promising technique for removing redundancy and irrelevance from large-scale data. Here, we consider it as a task assignment problem, which refers to assigning each element of a source set to one representative, i.e., prototype. However, due to the outliers and uncertain distribution on source, the selected prototypes are generally less representative and interesting. To alleviate this issue, we develop in this paper a Self-supervised Deep Low-rank Assignment model (SDLA). By dynamically integrating a low-rank assignment model with deep representation learning, our model effectively ensures the goodness-of-exemplar and goodness-of-discrimination of selected prototypes. Specifically, on the basis of a denoising autoencoder, dissimilarity metrics on source are continuously self-refined in embedding space with weak supervision from selected prototypes, thus preserving categorical similarity. Conversely, working on this metric space, similar samples tend to select the same prototypes by designing a low-rank assignment model. Experimental results on applications like text clustering and image classification (using prototypes) demonstrate our method is considerably superior to the state-of-the-art methods in prototype selection.
#3317

New Balanced Active Learning Model and Optimization Algorithm
Xiaoqian Wang, Yijun Huang, Ji Liu, Heng Huang

Active Learning

It is common in machine learning applications that unlabeled data are abundant while acquiring labels is extremely difficult. In order to reduce the cost of training model while maintaining the model quality, active learning provides a feasible solution. Instead of acquiring labels for random samples, active learning methods carefully select the data to be labeled so as to alleviate the impact from the redundancy or noise in the selected data and improve the trained model performance. In early stage experimental design, previous active learning methods adopted data reconstruction framework, such that the selected data maintained high representative power. However, these models did not consider the data class structure, thus the selected samples could be predominated by the samples from major classes. Such mechanism fails to include samples from the minor classes thus tends to be less "representative". To solve this challenging problem, we propose a novel active learning model for the early stage of experimental design. We use exclusive sparsity norm to enforce the selected samples to be (roughly) evenly distributed among different groups. We provide a new efficient optimization algorithm and theoretically prove the optimal convergence rate O(1/{T^2}). With a simple substitution, we reduce the computational load of each iteration from O(n^3) to O(n^2), which makes our algorithm more scalable than previous frameworks.
#2131

Adversarial Active Learning for Sequences Labeling and Generation
Yue Deng, KaWai Chen, Yilin Shen, Hongxia Jin

Active Learning

We introduce an active learning framework for general sequence learning tasks including sequence labeling and generation. Most existing active learning algorithms mainly rely on an uncertainty measure derived from the probabilistic classifier for query sample selection. However, such approaches suffer from two shortcomings in the context of sequence learning including 1) cold start problem and 2) label sampling dilemma. To overcome these shortcomings, we propose a deep-learning-based active learning framework to directly identify query samples from the perspective of adversarial learning. Our approach intends to offer labeling priorities for sequences whose information content are least covered by existing labeled data. We verify our sequence-based active learning approach on two tasks including sequence labeling and sequence generation.

Wednesday 18 08:30 - 09:55 SIS-WEB - Sister Conferences: Web, Recommendation, Retrieval (C2)

Chair: Jia Jia

#5105

Modeling the Assimilation-Contrast Effects in Online Product Rating Systems: Debiasing and Recommendations
Xiaoying Zhang, Hong Xie, Junzhou Zhao, John C.S. Lui

Sister Conferences: Web, Recommendation, Retrieval

The unbiasedness of online product ratings, an important property to ensure that users’ ratings indeed reflect their true evaluations to products, is vital both in shaping consumer purchase decisions and providing reliable recommendations. Recent experimental studies showed that distortions from historical ratings would ruin the unbiasedness of subsequent ratings. How to “discover” the distortions from historical ratings in each single rating (or at the micro-level), and perform the “debiasing operations” in real rating systems are the main objectives of this work. Using 42 million real customer ratings, we first show that users either “assimilate” or “contrast” to historical ratings under different scenarios: users conform to historical ratings if historical ratings are not far from the product quality (assimilation), while users deviate from historical ratings if historical ratings are significantly different from the product quality (contrast). This phenomenon can be explained by the well-known psychological argument: the “Assimilate-Contrast” theory. However, none of the existing works on modeling historical ratings’ influence have taken this into account, and this motivates us to propose the Histori- cal Influence Aware Latent Factor Model (HIALF), the first model for real rating systems to capture and mitigate historical distortions in each single rating. HIALF also allows us to study the influence patterns of historical ratings from a modeling perspective, and it perfectly matches the assimilation and contrast effects we previously observed. Also, HIALF achieves significant improvements in predicting subsequent ratings, and accurately predicts the relationships revealed in previous empirical measurements on real ratings. Finally, we show that HIALF can contribute to better recommendations by decoupling users’ real preference from distorted ratings, and reveal the intrinsic product quality for wiser consumer purchase decisions.
#5113

Translation-based Recommendation: A Scalable Method for Modeling Sequential Behavior
Ruining He, Wang-Cheng Kang, Julian McAuley

Sister Conferences: Web, Recommendation, Retrieval

Modeling the complex interactions between users and items is at the core of designing successful recommender systems. One key task consists of predicting users’ personalized sequential behavior, where the challenge mainly lies in modeling ‘third-order’ interactions between a user, her previously visited item(s), and the next item to consume. In this paper, we propose a unified method, TransRec, to model such interactions for largescale sequential prediction. Methodologically, we embed items into a ‘transition space’ where users are modeled as translation vectors operating on item sequences. Empirically, this approach outperforms the state-of-the-art on a wide spectrum of real-world datasets.
#5148

Unbiased Learning-to-Rank with Biased Feedback
Thorsten Joachims, Adith Swaminathan, Tobias Schnabel

Sister Conferences: Web, Recommendation, Retrieval

Implicit feedback (e.g., clicks, dwell times, etc.) is an abundant source of data in human-interactive systems. While implicit feedback has many advantages (e.g., it is inexpensive to collect, user-centric, and timely), its inherent biases are a key obstacle to its effective use. For example, position bias in search rankings strongly influences how many clicks a result receives, so that directly using click data as a training signal in Learning-to-Rank (LTR) methods yields sub-optimal results. To overcome this bias problem, we present a counterfactual inference framework that provides the theoretical basis for unbiased LTR via Empirical Risk Minimization despite biased data. Using this framework, we derive a propensity-weighted ranking SVM for discriminative learning from implicit feedback, where click models take the role of the propensity estimator. Beyond the theoretical support, we show empirically that the proposed learning method is highly effective in dealing with biases, that it is robust to noise and propensity model mis-specification, and that it scales efficiently. We also demonstrate the real-world applicability of our approach on an operational search engine, where it substantially improves retrieval performance.
#5108

A Model of Distributed Query Computation in Client-Server Scenarios on the Semantic Web
Olaf Hartig, Ian Letter, Jorge Pérez

Sister Conferences: Web, Recommendation, Retrieval

This paper provides an overview of a model for capturing properties of client-server-based query computation setups. This model can be used to formally analyze different combinations of client and server capabilities, and compare them in terms of various fine-grain complexity measures. While the motivations and the focus of the presented work are related to querying the Semantic Web, the main concepts of the model are general enough to be applied in other contexts as well.
#5109

Reducing Controversy by Connecting Opposing Views
Kiran Garimella, Gianmarco De Francisci Morales, Aristides Gionis, Michael Mathioudakis

Sister Conferences: Web, Recommendation, Retrieval

Controversial issues often split the population into groups with opposing views. When such issues emerge on social media, we often observe the creation of "echo chambers," i.e., situations where like-minded people reinforce each other’s opinion, but do not get exposed to the views of the opposing side. In this paper we study algorithmic techniques for bridging these chambers, and thus reduce controversy. Specifically, we represent discussions as graphs, and cast our objective as an edge-recommendation problem. The goal of the recommendation is to reduce the controversy score of the graph, measured by a recently-developed metric based on random walks. At the same time, we take into account the acceptance probability of the recommended edges, which represent the probability that the recommended edges materialize in the graph.
#5127

Learning with Sparse and Biased Feedback for Personal Search
Michael Bendersky, Xuanhui Wang, Marc Najork, Donald Metzler

Sister Conferences: Web, Recommendation, Retrieval

Personal search, including email, on-device, and personal media search, has recently attracted a considerable attention from the information retrieval community. In this paper, we provide an overview of challenges and opportunities of learning with implicit user feedback (e.g., click data) in personal search. Implicit user feedback provides a convenient source of supervision for ranking models in personal search. This feedback, however, has two major drawbacks: it is highly sparse and biased due to the personal nature of queries and documents. We demonstrate how these drawbacks can be overcome, and empirically demonstrate the benefits of learning with implicit feedback in the context of a large-scale email search engine.
#5134

A Conversational Approach to Process-oriented Case-based Reasoning
Christian Zeyen, Gilbert Müller, Ralph Bergmann

Sister Conferences: Web, Recommendation, Retrieval

Process-oriented case-based reasoning (POCBR) supports workflow modeling by retrieving and adapting workflows that have proved useful in the past. Current approaches typically require users to specify detailed queries, which can be a demanding task. Conversational case-based reasoning (CCBR) particularly addresses this problem by proposing methods that incrementally elicit the relevant features of the target problem in an interactive dialog. However, no CCBR approaches exist that are applicable for workflow cases that go beyond attribute-value representations such as labeled graphs. This paper closes this gap and presents a conversational POCBR approach (C-POCBR) in which questions related to structural properties of the workflow cases are generated automatically. An evaluation with cooking workflows indicates that C-POCBR can reduce the communication effort for users during retrieval.

Wednesday 18 08:30 - 09:55 ML-NN2 - Neural Networks (C3)

Chair: Nina Narodystka

#1389

AAR-CNNs: Auto Adaptive Regularized Convolutional Neural Networks
Yao Lu, Guangming Lu, Yuanrong Xu, Bob Zhang

Neural Networks

In order to address the overfitting problem caused by the small or simple training datasets and the large model’s size in Convolutional Neural Networks (CNNs), a novel Auto Adaptive Regularization (AAR) method is proposed in this paper. The relevant networks can be called AAR-CNNs. AAR is the first method using the “abstraction extent” (predicted by AE net) and a tiny learnable module (SE net) to auto adaptively predict more accurate and individualized regularization information. The AAR module can be directly inserted into every stage of any popular networks and trained end to end to improve the networks’ flexibility. This method can not only regularize the network at both the forward and the backward processes in the training phase, but also regularize the network on a more refined level (channel or pixel level) depending on the abstraction extent’s form. Comparative experiments are performed on low resolution ImageNet, CIFAR and SVHN datasets. Experimental results show that the AAR-CNNs can achieve state-of-the-art performances on these datasets.
#1638

A Unified Analysis of Stochastic Momentum Methods for Deep Learning
Yan Yan, Tianbao Yang, Zhe Li, Qihang Lin, Yi Yang

Neural Networks

Stochastic momentum methods have been widely adopted in training deep neural networks. However, their theoretical analysis of convergence of the training objective and the generalization error for prediction is still under-explored. This paper aims to bridge the gap between practice and theory by analyzing the stochastic gradient (SG) method, and the stochastic momentum methods including two famous variants, i.e., the stochastic heavy-ball (SHB) method and the stochastic variant of Nesterov?s accelerated gradient (SNAG) method. We propose a framework that unifies the three variants. We then derive the convergence rates of the norm of gradient for the non-convex optimization problem, and analyze the generalization performance through the uniform stability approach. Particularly, the convergence analysis of the training objective exhibits that SHB and SNAG have no advantage over SG. However, the stability analysis shows that the momentum term can improve the stability of the learned model and hence improve the generalization performance. These theoretical insights verify the common wisdom and are also corroborated by our empirical analysis on deep learning.
#1745

DeepTravel: a Neural Network Based Travel Time Estimation Model with Auxiliary Supervision
Hanyuan Zhang, Hao Wu, Weiwei Sun, Baihua Zheng

Neural Networks

Estimating the travel time of a path is of great importance to smart urban mobility. Existing approaches are either based on estimating the time cost of each road segment which are not able to capture many cross-segment complex factors, or designed heuristically in a non-learning-based way which fail to leverage the natural abundant temporal labels of the data, i.e., the time stamp of each trajectory point. In this paper, we leverage on new development of deep neural networks and propose a novel auxiliary supervision model, namely DeepTravel, that can automatically and effectively extract different features, as well as make full use of the temporal labels of the trajectory data. We have conducted comprehensive experiments on real datasets to demonstrate the out-performance of DeepTravel over existing approaches.
#4425

A Novel Data Representation for Effective Learning in Class Imbalanced Scenarios
Sri Harsha Dumpala, Rupayan Chakraborty, Sunil Kumar Kopparapu

Neural Networks

Class imbalance refers to the scenario where certain classes are highly under-represented compared to other classes in terms of the availability of training data. This situation hinders the applicability of conventional machine learning algorithms to most of the classification problems where class imbalance is prominent. Most existing methods addressing class imbalance either rely on sampling techniques or cost-sensitive learning methods; thus inheriting their shortcomings. In this paper, we introduce a novel approach that is different from sampling or cost-sensitive learning based techniques, to address the class imbalance problem, where two samples are simultaneously considered to train the classifier. Further, we propose a mechanism to use a single base classifier, instead of an ensemble of classifiers, to obtain the output label of the test sample using majority voting method. Experimental results on several benchmark datasets clearly indicate the usefulness of the proposed approach over the existing state-of-the-art techniques.
#3225

CAGAN: Consistent Adversarial Training Enhanced GANs
Yao Ni, Dandan Song, Xi Zhang, Hao Wu, Lejian Liao

Neural Networks

Generative adversarial networks (GANs) have shown impressive results, however, the generator and the discriminator are optimized in finite parameter space which means their performance still need to be improved. In this paper, we propose a novel approach of adversarial training between one generator and an exponential number of critics which are sampled from the original discriminative neural network via dropout. As discrepancy between outputs of different sub-networks of a same sample can measure the consistency of these critics, we encourage the critics to be consistent to real samples and inconsistent to generated samples during training, while the generator is trained to generate consistent samples for different critics. Experimental results demonstrate that our method can obtain state-of-the-art Inception scores of 9.17 and 10.02 on supervised CIFAR-10 and unsupervised STL-10 image generation tasks, respectively, as well as achieve competitive semi-supervised classification results on several benchmarks. Importantly, we demonstrate that our method can maintain stability in training and alleviate mode collapse.
#2361

Convolutional Memory Blocks for Depth Data Representation Learning
Keze Wang, Liang Lin, Chuangjie Ren, Wei Zhang, Wenxiu Sun

Neural Networks

Compared to natural RGB images, data captured by 3D / depth sensors (e.g., Microsoft Kinect) have different properties, e.g., less discriminable in appearance due to lacking color / texture information. Applying convolutional neural networks (CNNs) on these depth data would lead to unsatisfying learning efficiency, i.e., requiring large amounts of annotated training data for convergence. To address this issue, this paper proposes a novel memory network module, called Convolutional Memory Block (CMB), which empowers CNNs with the memory mechanism on handling depth data. Different from the existing memory networks that store long / short term dependency from sequential data, our proposed CMB focuses on modeling the representative dependency (correlation) among non-sequential samples. Specifically, our CMB consists of one internal memory (i.e., a set of feature maps) and three specific controllers, which enable a powerful yet efficient memory manipulation mechanism. In this way, the internal memory, being implicitly aggregated from all previous inputted samples, can learn to store and utilize representative features among the samples. Furthermore, we employ our CMB to develop a concise framework for predicting articulated pose from still depth images. Comprehensive evaluations on three public benchmarks demonstrate significant superiority (about 6%) of our framework over all the compared methods. More importantly, thanks to the enhanced learning efficiency, our framework can still achieve satisfying results using 50% less training data.
#4025

Counterexample-Guided Data Augmentation
Tommaso Dreossi, Shromona Ghosh, Xiangyu Yue, Kurt Keutzer, Alberto Sangiovanni-Vincentelli, Sanjit A. Seshia

Neural Networks

We present a novel framework for augmenting data sets for machine learning based on counterexamples. Counterexamples are misclassified examples that have important properties for retraining and improving the model. Key components of our framework include a \textit{counterexample generator}, which produces data items that are misclassified by the model and error tables, a novel data structure that stores information pertaining to misclassifications. Error tables can be used to explain the model's vulnerabilities and are used to efficiently generate counterexamples for augmentation. We show the efficacy of the proposed framework by comparing it to classical augmentation techniques on a case study of object detection in autonomous driving based on deep neural networks.

Wednesday 18 08:30 - 09:55 Open Session (K21)

Chair: Reinhard Lafrenz, Laure Le Bars

A future European AI ecosystem and On-Demand platform

Open Session

Wednesday 18 08:55 - 09:55 Industry Day (A4)

Industry Day - Session 1a

Industry Day

Show details

Wednesday 18 09:55 - 16:40 Competition (Registration Area)

Chair: Jochen Renz

The Angry Birds Human vs Machine Challenge

Competition

Show details

Wednesday 18 10:25 - 11:10 Invited Talk (VICTORIA)

Chair: Vincent Conitzer

Maximizing the Social Good: Markets without Money
Nicole Immorlica

Invited Talk

Wednesday 18 10:25 - 12:30 Open Session (K21)

Chair: Alibaba Group

Alimama - Smart Advertising Workshop

Open Session

Wednesday 18 10:25 - 12:45 Industry Day (A4)

Industry Day - Session 1b

Industry Day

Show details

Wednesday 18 11:20 - 12:45 SUR-ML - Survey Track: Machine Learning (VICTORIA)

Chair: Longbing Cao

#5414

Robust Multi-view Representation: A Unified Perspective from Multi-view Learning to Domain Adaption
Zhengming Ding, Ming Shao, Yun Fu

Survey Track: Machine Learning

Multi-view data are extensively accessible nowadays thanks to various types of features, different view-points and sensors which tend to facilitate better representation in many key applications. This survey covers the topic of robust multi-view data representation, centered around several major visual applications. First of all, we formulate a unified learning framework which is able to model most existing multi-view learning and domain adaptation in this line. Following this, we conduct a comprehensive discussion across these two problems by reviewing the algorithms along these two topics, including multi-view clustering, multi-view classification, zero-shot learning, and domain adaption. We further present more practical challenges in multi-view data analysis. Finally, we discuss future research including incomplete, unbalance, large-scale multi-view learning. This would benefit AI community from literature review to future direction.
#5433

Systems AI: A Declarative Learning Based Programming Perspective
Parisa Kordjamshidi, Dan Roth, Kristian Kersting

Survey Track: Machine Learning

Data-driven approaches are becoming dominant problem-solving techniques in many areas of research and industry. Unfortunately, current technologies do not make such techniques easy to use for application experts who are not fluent in machine learning nor for machine learning experts who aim at testing ideas on real-world data and need to evaluate those as a part of an end-to-end system. We review key efforts made by various AI communities to provide languages for high-level abstractions over learning and reasoning techniques needed for designing complex AI systems. We classify the existing frameworks based on the type of techniques as well as the data and knowledge representations they use, provide a comparative study of the way they address the challenges of programming real-world applications, and highlight some shortcomings and future directions.
#5437

Advancements in Dueling Bandits
Yanan Sui, Masrour Zoghi, Katja Hofmann, Yisong Yue

Survey Track: Machine Learning

The dueling bandits problem is an online learning framework where learning happens ``on-the-fly'' through preference feedback, i.e., from comparisons between a pair of actions. Unlike conventional online learning settings that require absolute feedback for each action, the dueling bandits framework assumes only the presence of (noisy) binary feedback about the relative quality of each pair of actions. The dueling bandits problem is well-suited for modeling settings that elicit subjective or implicit human feedback, which is typically more reliable in preference form. In this survey, we review recent results in the theories, algorithms, and applications of the dueling bandits problem. As an emerging domain, the theories and algorithms of dueling bandits have been intensively studied during the past few years. We provide an overview of recent advancements, including algorithmic advances and applications. We discuss extensions to standard problem formulation and novel application areas, highlighting key open research questions in our discussion.
#5438

Boosting Combinatorial Problem Modeling with Machine Learning
Michele Lombardi, Michela Milano

Survey Track: Machine Learning

In the past few years, the area of Machine Learning (ML) has witnessed tremendous advancements, becoming a pervasive technology in a wide range of applications. One area that can significantly benefit from the use of ML is Combinatorial Optimization. The three pillars of constraint satisfaction and optimization problem solving, i.e., modeling, search, and optimization, can exploit ML techniques to boost their accuracy, efficiency and effectiveness. In this survey we focus on the modeling component, whose effectiveness is crucial for solving the problem. The modeling activity has been traditionally shaped by optimization and domain experts, interacting to provide realistic results. Machine Learning techniques can tremendously ease the process, and exploit the available data to either create models or refine expert-designed ones. In this survey we cover approaches that have been recently proposed to enhance the modeling process by learning either single constraints, objective functions, or the whole model. We highlight common themes to multiple approaches and draw connections with related fields of research.

Wednesday 18 11:20 - 12:45 KR-MAS3 - Knowledge Representation and Agents: Arguing and Negotiating (C7)

Chair: Faria Nassiri-Mofakham

#1752

Two Sides of the Same Coin: Belief Revision and Enforcing Arguments
Adrian Haret, Johannes P. Wallner, Stefan Woltran

Knowledge Representation and Agents: Arguing and Negotiating

We study a type of change on knowledge bases inspired by the dynamics of formal argumentation systems, where the goal is to enforce acceptance of certain arguments. We put forward that enforcing acceptance of arguments can be viewed as a member of the wider family of belief change operations, and that an axiomatic treatment of it is therefore desirable. In our case, laying down axioms enables a precise account of the close connection between enforcing arguments and belief revision. Our analysis of enforcing arguments proceeds by (i) axiomatizing it as an operation in propositional logic and providing a representation result in terms of rankings on sets of interpretations, (ii) showing that it stands in close relationship to belief revision, and (iii) using it as a gateway towards a principled treatment of enforcement in abstract argumentation.
#504

A Study of Argumentative Characterisations of Preferred Subtheories
Marcello D'Agostino, Sanjay Modgil

Knowledge Representation and Agents: Arguing and Negotiating

Classical logic argumentation (Cl-Arg) under the stable semantics yields argumentative characterisations of non-monotonic inference in Preferred Subtheories. This paper studies these characterisations under both the standard approach to Cl-Arg, and a recent dialectical approach that is provably rational under resource bounds. Two key contributions are made. Firstly, the preferred extensions are shown to coincide with the stable extensions. This means that algorithms and proof theories for the admissible semantics can now be used to decide credulous inference in Preferred Subtheories. Secondly, we show that as compared with the standard approach, the grounded semantics applied to the dialectical approach more closely approximates sceptical inference in Preferred Subtheories.
#2712

Probabilistic bipolar abstract argumentation frameworks: complexity results
Bettina Fazzinga, Sergio Flesca, Filippo Furfaro

Knowledge Representation and Agents: Arguing and Negotiating

Probabilistic Bipolar Abstract Argumentation Frameworks (prBAFs), combining the possibility of specifying supports between arguments with a probabilistic modeling of the uncertainty, are considered, and the complexity of the fundamentalproblem of computing extensions' probabilities is addressed.The most popular semantics of supports and extensions are considered, as well as different paradigms for defining the probabilistic encoding of the uncertainty.Interestingly, the presence of supports, which does not alter the complexity of verifying extensions in the deterministic case, is shown to introduce a new source of complexity in some probabilistic settings, for which tractable cases are also identified.
#2977

An Empirical Study of Knowledge Tradeoffs in Case-Based Reasoning
Devi Ganesan, Sutanu Chakraborti

Knowledge Representation and Agents: Arguing and Negotiating

Case-Based Reasoning provides a framework for integrating domain knowledge with data in the form of four knowledge containers namely Case base, Vocabulary, Similarity and Adaptation. It is a known fact in Case-Based Reasoning community that knowledge can be interchanged between the containers. However, the explicit interplay between them, and how this interchange is affected by the knowledge richness of the underlying domain is not yet fully understood. We attempt to bridge this gap by proposing footprint size reduction as a measure for quantifying knowledge tradeoffs between containers. The proposed measure is empirically evaluated on synthetic as well as real world datasets. From a practical standpoint, footprint size reduction provides a unified way of estimating the impact of a given piece of knowledge in any knowledge container, and can also suggest ways of characterizing the nature of domains ranging from ill-defined to well-defined ones. Our study also makes evident the need for maintenance approaches that go beyond case base and competence to include other containers and performance objectives.
#4004

Relevance in Structured Argumentation
AnneMarie Borg, Christian Straßer

Knowledge Representation and Agents: Arguing and Negotiating

We study properties related to relevance in non-monotonic consequence relations obtained by systems of structured argumentation. Relevance desiderata concern the robustness of a consequence relation under the addition of irrelevant information. For an account of what (ir)relevance amounts to we use syntactic and semantic considerations. Syntactic criteria have been proposed in the domain of relevance logic and were recently used in argumentation theory under the names of non-interference and crash-resistance. The basic idea is that the conclusions of a given argumentative theory should be robust under adding information that shares no propositional variables with the original database. Some semantic relevance criteria are known from non-monotonic logic. For instance, cautious monotony states that if we obtain certain conclusions from an argumentation theory, we may expect to still obtain the same conclusions if we add some of them to the given database. In this paper we investigate properties of structured argumentation systems that warrant relevance desiderata.
#1986

Argumentation-Based Recommendations: Fantastic Explanations and How to Find Them
Antonio Rago, Oana Cocarascu, Francesca Toni

Knowledge Representation and Agents: Arguing and Negotiating

A significant problem of recommender systems is their inability to explain recommendations, resulting in turn in ineffective feedback from users and the inability to adapt to users’ preferences. We propose a hybrid method for calculating predicted ratings, built upon an item/aspect-based graph with users’ partially given ratings, that can be naturally used to provide explanations for recommendations, extracted from user-tailored Tripolar Argumentation Frameworks (TFs). We show that our method can be understood as a gradual semantics for TFs, exhibiting a desirable, albeit weak, property of balance. We also show experimentally that our method is competitive in generating correct predictions, compared with state-of-the-art methods, and illustrate how users can interact with the generated explanations to improve quality of recommendations.
#5462

(Journal track) On the Equivalence between Assumption-Based Argumentation and Logic Programming
Martin Caminada, Claudia Schulz

Knowledge Representation and Agents: Arguing and Negotiating

In this work, we explain how Assumption-Based Argumentation (ABA) is subsumed by Logic Programming (LP). The translation from ABA to LP (with a few restrictions on the ABA framework) results in a normal logic program whose semantics coincide with the semantics of the underlying ABA framework. Although the precise technicalities are beyond the current extended abstract (these can be found in the associated full paper) we provide a number of examples to illustrate the general idea.

Wednesday 18 11:20 - 12:45 CSAT-CSAT - Constraints and Satisfiability (K2)

Chair: Sophie Tourret

#1908

A Fast Algorithm for Generalized Arc Consistency of the Alldifferent Constraint
Xizhe Zhang, Qian Li, Weixiong Zhang

Constraints and Satisfiability

The alldifferent constraint is an essential ingredient of most Constraints Satisfaction Problems (CSPs). It has been known that the generalized arc consistency (GAC) of alldifferent constraints can be reduced to the maximum matching problem in a value graph. The redundant edges, which do not appear in any maximum matching of the value graph, can and should be removed from the graph. The existing methods attempt to identify these redundant edges by computing the strongly connected components after finding a maximum matching for the graph. Here, we present a novel theorem for identification of the redundant edges. We show that some of the redundant edges can be immediately detected after finding a maximum matching. Based on this theoretical result, we present an efficient algorithm for processing alldifferent constraints. Experimental results on real problems show that our new algorithm significantly outperforms the-state-of-art approaches.
#3777

Compact-MDD: Efficiently Filtering (s)MDD Constraints with Reversible Sparse Bit-sets
Hélène Verhaeghe, Christophe Lecoutre, Pierre Schaus

Constraints and Satisfiability

Multi-Valued Decision Diagrams (MDDs) are instrumental in modeling combinatorial problems with Constraint Programming.In this paper, we propose a related data structure called sMDD (semi-MDD) where the central layer of the diagrams is non-deterministic.We show that it is easy and efficient to transform any table (set of tuples) into an sMDD.We also introduce a new filtering algorithm, called Compact-MDD, which is based on bitwise operations, and can be applied to both MDDs and sMDDs.Our experimental results show the practical interest of our approach, both in terms of compression and filtering speed.
#3873

Core-Guided Minimal Correction Set and Core Enumeration
Nina Narodytska, Nikolaj Bjørner, Maria-Cristina Marinescu, Mooly Sagiv

Constraints and Satisfiability

A set of constraints is unsatisfiable if there is no solution that satisfies these constraints. To analyse unsatisfiable problems, the user needs to understand where inconsistencies come from and how they can be repaired. Minimal unsatisfiable cores and correction sets are important subsets of constraints that enable such analysis. In this work, we propose a new algorithm for extracting minimal unsatisfiable cores and correction sets simultaneously. Building on top of the relaxation and strengthening framework, we introduce novel techniques for extracting these sets. Our new solver significantly outperforms several state of the art algorithms on common benchmarks when it comes to extracting correction sets and compares favorably on core extraction.
#3860

A Reactive Strategy for High-Level Consistency During Search
Robert J. Woodward, Berthe Y. Choueiry, Christian Bessiere

Constraints and Satisfiability

Constraint propagation during backtrack search significantly improves the performance of solving a Constraint Satisfaction Problem. While Generalized Arc Consistency (GAC) is the most popular level of propagation, higher-level consistencies (HLC) are needed to solve difficult instances. Deciding to enforce an HLC instead of GAC remains the topic of active research. We propose a simple and effective strategy that reactively triggers an HLC by monitoring search performance: When search starts thrashing, we trigger an HLC, then conservatively revert to GAC. We detect thrashing by counting the number of backtracks at each level of the search tree and geometrically adjust the frequency of triggering an HLC based on its filtering effectiveness. We validate our approach on benchmark problems using Partition-One Arc-Consistency as an HLC. However, our strategy is generic and can be used with other higher-level consistency algorithms.
#5125

(Sister Conferences Best Papers Track) Reduced Cost Fixing for Maximum Satisfiability
Fahiem Bacchus, Antti Hyttinen, Matti Järvisalo, Paul Saikko

Constraints and Satisfiability

Maximum satisfiability (MaxSAT) offers a competitive approach to solving NP-hard real-world optimization problems. While state-of-the-art MaxSAT solvers rely heavily on Boolean satisfiability (SAT) solvers, a recent trend, brought on by MaxSAT solvers implementing the so-called implicit hitting set (IHS) approach, is to integrate techniques from the realm of integer programming (IP) into the solving process. This allows for making use of additional IP solving techniques to further speed up MaxSAT solving. In this line of work, we investigate the integration of the technique of reduced cost fixing from the IP realm into IHS solvers, and empirically show that reduced cost fixing considerable speeds up a state-of-the-art MaxSAT solver implementing the IHS approach.
#5119

(Sister Conferences Best Papers Track) Multi-Objective Optimization Through Pareto Minimal Correction Subsets
Miguel Terra-Neves, Inês Lynce, Vasco Manquinho

Constraints and Satisfiability

A Minimal Correction Subset (MCS) of an unsatisfiable constraint set is a minimal subset of constraints that, if removed, makes the constraint set satisfiable. MCSs enjoy a wide range of applications, such as finding approximate solutions to constrained optimization problems. However, existing work on applying MCS enumeration to optimization problems focuses on the single-objective case. In this work, Pareto Minimal Correction Subsets (Pareto-MCSs) are proposed for approximating the Pareto-optimal solution set of multi-objective constrained optimization problems. We formalize and prove an equivalence relationship between Pareto-optimal solutions and Pareto-MCSs. Moreover, Pareto-MCSs and MCSs can be connected in such a way that existing state-of-the-art MCS enumeration algorithms can be used to enumerate Pareto-MCSs. Finally, experimental results on the multi-objective virtual machine consolidation problem show that the Pareto-MCS approach is competitive with state-of-the-art algorithms.
#5149

(Sister Conferences Best Papers Track) Dynamic Dependency Awareness for QBF
Joshua Blinkhorn, Olaf Beyersdorff

Constraints and Satisfiability

We provide the first proof complexity results for QBF dependency calculi. By showing that the reflexive resolution path dependency scheme admits exponentially shorter Q-resolution proofs on a known family of instances, we answer a question first posed by Slivovsky and Szeider (SAT 2014). Further, we introduce a new calculus in which a dependency scheme is applied dynamically. We demonstrate the further potential of this approach beyond that of the existing static system with an exponential separation.

Wednesday 18 11:20 - 12:45 NLP-QA - Question Answering (T2)

Chair: Cynthia Matuszek

#318

Quality Matters: Assessing cQA Pair Quality via Transductive Multi-View Learning
Xiaochi Wei, Heyan Huang, Liqiang Nie, Fuli Feng, Richang Hong, Tat-Seng Chua

Question Answering

Community-based question answering (cQA) sites have become important knowledge sharing platforms, as massive cQA pairs are archived, but the uneven quality of cQA pairs leaves information seekers unsatisfied. Various efforts have been dedicated to predicting the quality of cQA contents. Most of them concatenate different features into single vectors and then feed them into regression models. In fact, the quality of cQA pairs is influenced by different views, and the agreement among them is essential for quality assessment. Besides, the lacking of labeled data significantly hinders the quality prediction performance. Toward this end, we present a transductive multi-view learning model. It is designed to find a latent common space by unifying and preserving information from various views, including question, answer, QA relevance, asker, and answerer. Additionally, rich information in the unlabeled test cQA pairs are utilized via transductive learning to enhance the representation ability of the common space. Extensive experiments on real-world datasets have well-validated the proposed model.
#493

Curriculum Learning for Natural Answer Generation
Cao Liu, Shizhu He, Kang Liu, Jun Zhao

Question Answering

By reason of being able to obtain natural language responses, natural answers are more favored in real-world Question Answering (QA) systems. Generative models learn to automatically generate natural answers from large-scale question answer pairs (QA-pairs). However, they are suffering from the uncontrollable and uneven quality of QA-pairs crawled from the Internet. To address this problem, we propose a curriculum learning based framework for natural answer generation (CL-NAG), which is able to take full advantage of the valuable learning data from a noisy and uneven-quality corpus. Specifically, we employ two practical measures to automatically measure the quality (complexity) of QA-pairs. Based on the measurements, CL-NAG firstly utilizes simple and low-quality QA-pairs to learn a basic model, and then gradually learns to produce better answers with richer contents and more complete syntaxes based on more complex and higher-quality QA-pairs. In this way, all valuable information in the noisy and uneven-quality corpus could be fully exploited. Experiments demonstrate that CL-NAG outperforms the state-of-the-arts, which increases 6.8% and 8.7% in the accuracy for simple and complex questions, respectively.
#1507

Answering Mixed Type Questions about Daily Living Episodes
Taiki Miyanishi, Jun-ichiro Hirayama, Atsunori Kanemura, Motoaki Kawanabe

Question Answering

We propose a physical-world question-answering (QA) method, where the system answers a text question about the physical world by searching a given sequence of sentences about daily-life episodes. To address various information needs in a physical world situation, the physical-world QA methods have to generate mixed-type responses (e.g. word sequence, word set, number, and time as well as a single word) according to the content of questions, after reading physical-world event stories. Most existing methods only provide words or choose answers from multiple candidates. In this paper, we use multiple decoders to generate a mixed-type answer encoding daily episodes with a memory architecture that can capture short- and long-term event dependencies. Results using house-activity stories show that the use of multiple decoders with memory components is effective for answering various physical-world QA questions.
#1646

Towards Reading Comprehension for Long Documents
Yuanxing Zhang, Yangbin Zhang, Kaigui Bian, Xiaoming Li

Question Answering

Machine reading comprehension has gained attention from both industry and academia. It is a very challenging task that involves various domains such as language comprehension, knowledge inference, summarization, etc. Previous studies mainly focus on reading comprehension on short paragraphs, and these approaches fail to perform well on the documents. In this paper, we propose a hierarchical match attention model to instruct the machine to extract answers from a specific short span of passages for the long document reading comprehension (LDRC) task. The model takes advantages from hierarchical-LSTM to learn the paragraph-level representation, and implements the match mechanism (i.e., quantifying the relationship between two contexts) to find the most appropriate paragraph that includes the hint of answers. Then the task can be decoupled into reading comprehension task for short paragraph, such that the answer can be produced. Experiments on the modified SQuAD dataset show that our proposed model outperforms existing reading comprehension models by at least 20% regarding exact match (EM), F1 and the proportion of identified paragraphs which are exactly the short paragraphs where the original answers locate.
#4502

ElimiNet: A Model for Eliminating Options for Reading Comprehension with Multiple Choice Questions
Soham Parikh, Ananya Sai, Preksha Nema, Mitesh Khapra

Question Answering

The task of Reading Comprehension with Multiple Choice Questions, requires a human (or machine) to read a given {passage, question} pair and select one of the n given options. The current state of the art model for this task first computes a question-aware representation for the passage and then selects the option which has the maximum similarity with this representation. However, when humans perform this task they do not just focus on option selection but use a combination of elimination and selection. Specifically, a human would first try to eliminate the most irrelevant option and then read the passage again in the light of this new information (and perhaps ignore portions corresponding to the eliminated option). This process could be repeated multiple times till the reader is finally ready to select the correct option. We propose ElimiNet, a neural network-based model which tries to mimic this process. Specifically, it has gates which decide whether an option can be eliminated given the {passage, question} pair and if so it tries to make the passage representation orthogonal to this eliminated option (akin to ignoring portions of the passage corresponding to the eliminated option). The model makes multiple rounds of partial elimination to refine the passage representation and finally uses a selection module to pick the best option. We evaluate our model on the recently released large scale RACE dataset and show that it outperforms the current state of the art model on 7 out of the 13 question types in this dataset. Further, we show that taking an ensemble of our elimination-selection based method with a selection based method gives us an improvement of 3.1% over the best-reported performance on this dataset.
#2379

Teaching Machines to Ask Questions
Kaichun Yao, Libo Zhang, Tiejian Luo, Lili Tao, Yanjun Wu

Question Answering

We propose a novel neural network model that aims to generate diverse and human-like natural language questions. Our model not only directly captures the variability in possible questions by using a latent variable, but also generates certain types of questions by introducing an additional observed variable. We deploy our model in the generative adversarial network (GAN) framework and modify the discriminator which not only allows evaluating the question authenticity, but predicts the question type. Our model is trained and evaluated on a question-answering dataset SQuAD, and the experimental results shown the proposed model is able to generate diverse and readable questions with the specific attribute.
#2698

Hermitian Co-Attention Networks for Text Matching in Asymmetrical Domains
Yi Tay, Anh Tuan Luu, Siu Cheung Hui

Question Answering

Co-Attentions are highly effective attention mechanisms for text matching applications. Co-Attention enables the learning of pairwise attentions, i.e., learning to attend based on computing word-level affinity scores between two documents. However, text matching problems can exist in either symmetrical or asymmetrical domains. For example, paraphrase identification is a symmetrical task while question-answer matching and entailment classification are considered asymmetrical domains. In this paper, we argue that Co-Attention models in asymmetrical domains require different treatment as opposed to symmetrical domains, i.e., a concept of word-level directionality should be incorporated while learning word-level similarity scores. Hence, the standard inner product in real space commonly adopted in co-attention is not suitable. This paper leverages attractive properties of the complex vector space and proposes a co-attention mechanism based on the complex-valued inner product (Hermitian products). Unlike the real dot product, the dot product in complex space is asymmetric because the first item is conjugated. Aside from modeling and encoding directionality, our proposed approach also enhances the representation learning process. Extensive experiments on five text matching benchmark datasets demonstrate the effectiveness of our approach.

Wednesday 18 11:20 - 12:45 CV-REC2 - Computer Vision: Recognition (T1)

Chair: Zhou Yu

#105

Deep View-Aware Metric Learning for Person Re-Identification
Pu Chen, Xinyi Xu, Cheng Deng

Computer Vision: Recognition

Person re-identification remains a challenging issue due to the dramatic changes in visual appearance caused by the variations in camera views, human pose, and background clutter. In this paper, we propose a deep view-aware metric learning (DVAML) model, where image pairs with similar and dissimilar views are projected into different feature subspaces, which can discover the intrinsic relevance between image pairs from different aspects. Additionally, we employ multiple metrics to jointly learn feature subspaces on which the relevance between image pairs are explicitly captured and thus greatly promoting the retrieval accuracy. Extensive experiment results on datasets CUHK01, CUHK03, and PRID2011 demonstrate the superiority of our method compared with state-of-the-art approaches.
#736

Centralized Ranking Loss with Weakly Supervised Localization for Fine-Grained Object Retrieval
Xiawu Zheng, Rongrong Ji, Xiaoshuai Sun, Yongjian Wu, Feiyue Huang, Yanhua Yang

Computer Vision: Recognition

Fine-grained object retrieval has attracted extensive research focus recently. Its state-of-the-art schemesare typically based upon convolutional neural network (CNN) features. Despite the extensive progress, two issues remain open. On one hand, the deep features are coarsely extracted at image level rather than precisely at object level, which are interrupted by background clutters. On the other hand, training CNN features with a standard triplet loss is time consuming and incapable to learn discriminative features. In this paper, we present a novel fine-grained object retrieval scheme that conquers these issues in a unified framework. Firstly, we introduce a novel centralized ranking loss (CRL), which achieves a very efficient (1,000times training speedup comparing to the triplet loss) and discriminative feature learning by a ?centralized? global pooling. Secondly, a weakly supervised attractive feature extraction is proposed, which segments object contours with top-down saliency. Consequently, the contours are integrated into the CNN response map to precisely extract features ?within? the target object. Interestingly, we have discovered that the combination of CRL and weakly supervised learning can reinforce each other. We evaluate the performance ofthe proposed scheme on widely-used benchmarks including CUB200-2011 and CARS196. We havereported significant gains over the state-of-the-art schemes, e.g., 5.4% over SCDA [Wei et al., 2017]on CARS196, and 3.7% on CUB200-2011.
#839

High Resolution Feature Recovering for Accelerating Urban Scene Parsing
Rui Zhang, Sheng Tang, Luoqi Liu, Yongdong Zhang, Jintao Li, Shuicheng Yan

Computer Vision: Recognition

Both accuracy and speed are equally important in urban scene parsing. Most of the existing methods mainly focus on improving parsing accuracy, ignoring the problem of low inference speed due to large-sized input and high resolution feature maps. To tackle this issue, we propose a High Resolution Feature Recovering (HRFR) framework to accelerate a given parsing network. A Super-Resolution Recovering module is employed to recover features of large original-sized images from features of down-sampled input. Therefore, our framework can combine the advantages of (1) fast speed of networks with down-sampled input and (2) high accuracy of networks with large original-sized input. Additionally, we employ auxiliary intermediate supervision and boundary region re-weighting to facilitate the optimization of the network. Extensive experiments on the two challenging Cityscapes and CamVid datasets well demonstrate the effectiveness of the proposed HRFR framework, which can accelerate the scene parsing inference process by about 3.0x speedup from 1/2 down-sampled input with negligible accuracy reduction.
#1086

Semantic Locality-Aware Deformable Network for Clothing Segmentation
Wei Ji, Xi Li, Yueting Zhuang, Omar El Farouk Bourahla, Yixin Ji, Shihao Li, Jiabao Cui

Computer Vision: Recognition

Clothing segmentation is a challenging vision problem typically implemented within a fine-grained semantic segmentation framework. Different from conventional segmentation, clothing segmentation has some domain-specific properties such as texture richness, diverse appearance variations, non-rigid geometry deformations, and small sample learning. To deal with these points, we propose a semantic locality-aware segmentation model, which adaptively attaches an original clothing image with a semantically similar (e.g., appearance or pose) auxiliary exemplar by search. Through considering the interactions of the clothing image and its exemplar, more intrinsic knowledge about the locality manifold structures of clothing images is discovered to make the learning process of small sample problem more stable and tractable. Furthermore, we present a CNN model based on the deformable convolutions to extract the non-rigid geometry-aware features for clothing images. Experimental results demonstrate the effectiveness of the proposed model against the state-of-the-art approaches.
#1116

MEnet: A Metric Expression Network for Salient Object Segmentation
Shulian Cai, Jiabin Huang, Delu Zeng, Xinghao Ding, John Paisley

Computer Vision: Recognition

Recent CNN-based saliency models have achieved excellent performance on public datasets, but most are sensitive to distortions from noise or compression. In this paper, we propose an end-to-end generic salient object segmentation model called Metric Expression Network (MEnet) to overcome this drawback. We construct a topological metric space where the implicit metric is determined by a deep network. In this latent space, we can group pixels within an observed image semantically into two regions, based on whether they are in a salient region or a non-salient region in the image. We carry out all feature extractions at the pixel level, which makes the output boundaries of the salient object finely-grained. Experimental results show that the proposed metric can generate robust salient maps that allow for object segmentation. By testing the method on several public benchmarks, we show that the performance of MEnet achieves excellent results. We also demonstrate that the proposed method outperforms previous CNN-based methods on distorted images.
#1934

Cross-Modality Person Re-Identification with Generative Adversarial Training
Pingyang Dai, Rongrong Ji, Haibin Wang, Qiong Wu, Yuyu Huang

Computer Vision: Recognition

Person re-identification (Re-ID) is an important task in video surveillance which automatically searches and identifies people across different cameras. Despite the extensive Re-ID progress in RGB cameras, few works have studied the Re-ID between infrared and RGB images, which is essentially a cross-modality problem and widely encountered in real-world scenarios. The key challenge lies in two folds, i.e., the lack of discriminative information to re-identify the same person between RGB and infrared modalities, and the difficulty to learn a robust metric towards such a large-scale cross-modality retrieval. In this paper, we tackle the above two challenges by proposing a novel cross-modality generative adversarial network (termed cmGAN). To handle the issue of insufficient discriminative information, we leverage the cutting-edge generative adversarial training to design our own discriminator to learn discriminative feature representation from different modalities. To handle the issue of large-scale cross-modality metric learning, we integrates both identification loss and cross-modality triplet loss, which minimize inter-class ambiguity while maximizing cross-modality similarity among instances. The entire cmGAN can be trained in an end-to-end manner by using standard deep neural network framework. We have quantized the performance of our work in the newly-released SYSU RGB-IR Re-ID benchmark, and have reported superior performance, i.e., Cumulative Match Characteristic curve (CMC) and Mean Average Precision (MAP), over the state-of-the-art works [Wu et al., 2017], respectively.
#1707

SafeNet: Scale-normalization and Anchor-based Feature Extraction Network for Person Re-identification
Kun Yuan, Qian Zhang, Chang Huang, Shiming Xiang, Chunhong Pan

Computer Vision: Recognition

Person Re-identification (ReID) is a challenging retrieval task that requires matching a person's image across non-overlapping camera views. The quality of fulfilling this task is largely determined on the robustness of the features that are used to describe the person. In this paper, we show the advantage of jointly utilizing multi-scale abstract information to learn powerful features over full body and parts. A scale normalization module is proposed to balance different scales through residual-based integration. To exploit the information hidden in non-rigid body parts, we propose an anchor-based method to capture the local contents by stacking convolutions of kernels with various aspect ratios, which focus on different spatial distributions. Finally, a well-defined framework is constructed for simultaneously learning the representations of both full body and parts. Extensive experiments conducted on current challenging large-scale person ReID datasets, including Market1501, CUHK03 and DukeMTMC, demonstrate that our proposed method achieves the state-of-the-art results.

Wednesday 18 11:20 - 12:45 ML-FLS - Feature Selection, Learning Sparse Models (K11)

Chair: Zhangyang Wang

#934

Exact Low Tubal Rank Tensor Recovery from Gaussian Measurements
Canyi Lu, Jiashi Feng, Zhouchen Lin, Shuicheng Yan

Feature Selection, Learning Sparse Models

The recent proposed Tensor Nuclear Norm (TNN) [Lu et al., 2016; 2018a] is an interesting convex penalty induced by the tensor SVD [Kilmer and Martin, 2011]. It plays a similar role as the matrix nuclear norm which is the convex surrogate of the matrix rank. Considering that the TNN based Tensor Robust PCA [Lu et al., 2018a] is an elegant extension of Robust PCA with a similar tight recovery bound, it is natural to solve other low rank tensor recovery problems extended from the matrix cases. However, the extensions and proofs are generally tedious. The general atomic norm provides a unified view of low-complexity structures induced norms, e.g., the l1-norm and nuclear norm. The sharp estimates of the required number of generic measurements for exact recovery based on the atomic norm are known in the literature. In this work, with a careful choice of the atomic set, we prove that TNN is a special atomic norm. Then by computing the Gaussian width of certain cone which is necessary for the sharp estimate, we achieve a simple bound for guaranteed low tubal rank tensor recovery from Gaussian measurements. Specifically, we show that by solving a TNN minimization problem, the underlying tensor of size n1×n2×n3 with tubal rank r can be exactly recovered when the given number of Gaussian measurements is O(r(n1+n2−r)n3). It is order optimal when comparing with the degrees of freedom r(n1+n2−r)n3. Beyond the Gaussian mapping, we also give the recovery guarantee of tensor completion based on the uniform random mapping by TNN minimization. Numerical experiments verify our theoretical results.
#1004

Cuckoo Feature Hashing: Dynamic Weight Sharing for Sparse Analytics
Jinyang Gao, Beng Chin Ooi, Yanyan Shen, Wang-Chien Lee

Feature Selection, Learning Sparse Models

Feature hashing is widely used to process large scale sparse features for learning of predictive models. Collisions inherently happen in the hashing process and hurt the model performance. In this paper, we develop a feature hashing scheme called Cuckoo Feature Hashing(CCFH) based on the principle behind Cuckoo hashing, a hashing scheme designed to resolve collisions. By providing multiple possible hash locations for each feature, CCFH prevents the collisions between predictive features by dynamically hashing them into alternative locations during model training. Experimental results on prediction tasks with hundred-millions of features demonstrate that CCFH can achieve the same level of performance by using only 15%-25% parameters compared with conventional feature hashing.
#3806

Stochastic Second-Order Method for Large-Scale Nonconvex Sparse Learning Models
Hongchang Gao, Heng Huang

Feature Selection, Learning Sparse Models

Sparse learning models have shown promising performance in the high dimensional machine learning applications. The main challenge of sparse learning models is how to optimize it efficiently. Most existing methods solve this problem by relaxing it as a convex problem, incurring large estimation bias. Thus, the sparse learning model with nonconvex constraint has attracted much attention due to its better performance. But it is difficult to optimize due to the non-convexity. In this paper, we propose a linearly convergent stochastic second-order method to optimize this nonconvex problem for large-scale datasets. The proposed method incorporates second-order information to improve the convergence speed. Theoretical analysis shows that our proposed method enjoys linear convergence rate and guarantees to converge to the underlying true model parameter. Experimental results have verified the efficiency and correctness of our proposed method.
#3529

Accelerated Difference of Convex functions Algorithm and its Application to Sparse Binary Logistic Regression
Duy Nhat Phan, Hoai Minh Le, Hoai An Le Thi

Feature Selection, Learning Sparse Models

In this work, we present a variant of DCA (Difference of Convex function Algorithm) with the aim to improve its convergence speed. The proposed algorithm, named Accelerated DCA (ADCA), consists in incorporating the Nesterov's acceleration technique into DCA. We first investigate ADCA for solving the standard DC program and rigorously study its convergence properties and the convergence rate. Secondly, we develop ADCA for a special case of the standard DC program whose the objective function is the sum of a differentiable with L-Lipschitz gradient function (possibly nonconvex) and a nonsmooth DC function. We exploit the special structure of the problem to propose an efficient DC decomposition for which the corresponding ADCA scheme is inexpensive. As an application, we consider the sparse binary logistic regression problem. Numerical experiments on several benchmark datasets illustrate the efficiency of our algorithm and its superiority over well-known methods.
#1799

Unsupervised Deep Hashing via Binary Latent Factor Models for Large-scale Cross-modal Retrieval
Gengshen Wu, Zijia Lin, Jungong Han, Li Liu, Guiguang Ding, Baochang Zhang, Jialie Shen

Feature Selection, Learning Sparse Models

Despite its great success, matrix factorization based cross-modality hashing suffers from two problems: 1) there is no engagement between feature learning and binarization; and 2) most existing methods impose the relaxation strategy by discarding the discrete constraints when learning the hash function, which usually yields suboptimal solutions. In this paper, we propose a novel multimodal hashing framework, referred as Unsupervised Deep Cross-Modal Hashing (UDCMH), for multimodal data search in a self-taught manner via integrating deep learning and matrix factorization with binary latent factor models. On one hand, our unsupervised deep learning framework enables the feature learning to be jointly optimized with the binarization. On the other hand, the hashing system based on the binary latent factor models can generate unified binary codes by solving a discrete-constrained objective function directly with no need for a relaxation step. Moreover, novel Laplacian constraints are incorporated into the objective function, which allow to preserve not only the nearest neighbors that are commonly considered in the literature but also the farthest neighbors of data, even if the semantic labels are not available. Extensive experiments on multiple datasets highlight the superiority of the proposed framework over several state-of-the-art baselines.
#1136

Improving Deep Neural Network Sparsity through Decorrelation Regularization
Xiaotian Zhu, Wengang Zhou, Houqiang Li

Feature Selection, Learning Sparse Models

Modern deep learning models usually suffer high complexity in model size and computation when transplanted to resource constrained platforms. To this end, many works are dedicated to compressing deep neural networks. Adding group LASSO regularization is one of the most effective model compression methods since it generates structured sparse networks. We investigate the deep neural networks trained by group LASSO constraint and observe that even with strong sparsity regularization imposed, there still exists substantial filter correlation among the convolution filters, which is undesired for a compact neural network. We propose to suppress such correlation with a new kind of constraint called decorrelation regularization, which explicitly forces the network to learn a set of less correlated filters. The experiments on CIFAR10/100 and ILSVRC2012 datasets show that when combined our decorrelation regularization with group LASSO, the correlation between filters could be effectively weakened, which increases the sparsity of the resulting model and leads to better compressing performance.
#2523

Efficient DNN Neuron Pruning by Minimizing Layer-wise Nonlinear Reconstruction Error
Chunhui Jiang, Guiying Li, Chao Qian, Ke Tang

Feature Selection, Learning Sparse Models

Deep neural networks (DNNs) have achieved great success, but the applications to mobile devices are limited due to their huge model size and low inference speed. Much effort thus has been devoted to pruning DNNs. Layer-wise neuron pruning methods have shown their effectiveness, which minimize the reconstruction error of linear response with a limited number of neurons in each single layer pruning. In this paper, we propose a new layer-wise neuron pruning approach by minimizing the reconstruction error of nonlinear units, which might be more reasonable since the error before and after activation can change significantly. An iterative optimization procedure combining greedy selection with gradient decent is proposed for single layer pruning. Experimental results on benchmark DNN models show the superiority of the proposed approach. Particularly, for VGGNet, the proposed approach can compress its disk space by 13.6× and bring a speedup of 3.7×; for AlexNet, it can achieve a compression rate of 4.1× and a speedup of 2.2×, respectively.

Wednesday 18 11:20 - 12:45 MLA-BM - Biomedical Applications (C3)

Chair: Jonathan Rubin

#704

Pairwise-Ranking based Collaborative Recurrent Neural Networks for Clinical Event Prediction
Zhi Qiao, Shiwan Zhao, Cao Xiao, Xiang Li, Yong Qin, Fei Wang

Biomedical Applications

Patient Electronic Health Records (EHR) data consist of sequences of patient visits over time. Sequential prediction of patients' future clinical events (e.g., diagnoses) from their historical EHR data is a core research task and motives a series of predictive models including deep learning. The existing research mainly adopts a classification framework, which treats the observed and unobserved events as positive and negative classes. However, this may not be true in real clinical setting considering the high rate of missed diagnoses and human errors. In this paper, we propose to formulate the clinical event prediction problem as an events recommendation problem. An end-to-end pairwise-ranking based collaborative recurrent neural networks (PacRNN) is proposed to solve it, which firstly embeds patient clinical contexts with attention RNN, then uses Bayesian Personalized Ranking (BPR) regularized by disease co-occurrence to rank probabilities of patient-specific diseases, as well as use point process to provide simultaneous prediction of the occurring time of these diagnoses. Experimental results on two real world EHR datasets demonstrate the robust performance, interpretability, and efficacy of PacRNN.
#1591

A Novel Neural Network Model based on Cerebral Hemispheric Asymmetry for EEG Emotion Recognition
Yang Li, Wenming Zheng, Zhen Cui, Tong Zhang, Yuan Zong

Biomedical Applications

In this paper, we propose a novel neural network model, called bi-hemispheres domain adversarial neural network (BiDANN), for EEG emotion recognition. BiDANN is motivated by the neuroscience findings, i.e., the emotional brain's asymmetries between left and right hemispheres. The basic idea of BiDANN is to map the EEG feature data of both left and right hemispheres into discriminative feature spaces separately, in which the data representations can be classified easily. For further precisely predicting the class labels of testing data, we narrow the distribution shift between training and testing data by using a global and two local domain discriminators, which work adversarially to the classifier to encourage domain-invariant data representations to emerge. After that, the learned classifier from labeled training data can be applied to unlabeled testing data naturally. We conduct two experiments to verify the performance of our BiDANN model on SEED database. The experimental results show that the proposed model achieves the state-of-the-art performance.
#2358

Predicting the Spatio-Temporal Evolution of Chronic Diseases in Population with Human Mobility Data
Yingzi Wang, Xiao Zhou, Anastasios Noulas, Cecilia Mascolo, Xing Xie, Enhong Chen

Biomedical Applications

Chronic diseases like cancer and diabetes are major threats to human life. Understanding the distribution and progression of chronic diseases of a population is important in assisting the allocation of medical resources as well as the design of policies in preemptive healthcare. Traditional methods to obtain large scale indicators on population health, e.g., surveys and statistical analysis, can be costly and time-consuming and often lead to a coarse spatio-temporal picture. In this paper, we leverage a dataset describing the human mobility patterns of citizens in a large metropolitan area. By viewing local human lifestyles we predict the evolution rate of several chronic diseases at the level of a city neighborhood. We apply the combination of a collaborative topic modeling (CTM) and a Gaussian mixture method (GMM) to tackle the data sparsity challenge and achieve robust predictions on health conditions simultaneously. Our method enables the analysis and prediction of disease rate evolution at fine spatio-temporal scales and demonstrates the potential of incorporating datasets from mobile web sources to improve population health monitoring. Evaluations using real-world check-in and chronic disease morbidity datasets in the city of London show that the proposed CTM+GMM model outperforms various baseline methods.
#2613

Hybrid Approach of Relation Network and Localized Graph Convolutional Filtering for Breast Cancer Subtype Classification
Sungmin Rhee, Seokjun Seo, Sun Kim

Biomedical Applications

Network biology has been successfully used to help reveal complex mechanisms of disease, especially cancer. On the other hand, network biology requires in-depth knowledge to construct disease-specific networks, but our current knowledge is very limited even with the recent advances in human cancer biology. Deep learning has shown an ability to address the problem like this. However, it conventionally used grid-like structured data, thus application of deep learning technologies to the human disease subtypes is yet to be explored. To overcome the issue, we propose a hybrid model, which integrates two key components 1) graph convolution neural network (graph CNN) and 2) relation network (RN). Experimental results on synthetic data and breast cancer data demonstrate that our proposed method shows better performances than existing methods.
#4414

Joint Learning of Phenotypes and Diagnosis-Medication Correspondence via Hidden Interaction Tensor Factorization
Kejing Yin, William K. Cheung, Yang Liu, Benjamin C. M. Fung, Jonathan Poon

Biomedical Applications

Non-negative tensor factorization has been shown effective for discovering phenotypes from the EHR data with minimal human supervision. In most cases, an interaction tensor of the elements in the EHR (e.g., diagnoses and medications) has to be first established before the factorization can be applied. Such correspondence information however is often missing. While different heuristics can be used to estimate the missing correspondence, any errors introduced will in turn cause inaccuracy for the subsequent phenotype discovery task. This is especially true for patients with multiple diseases diagnosed (e.g., under critical care). To alleviate this limitation, we propose the hidden interaction tensor factorization (HITF) where the diagnosis-medication correspondence and the underlying phenotypes are inferred simultaneously. We formulate it under a Poisson non-negative tensor factorization framework and learn the HITF model via maximum likelihood estimation. For performance evaluation, we applied HITF to the MIMIC III dataset. Our empirical results show that both the phenotypes and the correspondence inferred are clinically meaningful. In addition, the inferred HITF model outperforms a number of state-of-the-art methods for mortality prediction.
#2398

Interpretable Drug Target Prediction Using Deep Neural Representation
Kyle Yingkai Gao, Achille Fokoue, Heng Luo, Arun Iyengar, Sanjoy Dey, Ping Zhang

Biomedical Applications

The identification of drug-target interactions (DTIs) is a key task in drug discovery, where drugs are chemical compounds and targets are proteins. Traditional DTI prediction methods are either time consuming (simulation-based methods) or heavily dependent on domain expertise (similarity-based and feature-based methods). In this work, we propose an end-to-end neural network model that predicts DTIs directly from low level representations. In addition to making predictions, our model provides biological interpretation using two-way attention mechanism. Instead of using simplified settings where a dataset is evaluated as a whole, we designed an evaluation dataset from BindingDB following more realistic settings where predictions of unobserved examples (proteins and drugs) have to be made. We experimentally compared our model with matrix factorization, similarity-based methods, and a previous deep learning approach. Overall, the results show that our model outperforms other approaches without requiring domain knowledge and feature engineering. In a case study, we illustrated the ability of our approach to provide biological insights to interpret the predictions.
#843

Drug Similarity Integration Through Attentive Multi-view Graph Auto-Encoders
Tengfei Ma, Cao Xiao, Jiayu Zhou, Fei Wang

Biomedical Applications

Drug similarity has been studied to support downstream clinical tasks such as inferring novel properties of drugs (e.g. side effects, indications, interactions) from known properties. The growing availability of new types of drug features brings the opportunity of learning a more comprehensive and accurate drug similarity that represents the full spectrum of underlying drug relations. However, it is challenging to integrate these heterogeneous, noisy, nonlinear-related information to learn accurate similarity measures especially when labels are scarce. Moreover, there is a trade-off between accuracy and interpretability. In this paper, we propose to learn accurate and interpretable similarity measures from multiple types of drug features. In particular, we model the integration using multi-view graph auto-encoders, and add attentive mechanism to determine the weights for each view with respect to corresponding tasks and features for better interpretability. Our model has flexible design for both semi-supervised and unsupervised settings. Experimental results demonstrated significant predictive accuracy improvement. Case studies also showed better model capacity (e.g. embed node features) and interpretability.

Wednesday 18 11:20 - 13:00 MAS-PS - Agents and Planning (C8)

Chair: Satoshi Kurihara

#304

Managing Communication Costs under Temporal Uncertainty
Nikhil Bhargava, Christian Muise, Tiago Vaquero, Brian Williams

Agents and Planning

In multi-agent temporal planning, individual agents cannot know a priori when other agents will execute their actions and so treat those actions as uncertain. Only when others communicate the results of their actions is that uncertainty resolved. If a full communication protocol is specified ahead of time, then delay controllability can be used to assess the feasibility of the temporal plan. However, agents often have flexibility in choosing when to communicate the results of their action. In this paper, we address the question of how to choose communication protocols that guarantee the feasibility of the original temporal plan subject to some cost associated with that communication. To do so, we introduce a means of extracting delay controllability conflicts and show how we can use these conflicts to more efficiently guide our search. We then present three conflict-directed search algorithms and explore the theoretical and empirical trade-offs between the different approaches.
#1701

Traffic Light Scheduling, Value of Time, and Incentives
Argyrios Deligkas, Erez Karpas, Ron Lavi, Rann Smorodinsky

Agents and Planning

We study the intersection signalling control problem for cars with heterogeneous valuations of time (VoT). We are interested in a control algorithm that has some desirable properties: (1) it induces cars to report their VoT truthfully, (2) it minimizes the value of time lost for cars waiting at the intersection, and (3) it is computationally efficient. We obtain three main results: (1) We describe a computationally efficient heuristic forward search approach to solve the static problem. Simulation results show that this method is significantly faster than the dynamic-programming approach to solve the static problem (which is by itself polynomial time). We therefore believe that our algorithm can be commercially implemented. (2) We extend the solution of the static problem to the dynamic case. We couple our algorithm with a carefully designed payment scheme which yields an incentive compatible mechanism. In other words, it is the best interest of each car to truthfully report its VoT. (3) We describe simulation results that compare the social welfare obtained by our scheduling algorithm, as measured by the total value of waiting time, to the social welfare obtained by other intersection signalling control methods.
#2032

Multi-Agent Path Finding with Deadlines
Hang Ma, Glenn Wagner, Ariel Felner, Jiaoyang Li, T. K. Satish Kumar, Sven Koenig

Agents and Planning

We formalize Multi-Agent Path Finding with Deadlines (MAPF-DL). The objective is to maximize the number of agents that can reach their given goal vertices from their given start vertices within the deadline, without colliding with each other. We first show that MAPF-DL is NP-hard to solve optimally. We then present two classes of optimal algorithms, one based on a reduction of MAPF-DL to a flow problem and a subsequent compact integer linear programming formulation of the resulting reduced abstracted multi-commodity flow network and the other one based on novel combinatorial search algorithms. Our empirical results demonstrate that these MAPF-DL solvers scale well and each one dominates the other ones in different scenarios.
#1905

Scalable Initial State Interdiction for Factored MDPs
Swetasudha Panda, Yevgeniy Vorobeychik

Agents and Planning

We propose a novel Stackelberg game model of MDP interdiction in which the defender modifies the initial state of the planner, who then responds by computing an optimal policy starting with that state. We first develop a novel approach for MDP interdiction in factored state space that allows the defender to modify the initial state. The resulting approach can be computationally expensive for large factored MDPs. To address this, we develop several interdiction algorithms that leverage variations of reinforcement learning using both linear and non-linear function approximation. Finally, we extend the interdiction framework to consider a Bayesian interdiction problem in which the interdictor is uncertain about some of the planner's initial state features. Extensive experiments demonstrate the effectiveness of our approaches.
#3392

Solving Patrolling Problems in the Internet Environment
Tomas Brazdil, Antonin Kucera, Vojtech Rehak

Agents and Planning

We propose an algorithm for constructing efficient patrolling strategies in the Internet environment, where the protected targets are nodes connected to the network and the patrollers are software agents capable of detecting/preventing undesirable activities on the nodes. The algorithm is based on a novel compositional principle designed for a special class of strategies, and it can quickly construct (sub)optimal solutions even if the number of targets reaches hundreds of millions.
#3627

Counterplanning using Goal Recognition and Landmarks
Alberto Pozanco, Yolanda E-Martín, Susana Fernández, Daniel Borrajo

Agents and Planning

In non-cooperative multi-agent systems, agents might want to prevent the opponents from achieving their goals. One alternative to solve this task would be using counterplanning to generate a plan that allows an agent to block other's to reach their goals. In this paper, we introduce a fully automated domain-independent approach for counterplanning. It combines; goal recognition to infer an opponent's goal; landmarks' computation to identify subgoals that can be used to block opponents' goals achievement; and classical automated planning to generate plans that prevent the opponent's goals achievement. Experimental results in several domains show the benefits of our novel approach.
#3903

A Decentralised Approach to Intersection Traffic Management
Huan Vu, Samir Aknine, Sarvapali D. Ramchurn

Agents and Planning

Traffic congestion has a significant impact on quality of life and the economy. This paper presents a decentralised traffic management mechanism for intersections using a distributed constraint optimisation approach (DCOP). Our solution outperforms the state of the art solution both for stable traffic conditions (about 60% reduced waiting time) and robustness to unpredictable events.
#1289

Extended Increasing Cost Tree Search for Non-Unit Cost Domains
Thayne T. Walker, Nathan R. Sturtevant, Ariel Felner

Agents and Planning

Multi-agent pathfinding (MAPF) has applications in navigation, robotics, games and planning. Most work on search-based optimal algorithms for MAPF has focused on simple domains with unit cost actions and unit time steps. Although these constraints keep many aspects of the algorithms simple, they also severely limit the domains that can be used. In this paper we introduce a new definition of the MAPF problem for non-unit cost and non-unit time step domains along with new multiagent state successor generation schemes for these domains. Finally, we define an extended version of the increasing cost tree search algorithm (ICTS) for non-unit costs, with two new sub-optimal variants of ICTS: epsilon-ICTS and w-ICTS. Our experiments show that higher quality sub-optimal solutions are achievable in domains with finely discretized movement models in no more time than lower-quality, optimal solutions in domains with coarsely discretized movement models.

Wednesday 18 11:20 - 13:00 ML-RLA - Reinforcement Learning and Applications (C2)

Chair: Matthew E. Taylor

#172

Impression Allocation for Combating Fraud in E-commerce Via Deep Reinforcement Learning with Action Norm Penalty
Mengchen Zhao, Zhao Li, Bo An, Haifeng Lu, Yifan Yang, Chen Chu

Reinforcement Learning and Applications

Conducting fraud transactions has become popular among e-commerce sellers to make their products favorable to the platform and buyers, which decreases the utilization efficiency of buyer impressions and jeopardizes the business environment. Fraud detection techniques are necessary but not enough for the platform since it is impossible to recognize all the fraud transactions. In this paper, we focus on improving the platform's impression allocation mechanism to maximize its profit and reduce the sellers' fraudulent behaviors simultaneously. First, we learn a seller behavior model to predict the sellers' fraudulent behaviors from the real-world data provided by one of the largest e-commerce company in the world. Then, we formulate the platform's impression allocation problem as a continuous Markov Decision Process (MDP) with unbounded action space. In order to make the action executable in practice and facilitate learning, we propose a novel deep reinforcement learning algorithm DDPG-ANP that introduces an action norm penalty to the reward function. Experimental results show that our algorithm significantly outperforms existing baselines in terms of scalability and solution quality.
#523

StackDRL: Stacked Deep Reinforcement Learning for Fine-grained Visual Categorization
Xiangteng He, Yuxin Peng, Junjie Zhao

Reinforcement Learning and Applications

Fine-grained visual categorization (FGVC) is the discrimination of similar subcategories, whose main challenge is to localize the quite subtle visual distinctions between similar subcategories. There are two pivotal problems: discovering which region is discriminative and representative, and determining how many discriminative regions are necessary to achieve the best performance. Existing methods generally solve these two problems relying on the prior knowledge or experimental validation, which extremely restricts the usability and scalability of FGVC. To address the "which" and "how many" problems adaptively and intelligently, this paper proposes a stacked deep reinforcement learning approach (StackDRL). It adopts a two-stage learning architecture, which is driven by the semantic reward function. Two-stage learning localizes the object and its parts in sequence ("which"), and determines the number of discriminative regions adaptively ("how many"), which is quite appealing in FGVC. Semantic reward function drives StackDRL to fully learn the discriminative and conceptual visual information, via jointly combining the attention-based reward and category-based reward. Furthermore, unsupervised discriminative localization avoids the heavy labor consumption of labeling, and extremely strengthens the usability and scalability of our StackDRL approach. Comparing with ten state-of-the-art methods on CUB-200-2011 dataset, our StackDRL approach achieves the best categorization accuracy.
#2177

Deep Reinforcement Learning in Ice Hockey for Context-Aware Player Evaluation
Guiliang Liu, Oliver Schulte

Reinforcement Learning and Applications

A variety of machine learning models have been proposed to assess the performance of players in professional sports. However, they have only a limited ability to model how player performance depends on the game context. This paper proposes a new approach to capturing game context: we apply Deep Reinforcement Learning (DRL) to learn an action-value Q function from 3M play-by-play events in the National Hockey League (NHL). The neural network representation integrates both continuous context signals and game history, using a possession-based LSTM. The learned Q-function is used to value players' actions under different game contexts. To assess a player's overall performance, we introduce a novel Game Impact Metric (GIM) that aggregates the values of the player's actions. Empirical Evaluation shows GIM is consistent throughout a play season, and correlates highly with standard success measures and future salary.
#4005

Hashing over Predicted Future Frames for Informed Exploration of Deep Reinforcement Learning
Haiyan Yin, Jianda Chen, Sinno Jialin Pan

Reinforcement Learning and Applications

In deep reinforcement learning (RL) tasks, an efficient exploration mechanism should be able to encourage an agent to take actions that lead to less frequent states which may yield higher accumulative future return. However, both knowing about the future and evaluating the frequentness of states are non-trivial tasks, especially for deep RL domains, where a state is represented by high-dimensional image frames. In this paper, we propose a novel informed exploration framework for deep RL, where we build the capability for an RL agent to predict over the future transitions and evaluate the frequentness for the predicted future frames in a meaningful manner. To this end, we train a deep prediction model to predict future frames given a state-action pair, and a convolutional autoencoder model to hash over the seen frames. In addition, to utilize the counts derived from the seen frames to evaluate the frequentness for the predicted frames, we tackle the challenge of matching the predicted future frames and their corresponding seen frames at the latent feature level. In this way, we derive a reliable metric for evaluating the novelty of the future direction pointed by each action, and hence inform the agent to explore the least frequent one.
#2661

A Reinforced Topic-Aware Convolutional Sequence-to-Sequence Model for Abstractive Text Summarization
Li Wang, Junlin Yao, Yunzhe Tao, Li Zhong, Wei Liu, Qiang Du

Reinforcement Learning and Applications

In this paper, we propose a deep learning approach to tackle the automatic summarization tasks by incorporating topic information into the convolutional sequence-to-sequence (ConvS2S) model and using self-critical sequence training (SCST) for optimization. Through jointly attending to topics and word-level alignment, our approach can improve coherence, diversity, and informativeness of generated summaries via a biased probability generation mechanism. On the other hand, reinforcement training, like SCST, directly optimizes the proposed model with respect to the non-differentiable metric ROUGE, which also avoids the exposure bias during inference. We carry out the experimental evaluation with state-of-the-art methods over the Gigaword, DUC-2004, and LCSTS datasets. The empirical results demonstrate the superiority of our proposed method in the abstractive summarization.
#2840

Beyond the Click-Through Rate: Web Link Selection with Multi-level Feedback
Kun Chen, Kechao Cai, Longbo Huang, John C.S. Lui

Reinforcement Learning and Applications

The web link selection problem is to select a small subset of web links from a large web link pool, and to place the selected links on a web page that can only accommodate a limited number of links, e.g., advertisements, recommendations, or news feeds. Despite the long concerned click-through rate which reflects the attractiveness of the link itself, revenue can only be obtained from user actions after clicks, e.g., purchasing after being directed to the product pages by recommendation links. Thus, web links have an intrinsic multi-level feedback structure. With this observation, we consider the context-free web link selection problem, where the objective is to maximize revenue while ensuring that the attractiveness is no less than a preset threshold. The key challenge of the problem is that each link's multi-level feedbacks are stochastic, and unobservable unless the link is selected. We model this problem with a constrained stochastic multi-armed bandit formulation, and design an efficient link selection algorithm, called Constrained Upper Confidence Bound algorithm (Con-UCB). We prove O(sqrt(T ln(T))) bounds on both regret and violation of the attractiveness constraint. We also conduct extensive experiments on three real-world datasets, and show that Con-UCB outperforms state-of-the-art context-free bandit algorithms concerning the multi-level feedback structure.
#4272

Learning Environmental Calibration Actions for Policy Self-Evolution
Chao Zhang, Yang Yu, Zhi-Hua Zhou

Reinforcement Learning and Applications

Reinforcement learning in physical world is often expensive. Simulators are commonly employed to train policies. Due to the simulation error, trained-in-simulator policies are hard to be directly deployed in physical world. Therefore, how to efficiently reuse these policies to the real environment is a key issue. To address this issue, this paper presents a policy self-evolution process: in the target environment, the agent firstly executes a few calibration actions to perceive the environment, and then reuses the previous policies according to the observation of the environment. In this way, the mission of policy learning in the target environment is reduced to the task of environment identification through executing the calibration actions, which needs much less samples than learning a policy from scratch. We propose the POSEC (POlicy Self-Evolution by Calibration) approach, which learns the most informative calibration actions for policy self-evolution. Taking three robotic arm controlling tasks as the test beds, we show that the proposed method can learn a fine policy for a new arm with only a few (e.g. five) samples of the target environment.
#5135

(Sister Conferences Best Papers Track) Importance Sampling for Fair Policy Selection
Shayan Doroudi, Philip S. Thomas, Emma Brunskill

Reinforcement Learning and Applications

We consider the problem of off-policy policy selection in reinforcement learning: using historical data generated from running one policy to compare two or more policies. We show that approaches based on importance sampling can be unfair---they can select the worse of two policies more often than not. We then give an example that shows importance sampling is systematically unfair in a practically relevant setting; namely, we show that it unreasonably favors shorter trajectory lengths. We then present sufficient conditions to theoretically guarantee fairness. Finally, we provide a practical importance sampling-based estimator to help mitigate the unfairness due to varying trajectory lengths.

Wednesday 18 11:20 - 13:00 CV-VID - Video: Events, Activities, Surveillance, Question Answering (T5)

Chair: Hongyuan Zhu

#953

Watching a Small Portion could be as Good as Watching All: Towards Efficient Video Classification
Hehe Fan, Zhongwen Xu, Linchao Zhu, Chenggang Yan, Jianjun Ge, Yi Yang

Video: Events, Activities, Surveillance, Question Answering

We aim to significantly reduce the computational cost for classification of temporally untrimmed videos while retaining similar accuracy. Existing video classification methods sample frames with a predefined frequency over entire video. Differently, we propose an end-to-end deep reinforcement approach which enables an agent to classify videos by watching a very small portion of frames like what we do. We make two main contributions. First, information is not equally distributed in video frames along time. An agent needs to watch more carefully when a clip is informative and skip the frames if they are redundant or irrelevant. The proposed approach enables the agent to adapt sampling rate to video content and skip most of the frames without the loss of information. Second, in order to have a confident decision, the number of frames that should be watched by an agent varies greatly from one video to another. We incorporate an adaptive stop network to measure confidence score and generate timely trigger to stop the agent watching videos, which improves efficiency without loss of accuracy. Our approach reduces the computational cost significantly for the large-scale YouTube-8M dataset, while the accuracy remains the same.
#86

Visible Thermal Person Re-Identification via Dual-Constrained Top-Ranking
Mang Ye, Zheng Wang, Xiangyuan Lan, Pong C. Yuen

Video: Events, Activities, Surveillance, Question Answering

Cross-modality person re-identification between the thermal and visible domains is extremely important for night-time surveillance applications. Existing works in this filed mainly focus on learning sharable feature representations to handle the cross-modality discrepancies. However, besides the cross-modality discrepancy caused by different camera spectrums, visible thermal person re-identification also suffers from large cross-modality and intra-modality variations caused by different camera views and human poses. In this paper, we propose a dual-path network with a novel bi-directional dual-constrained top-ranking loss to learn discriminative feature representations. It is advantageous in two aspects: 1) end-to-end feature learning directly from the data without extra metric learning steps, 2) it simultaneously handles the cross-modality and intra-modality variations to ensure the discriminability of the learnt representations. Meanwhile, identity loss is further incorporated to model the identity-specific information to handle large intra-class variations. Extensive experiments on two datasets demonstrate the superior performance compared to the state-of-the-arts.
#332

PredCNN: Predictive Learning with Cascade Convolutions
Ziru Xu, Yunbo Wang, Mingsheng Long, Jianmin Wang

Video: Events, Activities, Surveillance, Question Answering

Predicting future frames in videos remains an unsolved but challenging problem. Mainstream recurrent models suffer from huge memory usage and computation cost, while convolutional models are unable to effectively capture the temporal dependencies between consecutive video frames. To tackle this problem, we introduce an entirely CNN-based architecture, PredCNN, that models the dependencies between the next frame and the sequential video inputs. Inspired by the core idea of recurrent models that previous states have more transition operations than future states, we design a cascade multiplicative unit (CMU) that provides relatively more operations for previous video frames. This newly proposed unit enables PredCNN to predict future spatiotemporal data without any recurrent chain structures, which eases gradient propagation and enables a fully paralleled optimization. We show that PredCNN outperforms the state-of-the-art recurrent models for video prediction on the standard Moving MNIST dataset and two challenging crowd flow prediction datasets, and achieves a faster training speed and lower memory footprint.
#3663

Crowd Counting using Deep Recurrent Spatial-Aware Network
Lingbo Liu, Hongjun Wang, Guanbin Li, Wanli Ouyang, Liang Lin

Video: Events, Activities, Surveillance, Question Answering

Crowd counting from unconstrained scene images is a crucial task in many real-world applications like urban surveillance and management, but it is greatly challenged by the camera’s perspective that causes huge appearance variations in people’s scales and rotations. Conventional methods address such challenges by resorting to fixed multi-scale architectures that are often unable to cover the largely varied scales while ignoring the rotation variations. In this paper, we propose a unified neural network framework, named Deep Recurrent Spatial-Aware Network, which adaptively addresses the two issues in a learnable spatial transform module with a region-wise refinement process. Specifically, our framework incorporates a Recurrent Spatial-Aware Refinement (RSAR) module iteratively conducting two components: i) a Spatial Transformer Network that dynamically locates an attentional region from the crowd density map and transforms it to the suitable scale and rotation for optimal crowd estimation; ii) a Local Refinement Network that refines the density map of the attended region with residual learning. Extensive experiments on four challenging benchmarks show the effectiveness of our approach. Specifically, comparing with the existing best-performing methods, we achieve an improvement of 12\% on the largest dataset WorldExpo’10 and 22.8\% on the most challenging dataset UCF\_CC\_50
#3410

Video Captioning with Tube Features
Bin Zhao, Xuelong Li, Xiaoqiang Lu

Video: Events, Activities, Surveillance, Question Answering

Visual feature plays an important role in the video captioning task. Considering that the video content is mainly composed of the activities of salient objects, it has restricted the caption quality of current approaches which just focus on global frame features while paying less attention to the salient objects. To tackle this problem, in this paper, we design an object-aware feature for video captioning, denoted as tube feature. Firstly, Faster-RCNN is employed to extract object regions in frames, and a tube generation method is developed to connect the regions from different frames but belonging to the same object. After that, an encoder-decoder architecture is constructed for video caption generation. Specifically, the encoder is a bi-directional LSTM, which is utilized to capture the dynamic information of each tube. The decoder is a single LSTM extended with an attention model, which enables our approach to adaptively attend to the most correlated tubes when generating the caption. We evaluate our approach on two benchmark datasets: MSVD and Charades. The experimental results have demonstrated the effectiveness of tube feature in the video captioning task.
#511

Visual Data Synthesis via GAN for Zero-Shot Video Classification
Chenrui Zhang, Yuxin Peng

Video: Events, Activities, Surveillance, Question Answering

Zero-Shot Learning (ZSL) in video classification is a promising research direction, which aims to tackle the challenge from explosive growth of video categories. Most existing methods exploit seento- unseen correlation via learning a projection between visual and semantic spaces. However, such projection-based paradigms cannot fully utilize the discriminative information implied in data distribution, and commonly suffer from the information degradation issue caused by "heterogeneity gap". In this paper, we propose a visual data synthesis framework via GAN to address these problems. Specifically, both semantic knowledge and visual distribution are leveraged to synthesize video feature of unseen categories, and ZSL can be turned into typical supervised problem with the synthetic features. First, we propose multi-level semantic inference to boost video feature synthesis, which captures the discriminative information implied in joint visual-semantic distribution via feature-level and label-level semantic inference. Second, we propose Matching-aware Mutual Information Correlation to overcome information degradation issue, which captures seen-to-unseen correlation in matched and mismatched visual-semantic pairs by mutual information, providing the zero-shot synthesis procedure with robust guidance signals. Experimental results on four video datasets demonstrate that our approach can improve the zero-shot video classification performance significantly.
#550

Open-Ended Long-form Video Question Answering via Adaptive Hierarchical Reinforced Networks
Zhou Zhao, Zhu Zhang, Shuwen Xiao, Zhou Yu, Jun Yu, Deng Cai, Fei Wu, Yueting Zhuang

Video: Events, Activities, Surveillance, Question Answering

Open-ended long-form video question answering is challenging problem in visual information retrieval, which automatically generates the natural language answer from the referenced long-form video content according to the question. However, the existing video question answering works mainly focus on the short-form video question answering, due to the lack of modeling the semantic representation of long-form video contents. In this paper, we consider the problem of long-form video question answering from the viewpoint of adaptive hierarchical reinforced encoder-decoder network learning. We propose the adaptive hierarchical encoder network to learn the joint representation of the long-form video contents according to the question with adaptive video segmentation. we then develop the reinforced decoder network to generate the natural language answer for open-ended video question answering. We construct a large-scale long-form video question answering dataset. The extensive experiments show the effectiveness of our method.
#1365

Multi-Turn Video Question Answering via Multi-Stream Hierarchical Attention Context Network
Zhou Zhao, Xinghua Jiang, Deng Cai, Jun Xiao, Xiaofei He, Shiliang Pu

Video: Events, Activities, Surveillance, Question Answering

Conversational video question answering is a challenging task in visual information retrieval, which generates the accurate answer from the referenced video contents according to the visual conversation context and given question. However, the existing visual question answering methods mainly tackle the problem of single-turn video question answering, which may be ineffectively applied for multi-turn video question answering directly, due to the insufficiency of modeling the sequential conversation context. In this paper, we study the problem of multi-turn video question answering from the viewpoint of multi-step hierarchical attention context network learning. We first propose the hierarchical attention context network for context-aware question understanding by modeling the hierarchically sequential conversation context structure. We then develop the multi-stream spatio-temporal attention network for learning the joint representation of the dynamic video contents and context-aware question embedding. We next devise the hierarchical attention context network learning method with multi-step reasoning process for multi-turn video question answering. We construct two large-scale multi-turn video question answering datasets. The extensive experiments show the effectiveness of our method.

Wednesday 18 14:00 - 14:45 Invited Talk (VICTORIA)

Chair: Fredrik Heintz

Intelligible Intelligence & Beneficial Intelligence
Max Tegmark

Invited Talk

Wednesday 18 14:55 - 16:10 SUR-NLCV - Survey Track: Natural Language Processing and Computer Vision (VICTORIA)

Chair: Yuhong Guo

#5431

Five Years of Argument Mining: a Data-driven Analysis
Elena Cabrio, Serena Villata

Survey Track: Natural Language Processing and Computer Vision

Argument mining is the research area aiming at extracting natural language arguments and their relations from text, with the final goal of providing machine-processable structured data for computational models of argument. This research topic has started to attract the attention of a small community of researchers around 2014, and it is nowadays counted as one of the most promising research areas in Artificial Intelligence in terms of growing of the community, funded projects, and involvement of companies. In this paper, we present the argument mining tasks, and we discuss the obtained results in the area from a data-driven perspective. An open discussion highlights the main weaknesses suffered by the existing work in the literature, and proposes open challenges to be faced in the future.
#5441

"Chitty-Chitty-Chat Bot": Deep Learning for Conversational AI
Rui Yan

Survey Track: Natural Language Processing and Computer Vision

Conversational AI is of growing importance since it enables easy interaction interface between humans and computers. Due to its promising potential and alluring commercial values to serve as virtual assistants and/or social chatbots, major AI, NLP, and Search & Mining conferences are explicitly calling-out for contributions from conversational studies. It is an active research area and of considerable interest. To build a conversational system with moderate intelligence is challenging, and requires abundant dialogue data and interdisciplinary techniques. Along with the Web 2.0, the massive data available greatly facilitate data-driven methods such as deep learning for human-computer conversations. In general, conversational systems can be categorized into 1) task-oriented systems which aim to help users accomplish goals in vertical domains, and 2) social chat bots which can converse seamlessly and appropriately with humans, playing the role of a chat companion. In this paper, we focus on the survey of non-task-oriented chit-chat bots.
#5445

Event Coreference Resolution: A Survey of Two Decades of Research
Jing Lu, Vincent Ng

Survey Track: Natural Language Processing and Computer Vision

Recent years have seen a gradual shift of focus from entity-based tasks to event-based tasks in information extraction research. Being a core event-based task, event coreference resolution is less studied but arguably more challenging than entity coreference resolution. This paper provides an overview of the major milestones made in event coreference research since its inception two decades ago.
#5408

Affective Image Content Analysis: A Comprehensive Survey
Sicheng Zhao, Guiguang Ding, Qingming Huang, Tat-Seng Chua, Björn W. Schuller, Kurt Keutzer

Survey Track: Natural Language Processing and Computer Vision

Images can convey rich semantics and induce strong emotions in viewers. Recently, with the explosive growth of visual data, extensive research efforts have been dedicated to affective image content analysis (AICA). In this paper, we review the state-of-the-art methods comprehensively with respect to two main challenges -- affective gap and perception subjectivity. We begin with an introduction to the key emotion representation models that have been widely employed in AICA. Available existing datasets for performing evaluation are briefly described. We then summarize and compare the representative approaches on emotion feature extraction, personalized emotion prediction, and emotion distribution learning. Finally, we discuss some future research directions.

Wednesday 18 14:55 - 16:10 SIS-KR - Sister Conference Best Papers: Knowledge Representation and Reasoning (C7)

Chair: Carsten Lutz

#5104

The Finite Model Theory of Bayesian Networks: Descriptive Complexity
Fabio Gagliardi Cozman, Denis Deratani Mauá

Sister Conference Best Papers: Knowledge Representation and Reasoning

We adapt the theory of descriptive complexity to Bayesian networks, to quantify the expressivity of specifications based on predicates and quantifiers. We show that Bayesian network specifications that employ first-order quantification capture the complexity class PP; by allowing quantification over predicates, the resulting Bayesian network specifications capture each class in the hierarchy PP^(NP^...^NP), a result that does not seem to have equivalent in the literature.
#5118

The Intricacies of Three-Valued Extensional Semantics for Higher-Order Logic Programs
Panos Rondogiannis, Ioanna Symeonidou

Sister Conference Best Papers: Knowledge Representation and Reasoning

In this paper we examine the problem of providing a purely extensional three-valued semantics for higher-order logic programs with negation. We demonstrate that a technique that was proposed by M. Bezem for providing extensional semantics to positive higher-order logic programs, fails when applied to higher-order logic programs with negation. On the positive side, we demonstrate that for stratified higher-order logic programs, extensionality is indeed achieved by the technique. We analyze the reasons of the failure of extensionality in the general case, arguing that a three-valued setting can not distinguish between certain predicates that appear to have a different behaviour inside a program context, but which happen to be identical as three-valued relations.
#5123

Attributed Description Logics: Reasoning on Knowledge Graphs
Markus Krötzsch, Maximilian Marx, Ana Ozaki, Veronika Thost

Sister Conference Best Papers: Knowledge Representation and Reasoning

In modelling real-world knowledge, there often arises a need to represent and reason with meta-knowledge. To equip description logics (DLs) for dealing with such ontologies, we enrich DL concepts and roles with finite sets of attribute–value pairs, called annotations, and allow concept inclusions to express constraints on annotations. We investigate a range of DLs starting from the lightweight description logic EL, covering the prototypical ALCH, and extending to the very expressive SROIQ, the DL underlying OWL 2 DL.
#5129

Finite Controllability of Conjunctive Query Answering with Existential Rules: Two Steps Forward
Giovanni Amendola, Nicola Leone, Marco Manna

Sister Conference Best Papers: Knowledge Representation and Reasoning

Reasoning with existential rules typically consists of checking whether a Boolean conjunctive query is satisfied by all models of a first-order sentence having the form of a conjunction of Datalog rules extended with existential quantifiers in rule-heads. To guarantee decidability, five basic decidable classes - linear, weakly-acyclic, guarded, sticky, and shy - have been singled out, together with several generalizations and combinations. For all basic classes, except shy, the important property of finite controllability has been proved, ensuring that a query is satisfied by all models of the sentence if, and only if, it is satisfied by all of its finite models. This paper takes two steps forward: (i) devise a general technique to facilitate the process of (dis)proving finite controllability of an arbitrary class of existential rules; and (ii) specialize the technique to complete the picture for the five mentioned classes, by showing that also shy is finitely controllable.
#5117

Weighted Bipolar Argumentation Graphs: Axioms and Semantics
Leila Amgoud, Jonathan Ben-Naim

Sister Conference Best Papers: Knowledge Representation and Reasoning

The paper studies how arguments can be evaluated in weighted bipolar argumentation graphs (i.e., graphs whose arguments have basic weights and may be supported and attacked). It introduces principles that an evaluation method (or semantics) would satisfy, analyzes existing semantics with respect to them, and finally proposes a new semantics for the class of non-maximal acyclic graphs.
#5151

Orchestrating a Network of Mereotopological Theories: An Abridged Report
C. Maria Keet, Oliver Kutz

Sister Conference Best Papers: Knowledge Representation and Reasoning

Parthood is used widely in ontologies across subject domains, specified in a multitude of mereological theories, and even more when combined with topology. To complicate the landscape, decidable languages put restrictions on the language features, so that only fragments of the mereo(topo)logical theories can be represented, even though those full features may be needed to check correctness during modelling. We address these issues by specifying a structured network of theories formulated in multiple logics that are glued together by the various linking constructs of the Distributed Ontology Language, DOL. For the KGEMT mereotopology and its five sub-theories, together with the DL-based OWL species and first- and second-order logic, this network in DOL orchestrates 28 ontologies.

Wednesday 18 14:55 - 16:10 MAS-ML - Agents and Learning (C8)

Chair: Ann Nowé

#3918

Combinatorial Auctions via Machine Learning-based Preference Elicitation
Gianluca Brero, Benjamin Lubin, Sven Seuken

Agents and Learning

Combinatorial auctions (CAs) are used to allocate multiple items among bidders with complex valuations. Since the value space grows exponentially in the number of items, it is impossible for bidders to report their full value function even in medium-sized settings. Prior work has shown that current designs often fail to elicit the most relevant values of the bidders, thus leading to inefficiencies. We address this problem by introducing a machine learning-based elicitation algorithm to identify which values to query from the bidders. Based on this elicitation paradigm we design a new CA mechanism we call PVM, where payments are determined so that bidders’ incentives are aligned with allocative efficiency. We validate PVM experimentally in several spectrum auction domains, and we show that it achieves high allocative efficiency even when only few values are elicited from the bidders.
#2135

Recurrent Deep Multiagent Q-Learning for Autonomous Brokers in Smart Grid
Yaodong Yang, Jianye Hao, Mingyang Sun, Zan Wang, Changjie Fan, Goran Strbac

Agents and Learning

The broker mechanism is widely applied to serve for interested parties to derive long-term policies in order to reduce costs or gain profits in smart grid. However, a broker is faced with a number of challenging problems such as balancing demand and supply from customers and competing with other coexisting brokers to maximize its profit. In this paper, we develop an effective pricing strategy for brokers in local electricity retail market based on recurrent deep multiagent reinforcement learning and sequential clustering. We use real household electricity consumption data to simulate the retail market for evaluating our strategy. The experiments demonstrate the superior performance of the proposed pricing strategy and highlight the effectiveness of our reward shaping mechanism.
#2151

What Game Are We Playing? End-to-end Learning in Normal and Extensive Form Games
Chun Kai Ling, Fei Fang, J. Zico Kolter

Agents and Learning

Although recent work in AI has made great progress in solving large, zero-sum, extensive-form games, the underlying assumption in most past work is that the parameters of the game itself are known to the agents. This paper deals with the relatively under-explored but equally important "inverse" setting, where the parameters of the underlying game are not known to all agents, but must be learned through observations. We propose a differentiable, end-to-end learning framework for addressing this task. In particular, we consider a regularized version of the game, equivalent to a particular form of quantal response equilibrium, and develop 1) a primal-dual Newton method for finding such equilibrium points in both normal and extensive form games; and 2) a backpropagation method that lets us analytically compute gradients of all relevant game parameters through the solution itself. This ultimately lets us learn the game by training in an end-to-end fashion, effectively by integrating a "differentiable game solver" into the loop of larger deep network architectures. We demonstrate the effectiveness of the learning method in several settings including poker and security game tasks.
#3591

Balancing Two-Player Stochastic Games with Soft Q-Learning
Jordi Grau-Moya, Felix Leibfried, Haitham Bou-Ammar

Agents and Learning

Within the context of video games the notion of perfectly rational agents can be undesirable as it leads to uninteresting situations, where humans face tough adversarial decision makers. Current frameworks for stochastic games and reinforcement learning prohibit tuneable strategies as they seek optimal performance. In this paper, we enable such tuneable behaviour by generalising soft Q-learning to stochastic games, where more than one agent interact strategically. We contribute both theoretically and empirically. On the theory side, we show that games with soft Q-learning exhibit a unique value and generalise team games and zero-sum games far beyond these two extremes to cover a continuous spectrum of gaming behaviour. Experimentally, we show how tuning agents' constraints affect performance and demonstrate, through a neural network architecture, how to reliably balance games with high-dimensional representations.
#4151

Keeping in Touch with Collaborative UAVs: A Deep Reinforcement Learning Approach
Bo Yang, Min Liu

Agents and Learning

Effective collaborations among autonomous unmanned aerial vehicles (UAVs) rely on timely information sharing. However, the time-varying flight environment and the intermittent link connectivity pose great challenges to message delivery. In this paper, we leverage the deep reinforcement learning (DRL) technique to address the UAVs' optimal links discovery and selection problem in uncertain environments. As the multi-agent learning efficiency is constrained by the high-dimensional and continuous action spaces, we slice the whole action spaces into a number of tractable fractions to achieve efficient convergences of optimal policies in continuous domains. Moreover, for the nonstationarity issue that particularly challenges the multi-agent DRL with local perceptions, we present a multi-agent mutual sampling method that jointly interacts the intra-agent and inter-agent state-action information to stabilize and expedite the training procedure. We evaluate the proposed algorithm on the UAVs' continuous network connection task. Results show that the associated UAVs can quickly select the optimal connected links, which facilitate the UAVs' teamwork significantly.

Wednesday 18 14:55 - 16:10 ROB-PS - Robotics and Planning (K2)

Chair: Vaishak Belle

#3401

Fast Model Identification via Physics Engines for Data-Efficient Policy Search
Shaojun Zhu, Andrew Kimmel, Kostas E. Bekris, Abdeslam Boularias

Robotics and Planning

This paper presents a method for identifying mechanical parameters of robots or objects, such as their mass and friction coefficients. Key features are the use of off-the-shelf physics engines and the adaptation of a Bayesian optimization technique towards minimizing the number of real-world experiments needed for model-based reinforcement learning. The proposed framework reproduces in a physics engine experiments performed on a real robot and optimizes the model's mechanical parameters so as to match real-world trajectories. The optimized model is then used for learning a policy in simulation, before real-world deployment. It is well understood, however, that it is hard to exactly reproduce real trajectories in simulation. Moreover, a near-optimal policy can be frequently found with an imperfect model. Therefore, this work proposes a strategy for identifying a model that is just good enough to approximate the value of a locally optimal policy with a certain confidence, instead of wasting effort on identifying the most accurate model. Evaluations, performed both in simulation and on a real robotic manipulation task, indicate that the proposed strategy results in an overall time-efficient, integrated model identification and learning solution, which significantly improves the data-efficiency of existing policy search algorithms.
#3824

Behavioral Cloning from Observation
Faraz Torabi, Garrett Warnell, Peter Stone

Robotics and Planning

Humans often learn how to perform tasks via imitation: they observe others perform a task, and then very quickly infer the appropriate actions to take based on their observations. While extending this paradigm to autonomous agents is a well-studied problem in general, there are two particular aspects that have largely been overlooked: (1) that the learning is done from observation only (i.e., without explicit action information), and (2) that the learning is typically done very quickly. In this work, we propose a two-phase, autonomous imitation learning technique called behavioral cloning from observation (BCO), that aims to provide improved performance with respect to both of these aspects. First, we allow the agent to acquire experience in a self-supervised fashion. This experience is used to develop a model which is then utilized to learn a particular task by observing an expert perform that task without the knowledge of the specific actions taken. We experimentally compare BCO to imitation learning methods, including the state-of-the-art, generative adversarial imitation learning (GAIL) technique, and we show comparable task performance in several different simulation domains while exhibiting increased learning speed after expert trajectories become available.
#3837

Multi-modal Predicate Identification using Dynamically Learned Robot Controllers
Saeid Amiri, Suhua Wei, Shiqi Zhang, Jivko Sinapov, Jesse Thomason, Peter Stone

Robotics and Planning

Intelligent robots frequently need to explore the objects in their working environments. Modern sensors have enabled robots to learn object properties via perception of multiple modalities. However, object exploration in the real world poses a challenging trade-off between information gains and exploration action costs. Mixed observability Markov decision process (MOMDP) is a framework for planning under uncertainty, while accounting for both fully and partially observable components of the state. Robot perception frequently has to face such mixed observability. This work enables a robot equipped with an arm to dynamically construct query-oriented MOMDPs for multi-modal predicate identification (MPI) of objects. The robot's behavioral policy is learned from two datasets collected using real robots. Our approach enables a robot to explore object properties in a way that is significantly faster while improving accuracies in comparison to existing methods that rely on hand-coded exploration strategies.
#5472

(Journal track) Solving Multi-Agent Path Finding on Strongly Biconnected Digraphs
Adi Botea, Davide Bonusi, Pavel Surynek

Robotics and Planning

We present and evaluate diBOX, an algorithm for multi-agent path finding on strongly biconnected directed graphs. diBOX runs in polynomial time, computes suboptimal solutions and is complete for instances on strongly biconnected digraphs with at least two unoccupied positions. A detailed empirical analysis shows a good scalability for diBOX.
#5136

(Sister Conferences Best Papers Track) Greedy Stone Tower Creations with a Robotic Arm
Martin Wermelinger, Fadri Furrer, Hironori Yoshida, Fabio Gramazio, Matthias Kohler, Roland Siegwart, Marco Hutter

Robotics and Planning

Predominately, robotic construction is applied as prefabrication in structured indoor environments with standard building materials. Our work, on the other hand, focuses on utilizing irregular materials found on-site, such as rubble and rocks, for autonomous construction. We present a pipeline to detect arbitrarily placed objects in a scene and form a structure out of the detected objects. The next best stacking pose is selected using a searching method employing gradient descent with random initial orientations, exploiting a physics engine. This approach is validated in an experimental setup using a robotic manipulator by constructing balancing vertical stacks without mortars and adhesives. We show the results of eleven consecutive trials to form such towers autonomously using four arbitrarily in front of the robot placed rocks.
#4216

Robot Task Interruption by Learning to Switch Among Multiple Models
Anahita Mohseni-Kabir, Manuela Veloso

Robotics and Planning

While mobile robots reliably perform each service task by accurately localizing and safely navigating avoiding obstacles, they do not respond in any other way to their surroundings. We can make the robots more responsive to their environment by equipping them with models of multiple tasks and a way to interrupt a specific task and switch to another task based on observations. However the challenges of a multiple task model approach include selecting a task model to execute based on observations and having a potentially large set of observations associated with the set of all individual task models. We present a novel two-step solution. First, our approach leverages the tasks' policies and an abstract representation of their states, and learns which task should be executed at each given world state. Secondly, the algorithm uses the learned tasks and identifies the observation stimuli that trigger the interruption of one task and the switch to another task. We show that our solution using the switching stimuli compares favorably to the naive approach of learning a combined model for all the tasks. Moreover, leveraging the stimuli significantly decreases the amount of sensory input processing during the execution of tasks.

Wednesday 18 14:55 - 16:10 Panel (T2)

The Future of AI in Europe
Fredrik Heintz

Panel

Show details

Wednesday 18 14:55 - 16:10 KR-NLP2 - Knowledge Graphs (T1)

Chair: Parisa Kordjamshidi

#258

Translating Embeddings for Knowledge Graph Completion with Relation Attention Mechanism
Wei Qian, Cong Fu, Yu Zhu, Deng Cai, Xiaofei He

Knowledge Graphs

Knowledge graph embedding is an essential problem in knowledge extraction. Recently, translation based embedding models (e.g., TransE) have received increasingly attentions. These methods try to interpret the relations among entities as translations from head entity to tail entity and achieve promising performance on knowledge graph completion. Previous researchers attempt to transform the entity embedding concerning the given relation for distinguishability. Also, they naturally think the relation-related transforming should reflect attention mechanism, which means it should focus on only a part of the attributes. However, we found previous methods are failed with creating attention mechanism, and the reason is that they ignore the hierarchical routine of human cognition. When predicting whether a relation holds between two entities, people first check the category of entities, then they focus on fined-grained relation-related attributes to make the decision. In other words, the attention should take effect on entities filtered by the right category. In this paper, we propose a novel knowledge graph embedding method named TransAt to learn the translation based embedding, relation-related categories of entities and relation-related attention simultaneously. Extensive experiments show that our approach outperforms state-of-the-art methods significantly on public datasets, and our method can learn the true attention varying among relations.
#454

Co-training Embeddings of Knowledge Graphs and Entity Descriptions for Cross-lingual Entity Alignment
Muhao Chen, Yingtao Tian, Kai-Wei Chang, Steven Skiena, Carlo Zaniolo

Knowledge Graphs

Multilingual knowledge graph (KG) embeddings provide latent semantic representations of entities and structured knowledge with cross-lingual inferences, which benefit various knowledge-driven cross-lingual NLP tasks. However, precisely learning such cross-lingual inferences is usually hindered by the low coverage of entity alignment in many KGs. Since many multilingual KGs also provide literal descriptions of entities, in this paper, we introduce an embedding-based approach which leverages a weakly aligned multilingual KG for semi-supervised cross-lingual learning using entity descriptions. Our approach performs co-training of two embedding models, i.e. a multilingual KG embedding model and a multilingual literal description embedding model. The models are trained on a large Wikipedia-based trilingual dataset where most entity alignment is unknown to training. Experimental results show that the performance of the proposed approach on the entity alignment task improves at each iteration of co-training, and eventually reaches a stage at which it significantly surpasses previous approaches. We also show that our approach has promising abilities for zero-shot entity alignment, and cross-lingual KG completion.
#698

Non-translational Alignment for Multi-relational Networks
Shengnan Li, Xin Li, Rui Ye, Mingzhong Wang, Haiping Su, Yingzi Ou

Knowledge Graphs

Most existing solutions for the alignment of multi-relational networks, such as multi-lingual knowledge bases, are ``translation''-based which facilitate the network embedding via the trans-family, such as TransE. However, they cannot address triangular or other structural properties effectively. Thus, we propose a non-translational approach, which aims to utilize a probabilistic model to offer more robust solutions to the alignment task, by exploring the structural properties as well as leveraging on anchors to project each network onto the same vector space during the process of learning the representation of individual networks. The extensive experiments on four multi-lingual knowledge graphs demonstrate the effectiveness and robustness of the proposed method over a set of state-of-the-art alignment methods.
#1409

TreeNet: Learning Sentence Representations with Unconstrained Tree Structure
Zhou Cheng, Chun Yuan, Jiancheng Li, Haiqin Yang

Knowledge Graphs

Recursive neural network (RvNN) has been proved to be an effective and promising tool to learn sentence representations by explicitly exploiting the sentence structure. However, most existing work can only exploit simple tree structure, e.g., binary trees, or ignore the order of nodes, which yields suboptimal performance. In this paper, we proposed a novel neural network, namely TreeNet, to capture sentences structurally over the raw unconstrained constituency trees, where the number of child nodes can be arbitrary. In TreeNet, each node is learning from its left sibling and right child in a bottom-up left-to-right order, thus enabling the net to learn over any tree. Furthermore, multiple soft gates and a memory cell are employed in implementing the TreeNet to determine to what extent it should learn, remember and output, which proves to be a simple and efficient mechanism for semantic synthesis. Moreover, TreeNet significantly suppresses convolutional neural networks (CNN) and Long Short-Term Memory (LSTM) with fewer parameters. It improves the classification accuracy by 2%-5% with 42% of the best CNN’s parameters or 94% of standard LSTM’s. Extensive experiments demonstrate TreeNet achieves the state-of-the-art performance on all four typical text classification tasks.
#1735

Efficient Pruning of Large Knowledge Graphs
Stefano Faralli, Irene Finocchi, Simone Paolo Ponzetto, Paola Velardi

Knowledge Graphs

In this paper we present an efficient and highly accurate algorithm to prune noisy or over-ambiguous knowledge graphs given as input an extensional definition of a domain of interest, namely as a set of instances or concepts. Our method climbs the graph in a bottom-up fashion, iteratively layering the graph and pruning nodes and edges in each layer while not compromising the connectivity of the set of input nodes. Iterative layering and protection of pre-defined nodes allow to extract semantically coherent DAG structures from noisy or over-ambiguous cyclic graphs, without loss of information and without incurring in computational bottlenecks, which are the main problem of state-of-the-art methods for cleaning large, i.e., Web-scale, knowledge graphs. We apply our algorithm to the tasks of pruning automatically acquired taxonomies using benchmarking data from a SemEval evaluation exercise, as well as the extraction of a domain-adapted taxonomy from the Wikipedia category hierarchy. The results show the superiority of our approach over state-of-art algorithms in terms of both output quality and computational efficiency.
#5144

(Sister Conferences Best Papers Track) Completeness-aware Rule Learning from Knowledge Graphs
Thomas Pellissier Tanon, Daria Stepanova, Simon Razniewski, Paramita Mirza, Gerhard Weikum

Knowledge Graphs

Knowledge graphs (KGs) are huge collections of primarily encyclopedic facts that are widely used in entity recognition, structured search, question answering, and similar. Rule mining is commonly applied to discover patterns in KGs. However, unlike in traditional association rule mining, KGs provide a setting with a high degree of incompleteness, which may result in the wrong estimation of the quality of mined rules, leading to erroneous beliefs such as all artists have won an award. In this paper we propose to use (in-)completeness meta-information to better assess the quality of rules learned from incomplete KGs. We introduce completeness-aware scoring functions for relational association rules. Experimental evaluation both on real and synthetic datasets shows that the proposed rule ranking approaches have remarkably higher accuracy than the state-of-the-art methods in uncovering missing facts.

Wednesday 18 14:55 - 16:10 ML-INT - Interpretability (K11)

Chair: Xintao Wu

#4208

Interpretable Adversarial Perturbation in Input Embedding Space for Text
Motoki Sato, Jun Suzuki, Hiroyuki Shindo, Yuji Matsumoto

Interpretability

Following great success in the image processing field, the idea of adversarial training has been applied to tasks in the natural language processing (NLP) field. One promising approach directly applies adversarial training developed in the image processing field to the input word embedding space instead of the discrete input space of texts. However, this approach abandons such interpretability as generating adversarial texts to significantly improve the performance of NLP tasks. This paper restores interpretability to such methods by restricting the directions of perturbations toward the existing words in the input embedding space. As a result, we can straightforwardly reconstruct each input with perturbations to an actual text by considering the perturbations to be the replacement of words in the sentence while maintaining or even improving the task performance.
#676

Contextual Outlier Interpretation
Ninghao Liu, Donghwa Shin, Xia Hu

Interpretability

While outlier detection has been intensively studied in many applications, interpretation is becoming increasingly important to help people trust and evaluate the developed detection models through providing intrinsic reasons why the given outliers are identified. It is a nontrivial task for interpreting the abnormality of outliers due to the distinct characteristics of different detection models, complicated structures of data in certain applications, and imbalanced distribution of outliers and normal instances. In addition, contexts where outliers locate, as well as the relation between outliers and the contexts, are usually overlooked in existing interpretation frameworks. To tackle the issues, in this paper, we propose a Contextual Outlier INterpretation (COIN) framework to explain the abnormality of outliers spotted by detectors. The interpretability of an outlier is achieved through three aspects, i.e., outlierness score, attributes that contribute to the abnormality, and contextual description of its neighborhoods. Experimental results on various types of datasets demonstrate the flexibility and effectiveness of the proposed framework.
#4026

A Symbolic Approach to Explaining Bayesian Network Classifiers
Andy Shih, Arthur Choi, Adnan Darwiche

Interpretability

We propose an approach for explaining Bayesian network classifiers, which is based on compiling such classifiers into decision functions that have a tractable and symbolic form. We introduce two types of explanations for why a classifier may have classified an instance positively or negatively and suggest algorithms for computing these explanations. The first type of explanation identifies a minimal set of the currently active features that is responsible for the current classification, while the second type of explanation identifies a minimal set of features whose current state (active or not) is sufficient for the classification. We consider in particular the compilation of Naive and Latent-Tree Bayesian network classifiers into Ordered Decision Diagrams (ODDs), providing a context for evaluating our proposal using case studies and experiments based on classifiers from the literature.
#4223

Mixed Causal Structure Discovery with Application to Prescriptive Pricing
Wei Wenjuan, Feng Lu, Liu Chunchen

Interpretability

Prescriptive pricing is one of the most advanced pricing techniques, which derives the optimal price strategy to maximize the future profit/revenue by carrying out a two-stage process, demand modeling and price optimization.Demand modeling tries to reveal price-demand laws by discovering causal relationships among demands, prices, and objective factors, which is the foundation of price optimization.Existing methods either use regression or causal learning for uncovering the price-demand relations, but suffer from pain points in either accuracy/efficiency or mixed data type processing, while all of these are actual requirements in practical pricing scenarios.This paper proposes a novel demand modeling technique for practical usage.Speaking concretely, we propose a new locally consistent information criterion named MIC,and derive MIC-based inference algorithms for an accurate recovery of causal structure on mixed factor space.Experiments on simulate/real datasets show the superiority of our new approach in both price-demand law recovery and demand forecasting, as well as show promising performance in supporting optimal pricing.
#5452

(Journal track) Learning Explanatory Rules from Noisy Data
Richard Evans, Edward Grefenstette

Interpretability

Artificial Neural Networks are powerful function approximators capable of modelling solutions to a wide variety of problems, both supervised and unsupervised. As their size and expressivity increases, so too does the variance of the model, yielding a nearly ubiquitous overfitting problem. Although mitigated by a variety of model regularisation methods, the common cure is to seek large amounts of training data—which is not necessarily easily obtained—that sufficiently approximates the data distribution of the domain we wish to test on. In contrast, logic programming methods such as Inductive Logic Programming offer an extremely data-efficient process by which models can be trained to reason on symbolic domains. However, these methods are unable to deal with the variety of domains neural networks can be applied to: they are not robust to noise in or mislabelling of inputs, and perhaps more importantly, cannot be applied to non-symbolic domains where the data is ambiguous, such as operating on raw pixels. In this paper, we propose a Differentiable Inductive Logic framework (∂ILP), which can not only solve tasks which traditional ILP systems are suited for, but shows a robustness to noise and error in the training data which ILP cannot cope with. Furthermore, as it is trained by backpropagation against a likelihood objective, it can be hybridised by connecting it with neural networks over ambiguous data in order to be applied to domains which ILP cannot address, while providing data efficiency and generalisation beyond what neural networks on their own can achieve.
#5461

(Journal track) Visualisation and 'Diagnostic Classifiers' Reveal how Recurrent and Recursive Neural Networks Process Hierarchical Structure
Dieuwke Hupkes, Willem Zuidema

Interpretability

In this paper, we investigate how recurrent neural networks can learn and process languages with hierarchical, compositional semantics. To this end, we define the artificial task of processing nested arithmetic expressions, and study whether different types of neural networks can learn to compute their meaning. We find that simple recurrent networks cannot find a generalising solution to this task, but gated recurrent neural networks perform surprisingly well: networks learn to predict the outcome of the arithmetic expressions with high accuracy, although performance deteriorates somewhat with increasing length. We test multiple hypotheses on the information that is encoded and processed by the networks using a method called diagnostic classification. In this method, simple neural classifiers are used to test sequences of predictions about features of the hidden state representations at each time step. Our results indicate that the networks follow a strategy similar to our hypothesised ‘cumulative strategy’, which explains the high accuracy of the network on novel expressions, the generalisation to longer expressions than seen in training, and the mild deterioration with increasing length. This, in turn, shows that diagnostic classifiers can be a useful technique for opening up the black box of neural networks.

Wednesday 18 14:55 - 16:10 UAI-KR - Bayesian Networks (C2)

Chair: Mikko Koivisto

#1935

Algorithms for the Nearest Assignment Problem
Sara Rouhani, Tahrima Rahman, Vibhav Gogate

Bayesian Networks

We consider the following nearest assignment problem (NAP): given a Bayesian network B and probability value q, find a configuration w of variables in B such that difference between q and the probability of w is minimized. NAP is much harder than conventional inference problems such as finding the most probable explanation and is NP-hard even on independent Bayesian networks (IBNs), which are networks having no edges. Therefore, in order to solve NAP on IBNs, we show how to encode it as a two-way number partitioning problem. This encoding allows us to use greedy poly-time approximation algorithms from the number partitioning literature to yield an algorithm with guarantees for solving NAP on IBNs. We extend this basic algorithm from independent networks to arbitrary probabilistic graphical models by leveraging cutset conditioning and (Rao-Blackwellised) sampling algorithms. We derive approximation and complexity guarantees for our new algorithms and show experimentally that they are quite accurate in practice.
#2439

Estimation with Incomplete Data: The Linear Case
Karthika Mohan, Felix Thoemmes, Judea Pearl

Bayesian Networks

Traditional methods for handling incomplete data, including Multiple Imputation and Maximum Likelihood, require that the data be Missing At Random (MAR). In most cases, however, missingness in a variable depends on the underlying value of that variable. In this work, we devise model-based methods to consistently estimate mean, variance and covariance given data that are Missing Not At Random (MNAR). While previous work on MNAR data require variables to be discrete, we extend the analysis to continuous variables drawn from Gaussian distributions. We demonstrate the merits of our techniques by comparing it empirically to state of the art software packages.
#2887

Stochastic Anytime Search for Bounding Marginal MAP
Radu Marinescu, Rina Dechter, Alexander Ihler

Bayesian Networks

The Marginal MAP inference task is known to be extremely hard particularly because the evaluation of each complete MAP assignment involves an exact likelihood computation (a combinatorial sum). For this reason, most recent state-of-the-art solvers that focus on computing anytime upper and lower bounds on the optimal value are limited to solving instances with tractable conditioned summation subproblems. In this paper, we develop new search-based bounding schemes for Marginal MAP that produce anytime upper and lower bounds without performing exact likelihood computations. The empirical evaluation demonstrates the effectiveness of our new methods against the current best-performing search-based bounds.
#4071

On Robust Trimming of Bayesian Network Classifiers
YooJung Choi, Guy Van den Broeck

Bayesian Networks

This paper considers the problem of removing costly features from a Bayesian network classifier. We want the classifier to be robust to these changes, and maintain its classification behavior. To this end, we propose a closeness metric between Bayesian classifiers, called the expected classification agreement (ECA). Our corresponding trimming algorithm finds an optimal subset of features and a new classification threshold that maximize the expected agreement, subject to a budgetary constraint. It utilizes new theoretical insights to perform branch-and-bound search in the space of feature sets, while computing bounds on the ECA. Our experiments investigate both the runtime cost of trimming and its effect on the robustness and accuracy of the final classifier.
#5458

(Journal track) Learning Continuous Time Bayesian Networks in Non-stationary Domains
Simone Villa, Fabio Stella

Bayesian Networks

Non-stationary continuous time Bayesian networks are introduced. They allow the parents set of each node in a continuous time Bayesian network to change over time. Structural learning of nonstationary continuous time Bayesian networks is developed under different knowledge settings. A macroeconomic dataset is used to assess the effectiveness of learning non-stationary continuous time Bayesian networks from real-world data.
#2973

Extracting Job Title Hierarchy from Career Trajectories: A Bayesian Perspective
Huang Xu, Zhiwen Yu, Bin Guo, Mingfei Teng, Hui Xiong

Bayesian Networks

A job title usually implies the responsibility and the rank of a job position. While traditional job title analysis has been focused on studying the responsibilities of job titles, this paper attempts to reveal the rank of job titles. Specifically, we propose to extract job title hierarchy from employees' career trajectories. Along this line, we first quantify the Difficulty of Promotion (DOP) from one job title to another by a monotonic transformation of the length of tenure based on the assumption that a longer tenure usually implies a greater difficulty to be promoted. Then, the difference of two job title ranks is defined as a mapping of the DOP observed from job transitions. A Gaussian Bayesian Network (GBN) is adopted to model the joint distribution of the job title ranks and the DOPs in a career trajectory. Furthermore, a stochastic algorithm is developed for inferring the posterior job title rank by a given collection of DOPs in the GBN. Finally, experiments on more than 20 million job trajectories show that the job title hierarchy can be extracted precisely by the proposed method.

Wednesday 18 14:55 - 16:10 ML-SSL - Semi-Supervised Learning (C3)

Chair: Ming Li

#1411

Tri-net for Semi-Supervised Deep Learning
Dong-Dong Chen, Wei Wang, Wei Gao, Zhi-Hua Zhou

Semi-Supervised Learning

Deep neural networks have witnessed great successes in various real applications, but it requires a large number of labeled data for training. In this paper, we propose tri-net, a deep neural network which is able to use massive unlabeled data to help learning with limited labeled data. We consider model initialization, diversity augmentation and pseudo-label editing simultaneously. In our work, we utilize output smearing to initialize modules, use fine-tuning on labeled data to augment diversity and eliminate unstable pseudo-labels to alleviate the influence of suspicious pseudo-labeled data. Experiments show that our method achieves the best performance in comparison with state-of-the-art semi-supervised deep learning methods. In particular, it achieves 8.30% error rate on CIFAR-10 by using only 4000 labeled examples.
#2164

Adversarial Constraint Learning for Structured Prediction
Hongyu Ren, Russell Stewart, Jiaming Song, Volodymyr Kuleshov, Stefano Ermon

Semi-Supervised Learning

Constraint-based learning reduces the burden of collecting labels by having users specify general properties of structured outputs, such as constraints imposed by physical laws. We propose a novel framework for simultaneously learning these constraints and using them for supervision, bypassing the difficulty of using domain expertise to manually specify constraints. Learning requires a black-box simulator of structured outputs, which generates valid labels, but need not model their corresponding inputs or the input-label relationship. At training time, we constrain the model to produce outputs that cannot be distinguished from simulated labels by adversarial training. Providing our framework with a small number of labeled inputs gives rise to a new semi-supervised structured prediction model; we evaluate this model on multiple tasks --- tracking, pose estimation and time series prediction --- and find that it achieves high accuracy with only a small number of labeled inputs. In some cases, no labels are required at all.
#851

Solving Separable Nonsmooth Problems Using Frank-Wolfe with Uniform Affine Approximations
Edward Cheung, Yuying Li

Semi-Supervised Learning

Frank-Wolfe methods (FW) have gained significant interest in the machine learning community due to their ability to efficiently solve large problems that admit a sparse structure (e.g. sparse vectors and low-rank matrices). However the performance of the existing FW method hinges on the quality of the linear approximation. This typically restricts FW to smooth functions for which the approximation quality, indicated by a global curvature measure, is reasonably good. In this paper, we propose a modified FW algorithm amenable to nonsmooth functions, subject to a separability assumption, by optimizing for approximation quality over all affine functions, given a neighborhood of interest. We analyze theoretical properties of the proposed algorithm and demonstrate that it overcomes many issues associated with existing methods in the context of nonsmooth low-rank matrix estimation.
#91

Teaching Semi-Supervised Classifier via Generalized Distillation
Chen Gong, Xiaojun Chang, Meng Fang, Jian Yang

Semi-Supervised Learning

Semi-Supervised Learning (SSL) is able to build reliable classifier with very scarce labeled examples by properly utilizing the abundant unlabeled examples. However, existing SSL algorithms often yield unsatisfactory performance due to the lack of supervision information. To address this issue, this paper formulates SSL as a Generalized Distillation (GD) problem, which treats existing SSL algorithm as a learner and introduces a teacher to guide the learner?s training process. Specifically, the intelligent teacher holds the privileged knowledge that ?explains? the training data but remains unknown to the learner, and the teacher should convey its rich knowledge to the imperfect learner through a specific teaching function. After that, the learner gains knowledge by ?imitating? the output of the teaching function under an optimization framework. Therefore, the learner in our algorithm learns from both the teacher and the training data, so its output can be substantially distilled and enhanced. By deriving the Rademacher complexity and error bounds of the proposed algorithm, the usefulness of the introduced teacher is theoretically demonstrated. The superiority of our algorithm to the related state-of-the-art methods has also been empirically demonstrated by the experiments on different datasets with various sources of privileged knowledge.
#1886

Semi-Supervised Optimal Margin Distribution Machines
Teng Zhang, Zhi-Hua Zhou

Semi-Supervised Learning

Semi-supervised support vector machines is an extension of standard support vector machines with unlabeled instances, and the goal is to find a label assignment of the unlabeled instances, so that the decision boundary has the maximal \textit{minimum margin} on both the original labeled instances and unlabeled instances. Recent studies, however, disclosed that maximizing the minimum margin does not necessarily lead to better performance, and instead, it is crucial to optimize the \textit{margin distribution}. In this paper, we propose a novel approach SODM (Semi-supervised Optimal margin Distribution Machine), which tries to assign the label to unlabeled instances and achieve optimal margin distribution simultaneously. Specifically, we characterize the margin distribution by the first- and second-order statistics, i.e., the margin mean and variance, and extend a stochastic mirror prox method to solve the resultant minimax problem. Extensive experiments on UCI data sets show that SODM is significantly better than compared methods, which verifies the superiority of optimal margin distribution learning.
#2348

Semi-Supervised Multi-Modal Learning with Incomplete Modalities
Yang Yang, De-Chuan Zhan, Xiang-Rong Sheng, Yuan Jiang

Semi-Supervised Learning

In real world applications, data are often with multiple modalities. Researchers proposed the multi-modal learning approaches for integrating the information from different modalities. Most of the previous multi-modal methods assume that training examples are with complete modalities. However, due to the failures of data collection, self-deficiencies and other various reasons, multi-modal examples are usually with incomplete feature representation in real applications. In this paper, the incomplete feature representation issues in multi-modal learning are named as incomplete modalities, and we propose a semi-supervised multi-modal learning method aimed at this incomplete modal issue (SLIM). SLIM can utilize the extrinsic information from unlabeled data against the insufficiencies brought by the incomplete modal issues in a semi-supervised scenario. Besides, the proposed SLIM forms the problem into a unified framework which can be treated as a classifier or clustering learner, and integrate the intrinsic consistencies and extrinsic unlabeled information. As SLIM can extract the most discriminative predictors for each modality, experiments on 15 real world multi-modal datasets validate the effectiveness of our method.

Wednesday 18 14:55 - 16:10 CV-CV1 - Computer Vision 1 (T5)

Chair: Zhou Zhao

#512

Better and Faster: Knowledge Transfer from Multiple Self-supervised Learning Tasks via Graph Distillation for Video Classification
Chenrui Zhang, Yuxin Peng

Computer Vision 1

Video representation learning is a vital problem for classification task. Recently, a promising unsupervised paradigm termed self-supervised learning has emerged, which explores inherent supervisory signals implied in massive data for feature learning via solving auxiliary tasks. However, existing methods in this regard suffer from two limitations when extended to video classification. First, they focus only on a single task, whereas ignoring complementarity among different task-specific features and thus resulting in suboptimal video representation. Second, high computational and memory cost hinders their application in real-world scenarios. In this paper, we propose a graph-based distillation framework to address these problems: (1) We propose logits graph and representation graph to transfer knowledge from multiple self-supervised tasks, where the former distills classifier-level knowledge by solving a multi-distribution joint matching problem, and the latter distills internal feature knowledge from pairwise ensembled representations with tackling the challenge of heterogeneity among different features; (2) The proposal that adopts a teacher-student framework can reduce the redundancy of knowledge learned from teachers dramatically, leading to a lighter student model that solves classification task more efficiently. Experimental results on 3 video datasets validate that our proposal not only helps learn better video representation but also compress model for faster inference.
#1907

When Image Denoising Meets High-Level Vision Tasks: A Deep Learning Approach
Ding Liu, Bihan Wen, Xianming Liu, Zhangyang Wang, Thomas Huang

Computer Vision 1

Conventionally, image denoising and high-level vision tasks are handled separately in computer vision. In this paper, we cope with the two jointly and explore the mutual influence between them. First we propose a convolutional neural network for image denoising which achieves the state-of-the-art performance. Second we propose a deep neural network solution that cascades two modules for image denoising and various high-level tasks, respectively, and use the joint loss for updating only the denoising network via back-propagation. We demonstrate that on one hand, the proposed denoiser has the generality to overcome the performance degradation of different high-level vision tasks. On the other hand, with the guidance of high-level vision information, the denoising network can generate more visually appealing results. To the best of our knowledge, this is the first work investigating the benefit of exploiting image semantics simultaneously for image denoising and high-level vision tasks via deep learning.
#560

Robust Face Sketch Synthesis via Generative Adversarial Fusion of Priors and Parametric Sigmoid
Shengchuan Zhang, Rongrong Ji, Jie Hu, Yue Gao, Chia-Wen Lin

Computer Vision 1

Despite the extensive progress in face sketch synthesis, existing methods are mostly workable under constrained conditions, such as fixed illumination, pose, background and ethnic origin that are hardly to control in real-world scenarios. The key issue lies in the difficulty to use data under fixed conditions to train a model against imaging variations. In this paper, we propose a novel generative adversarial network termed pGAN, which can generate face sketches efficiently using training data under fixed conditions and handle the aforementioned uncontrolled conditions. In pGAN, we embed key photo priors into the process of synthesis and design a parametric sigmoid activation function for compensating illumination variations. Compared to the existing methods, we quantitatively demonstrate that the proposed method can work well on face photos in the wild.
#510

Scanpath Prediction for Visual Attention using IOR-ROI LSTM
Zhenzhong Chen, Wanjie Sun

Computer Vision 1

Predicting scanpath when a certain stimulus is presented plays an important role in modeling visual attention and search. This paper presents a model that integrates convolutional neural network and long short-term memory (LSTM) to generate realistic scanpaths. The core part of the proposed model is a dual LSTM unit, i.e., an inhibition of return LSTM (IOR-LSTM) and a region of interest LSTM (ROI-LSTM), capturing IOR dynamics and gaze shift behavior simultaneously. IOR-LSTM simulates the visual working memory to adaptively integrate and forget scene information. ROI-LSTM is responsible for predicting the next ROI given the inhibited image features. Experimental results indicate that the proposed architecture can achieve superior performance in predicting scanpaths.
#1026

Learning to Write Stylized Chinese Characters by Reading a Handful of Examples
Danyang Sun, Tongzheng Ren, Chongxuan Li, Hang Su, Jun Zhu

Computer Vision 1

Automatically writing stylized characters is an attractive yet challenging task, especially for Chinese characters with complex shapes and structures. Most current methods are restricted to generate stylized characters already present in the training set, but required to retrain the model when generating characters of new styles. In this paper, we develop a novel framework of Style-Aware Variational Auto-Encoder (SA-VAE), which disentangles the content-relevant and style-relevant components of a Chinese character feature with a novel intercross pair-wise optimization method. In this case, our method can generate Chinese characters flexibly by reading a few examples. Experiments demonstrate that our method has a powerful one-shot/few-shot generalization ability by inferring the style representation, which is the first attempt to learn to write new-style Chinese characters by observing only one or a few examples.
#802

DehazeGAN: When Image Dehazing Meets Differential Programming
Hongyuan Zhu, Xi Peng, Vijay Chandrasekhar, Liyuan Li, Joo-Hwee Lim

Computer Vision 1

Single image dehazing has been a classic topic in computer vision for years. Motivated by the atmospheric scattering model, the key to satisfactory single image dehazing relies on an estimation of two physical parameters, i.e., the global atmospheric light and the transmission coefficient. Most existing methods employ a two-step pipeline to estimate these two parameters with heuristics which accumulate errors and compromise dehazing quality. Inspired by differentiable programming, we re-formulate the atmospheric scattering model into a novel generative adversarial network (DehazeGAN). Such a reformulation and adversarial learning allow the two parameters to be learned simultaneously and automatically from data by optimizing the final dehazing performance so that clean images with faithful color and structures are directly produced. Moreover, our reformulation also greatly improves the GAN’s interpretability and quality for single image dehazing. To the best of our knowledge, our method is one of the first works to explore the connection among generative adversarial models, image dehazing, and differentiable programming, which advance the theories and application of these areas. Extensive experiments on synthetic and realistic data show that our method outperforms state-of-the-art methods in terms of PSNR, SSIM, and subjective visual quality.

Wednesday 18 15:00 - 16:00 Industry Day (A4)

Industry Day - Session 2a

Industry Day

Show details

Wednesday 18 16:40 - 17:40 Inudstry Day (A4)

Industry Day - Session 2b

Inudstry Day

Show details

Wednesday 18 16:40 - 18:20 DEMOS2 - Demos Talks 2: Machine Learning, Natural Language Processing, Knowledge Representation (VICTORIA)

Chair: Nevin L. Zhang

#5309

Hatebusters: A Web Application for Actively Reporting YouTube Hate Speech
Antonios Anagnostou, Ioannis Mollas, Grigorios Tsoumakas

Demos Talks 2: Machine Learning, Natural Language Processing, Knowledge Representation

Hatebusters is a web application for actively reporting YouTube hate speech, aiming to establish an online community of volunteer citizens. Hatebusters searches YouTube for videos with potentially hateful comments, scores their comments with a classifier trained on human-annotated data and presents users those comments with the highest probability of being hate speech. It also employs gamification elements, such as achievements and leaderboards, to drive user engagement.
#5310

A Wearable Device for Online and Long-Term ECG Monitoring
Marco Longoni, Diego Carrera, Beatrice Rossi, Pasqualina Fragneto, Marco Pessione, Giacomo Boracchi

Demos Talks 2: Machine Learning, Natural Language Processing, Knowledge Representation

We present a prototype wearable device able to perform online and long-term monitoring of ECG signals, and detect anomalous heartbeats such as arrhythmias. Our solution is based on user-specific dictionaries which characterizes the morphology of normal heartbeats and are learned every time the device is positioned. Anomalies are detected via an optimized sparse coding procedure, which assesses the conformance of each heartbeat to the user-specific dictionary. The dictionaries are adapted during online monitoring, to track heart rate variations occurring during everyday activities. Perhaps surprisingly, dictionary adaptation can be successfully performed by transformations that are user-independent and learned from large datasets of ECG signals.
#5316

Semantic Representation of Data Science Programs
Evan Patterson, Ioana Baldini, Aleksandra Mojsilović, Kush R. Varshney

Demos Talks 2: Machine Learning, Natural Language Processing, Knowledge Representation

Your computer is continuously executing programs, but does it really understand them? Not in any meaningful sense. That burden falls upon human knowledge workers, who are increasingly asked to write and understand code. They would benefit greatly from intelligent tools that reveal the connections between their code and its subject matter. Towards this prospect, we present an AI system that forms semantic representations of computer programs, using techniques from knowledge representation and program analysis. These representations are created through a novel algorithm for the semantic enrichment of dataflow graphs. We illustrate its workings with examples from the field of data science. The algorithm is undergirded by a new ontology language for modeling computer programs and a new ontology about data science, written in this language.
#5319

Using a Deep Learning Dialogue Research Toolkit in a Multilingual Multidomain Practical Application
Graham Wilcock

Demos Talks 2: Machine Learning, Natural Language Processing, Knowledge Representation

The demo shows a practical application of an open-source research toolkit developed by University of Cambridge. The toolkit (PyDial) supports research on deep reinforcement learning for multi-domain dialogues. The application (CityTalk) is a spoken dialogue system for robots that give information to tourists about local hotels and restaurants. We had a very positive experience using the toolkit, but in a few areas we decided to do things our own way.
#5329

Balanced News Using Constrained Bandit-based Personalization
Sayash Kapoor, Vijay Keswani, Nisheeth K. Vishnoi, L. Elisa Celis

Demos Talks 2: Machine Learning, Natural Language Processing, Knowledge Representation

We present a prototype for a news search engine that presents balanced viewpoints across liberal and conservative articles with the goal of depolarizing content and allowing users to escape their filter bubble. The balancing is done according to flexible user-defined constraints, and leverages recent advances in constrained bandit optimization. We showcase our balanced news feed by displaying it side-by-side with the news feed produced by a traditional (polarized) feed.
#5323

Repairing ASR output by Artificial Development and Ontology based Learning
C. Anantaram, Amit Sangroya, Mrinal Rawat, Aishwarya Chhabra

Demos Talks 2: Machine Learning, Natural Language Processing, Knowledge Representation

General purpose automatic speech recognition (gpASR) systems such as Google, Watson, etc. sometimes output inaccurate sentences when used in a domain specific scenario as it may not have had enough training samples for that particular domain and context. Further, the accent of the speaker and the environmental conditions in which the speaker speaks a sentence may influence the speech engine to recognize certain words inaccurately. Many approaches to improve the accuracy of ASR output exist. However, in the context of a domain and the environment in which a speaker speaks the sentences, gpASR output needs a lot of improvement in order to provide effective speech interfaces to domain-specific systems. In this paper, we demonstrate a method that combines bio-inspired artifi- cial development (ArtDev) with machine learning (ML) approaches to repair the output of a gpASR. Our method factors in the environment to tailor the repair process.
#5349

Automated Reasoning for City Infrastructure Maintenance Decision Support
Lijun Wei, Derek R. Magee, Vania Dimitrova, Barry Clarke, Heshan Du, Quratul-ain Mahesar, Kareem Al Ammari, Anthony G. Cohn

Demos Talks 2: Machine Learning, Natural Language Processing, Knowledge Representation

We present an interactive decision support system for assisting city infrastructure inter-asset management. It combines real-time site specific data retrieval, a knowledge base co-created with domain experts and an inference engine capable of predicting potential consequences and risks resulting from the available data and knowledge. The system can give explanations of each consequence, cope with incomplete and uncertain data by making assumptions about what might be the worst case scenario, and making suggestions for further investigation. This demo presents multiple real-world scenarios, and demonstrates how modifying assumptions (parameter values) can lead to different consequences.

Wednesday 18 16:40 - 18:20 KR-KBD - Knowledge, Belief, Diagnosis, Abduction (C7)

Chair: Brendan Juba

#2005

Belief Update in the Horn Fragment
Nadia Creignou, Adrian Haret, Odile Papini, Stefan Woltran

Knowledge, Belief, Diagnosis, Abduction

In line with recent work on belief change in fragments of propositional logic, we study belief update in the Horn fragment. We start from the standard KM postulates used to axiomatize belief update operators; these postulates lend themselves to semantic characterizations in terms of partial (resp. total) preorders on possible worlds. Since the Horn fragment is not closed under disjunction, the standard postulates have to be adapted for the Horn fragment. Moreover, a restriction on the preorders (i.e., Horn compliance) and additional postulates are needed to obtain sensible characterizations for the Horn fragment, and this leads to our main contribution: a representation result which shows that the class of update operators captured by Horn compliant partial (resp. total) preorders over possible worlds is precisely that given by the adapted and augmented Horn update postulates. With these results at hand, we provide concrete Horn update operators and are able to shed light on Horn revision operators based on partial preorders.
#4518

The Complexity of Limited Belief Reasoning—The Quantifier-Free Case
Yijia Chen, Abdallah Saffidine, Christoph Schwering

Knowledge, Belief, Diagnosis, Abduction

The classical view of epistemic logic is that an agent knows all the logical consequences of their knowledge base. This assumption of logical omniscience is often unrealistic and makes reasoning computationally intractable. One approach to avoid logical omniscience is to limit reasoning to a certain belief level, which intuitively measures the reasoning "depth".This paper investigates the computational complexity of reasoning with belief levels. First we show that while reasoning remains tractable if the level is constant, the complexity jumps to PSPACE-complete -- that is, beyond classical reasoning -- when the belief level is part of the input. Then we further refine the picture using parameterized complexity theory to investigate how the belief level and the number of non-logical symbols affect the complexity.
#3232

Single-Shot Epistemic Logic Program Solving
Manuel Bichler, Michael Morak, Stefan Woltran

Knowledge, Belief, Diagnosis, Abduction

Epistemic Logic Programs (ELPs) are an extension of Answer Set Programming (ASP) with epistemic operators that allow for a form of meta-reasoning, that is, reasoning over multiple possible worlds. Existing ELP solving approaches generally rely on making multiple calls to an ASP solver in order to evaluate the ELP. However, in this paper, we show that there also exists a direct translation from ELPs into non-ground ASP with bounded arity. The resulting ASP program can thus be solved in a single shot. We then implement this encoding method, using recently proposed techniques to handle large, non-ground ASP rules, into a prototype ELP solving system. This solver exhibits competitive performance on a set of ELP benchmark instances.
#3019

Leveraging Qualitative Reasoning to Improve SFL
Alexandre Perez, Rui Abreu

Knowledge, Belief, Diagnosis, Abduction

Spectrum-based fault localization (SFL) correlates a system's components with observed failures. By reasoning about coverage, SFL allows for a lightweight way of pinpointing faults. This abstraction comes at the cost of missing certain faults, such as errors of omission, and failing to provide enough contextual information to explain why components are considered suspicious. We propose an approach, named Q-SFL, that leverages qualitative reasoning to augment the information made available to SFL techniques. It qualitatively partitions system components, and treats each qualitative state as a new SFL component to be used when diagnosing. Our empirical evaluation shows that augmenting SFL with qualitative components can improve diagnostic accuracy in 54% of the considered real-world subjects.
#714

Abducing Relations in Continuous Spaces
Taisuke Sato, Katsumi Inoue, Chiaki Sakama

Knowledge, Belief, Diagnosis, Abduction

We propose a new approach to abduction, i.e., non-deductive inference to find a hypothesis H for an observation O such that H,KB |- O where KB is background knowledge. We reformulate it linear algebraically in vector spaces to abduce ``relations'', not logical formulas, to realize approximate but scalable abduction that can deal with web-scale knowledge bases. More specifically we consider the problem of abducing relations for Datalog programs with binary predicates. We treat two cases, the non-recursive case and the recursive case. In the non-recursive case, given r1(X,Y) and r3(X,Z), we abduce r2(Y,Z) so that r3(X,Z) <= r1(X,Y)&r2(Y,Z) approximately holds, by computing a matrix R2 that approximately satisfies a matrix equation R3 = min1(R1R2) containing a nonlinear function min1(x). Here R1, R2 andR3 encode as adjacency matrix r1(X,Y), r2(Y,Z) and r3(Y,Z) respectively. We apply this matrix-based abduction to rule discovery and relation discovery in a knowledge graph. The recursive case is mathematically more involved and computationally more difficult but solvable by deriving a recursive matrix equation and solving it. We illustrate concrete recursive cases including a transitive closure relation.
#3509

An Operational Semantics for a Fragment of PRS
Lavindra de Silva, Felipe Meneguzzi, Brian Logan

Knowledge, Belief, Diagnosis, Abduction

The Procedural Reasoning System (PRS) is arguably the first implementation of the Belief--Desire--Intention (BDI) approach to agent programming. PRS remains extremely influential, directly or indirectly inspiring the development of subsequent BDI agent programming languages. However, perhaps surprisingly given its centrality in the BDI paradigm, PRS lacks a formal operational semantics, making it difficult to determine its expressive power relative to other agent programming languages. This paper takes a first step towards closing this gap, by giving a formal semantics for a significant fragment of PRS. We prove key properties of the semantics relating to PRS-specific programming constructs, and show that even the fragment of PRS we consider is strictly more expressive than the plan constructs found in typical BDI languages.
#5469

(Journal track) Prime Implicate Generation in Equational Logic (extended abstract)
Mnacho Echenim, Nicolas Peltier, Sophie Tourret

Knowledge, Belief, Diagnosis, Abduction

A procedure is proposed to efficiently generate sets of ground implicates of first-order formulas with equality. It is based on a tuning of the superposition calculus, enriched with rules that add new hypotheses on demand during the proof search. Experimental results are presented, showing that the proposed approach is more efficient than state-of-the-art systems.
#5475

(Journal track) Preference-Based Inconsistency Management in Multi-Context Systems
Thomas Eiter, Antonius Weinzierl

Knowledge, Belief, Diagnosis, Abduction

Establishing information exchange between existing knowledge-based systems can lead to devastating inconsistency. Automatic resolution of inconsistency often is unsatisfactory, because any modification of the information flow may lead to bad or even dangerous conclusions. Methods to identify and select preferred repairs of inconsistency are thus needed. In this work, we leverage the expressive power and generality of Multi-Context Systems (MCS), a formalism for information exchange, to select most preferred repairs, by use of a meta-reasoning transformation. As for computational complexity, finding preferred repairs is not higher than the base case; finding most-preferred repairs is higher, yet worst-case optimal.

Wednesday 18 16:40 - 18:20 MAS-AGT2 - Algorithmic Game Theory: Noncooperative Games (C8)

Chair: Bo An

#460

When Does Diversity of Agent Preferences Improve Outcomes in Selfish Routing?
Richard Cole, Thanasis Lianeas, Evdokia Nikolova

Algorithmic Game Theory: Noncooperative Games

We seek to understand when heterogeneity in agent preferences yields improved outcomes in terms of overall cost. That this might be hoped for is based on the common belief that diversity is advantageous in many multi-agent settings. We investigate this in the context of routing. Our main result is a sharp characterization of the network settings in which diversity always helps, versus those in which it is sometimes harmful. Specifically, we consider routing games, where diversity arises in the way that agents trade-off two criteria (such as time and money, or, in the case of stochastic delays, expectation and variance of delay). Our main contributions are: 1) A participant-oriented measure of cost in the presence of agent diversity; 2) A full characterization of those network topologies for which diversity always helps, for all latency functions and demands.
#3383

Trembling-Hand Perfection in Extensive-Form Games with Commitment
Gabriele Farina, Alberto Marchesi, Christian Kroer, Nicola Gatti, Tuomas Sandholm

Algorithmic Game Theory: Noncooperative Games

We initiate the study of equilibrium refinements based on trembling-hand perfection in extensive-form games with commitment strategies, that is, where one player commits to a strategy first. We show that the standard strong (and weak) Stackelberg equilibria are not suitable for trembling-hand perfection, because the limit of a sequence of such strong (weak) Stackelberg commitment strategies of a perturbed game may not be a strong (weak) Stackelberg equilibrium itself. However, we show that the universal set of all Stackelberg equilibria (i.e., those that are optimal for at least some follower response function) is natural for trembling- hand perfection: it does not suffer from the problem above. We also prove that determining the existence of a Stackelberg equilibrium--refined or not--that gives the leader expected value at least v is NP-hard. This significantly extends prior complexity results that were specific to strong Stackelberg equilibrium.
#3475

Efficient Computation of Approximate Equilibria in Discrete Colonel Blotto Games
Dong Quan Vu, Patrick Loiseau, Alonso Silva

Algorithmic Game Theory: Noncooperative Games

The Colonel Blotto game is a famous game commonly used to model resource allocation problems in many domains ranging from security to advertising. Two players distribute a fixed budget of resources on multiple battlefields to maximize the aggregate value of battlefields they win, each battlefield being won by the player who allocates more resources to it. The continuous version of the game---where players can choose any fractional allocation---has been extensively studied, albeit only with partial results to date. Recently, the discrete version---where allocations can only be integers---started to gain traction and algorithms were proposed to compute the equilibrium in polynomial time; but these remain computationally impractical for large (or even moderate) numbers of battlefields. In this paper, we propose an algorithm to compute very efficiently an approximate equilibrium for the discrete Colonel Blotto game with many battlefields. We provide a theoretical bound on the approximation error as a function of the game's parameters. We also propose an efficient dynamic programming algorithm in order to compute for each game instance the actual value of the error. We perform numerical experiments that show that the proposed strategy provides a fast and good approximation to the equilibrium even for moderate numbers of battlefields
#3862

An FPTAS for Computing Nash Equilibrium in Resource Graph Games
Hau Chan, Albert Xin Jiang

Algorithmic Game Theory: Noncooperative Games

We consider the problem of computing a mixed-strategy Nash equilibrium (MSNE) in resource graph games (RGGs), a compact representation for games with an exponential number of strategies. In an RGG, each player's pure strategy is a subset of resources, represented by a binary vector, and her pure strategy set is represented compactly using a set of linear inequality constraints. Given the pure strategies of the players, each player's utility depends on the resource graph and the numbers of times the neighboring resources are used. RGGs are general enough to capture a wide variety of games studied in literature, including congestion games and security games.In this paper, we provide the first Fully Polytnomial Time Approximation Scheme (FPTAS) for computing an MSNE in any symmetric multilinear RGG where its constraint moralized resource graph (a graph formed between the moralized resource graph and the constraints defining the strategy polytope) has bounded treewidth. Our FPTAS can be generalized to compute optimal MSNE, and to games with a constant number of player types. As a consequence, our FPTAS provides new approximation results for security games, network congestion games, and bilinear games.
#3943

Designing the Game to Play: Optimizing Payoff Structure in Security Games
Zheyuan Ryan Shi, Ziye Tang, Long Tran-Thanh, Rohit Singh, Fei Fang

Algorithmic Game Theory: Noncooperative Games

We study Stackelberg Security Games where the defender, in addition to allocating defensive resources to protect targets from the attacker, can strategically manipulate the attacker’s payoff under budget constraints in weighted L^p-norm form regarding the amount of change. For the case of weighted L^1-norm constraint, we present (i) a mixed integer linear program-based algorithm with approximation guarantee; (ii) a branch-and-bound based algorithm with improved efficiency achieved by effective pruning; (iii) a polynomial time approximation scheme for a special but practical class of problems. In addition, we show that problems under budget constraints in L^0 and weighted L^\infty-norm form can be solved in polynomial time.
#4182

The Price of Usability: Designing Operationalizable Strategies for Security Games
Sara Marie Mc Carthy, Corine M. Laan, Kai Wang, Phebe Vayanos, Arunesh Sinha, Milind Tambe

Algorithmic Game Theory: Noncooperative Games

We consider the problem of allocating scarce security resources among heterogeneous targets to thwart a possible attack. It is well known that deterministic solutions to this problem being highly predictable are severely suboptimal. To mitigate this predictability, the game-theoretic security game model was proposed which randomizes over pure (deterministic) strategies, causing confusion in the adversary. Unfortunately, such mixed strategies typically involve randomizing over a large number of strategies, requiring security personnel to be familiar with numerous protocols, making them hard to operationalize. Motivated by these practical considerations, we propose an easy to use approach for computing strategies that are easy to operationalize and that bridge the gap between the static solution and the optimal mixed strategy. These strategies only randomize over an optimally chosen subset of pure strategies whose cardinality is selected by the defender, enabling them to conveniently tune the trade-off between ease of operationalization and efficiency using a single design parameter. We show that the problem of computing such operationalizable strategies is NP-hard, formulate it as a mixed-integer optimization problem, provide an algorithm for computing epsilon-optimal equilibria, and an efficient heuristic. We evaluate the performance of our approach on the problem of screening for threats at airport checkpoints and show that the Price of Usability, i.e., the loss in optimality to obtain a strategy that is easier to operationalize, is typically not high.
#3057

Leadership in Singleton Congestion Games
Alberto Marchesi, Stefano Coniglio, Nicola Gatti

Algorithmic Game Theory: Noncooperative Games

We study Stackelberg games where the underlying structure is a congestion game. We recall that, while leadership in 2-player games has been widely investigated, only few results are known when the number of players is three or more. The intractability of finding a Stackelberg equilibrium (SE) in normal-form and polymatrix games is among them. In this paper, we focus on congestion games in which each player can choose a single resource (a.k.a. singleton congestion games) and a player acts as leader. We show that, without further assumptions, finding an SE when the followers break ties in favor of the leader is not in Poly-APX, unless P = NP. Instead, under the assumption that every player has access to the same resources and that the cost functions are monotonic, we show that an SE can be computed efficiently when the followers break ties either in favor or against the leader.

Wednesday 18 16:40 - 18:20 PS-PS - Planning and Scheduling (K2)

Chair: Sylvie Thiébaux

#305

Variable-Delay Controllability
Nikhil Bhargava, Christian Muise, Brian Williams

Planning and Scheduling

In temporal planning, agents must schedule a set of events satisfying a set of predetermined constraints. These scheduling problems become more difficult when the duration of certain actions are outside the agent's control. Delay controllability is the generalized notion of whether a schedule can be constructed in the face of uncertainty if the agent eventually learns when events occur. Our work introduces the substantially more complex setting of determining variable-delay controllability, where an agent learns about events after some unknown but bounded amount of time has passed. We provide an efficient O(n^3) variable-delay controllability checker and show how to create an execution strategy for variable-delay controllability problems. To our knowledge, these essential capabilities are absent from existing controllability checking algorithms. We conclude by providing empirical evaluations of the quality of variable-delay controllability results as compared to approximations that use fixed delays to model the same problems.
#313

Hierarchical Expertise Level Modeling for User Specific Contrastive Explanations
Sarath Sreedharan, Siddharth Srivastava, Subbarao Kambhampati

Planning and Scheduling

There is a growing interest within the AI research community in developing autonomous systems capable of explaining their behavior to users. However, the problem of computing explanations for users of different levels of expertise has received little research attention. We propose an approach for addressing this problem by representing the user's understanding of the task as an abstraction of the domain model that the planner uses. We present algorithms for generating minimal explanations in cases where this abstract human model is not known. We reduce the problem of generating an explanation to a search over the space of abstract models and show that while the complete problem is NP-hard, a greedy algorithm can provide good approximations of the optimal solution. We also empirically show that our approach can efficiently compute explanations for a variety of problems.
#3470

Novel Structural Parameters for Acyclic Planning Using Tree Embeddings
Christer Bäckström, Peter Jonsson, Sebastian Ordyniak

Planning and Scheduling

We introduce two novel structural parameters for acyclic planning (planning restricted to instances with acyclic causal graphs): up-depth and down-depth. We show that cost-optimal acyclic planning restricted to instances with bounded domain size and bounded up- or down-depth can be solved in polynomial time. For example, many of the tractable subclasses based on polytrees are covered by our result. We analyze the parameterized complexity of planning with bounded up- and down-depth: in a certain sense, down-depth has better computational properties than up-depth. Finally, we show that computing up- and down-depth are fixed-parameter tractable problems, just as many other structural parameters that are used in computer science. We view our results as a natural step towards understanding the complexity of acyclic planning with bounded treewidth and other parameters.
#3883

Completeness-Preserving Dominance Techniques for Satisficing Planning
Álvaro Torralba

Planning and Scheduling

Dominance pruning methods have recently been introduced for optimal planning. They compare states based on their goal distance to prune those that can be proven to be worse than others. In this paper, we introduce dominance techniques for satisficing planning. We extend the definition of dominance, showing that being closer to the goal is not a prerequisite for dominance in the satisficing setting. We develop a new method to automatically find dominance relations in which a state dominates another if it has achieved more serializable sub-goals. We take advantage of dominance relations in different ways; while in optimal planning their usage focused on dominance pruning and action selection, we also use it to guide enforced hill-climbing search, resulting in a complete algorithm.
#4412

Scheduling under Uncertainty: A Query-based Approach
Luciana Arantes, Evripidis Bampis, Alexander Kononov, Manthos Letsios, Giorgio Lucarelli, Pierre Sens

Planning and Scheduling

We consider a single machine, a set of unit-time jobs, and a set of unit-time errors. We assume that the time-slot at which each error will occur is not known in advance but, for every error, there exists an uncertainty area during which the error will take place. In order to find if the error occurs in a specific time-slot, it is necessary to issue a query to it. In this work, we study two problems: (i) the error-query scheduling problem, whose aim is to reveal enough error-free slots with the minimum number of queries, and (ii) the lexicographic error-query scheduling problem where we seek the earliest error-free slots with the minimum number of queries. We consider both the off-line and the on-line versions of the above problems. In the former, the whole instance and its characteristics are known in advance and we give a polynomial-time algorithm for the error-query scheduling problem. In the latter, the adversary has the power to decide, in an on-line way, the time-slot of appearance for each error. We propose then both lower bounds and algorithms whose competitive ratios asymptotically match these lower bounds.
#4067

Learning to Infer Final Plans in Human Team Planning
Joseph Kim, Matthew E. Woicik, Matthew C. Gombolay, Sung-Hyun Son, Julie A. Shah

Planning and Scheduling

We envision an intelligent agent that analyzes conversations during human team meetings in order to infer the team’s plan, with the purpose of providing decision support to strengthen that plan. We present a novel learning technique to infer teams' final plans directly from a processed form of their planning conversation. Our method employs reinforcement learning to train a model that maps features of the discussed plan and patterns of dialogue exchange among participants to a final, agreed-upon plan. We employ planning domain models to efficiently search the large space of possible plans, and the costs of candidate plans serve as the reinforcement signal. We demonstrate that our technique successfully infers plans within a variety of challenging domains, with higher accuracy than prior art. With our domain-independent feature set, we empirically demonstrate that our model trained on one planning domain can be applied to successfully infer team plans within a novel planning domain.
#3285

Emergency Response Optimization using Online Hybrid Planning
Durga Harish Dayapule, Aswin Raghavan, Prasad Tadepalli, Alan Fern

Planning and Scheduling

This paper poses the planning problem faced by the dispatcher responding to urban emergencies as a Hybrid (Discrete and Continuous) State and Action Markov Decision Process (HSA-MDP). We evaluate the performance of three online planning algorithms based on hindsight optimization for HSA- MDPs on real-world emergency data in the city of Corvallis, USA. The approach takes into account and respects the policy constraints imposed by the emergency department. We show that our algorithms outperform a heuristic policy commonly used by dispatchers by significantly reducing the average response time as well as lowering the fraction of unanswered calls. Our results give new insights into the problem such as withholding of resources for future emergencies in some situations.
#5456

(Journal track) Fact-Alternating Mutex Groups for Classical Planning
Daniel Fišer, Antonín Komenda

Planning and Scheduling

Mutex groups are defined in the context of STRIPS planning as sets of facts out of which, maximally, one can be true in any state reachable from the initial state. This work provides a complexity analysis showing that inference of mutex groups is as hard as planning itself (PSPACE-Complete) and it also shows a tight relationship between mutex groups and graph cliques. Furthermore, we propose a new type of mutex group called a fact-alternating mutex group (fam-group) of which inference is NP-Complete. We introduce an algorithm for the inference of fam-groups based on integer linear programming that is complete with respect to the maximal fam-groups and we demonstrate that fam-groups can be beneficial in the translation of planning tasks into finite domain representation, for the detection of dead-end state and for the pruning of spurious operators. The experimental evaluation of the pruning algorithm shows a substantial increase in a number of solved tasks in domains from the optimal deterministic track of the last two planning competitions (IPC 2011 and 2014).

Wednesday 18 16:40 - 18:20 Panel (T2)

The AI Strategy for Europe

Panel

Show details

Wednesday 18 16:40 - 18:20 NLP-IE - Information Extraction (T1)

Chair: Guilin Qi

#3682

Empirical Analysis of Foundational Distinctions in Linked Open Data
Luigi Asprino, Valerio Basile, Paolo Ciancarini, Valentina Presutti

Information Extraction

The Web and its Semantic extension (i.e. Linked Open Data) contain open global-scale knowledge and make it available to potentially intelligent machines that want to benefit from it. Nevertheless, most of Linked Open Data lack ontological distinctions and have sparse axiomatisation. For example, distinctions such as whether an entity is inherently a class or an individual, or whether it is a physical object or not, are hardly expressed in the data, although they have been largely studied and formalised by foundational ontologies (e.g. DOLCE, SUMO). These distinctions belong to common sense too, which is relevant for many artificial intelligence tasks such as natural language understanding, scene recognition, and the like. There is a gap between foundational ontologies, that often formalise or are inspired by pre-existing philosophical theories and are developed with a top-down approach, and Linked Open Data that mostly derive from existing databases or crowd-based effort (e.g. DBpedia, Wikidata). We investigate whether machines can learn foundational distinctions over Linked Open Data entities, and if they match common sense. We want to answer questions such as “does the DBpedia entity for dog refer to a class or to an instance?”. We report on a set of experiments based on machine learning and crowdsourcing that show promising results.
#2495

Improving Low Resource Named Entity Recognition using Cross-lingual Knowledge Transfer
Xiaocheng Feng, Xiachong Feng, Bing Qin, Zhangyin Feng, Ting Liu

Information Extraction

Neural networks have been widely used for high resource language (e.g. English) named entity recognition (NER) and have shown state-of-the-art results.However, for low resource languages, such as Dutch, Spanish, due to the limitation of resources and lack of annotated data, taggers tend to have lower performances.To narrow this gap, we propose three novel strategies to enrich the semantic representations of low resource languages: we first develop neural networks to improve low resource word representations by knowledge transfer from high resource language using bilingual lexicons. Further, a lexicon extension strategy is designed to address out-of lexicon problem by automatically learning semantic projections.Thirdly, we regard word-level entity type distribution features as an external language-independent knowledge and incorporate them into our neural architecture. Experiments on two low resource languages (including Dutch and Spanish) demonstrate the effectiveness of these additional semantic representations (average 4.8\% improvement). Moreover, on Chinese OntoNotes 4.0 dataset, our approach achieved an F-score of 83.07\% with 2.91\% absolute gain compared to the state-of-the-art results.
#4195

Exploring Encoder-Decoder Model for Distant Supervised Relation Extraction
Sen Su, Ningning Jia, Xiang Cheng, Shuguang Zhu, Ruiping Li

Information Extraction

In this paper, we present an encoder-decoder model for distant supervised relation extraction. Given an entity pair and its sentence bag as input, in the encoder component, we employ the convolutional neural network to extract the features of the sentences in the sentence bag and merge them into a bag representation. In the decoder component, we utilize the long short-term memory network to model relation dependencies and predict the target relations in a sequential manner. In particular, to enable the sequential prediction of relations, we introduce a measure to quantify the amounts of information the relations take in their sentence bag, and use such information to determine the order of the relations of a sentence bag during model training. Moreover, we incorporate the attention mechanism into our model to dynamically adjust the bag representation to reduce the impact of sentences whose corresponding relations have been predicted. Extensive experiments on a popular dataset show that our model achieves significant improvement over state-of-the-art methods.
#666

Joint Extraction of Entities and Relations Based on a Novel Graph Scheme
Shaolei Wang, Yue Zhang, Wanxiang Che, Ting Liu

Information Extraction

Both entity and relation extraction can benefit from being performed jointly, allowing each task to correct the errors of the other. Most existing neural joint methods extract entities and relations separately and achieve joint learning through parameter sharing, leading to a drawback that information between output entities and relations cannot be fully exploited. In this paper, we convert the joint task into a directed graph by designing a novel graph scheme and propose a transition-based approach to generate the directed graph incrementally, which can achieve joint learning through joint decoding. Our method can model underlying dependencies not only between entities and relations, but also between relations. Experiments on NewYork Times (NYT) corpora show that our approach outperforms the state-of-the-art methods.
#2099

Ensemble Neural Relation Extraction with Adaptive Boosting
Dongdong Yang, Senzhang Wang, Zhoujun Li

Information Extraction

Relation extraction has been widely studied to extract new relational facts from open corpus. Previous relation extraction methods are faced with the problem of wrong labels and noisy data, which substantially decrease the performance of the model. In this paper, we propose an ensemble neural network model - Adaptive Boosting LSTMs with Attention, to more effectively perform relation extraction. Specifically, our model first employs the recursive neural network LSTMs to embed each sentence. Then we import attention into LSTMs by considering that the words in a sentence do not contribute equally to the semantic meaning of the sentence. Next via adaptive boosting, we build strategically several such neural classifiers. By ensembling multiple such LSTM classifiers with adaptive boosting, we could build a more effective and robust joint ensemble neural networks based relation extractor. Experiment results on real dataset demonstrate the superior performance of the proposed model, improving F1-score by about 8% compared to the state-of-the-art models.
#1320

Event Factuality Identification via Generative Adversarial Networks with Auxiliary Classification
Zhong Qian, Peifeng Li, Yue Zhang, Guodong Zhou, Qiaoming Zhu

Information Extraction

Event factuality identification is an important semantic task in NLP. Traditional research heavily relies on annotated texts. This paper proposes a two-step framework, first extracting essential factors related with event factuality from raw texts as the input, and then identifying the factuality of events via a Generative Adversarial Network with Auxiliary Classification (AC-GAN). The use of AC-GAN allows the model to learn more syntactic information and address the imbalance among factuality values. Experimental results on FactBank show that our method significantly outperforms several state-of-the-art baselines, particularly on events with embedded sources, speculative and negative factuality values.
#2737

Domain Adaptation via Tree Kernel Based Maximum Mean Discrepancy for User Consumption Intention Identification
Xiao Ding, Bibo Cai, Ting Liu, Qiankun Shi

Information Extraction

Identifying user consumption intention from social media is of great interests to downstream applications. Since such task is domain-dependent, deep neural networks have been applied to learn transferable features for adapting models from a source domain to a target domain. A basic idea to solve this problem is reducing the distribution difference between the source domain and the target domain such that the transfer error can be bounded. However, the feature transferability drops dramatically in higher layers of deep neural networks with increasing domain discrepancy. Hence, previous work has to use a few target domain annotated data to train domain-specific layers. In this paper, we propose a deep transfer learning framework for consumption intention identification, to reduce the data bias and enhance the transferability in domain-specific layers. In our framework, the representation of the domain-specific layer is mapped to a reproducing kernel Hilbert space, where the mean embeddings of different domain distributions can be explicitly matched. By using an optimal tree kernel method for measuring the mean embedding matching, the domain discrepancy can be effectively reduced. The framework can learn transferable features in a completely unsupervised manner with statistical guarantees. Experimental results on five different domain datasets show that our approach dramatically outperforms state-of-the-art baselines, and it is general enough to be applied to more scenarios. The source code and datasets can be found at http://ir.hit.edu.cn/$\scriptsize{\sim}$xding/index\_english.htm.

Wednesday 18 16:40 - 18:20 ML-CLA - Classification (K11)

Chair: Dragos Margineantu

#1390

Positive and Unlabeled Learning via Loss Decomposition and Centroid Estimation
Hong Shi, Shaojun Pan, Jian Yang, Chen Gong

Classification

Positive and Unlabeled learning (PU learning) aims to train a binary classifier based on only positive and unlabeled examples, where the unlabeled examples could be either positive or negative. The state-of-the-art algorithms usually cast PU learning as a cost-sensitive learning problem and impose distinct weights to different training examples via a manual or automatic way. However, such weight adjustment or estimation can be inaccurate and thus often lead to unsatisfactory performance. Therefore, this paper regards all unlabeled examples as negative, which means that some of the original positive data are mistakenly labeled as negative. By doing so, we convert PU learning into the risk minimization problem in the presence of false negative label noise, and propose a novel PU learning algorithm termed ?Loss Decomposition and Centroid Estimation? (LDCE). By decomposing the hinge loss function into two parts, we show that only the second part is influenced by label noise, of which the adverse effect can be reduced by estimating the centroid of negative examples. We intensively validate our approach on synthetic dataset, UCI benchmark datasets and real-world datasets, and the experimental results firmly demonstrate the effectiveness of our approach when compared with other state-of-the-art PU learning methodologies.
#1551

R-SVM+: Robust Learning with Privileged Information
Xue Li, Bo Du, Chang Xu, Yipeng Zhang, Lefei Zhang, Dacheng Tao

Classification

In practice, the circumstance that training and test data are clean is not always satisfied. The performance of existing methods in the learning using privileged information (LUPI) paradigm may be seriously challenged, due to the lack of clear strategies to address potential noises in the data. This paper proposes a novel Robust SVM+ (RSVM+) algorithm based on a rigorous theoretical analysis. Under the SVM+ framework in the LUPI paradigm, we study the lower bound of perturbations of both example feature data and privileged feature data, which will mislead the model to make wrong decisions. By maximizing the lower bound, tolerance of the learned model over perturbations will be increased. Accordingly, a novel regularization function is introduced to upgrade a variant form of SVM+. The objective function of RSVM+ is transformed into a quadratic programming problem, which can be efficiently optimized using off-the-shelf solvers. Experiments on real-world datasets demonstrate the necessity of studying robust SVM+ and the effectiveness of the proposed algorithm.
#1911

Reliable Multi-class Classification based on Pairwise Epistemic and Aleatoric Uncertainty
Vu-Linh Nguyen, Sébastien Destercke, Marie-Hélène Masson, Eyke Hüllermeier

Classification

We propose a method for reliable prediction in multi-class classification, where reliability refers to the possibility of partial abstention in cases of uncertainty. More specifically, we allow for predictions in the form of preorder relations on the set of classes, thereby generalizing the idea of set-valued predictions. Our approach relies on combining learning by pairwise comparison with a recent proposal for modeling uncertainty in classification, in which a distinction is made between reducible (a.k.a. epistemic) uncertainty caused by a lack of information and irreducible (a.k.a. aleatoric) uncertainty due to intrinsic randomness. The problem of combining uncertain pairwise predictions into a most plausible preorder is then formalized as an integer programming problem. Experimentally, we show that our method is able to appropriately balance reliability and precision of predictions.
#2208

Iterative Metric Learning for Imbalance Data Classification
Nan Wang, Xibin Zhao, Yu Jiang, Yue Gao

Classification

In many classification applications, the amount of data from different categories usually vary significantly, such as software defect predication and medical diagnosis. Under such circumstances, it is essential to propose a proper method to solve the imbalance issue among the data. However, most of the existing methods mainly focus on improving the performance of classifiers rather than searching for an appropriate way to find an effective data space for classification. In this paper, we propose a method named Iterative Metric Learning (IML) to explore the correlations among imbalance data and construct an effective data space for classification. Given the imbalance training data, it is important to select a subset of training samples for each testing data. Thus, we aim to find a more stable neighborhood for testing data using the iterative metric learning strategy. To evaluate the effectiveness of the proposed method, we have conducted experiments on two groups of dataset, i.e., the NASA Metrics Data Program (NASA) dataset and UCI Machine Learning Repository (UCI) dataset. Experimental results and comparisons with state-of-the-art methods have exhibited better performance of our proposed method.
#2689

Accelerated Asynchronous Greedy Coordinate Descent Algorithm for SVMs
Bin Gu, Yingying Shan, Xiang Geng, Guansheng Zheng

Classification

Support vector machines play an important role in machine learning in the last two decades. Traditional SVM solvers (e.g. LIBSVM) are not scalable in the current big data era. Recently, a state of the art solver was proposed based on the asynchronous greedy coordinate descent (AsyGCD) algorithm. However, AsyGCD is still not scalable enough, and is limited to binary classification. To address these issues, in this paper we propose an asynchronous accelerated greedy coordinate descent algorithm (AsyAGCD) for SVMs. Compared with AsyGCD, our AsyAGCD has the following two-fold advantages: 1) our AsyAGCD is an accelerated version of AsyGCD because active set strategy is used. Specifically, our AsyAGCD can converge much faster than AsyGCD for the second half of iterations. 2) Our AsyAGCD can handle more SVM formulations (including binary classification and regression SVMs) than AsyGCD. We provide the comparison of computational complexity of AsyGCD and our AsyAGCD. Experiment results on a variety of datasets and learning applications confirm that our AsyAGCD is much faster than the existing SVM solvers (including AsyGCD).
#2143

Achieving Non-Discrimination in Prediction
Lu Zhang, Yongkai Wu, Xintao Wu

Classification

In discrimination-aware classification, the pre-process methods for constructing a discrimination-free classifier first remove discrimination from the training data, and then learn the classifier from the cleaned data. However, they lack a theoretical guarantee for the potential discrimination when the classifier is deployed for prediction. In this paper, we fill this gap by mathematically bounding the discrimination in prediction. We adopt the causal model for modeling the data generation mechanism, and formally defining discrimination in population, in a dataset, and in prediction. We obtain two important theoretical results: (1) the discrimination in prediction can still exist even if the discrimination in the training data is completely removed; and (2) not all pre-process methods can ensure non-discrimination in prediction even though they can achieve non-discrimination in the modified training data. Based on the results, we develop a two-phase framework for constructing a discrimination-free classifier with a theoretical guarantee. The experiments demonstrate the theoretical results and show the effectiveness of our two-phase framework.
#920

Distortion-aware CNNs for Spherical Images
Qiang Zhao, Chen Zhu, Feng Dai, Yike Ma, Guoqing Jin, Yongdong Zhang

Classification

Convolutional neural networks are widely used in computer vision applications. Although they have achieved great success, these networks can not be applied to 360 spherical images directly due to varying distortion effect. In this paper, we present distortion-aware convolutional network for spherical images. For each pixel, our network samples a non-regular grid based on its distortion level, and convolves the sampled grid using square kernels shared by all pixels. The network successively approximates large image patches from different tangent planes of viewing sphere with small local sampling grids, thus improves the computational efficiency. Our method also deals with the boundary problem, which is an inherent issue for spherical images. To evaluate our method, we apply our network in spherical image classification problems based on transformed MNIST and CIFAR-10 datasets. Compared with the baseline method, our method can get much better performance. We also analyze the variants of our network.
#2293

Automatic Opioid User Detection from Twitter: Transductive Ensemble Built on Different Meta-graph Based Similarities over Heterogeneous Information Network
Yujie Fan, Yiming Zhang, Yanfang Ye, Xin Li

Classification

Opioid (e.g., heroin and morphine) addiction has become one of the largest and deadliest epidemics in the United States. To combat such deadly epidemic, in this paper, we propose a novel framework named HinOPU to automatically detect opioid users from Twitter, which will assist in sharpening our understanding toward the behavioral process of opioid addiction and treatment. In HinOPU, to model the users and the posted tweets as well as their rich relationships, we introduce structured heterogeneous information network (HIN) for representation. Afterwards, we use meta-graph based approach to characterize the semantic relatedness over users; we then formulate different similarities over users based on different meta-graphs on HIN. To reduce the cost of acquiring labeled samples for supervised learning, we propose a transductive classification method to build the base classifiers based on different similarities formulated by different meta-graphs. Then, to further improve the detection accuracy, we construct an ensemble to combine different predictions from different base classifiers for opioid user detection. Comprehensive experiments on real sample collections from Twitter are conducted to validate the effectiveness of HinOPU in opioid user detection by comparisons with other alternate methods.

Wednesday 18 16:40 - 18:20 ML-REC - Machine Learning and Recommender Systems (C2)

Chair: Xiaojun Chen

#624

Your Tweets Reveal What You Like: Introducing Cross-media Content Information into Multi-domain Recommendation
Weizhi Ma, Min Zhang, Chenyang Wang, Cheng Luo, Yiqun Liu, Shaoping Ma

Machine Learning and Recommender Systems

Cold start is a challenging problem in recommender systems. Many previous studies attempt to utilize extra information from other platforms to alleviate the problem. Most of the leveraged information is on-topic, directly related to users' preferences in the target domain. Thought to be unrelated, users' off-topic content information (such as user tweets) is usually omitted. However, the off-topic content information also helps to indicate the similarity of users on their tastes, interests, and opinions, which matches the underlying assumption of Collaborative Filtering (CF) algorithms. In this paper, we propose a framework to capture the features from user's off-topic content information in social media and introduce them into Matrix Factorization (MF) based algorithms. The framework is easy to understand and flexible in different embedding approaches and MF based algorithms. To the best of our knowledge, there is no previous study in which user's off-topic content in other platforms is taken into consideration. By capturing the cross-platform content including both on-topic and off-topic information, multiple algorithms with several embedding learning approaches have achieved significant improvements in rating prediction on three datasets. Especially in cold start scenarios, we observe greater enhancement. The results confirm our suggestion that off-topic cross-media information also contributes to the recommendation.
#1020

JUMP: a Jointly Predictor for User Click and Dwell Time
Tengfei Zhou, Hui Qian, Zebang Shen, Chao Zhang, Chengwei Wang, Shichen Liu, Wenwu Ou

Machine Learning and Recommender Systems

With the recent proliferation of recommendation system, there have been a lot of interests in session-based prediction methods, particularly those based on Recurrent Neural Network (RNN) and their variants. However, existing methods either ignore the dwell time prediction that plays an important role in measuring user's engagement on the content, or fail to process very short or noisy sessions. In this paper, we propose a joint predictor, JUMP, for both user click and dwell time in session-based settings. To map its input into a feature vector, JUMP adopts a novel three-layered RNN structure which includes a fast-slow layer for very short sessions and an attention layer for noisy sessions. Experiments demonstrate that JUMP outperforms state-of-the-art methods in both user click and dwell time prediction.
#1643

Dynamic Bayesian Logistic Matrix Factorization for Recommendation with Implicit Feedback
Yong Liu, Lifan Zhao, Guimei Liu, Xinyan Lu, Peng Gao, Xiao-Li Li, Zhihui Jin

Machine Learning and Recommender Systems

Matrix factorization has been widely adopted for recommendation by learning latent embeddings of users and items from observed user-item interaction data. However, previous methods usually assume the learned embeddings are static or homogeneously evolving with the same diffusion rate. This is not valid in most scenarios, where users’ preferences and item attributes heterogeneously drift over time. To remedy this issue, we have proposed a novel dynamic matrix factorization model, named Dynamic Bayesian Logistic Matrix Factorization (DBLMF), which aims to learn heterogeneous user and item embeddings that are drifting with inconsistent diffusion rates. More specifically, DBLMF extends logistic matrix factorization to model the probability a user would like to interact with an item at a given timestamp, and a diffusion process to connect latent embeddings over time. In addition, an efficient Bayesian inference algorithm has also been proposed to make DBLMF scalable on large datasets. The effectiveness of the proposed method has been demonstrated by extensive experiments on real datasets, compared with the state-of-the-art methods.
#1802

DELF: A Dual-Embedding based Deep Latent Factor Model for Recommendation
Weiyu Cheng, Yanyan Shen, Yanmin Zhu, Linpeng Huang

Machine Learning and Recommender Systems

Among various recommendation methods, latent factor models are usually considered to be state-of-the-art techniques, which aim to learn user and item embeddings for predicting user-item preferences. When applying latent factor models to recommendation with implicit feedback, the quality of embeddings always suffers from inadequate positive feedback and noisy negative feedback. Inspired by the idea of NSVD that represents users based on their interacted items, this paper proposes a dual-embedding based deep latent factor model named DELF for recommendation with implicit feedback. In addition to learning a single embedding for a user (resp. item), we represent each user (resp. item) with an additional embedding from the perspective of the interacted items (resp. users). We employ an attentive neural method to discriminate the importance of interacted users/items for dual-embedding learning. We further introduce a neural network architecture to incorporate dual embeddings for recommendation. A novel attempt of DELF is to model each user-item interaction with four deep representations that are subtly fused for preference prediction. We conducted extensive experiments on real-world datasets. The results verify the effectiveness of user/item dual embeddings and the superior performance of DELF on item recommendation.
#2617

Discrete Factorization Machines for Fast Feature-based Recommendation
Han Liu, Xiangnan He, Fuli Feng, Liqiang Nie, Rui Liu, Hanwang Zhang

Machine Learning and Recommender Systems

User and item features of side information are crucial for accurate recommendation. However, the large number of feature dimensions, e.g., usually larger than 107, results in expensive storage and computational cost. This prohibits fast recommendation especially on mobile applications where the computational resource is very limited. In this paper, we develop a generic feature-based recommendation model, called Discrete Factorization Machine (DFM), for fast and accurate recommendation. DFM binarizes the real-valued model parameters (e.g., float32) of every feature embedding into binary codes (e.g., boolean), and thus supports efficient storage and fast user-item score computation. To avoid the severe quantization loss of the binarization, we propose a convergent updating rule that resolves the challenging discrete optimization of DFM. Through extensive experiments on two real-world datasets, we show that 1) DFM consistently outperforms state-of-the-art binarized recommendation models, and 2) DFM shows very competitive performance compared to its real-valued version (FM), demonstrating the minimized quantization loss.
#3981

A Deep Framework for Cross-Domain and Cross-System Recommendations
Feng Zhu, Yan Wang, Chaochao Chen, Guanfeng Liu, Mehmet Orgun, Jia Wu

Machine Learning and Recommender Systems

Cross-Domain Recommendation (CDR) and Cross-System Recommendations (CSR) are two of the promising solutions to address the long-standing data sparsity problem in recommender systems. They leverage the relatively richer information, e.g., ratings, from the source domain or system to improve the recommendation accuracy in the target domain or system. Therefore, finding an accurate mapping of the latent factors across domains or systems is crucial to enhancing recommendation accuracy. However, this is a very challenging task because of the complex relationships between the latent factors of the source and target domains or systems. To this end, in this paper, we propose a Deep framework for both Cross-Domain and Cross-System Recommendations, called DCDCSR, based on Matrix Factorization (MF) models and a fully connected Deep Neural Network (DNN). Specifically, DCDCSR first employs the MF models to generate user and item latent factors and then employs the DNN to map the latent factors across domains or systems. More importantly, we take into account the rating sparsity degrees of individual users and items in different domains or systems and use them to guide the DNN training process for utilizing the rating data more effectively. Extensive experiments conducted on three real-world datasets demonstrate that DCDCSR framework outperforms the state-of-the-art CDR and CSR approaches in terms of recommendation accuracy.
#4111

Recommendation with Multi-Source Heterogeneous Information
Li Gao, Hong Yang, Jia Wu, Chuan Zhou, Weixue Lu, Yue Hu

Machine Learning and Recommender Systems

Network embedding has been recently used in social network recommendations by embedding low-dimensional representations of network items for recommendation. However, existing item recommendation models in social networks suffer from two limitations. First, these models partially use item information and mostly ignore important contextual information in social networks such as textual content and social tag information. Second, network embedding and item recommendations are learned in two independent steps without any interaction. To this end, we in this paper consider item recommendations based on heterogeneous information sources. Specifically, we combine item structure, textual content and tag information for recommendation. To model the multi-source heterogeneous information, we use two coupled neural networks to capture the deep network representations of items, based on which a new recommendation model Collaborative multi-source Deep Network Embedding (CDNE for short) is proposed to learn different latent representations. Experimental results on two real-world data sets demonstrate that CDNE can use network representation learning to boost the recommendation performance.
#1362

PLASTIC: Prioritize Long and Short-term Information in Top-n Recommendation using Adversarial Training
Wei Zhao, Benyou Wang, Jianbo Ye, Yongqiang Gao, Min Yang, Xiaojun Chen

Machine Learning and Recommender Systems

Recommender systems provide users with ranked lists of items based on individual's preferences and constraints. Two types of models are commonly used to generate ranking results: long-term models and session-based models. While long-term models represent the interactions between users and items that are supposed to change slowly across time, session-based models encode the information of users' interests and changing dynamics of items' attributes in short terms. In this paper, we propose a PLASTIC model, Prioritizing Long And Short-Term Information in top-n reCommendation using adversarial training. In the adversarial process, we train a generator as an agent of reinforcement learning which recommends the next item to a user sequentially. We also train a discriminator which attempts to distinguish the generated list of items from the real list recorded. Extensive experiments show that our model exhibits significantly better performances on two widely used real-world datasets.

Wednesday 18 16:40 - 18:20 MUL-WEB2 - AI and the Web, Networks 2 (C3)

Chair: Vincent W. Zheng

#558

3-in-1 Correlated Embedding via Adaptive Exploration of the Structure and Semantic Subspaces
Liang Yang, Yuanfang Guo, Di Jin, Huazhu Fu, Xiaochun Cao

AI and the Web, Networks 2

Combinational network embedding, which learns the node representation by exploring both topological and non-topological information, becomes popular due to the fact that the two types of information are complementing each other. Most of the existing methods either consider the topological and non-topological information being aligned or possess predetermined preferences during the embedding process.Unfortunately, previous methods fail to either explicitly describe the correlations between topological and non-topological information or adaptively weight their impacts. To address the existing issues, three new assumptions are proposed to better describe the embedding space and its properties. With the proposed assumptions, nodes, communities and topics are mapped into one embedding space. A novel generative model is proposed to formulate the generation process of the network and content from the embeddings, with respect to the Bayesian framework. The proposed model automatically leans to the information which is more discriminative.The embedding result can be obtained by maximizing the posterior distribution by adopting the variational inference and reparameterization trick. Experimental results indicate that the proposed method gives superior performances compared to the state-of-the-art methods when a variety of real-world networks is analyzed.
#3962

Deep Attributed Network Embedding
Hongchang Gao, Heng Huang

AI and the Web, Networks 2

Network embedding has attracted a surge of attention in recent years. It is to learn the low-dimensional representation for nodes in a network, which benefits downstream tasks such as node classification and link prediction. Most of the existing approaches learn node representations only based on the topological structure, yet nodes are often associated with rich attributes in many real-world applications. Thus, it is important and necessary to learn node representations based on both the topological structure and node attributes. In this paper, we propose a novel deep attributed network embedding approach, which can capture the high non-linearity and preserve various proximities in both topological structure and node attributes. At the same time, a novel strategy is proposed to guarantee the learned node representation can encode the consistent and complementary information from the topological structure and node attributes. Extensive experiments on benchmark datasets have verified the effectiveness of our proposed approach.
#3445

Hashtag2Vec: Learning Hashtag Representation with Relational Hierarchical Embedding Model
Jie Liu, Zhicheng He, Yalou Huang

AI and the Web, Networks 2

Hashtags have always been important elements in many social network platforms and micro-blog services. Semantic understanding of hashtags is a critical and fundamental task for many applications on social networks, such as event analysis, theme discovery, information retrieval, etc. However, this task is challenging due to the sparsity, polysemy, and synonymy of hashtags. In this paper, we investigate the problem of hashtag embedding by combining the short text content with the various heterogeneous relations in social networks. Specifically, we first establish a network with hashtags as its nodes. Hierarchically, each of the hashtag nodes is associated with a set of tweets and each tweet contains a set of words. Then we devise an embedding model, called Hashtag2Vec, which exploits multiple relations of hashtag-hashtag, hashtag-tweet, tweet-word, and word-word relations based on the hierarchical heterogeneous network. In addition to embedding the hashtags, our proposed framework is capable of embedding the short social texts as well. Extensive experiments are conducted on two real-world datasets, and the results demonstrate the effectiveness of the proposed method.
#1675

Integrative Network Embedding via Deep Joint Reconstruction
Di Jin, Meng Ge, Liang Yang, Dongxiao He, Longbiao Wang, Weixiong Zhang

AI and the Web, Networks 2

Network embedding is to learn a low-dimensional representation for a network in order to capture intrinsic features of the network. It has been applied to many applications, e.g., network community detection and user recommendation. One of the recent research topics for network embedding has been focusing on exploitation of diverse information, including network topology and semantic information on nodes of networks. However, such diverse information has not been fully utilized nor adequately integrated in the existing methods, so that the resulting network embedding is far from satisfactory. In this paper, we develop a weight-free multi-component network embedding approach by network reconstruction via a deep Autoencoder. Three key components make our new approach effective, i.e., a uniformed graph representation of network topology and semantic information, enhancement to the graph representation using local network structure (i.e., pairwise relationship on nodes) by sampling with latent space regularization, and integration of the diverse information in graph forms in a deep Autoencoder. Extensive experimental results on seven real-world networks demonstrate a superior performance of our method over nine state-of-the-art methods for embedding.
#1687

Finding Communities with Hierarchical Semantics by Distinguishing General and Specialized topics
Ge Zhang, Di Jin, Jian Gao, Pengfei Jiao, Françoise Fogelman-Soulié, Xin Huang

AI and the Web, Networks 2

Using network topology and semantic contents to find topic-related communities is a new trend in the field of community detection. By analyzing texts in social networks, we find that topics in networked contents are often hierarchical. In most cases, they have a two-level semantic structure with general and specialized topics, to respectively denote common and specific interests of communities. However, the existing community detection methods ignore such a hierarchy and take all words used to describe node semantics from an identical perspective. This indiscriminate use of words leads to natural defects in depicting networked content in which the deep semantics is not fully utilized. To address this problem, we propose a novel probabilistic generative model. By distinguishing the general and specialized topics of words, our model not only can find community structures more accurately, but also provide two-level semantic interpretation for each community. We train the model by deriving an efficient inference method under the framework of variational expectation-maximization. We provide a case study to show the ability of our algorithm in deep semantic interpretability of communities. The superiority of our algorithm for community detection is further demonstrated in comparison with eight state-of-the-art algorithms on eight real-world networks.
#3498

Active Discriminative Network Representation Learning
Li Gao, Hong Yang, Chuan Zhou, Jia Wu, Shirui Pan, Yue Hu

AI and the Web, Networks 2

Most of current network representation models are learned in unsupervised fashions, which usually lack the capability of discrimination when applied to network analysis tasks, such as node classification. It is worth noting that label information is valuable for learning the discriminative network representations. However, labels of all training nodes are always difficult or expensive to obtain and manually labeling all nodes for training is inapplicable. Different sets of labeled nodes for model learning lead to different network representation results. In this paper, we propose a novel method, termed as ANRMAB, to learn the active discriminative network representations with a multi-armed bandit mechanism in active learning setting. Specifically, based on the networking data and the learned network representations, we design three active learning query strategies. By deriving an effective reward scheme that is closely related to the estimated performance measure of interest, ANRMAB uses a multi-armed bandit mechanism for adaptive decision making to select the most informative nodes for labeling. The updated labeled nodes are then used for further discriminative network representation learning. Experiments are conducted on three public data sets to verify the effectiveness of ANRMAB.
#4245

Neural User Response Generator: Fake News Detection with Collective User Intelligence
Feng Qian, Chengyue Gong, Karishma Sharma, Yan Liu

AI and the Web, Networks 2

Fake news on social media is a major challenge and studies have shown that fake news can propagate exponentially quickly in early stages. Therefore, we focus on early detection of fake news, and consider that only news article text is available at the time of detection, since additional information such as user responses and propagation patterns can be obtained only after the news spreads. However, we find historical user responses to previous articles are available and can be treated as soft semantic labels, that enrich the binary label of an article, by providing insights into why the article must be labeled as fake. We propose a novel Two-Level Convolutional Neural Network with User Response Generator (TCNN-URG) where TCNN captures semantic information from article text by representing it at the sentence and word level, and URG learns a generative model of user response to article text from historical user responses which it can use to generate responses to new articles in order to assist fake news detection. We conduct experiments on one available dataset and a larger dataset collected by ourselves. Experimental results show that TCNN-URG outperforms the baselines based on prior approaches that detect fake news from article text alone.
#708

Weakly Learning to Match Experts in Online Community
Yujie Qian, Jie Tang, Kan Wu

AI and the Web, Networks 2

In online question-and-answer (QA) websites like Quora, one central issue is to find (invite) users who are able to provide answers to a given question and at the same time would be unlikely to say "no" to the invitation. The challenge is how to trade off the matching degree between users’ expertise and the question topic, and the likelihood of positive response from the invited users. In this paper, we formally formulate the problem and develop a weakly supervised factor graph (WeakFG) model to address the problem. The model explicitly captures expertise matching degree between questions and users. To model the likelihood that an invited user is willing to answer a specific question, we incorporate a set of correlations based on social identity theory into the WeakFG model. We use two different genres of datasets: QA-Expert and Paper-Reviewer, to validate the proposed model. Our experimental results show that the proposed model can significantly outperform (+1.5-10.7% by MAP) the state-of-the-art algorithms for matching users (experts) with community questions. We have also developed an online system to further demonstrate the advantages of the proposed method.

Wednesday 18 16:40 - 18:20 CV-3D - 2d and 3d Computer Vision (T5)

Chair: Yue Gao

#502

DEL: Deep Embedding Learning for Efficient Image Segmentation
Yun Liu, Peng-Tao Jiang, Vahan Petrosyan, Shi-Jie Li, Jiawang Bian, Le Zhang, Ming-Ming Cheng

2d and 3d Computer Vision

Image segmentation has been explored for many years and still remains a crucial vision problem. Some efficient or accurate segmentation algorithms have been widely used in many vision applications. However, it is difficult to design a both efficient and accurate image segmenter. In this paper, we propose a novel method called DEL (deep embedding learning) which can efficiently transform superpixels into image segmentation. Starting with the SLIC superpixels, we train a fully convolutional network to learn the feature embedding space for each superpixel. The learned feature embedding corresponds to a similarity measure that measures the similarity between two adjacent superpixels. With the deep similarities, we can directly merge the superpixels into large segments. The evaluation results on BSDS500 and PASCAL Context demonstrate that our approach achieves a good trade-off between efficiency and effectiveness. Specifically, our DEL algorithm can achieve comparable segments when compared with MCG but is much faster than it, i.e. 11.4fps vs. 0.07fps.
#1921

Siamese CNN-BiLSTM Architecture for 3D Shape Representation Learning
Guoxian Dai, Jin Xie, Yi Fang

2d and 3d Computer Vision

Learning a 3D shape representation from a collection of its rendered 2D images has been extensively studied. However, existing view-based techniques have not yet fully exploited the information among all the views of projections. In this paper, by employing recurrent neural network to efficiently capture features across different views, we propose a siamese CNN-BiLSTM network for 3D shape representation learning. The proposed method minimizes a discriminative loss function to learn a deep nonlinear transformation, mapping 3D shapes from the original space into a nonlinear feature space. In the transformed space, the distance of 3D shapes with the same label is minimized, otherwise the distance is maximized to a large margin. Specifically, the 3D shapes are first projected into a group of 2D images from different views. Then convolutional neural network (CNN) is adopted to extract features from different view images, followed by a bidirectional long short-term memory (LSTM) to aggregate information across different views. Finally, we construct the whole CNN-BiLSTM network into a siamese structure with contrastive loss function. Our proposed method is evaluated on two benchmarks, ModelNet40 and SHREC 2014, demonstrating superiority over the state-of-the-art methods.
#4206

Nonrigid Points Alignment with Soft-weighted Selection
Xuelong Li, Jian Yang, Qi Wang

2d and 3d Computer Vision

Point set registration (PSR) is a crucial problem in computer vision and pattern recognition. Existing PSR methods cannot align point sets robustly due to degradations, such as deformation, noise, occlusion, outlier, and multi-view changes. In this paper, we present a self-selected regularized Gaussian fields criterion for nonrigid point matching. Unlike most existing methods, we formulate the registration problem as a sparse approximation task with low rank constraint in reproducing kernel Hilbert space (RKHS). A self-selected mechanism is used to dynamically assign real-valued label for each point in an accuracy-aware weighting manner, which makes the model focus more on the reliable points in position. Based on the label, an equivalent matching number optimization is embedded into the non-rigid criterion to enhance the reliability of the approximation. Experimental results show that the proposed method can achieve a better result in both registration accuracy and correct matches compared to state-of-the-art approaches.
#723

View-Volume Network for Semantic Scene Completion from a Single Depth Image
Yuxiao Guo, Xin Tong

2d and 3d Computer Vision

We introduce a View-Volume convolutional neural network (VVNet) for inferring the occupancy and semantic labels of a volumetric 3D scene from a single depth image. Our method extracts the detailed geometric features from the input depth image with a 2D view CNN and then projects the features into a 3D volume according to the input depth map via a projection layer. After that, we learn the 3D context information of the scene with a 3D volume CNN for computing the result volumetric occupancy and semantic labels. With combined 2D and 3D representations, the VVNet efficiently reduces the computational cost, enables feature extraction from multi-channel high resolution inputs, and thus significantly improve the result accuracy. We validate our method and demonstrate its efficiency and effectiveness on both synthetic SUNCG and real NYU dataset.
#715

Annotation-Free and One-Shot Learning for Instance Segmentation of Homogeneous Object Clusters
Zheng Wu, Ruiheng Chang, Jiaxu Ma, Cewu Lu, Chi Keung Tang

2d and 3d Computer Vision

We propose a novel approach for instance segmentation given an image of homogeneous object cluster (HOC). Our learning approach is one-shot because a single video of an object instance is captured and it requires no human annotation. Our intuition is that images of homogeneous objects can be effectively synthesized based on structure and illumination priors derived from real images. A novel solver is proposed that iteratively maximizes our structured likelihood to generate realistic images of HOC. Illumination transformation scheme is applied to make the real and synthetic images share the same illumination condition. Extensive experiments and comparisons are performed to verify our method. We build a dataset consisting of pixel-level annotated images of HOC. The dataset and code will be released.
#1715

A Normalized Convolutional Neural Network for Guided Sparse Depth Upsampling
Jiashen Hua, Xiaojin Gong

2d and 3d Computer Vision

Guided sparse depth upsampling aims to upsample an irregularly sampled sparse depth map when an aligned high-resolution color image is given as guidance. When deep convolutional neural networks (CNNs) become the optimal choice to many applications nowadays, how to deal with irregular and sparse data still remains a non-trivial problem. Inspired by the classical normalized convolution operation, this work proposes a normalized convolutional layer (NCL) implemented in CNNs. Sparse data are therefore explicitly considered in CNNs by the separation of both data and filters into a signal part and a certainty part. Based upon NCLs, we design a normalized convolutional neural network (NCNN) to perform guided sparse depth upsampling. Experiments on both indoor and outdoor datasets show that the proposed NCNN models achieve state-of-the-art upsampling performance. Moreover, the models using NCLs gain a great generalization ability to different sparsity levels.
#1137

Sharing Residual Units Through Collective Tensor Factorization To Improve Deep Neural Networks
Yunpeng Chen, Xiaojie Jin, Bingyi Kang, Jiashi Feng, Shuicheng Yan

2d and 3d Computer Vision

The residual unit and its variations are wildly used in building very deep neural networks for alleviating optimization difficulty. In this work, we revisit the standard residual function as well as its several successful variants and propose a unified framework based on tensor Block Term Decomposition (BTD) to explain these apparently different residual functions from the tensor decomposition view. With the BTD framework, we further propose a novel basic network architecture, named the Collective Residual Unit (CRU). CRU further enhances parameter efficiency of deep residual neural networks by sharing core factors derived from collective tensor factorization over the involved residual units. It enables efficient knowledge sharing across multiple residual units, reduces the number of model parameters, lowers the risk of over-fitting, and provides better generalization ability. Extensive experimental results show that our proposed CRU network brings outstanding parameter efficiency -- it achieves comparable classification performance with ResNet-200 while using a model size as small as ResNet-50 on the ImageNet-1k and Places365-Standard benchmark datasets.
#2351

Enhanced-alignment Measure for Binary Foreground Map Evaluation
Deng-Ping Fan, Cheng Gong, Yang Cao, Bo Ren, Ming-Ming Cheng, Ali Borji

2d and 3d Computer Vision

The existing binary foreground map (FM) measures address various types of errors in either pixel-wise or structural ways. These measures consider pixel-level match or image-level information independently, while cognitive vision studies have shown that human vision is highly sensitive to both global information and local details in scenes. In this paper, we take a detailed look at current binary FM evaluation measures and propose a novel and effective E-measure (Enhanced-alignment measure). Our measure combines local pixel values with the image-level mean value in one term, jointly capturing image-level statistics and local pixel matching information. We demonstrate the superiority of our measure over the available measures on 4 popular datasets via 5 meta-measures, including ranking models for applications, demoting generic, random Gaussian noise maps, ground-truth switch, as well as human judgments. We find large improvements in almost all the meta-measures. For instance, in terms of application ranking, we observe improvement ranging from 9.08% to 19.65% compared with other popular measures.

Wednesday 18 17:00 - 18:30 Special session (A3)

Chair: Rafiq Muhammad

Job Matching

Special session

Thursday 19 08:30 - 09:45 EAR7 - Early Career 7 (VICTORIA)

Chair: Bernhard Nebel

#5486

Advances and Challenges in Privacy Preserving Planning
Guy Shani

Early Career 7

Collaborative privacy-preserving planning (CPPP) is a multi-agent planning task in which agents need to achieve a common set of goals without revealing certain private information. CPPP has gained attention in recent years as an important sub area of multi agent planning, presenting new challenges to the planning community. In this paper we describe recent advancements, and outline open problems and future directions in this field. We begin with describing different models of privacy, such as weak and strong privacy, agent privacy, and cardinality preserving privacy. We then discuss different solution approaches, focusing on the two prominent methods --- joint creation of a global coordination scheme first, followed by independent planning to extend the global scheme with private actions; and collaborative local planning where agents communicate information concerning their planning process. In both cases a heuristic is needed to guide the search process. We describe several adaptations of well known classical planning heuristic to CPPP, focusing on the difficulties in computing the heuristic without disclosing private information.
#5448

Interactive Learning and Decision Making: Foundations, Insights & Challenges
Frans A. Oliehoek

Early Career 7

Designing "teams of intelligent agents that successfully coordinate and learn about their complex environments inhabited by other agents (such as humans)" is one of the major goals of AI, and it is the challenge that I aim to address in my research. In this paper I give an overview of some of the foundations, insights and challenges in this field of Interactive Learning and Decision Making.
#5481

Mental Health Computing via Harvesting Social Media Data
Jia Jia

Early Career 7

Mental health has become a general concern of people nowadays. It is of vital importance to detect and manage mental health issues before they turn into severe problems. Traditional psychological interventions are reliable, but expensive and hysteretic. With the rapid development of social media, people are increasingly sharing their daily lives and interacting with friends online. Via harvesting social media data, we comprehensively study the detection of mental wellness, with two typical mental problems, stress and depression, as specific examples. Initializing with binary user-level detection, we expand our research towards multiple contexts, by considering the trigger and level of mental health problems, and involving different social media platforms of different cultures. We construct several benchmark real-world datasets for analysis and propose a series of multi-modal detection models, whose effectiveness are verified by extensive experiments. We also make in-depth analysis to reveal the underlying online behaviors regarding these mental health issues.

Thursday 19 08:30 - 09:55 KR-KR - Knowledge Representation and Reasoning (C7)

Chair: Michael Morak

#2100

Enhancing Existential Rules by Closed-World Variables
Giovanni Amendola, Nicola Leone, Marco Manna, Pierfrancesco Veltri

Knowledge Representation and Reasoning

Existential rules generalize Datalog with existential quantification in the head. Natively, Datalog is interpreted under a closed-world semantics, while existential rules typically employ the open-world assumption. The interpretation domain in the latter case is enlarged by infinitely many "anonymous" individuals. Then, in any rule, each variable ranges over all individuals, even if not needed or required. In this paper, we enhance existential rules by closed-world variables to consciously reason on the properties of "known" (non-anonymous) and arbitrary individuals in different ways. Accordingly, we uniformly generalize the basic classes of existential rules that ensure decidability of ontology-based query answering. For them, after observing that decidability is preserved, we prove that a strict increase in expressiveness is gained, and in most cases the computational complexity is not altered.
#3466

Explainable Certain Answers
Giovanni Amendola, Leonid Libkin

Knowledge Representation and Reasoning

When a dataset is not fully specified and can represent many possible worlds, one commonly answers queries by computing certain answers to them. A natural way of defining certainty is to say that an answer is certain if it is consistent with query answers in all possible worlds, and is furthermore the most informative answer with this property. However, the existence and complexity of such answers is not yet well understood even for relational databases. Thus in applications one tends to use different notions, essentially the intersection of query answers in possible worlds. However, justification of such notions has long been questioned. This leads to two problems: are certain answers based on informativeness feasible in applications? and can a clean justification be provided for intersection-based notions? Our goal is to answer both. For the former, we show that such answers may not exist, or be very large, even in simple cases of querying incomplete data. For the latter, we add the concept of explanations to the notion of informativeness: it shows not only that one object is more informative than the other, but also says why this is so. This leads to a modified notion of certainty: explainable certain answers. We present a general framework for reasoning about them, and show that for open and closed world relational databases, they are precisely the common intersection-based notions of certainty.
#841

Inconsistency Measures for Repair Semantics in OBDA
Bruno Yun, Srdjan Vesic, Madalina Croitoru, Pierre Bisquert

Knowledge Representation and Reasoning

In this paper, we place ourselves in the Ontology Based Data Access (OBDA) setting and investigate reasoning with inconsistent existential rules knowledge bases. We use the notion of inconsistency measures on sets of facts to rank and filter repairs. We propose a generic framework to answer queries by using the best repairs and study productivity and properties of such a framework.
#1964

A Social Interaction Activity based Time-Varying User Vectorization Method for Online Social Networks
Tianyi Hao, Longbo Huang

Knowledge Representation and Reasoning

In this paper, we consider the problem of user modeling in online social networks, and propose a social interaction activity based user vectorization framework, called the time-varying user vectorization (Tuv), to infer and make use of important user features. Tuv is designed based on a novel combination of word2vec, negative sampling and a smoothing technique for model training. It jointly handles multi-format user data and computes user representing vectors, by taking into consideration user feature variation, self-similarity and pairwise interactions among users. The framework enables us to extract hidden user properties and to produce user vectors. We conduct extensive experiments based on a real-world dataset, which show that Tuv significantly outperforms several state-of-the-art user vectorization methods.
#5455

(Journal track) Three-Valued Semantics for Hybrid MKNF Knowledge Bases Revisited
Fangfang Liu, Jia-Huai You

Knowledge Representation and Reasoning

Knorr et al. (2011) formulated a three-valued formalism for the logic of Minimal Knowledge and Negation as Failure (MKNF) and proposed a well-founded semantics for hybrid MKNF knowledge bases (KBs). The main results state that if a hybrid MKNF KB has a three-valued MKNF model, its well-founded MKNF model exists, which is unique and can be computed by an alternating fixpoint construction. In this paper, we show that these claims are erroneous. We propose a classification of hybrid MKNF KBs into a hierarchy and show that its innermost subclass is what works for the well-founded semantics of Knorr et al. Furthermore, we provide a uniform characterization of well-founded, two-valued, and all three-valued MKNF models, in terms of stable partitions and the alternating fixpoint construction, which leads to updated complexity results as well as proof-theoretic tools for reasoning under these semantics.
#5468

(Journal track) Enhancing Context Knowledge Repositories with Justifiable Exceptions
Loris Bozzato, Thomas Eiter, Luciano Serafini

Knowledge Representation and Reasoning

The Contextualized Knowledge Repository (CKR) framework was conceived as a logic-based approach for representing context dependent knowledge, which is a well-known area of study in AI. The framework has a two-layer structure with a global context that contains context-independent knowledge and meta-information about the contexts, and a set of local contexts with specific knowledge bases. In many practical cases, it is desirable that inherited global knowledge can be "overridden" at the local level. In order to address this need, we present an extension of CKR with global defeasible axioms: these axioms locally apply to (tuples of) individuals unless an exception for overriding exists; such an exception, however, requires a justification that is provable from the knowledge base. We formalize this intuition and study its semantic and computational properties. Furthermore, we present a translation of extended CKRs to datalog programs under the answer set (i.e., stable) semantics and we present an implementation prototype. Our work adds to the body of results on using deductive database technology in these areas, and provides an expressive formalism for exception handling by overriding.
#5463

(Journal track) On the Logical Properties of the Description Logic DL^N (Extended abstract)
Piero A. Bonatti, Luigi Sauro

Knowledge Representation and Reasoning

DL^N is a recent nonmonotonic description logic, designed for satisfying independently proposed knowledge engineering requirements, and for removing some recurrent drawbacks of traditional nonmonotonic semantics. In this paper we study the logical properties of DL^N and their relationships with the KLM postulates. We use various versions of the KLM postulates to deepen the comparison with related work, and illustrate the different tradeoffs between opposite expressivity requirements adopted by each approach.

Thursday 19 08:30 - 09:55 MAS-VOT - Voting (C8)

Chair: Maria Polukarov

#1153

Approval-Based Multi-Winner Rules and Strategic Voting
Martin Lackner, Piotr Skowron

Voting

We investigate the possibility of strategic voting in approval-based multiwinner rules. In particular, we define three axiomatic properties that guarantee resilience to certain forms of strategic voting: independence of irrelevant alternatives (IIA), monotonicity, and SD-strategyproofness. In this paper, we systematically analyze multiwinner rules based on these axioms and provide a fine-grained picture of their resilience to strategic voting. Both our axiomatic and experimental analysis show that approval-based multiwinner rules are generally very susceptible to strategic voting---with one exception: multiwinner approval voting.
#1890

Multiwinner Voting with Fairness Constraints
L. Elisa Celis, Lingxiao Huang, Nisheeth K. Vishnoi

Voting

Multiwinner voting rules are used to select a small representative subset of candidates or items from a larger set given the preferences of voters. However, if candidates have sensitive attributes such as gender or ethnicity (when selecting a committee), or specified types such as political leaning (when selecting a subset of news items), an algorithm that chooses a subset by optimizing a multiwinner voting rule may be unbalanced in its selection -- it may under or over represent a particular gender or political orientation in the examples above. We introduce an algorithmic framework for multiwinner voting problems when there is an additional requirement that the selected subset should be ``fair'' with respect to a given set of attributes. Our framework provides the flexibility to (1) specify fairness with respect to multiple, non-disjoint attributes (e.g., ethnicity and gender) and (2) specify a score function. We study the computational complexity of this constrained multiwinner voting problem for monotone and submodular score functions and present several approximation algorithms and matching hardness of approximation results for various attribute group structure and types of score functions. We also present simulations that suggest that adding fairness constraints may not affect the scores significantly when compared to the unconstrained case.
#2690

Multiwinner Voting with Restricted Admissible Sets: Complexity and Strategyproofness
Yongjie Yang, Jianxin Wang

Voting

Multiwinner voting aims to select a subset of candidates (the winners) from admissible sets, according to the votes cast by voters. A special class of multiwinner rules—the k-committee selection rules where the number of winners is predefined—have gained considerable attention recently. In this setting, the admissible sets are all subsets of candidates of size exactly k. In this paper, we study admissible sets with combinatorial restrictions. In particular, in our setting, we are given a graph G whose vertex set is the candidate set. Admissible sets are the subsets of candidates whose induced subgraphs belong to some special class G of graphs. We consider different graph classes G and investigate the complexity of multiwinner determination problem for prevalent voting rules in this setting. In addition, we investigate the strategyproofness of many rules for different classes of admissible sets.
#3678

Egalitarian Committee Scoring Rules
Haris Aziz, Piotr Faliszewski, Bernard Grofman, Arkadii Slinko, Nimrod Talmon

Voting

We introduce and study the class of egalitarian variants of committee scoring rules, where instead of summing up the scores that voters assign to committees---as is done in the utilitarian variants---the score of a committee is taken to be the lowest score assigned to it by any voter. We focus on five rules, which are egalitarian analogues of SNTV, the k-Borda rule, the Chamberlin--Courant rule, the Bloc rule, and the Pessimist rule. We establish their computational complexity, provide their initial axiomatic study, and perform experiments to represent the action of these rules graphically.
#3518

An Analytical and Experimental Comparison of Maximal Lottery Schemes
Florian Brandl, Felix Brandt, Christian Stricker

Voting

Randomized voting rules are gaining increasing attention in computational and non-computational social choice. A particularly interesting class of such rules are maximal lottery (ML) schemes, which were proposed by Peter Fishburn in 1984 and have been repeatedly recommended for practical use. However, the subtle differences between different ML schemes are often ignored. Two canonical subsets of ML schemes are C1-ML schemes (which only depend on unweighted majority comparisons) and C2-ML schemes (which only depend on weighted majority comparisons). We prove that C2-ML schemes are the only Pareto efficient---but also among the most manipulable---ML schemes. Furthermore, we evaluate the frequency of manipulable preference profiles and the degree of randomization of ML schemes via extensive computer simulations. In general, ML schemes are rarely manipulable and often do not randomize at all, especially when there are only few alternatives. For up to 21 alternatives, the average support size of ML schemes lies below 4 under reasonable assumptions. The average degree of randomization (in terms of Shannon entropy) of C2-ML schemes is significantly lower than that of C1-ML schemes.
#2646

Pairwise Liquid Democracy
Markus Brill, Nimrod Talmon

Voting

In a liquid democracy, voters can either vote directly or delegate their vote to another voter of their choice. We consider ordinal elections, and study a model of liquid democracy in which voters specify partial orders and use several delegates to refine them. This flexibility, however, comes at a price, as individual rationality (in the form of transitive preferences) can no longer be guaranteed. We discuss ways to detect and overcome such complications. Based on the framework of distance rationalization, we introduce novel variants of voting rules that are tailored to the liquid democracy context.
#3093

Computing the Schulze Method for Large-Scale Preference Data Sets
Theresa Csar, Martin Lackner, Reinhard Pichler

Voting

The Schulze method is a voting rule widely used in practice and enjoys many positive axiomatic properties. While it is computable in polynomial time, its straight-forward implementation does not scale well for large elections. In this paper, we develop a highly optimised algorithm for computing the Schulze method with Pregel, a framework for massively parallel computation of graph problems, and demonstrate its applicability for large preference data sets. In addition, our theoretic analysis shows that the Schulze method is indeed particularly well-suited for parallel computation, in stark contrast to the related ranked pairs method. More precisely we show that winner determination subject to the Schulze method is NL-complete, whereas this problem is P-complete for the ranked pairs method.

Thursday 19 08:30 - 09:55 CSAT-SGP - Constraints, Satisfiability and Search (K2)

Chair: Kuldeep Meel

#2931

A Framework for Constraint Based Local Search using Essence
Özgür Akgün, Saad Attieh, Ian P. Gent, Christopher Jefferson, Ian Miguel, Peter Nightingale, András Z. Salamon, Patrick Spracklen, James Wetter

Constraints, Satisfiability and Search

Structured Neighbourhood Search (SNS) is a framework for constraint-based local search for problems expressed in the Essence abstract constraint specification language. The local search explores a structured neighbourhood, where each state in the neighbourhood preserves a high level structural feature of the problem. SNS derives highly structured problem-specific neighbourhoods automatically and directly from the features of the Essence specification of the problem. Hence, neighbourhoods can represent important structural features of the problem, such as partitions of sets, even if that structure is obscured in the low-level input format required by a constraint solver. SNS expresses each neighbourhood as a constrained optimisation problem, which is solved with a constraint solver. We have implemented SNS, together with automatic generation of neighbourhoods for high level structures, and report high quality results for several optimisation problems.
#3494

Stratification for Constraint-Based Multi-Objective Combinatorial Optimization
Miguel Terra-Neves, Inês Lynce, Vasco Manquinho

Constraints, Satisfiability and Search

New constraint-based algorithms have been recently proposed to solve Multi-Objective Combinatorial Optimization (MOCO) problems. These new methods are based on Minimal Correction Subsets (MCSs) or P-minimal models and have shown to be successful at solving MOCO instances when the constraint set is hard to satisfy. However, if the constraints are easy to satisfy, constraint-based tools usually do not perform as well as stochastic methods. For solving such instances, algorithms should focus on dealing with the objective functions. This paper proposes the integration of stratification techniques in constraint-based algorithms for MOCO. Moreover, it also shows how to diversify the stratification among the several objective criteria in order to better approximate the Pareto front of MOCO problems. An extensive experimental evaluation on publicly available MOCO instances shows that the new algorithm is competitive with stochastic methods and it is much more effective than existing constraint-based methods.
#3555

Unary Integer Linear Programming with Structural Restrictions
Eduard Eiben, Robert Ganian, Dušan Knop, Sebastian Ordyniak

Constraints, Satisfiability and Search

Recently a number of algorithmic results have appeared which show the tractability of Integer Linear Programming (ILP) instances under strong restrictions on variable domains and/or coefficients (AAAI 2016, AAAI 2017, IJCAI 2017). In this paper, we target ILPs where neither the variable domains nor the coefficients are restricted by a fixed constant or parameter; instead, we only require that our instances can be encoded in unary. We provide new algorithms and lower bounds for such ILPs by exploiting the structure of their variable interactions, represented as a graph. Our first set of results focuses on solving ILP instances through the use of a graph parameter called clique-width, which can be seen as an extension of treewidth which also captures well-structured dense graphs. In particular, we obtain a polynomial-time algorithm for instances of bounded clique-width whose domain and coefficients are polynomially bounded by the input size, and we complement this positive result by a number of algorithmic lower bounds. Afterwards, we turn our attention to ILPs with acyclic variable interactions. In this setting, we obtain a complexity map for the problem with respect to the graph representation used and restrictions on the encoding.
#3590

A Scalable Scheme for Counting Linear Extensions
Topi Talvitie, Kustaa Kangas, Teppo Niinimäki, Mikko Koivisto

Constraints, Satisfiability and Search

Counting the linear extensions of a given partial order not only has several applications in artificial intelligence but also represents a hard problem that challenges modern paradigms for approximate counting. Recently, Talvitie et al. (AAAI 2018) showed that an exponential time scheme beats the fastest known polynomial time schemes in practice, even if allowing hours of running time. Here, we present a novel scheme, relaxation Tootsie Pop, which in our experiments exhibits polynomial characteristics and significantly outperforms previous schemes. We also instantiate state-of-the-art model counters for CNF formulas; two natural encodings yield schemes that, however, are inferior to the more specialized schemes.
#2924

Socially Motivated Partial Cooperation in Multi-agent Local Search
Tal Ze'evi, Roie Zivan, Omer Lev

Constraints, Satisfiability and Search

Partial Cooperation is a paradigm and a corresponding model, proposed to represent multi-agent systems in which agents are willing to cooperate to achieve a global goal, as long as some minimal threshold on their personal utility is satisfied. Distributed local search algorithms were proposed in order to solve asymmetric distributed constraint optimization problems (ADCOPs) in which agents are partially cooperative. We contribute by: 1) extending the partial cooperative model to allow it to represent dynamic cooperation intentions, affected by changes in agents’ wealth, in accordance with social studies literature. 2) proposing a novel local search algorithm in which agents receive indications of others’ preferences on their actions and thus, can perform actions that are socially beneficial. Our empirical study reveals the advantage of the proposed algorithm in multiple benchmarks. Specifically, on realistic meeting scheduling problems it overcomes limitations of standard local search algorithms.
#112

Solving (Weighted) Partial MaxSAT by Dynamic Local Search for SAT
Zhendong Lei, Shaowei Cai

Constraints, Satisfiability and Search

Partial MaxSAT (PMS) generalizes SAT and MaxSAT by introducing hard clauses and soft clauses. PMS and Weighted PMS (WPMS) have many important real world applications. Local search is one popular method for solving (W)PMS. Recent studies on specialized local search for (W)PMS have led to significant improvements. But such specialized algorithms are complicated with the concepts tailored for hard and soft clauses. In this work, we propose a dynamic local search algorithm, which exploits the structure of (W)PMS by a carefully designed clause weighting scheme. Our solver SATLike adopts a local search framework for SAT and does not need any specialized concept for (W)PMS. Experiments on PMS and WPMS benchmarks from the MaxSAT Evaluations (MSE) 2016 and 2017 show that SATLike significantly outperforms state of the art local search solvers. Also, SATLike significantly narrows the gap between the performance of local search solvers and complete solvers on industrial benchmarks, and performs better than the complete solvers on the MSE2017 benchmarks.
#5464

(Journal track) Linear Satisfiability Preserving Assignments
Kei Kimura, Kazuhisa Makino

Constraints, Satisfiability and Search

In this paper, we study several classes of satisfiability preserving assignments to the constraint satisfaction problem. In particular, we consider fixable, autark and satisfying assignments. Since it is in general NP-hard to find a nontrivial (i.e., nonempty) satisfiability preserving assignment, we introduce linear satisfiability preserving assignments, which are defined by polyhedral cones in an associated vector space. The vector space is obtained by the identification, introduced by Kullmann, of assignments with real vectors. We consider arbitrary polyhedral cones, where only restricted classes of cones for autark assignments are considered in the literature. We reveal that cones in certain classes are maximal as a convex subset of the set of the associated vectors, which can be regarded as extensions of Kullmann's results for autark assignments of CNFs. As algorithmic results, we present a pseudo-polynomial time algorithm that computes a linear fixable assignment for a given integer linear system, which implies the well known pseudo-polynomial solvability for integer linear systems such as two-variable-per-inequality, Horn and q-Horn systems.

Thursday 19 08:30 - 09:55 NLP-MT - Machine Translation (T2)

Chair: Sinno Jialin Pan

#1398

Neural Machine Translation with Key-Value Memory-Augmented Attention
Fandong Meng, Zhaopeng Tu, Yong Cheng, Haiyang Wu, Junjie Zhai, Yuekui Yang, Di Wang

Machine Translation

Although attention-based Neural Machine Translation (NMT) has achieved remarkable progress in recent years, it still suffers from issues of repeating and dropping translations. To alleviate these issues, we propose a novel key-value memory-augmented attention model for NMT, called KVMEMATT. Specifically, we maintain a timely updated keymemory to keep track of attention history and a fixed value-memory to store the representation of source sentence throughout the whole translation process. Via nontrivial transformations and iterative interactions between the two memories, the decoder focuses on more appropriate source word(s) for predicting the next target word at each decoding step, therefore can improve the adequacy of translations. Experimental results on Chinese)English and WMT17 German,English translation tasks demonstrate the superiority of the proposed model.
#2641

Phrase Table as Recommendation Memory for Neural Machine Translation
Yang Zhao, Yining Wang, Jiajun Zhang, Chengqing Zong

Machine Translation

Neural Machine Translation (NMT) has drawn much attention due to its promising translation performance recently. However, several studies indicate that NMT often generates fluent but unfaithful translations. In this paper, we propose a method to alleviate this problem by using a phrase table as recommendation memory. The main idea is to add bonus to words worthy of recommendation, so that NMT can make correct predictions. Specifically, we first derive a prefix tree to accommodate all the candidate target phrases by searching the phrase translation table according to the source sentence.Then, we construct a recommendation word set by matching between candidate target phrases and previously translated target words by NMT. After that, we determine the specific bonus value for each recommendable word by using the attention vector and phrase translation probability. Finally,we integrate this bonus value into NMT to improve the translation results. The extensive experiments demonstrate that the proposed methods obtain remarkable improvements over the strong attention based NMT.
#5454

(Journal track) From Feature to Paradigm: Deep Learning in Machine Translation
Marta R. Costa-jussà

Machine Translation

In the last years, deep learning algorithms have highly revolutionized several areas including speech, image and natural language processing. The specific field of Machine Translation (MT) has not remained invariant. Integration of deep learning in MT varies from re-modeling existing features into standard statistical systems to the development of a new architecture. Among the different neural networks, research works use feed-forward neural networks, recurrent neural networks and the encoder-decoder schema. These architectures are able to tackle challenges as having low-resources or morphology variations. This extended abstract focuses on describing the foundational works on the neural MT approach; mentioning its strengths and weaknesses; and including an analysis of the corresponding challenges and future work. The full manuscript [Costa-jussà, 2018] describes, in addition, how these neural networks have been integrated to enhance different aspects and models from statistical MT, including language modeling, word alignment, translation, reordering, and rescoring; and on describing the new neural MT approach together with recent approaches on using subword, characters and training with multilingual languages, among others.
#2700

Point Set Registration for Unsupervised Bilingual Lexicon Induction
Hailong Cao, Tiejun Zhao

Machine Translation

Inspired by the observation that word embeddings exhibit isomorphic structure across languages, we propose a novel method to induce a bilingual lexicon from only two sets of word embeddings, which are trained on monolingual source and target data respectively. This is achieved by formulating the task as point set registration which is a more general problem. We show that a transformation from the source to the target embedding space can be learned automatically without any form of cross-lingual supervision. By properly adapting a traditional point set registration model to make it be suitable for processing word embeddings, we achieved state-of-the-art performance on the unsupervised bilingual lexicon induction task. The point set registration problem has been well-studied and can be solved by many elegant models, we thus opened up a new opportunity to capture the universal lexical semantic structure across languages.
#2440

Enhancing Semantic Representations of Bilingual Word Embeddings with Syntactic Dependencies
Linli Xu, Wenjun Ouyang, Xiaoying Ren, Yang Wang, Liang Jiang

Machine Translation

Cross-lingual representation is a technique that can both represent different languages in the same latent vector space and enable the knowledge transfer across languages. To learn such representations, most of existing works require parallel sentences with word-level alignments and assume that aligned words have similar Bag-of-Words (BoW) contexts. However, due to differences in grammar structures among different languages, the contexts of aligned words in different languages may appear at different positions of the sentence. To address this issue of different syntactics across different languages, we propose a model of bilingual word embeddings integrating syntactic dependencies (DepBiWE) by producing dependency parse-trees which encode the accurate relative positions for the contexts of aligned words. In addition, a new method is proposed to learn bilingual word embeddings from dependency-based contexts and BoW contexts jointly. Extensive experimental results on a real world dataset clearly validate the superiority of the proposed model DepBiWE on various natural language processing (NLP) tasks.
#705

An Encoder-Decoder Framework Translating Natural Language to Database Queries
Ruichu Cai, Boyan Xu, Zhenjie Zhang, Xiaoyan Yang, Zijian Li, Zhihao Liang

Machine Translation

Machine translation is going through a radical revolution, driven by the explosive development of deep learning techniques using Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN). In this paper, we consider a special case in machine translation problems, targeting to convert natural language into Structured Query Language (SQL) for data retrieval over relational database. Although generic CNN and RNN learn the grammar structure of SQL when trained with sufficient samples, the accuracy and training efficiency of the model could be dramatically improved, when the translation model is deeply integrated with the grammar rules of SQL. We present a new encoder-decoder framework, with a suite of new approaches, including new semantic features fed into the encoder, grammar-aware states injected into the memory of decoder, as well as recursive state management for sub-queries. These techniques help the neural network better focus on understanding semantics of operations in natural language and save the efforts on SQL grammar learning. The empirical evaluation on real world database and queries show that our approach outperform state-of-the-art solution by a significant margin.
#5460

(Journal track) Lightweight Random Indexing for Polylingual Text Classification
Alejandro Moreo Fernández, Andrea Esuli, Fabrizio Sebastiani

Machine Translation

Polylingual Text Classification (PLC) is a supervised learning task that consists of assigning class labels to documents written in different languages, assuming that a representative set of training documents is available for each language. This scenario is more and more frequent, given the large quantity of multilingual platforms and communities emerging on the Internet. In this work we analyse some important methods proposed in the literature that are machine-translation-free and dictionary-free, and we propose a particular configuration of the Random Indexing method (that we dub Lightweight Random Indexing). We show that it outperforms all compared algorithms and also displays a significantly reduced computational cost.

Thursday 19 08:30 - 09:55 CV-CV2 - Computer Vision 2 (T1)

Chair: Amy Loutfi

#235

H-Net: Neural Network for Cross-domain Image Patch Matching
Weiquan Liu, Xuelun Shen, Cheng Wang, Zhihong Zhang, Chenglu Wen, Jonathan Li

Computer Vision 2

Describing the same scene with different imaging style or rendering image from its 3D model gives us different domain images. Different domain images tend to have a gap and different local appearances, which raise the main challenge on the cross-domain image patch matching. In this paper, we propose to incorporate AutoEncoder into the Siamese network, named as H-Net, of which the structural shape resembles the letter H. The H-Net achieves state-of-the-art performance on the cross-domain image patch matching. Furthermore, we improved H-Net to H-Net++. The H-Net++ extracts invariant feature descriptors in cross-domain image patches and achieves state-of-the-art performance by feature retrieval in Euclidean space. As there is no benchmark dataset including cross-domain images, we made a cross-domain image dataset which consists of camera images, rendering images from UAV 3D model, and images generated by CycleGAN algorithm. Experiments show that the proposed H-Net and H-Net++ outperform the existing algorithms. Our code and cross-domain image dataset are available at https://github.com/Xylon-Sean/H-Net.
#998

Deep Reasoning with Knowledge Graph for Social Relationship Understanding
Zhouxia Wang, Tianshui Chen, Jimmy Ren, Weihao Yu, Hui Cheng, Liang Lin

Computer Vision 2

Social relationships (e.g., friends, couple etc.) form the basis of the social network in our daily life. Automatically interpreting such relationships bears a great potential for the intelligent systems to understand human behavior in depth and to better interact with people at a social level. Human beings interpret the social relationships within a group not only based on the people alone, and the interplay between such social relationships and the contextual information around the people also plays a significant role. However, these additional cues are largely overlooked by the previous studies. We found that the interplay between these two factors can be effectively modeled by a novel structured knowledge graph with proper message propagation and attention. And this structured knowledge can be efficiently integrated into the deep neural network architecture to promote social relationship understanding by an end-to-end trainable Graph Reasoning Model (GRM), in which a propagation mechanism is learned to propagate node message through the graph to explore the interaction between persons of interest and the contextual objects. Meanwhile, a graph attentional mechanism is introduced to explicitly reason about the discriminative objects to promote recognition. Extensive experiments on the public benchmarks demonstrate the superiority of our method over the existing leading competitors.
#1206

Representation Learning for Scene Graph Completion via Jointly Structural and Visual Embedding
Hai Wan, Yonghao Luo, Bo Peng, Wei-Shi Zheng

Computer Vision 2

This paper focuses on scene graph completion which aims at predicting new relations between two entities utilizing existing scene graphs and images. By comparing with the well-known knowledge graph, we first identify that each scene graph is associated with an image and each entity of a visual triple in a scene graph is composed of its entity type with attributes and grounded with a bounding box in its corresponding image. We then propose an end-to-end model named Representation Learning via Jointly Structural and Visual Embedding (RLSV) to take advantages of structural and visual information in scene graphs. In RLSV model, we provide a fully-convolutional module to extract the visual embeddings of a visual triple and apply hierarchical projection to combine the structural and visual embeddings of a visual triple. In experiments, we evaluate our model on two scene graph completion tasks: link prediction and visual triple classification, and further analyze by case studies. Experimental results demonstrate that our model outperforms all baselines in both tasks, which justifies the significance of combining structural and visual information for scene graph completion.
#1505

Age Estimation Using Expectation of Label Distribution Learning
Bin-Bin Gao, Hong-Yu Zhou, Jianxin Wu, Xin Geng

Computer Vision 2

Age estimation performance has been greatly improved by using convolutional neural network. However, existing methods have an inconsistency between the training objectives and evaluation metric, so they may be suboptimal. In addition, these methods always adopt image classification or face recognition models with a large amount of parameters, which bring expensive computation cost and storage overhead. To alleviate these issues, we design a lightweight network architecture and propose a unified framework which can jointly learn age distribution and regress age. The effectiveness of our approach has been demonstrated on apparent and real age estimation tasks. Our method achieves new state-of-the-art results using the single model with 36$\times$ fewer parameters and 2.6$\times$ reduction in inference time. Moreover, our method can achieve comparable results as the state-of-the-art even though model parameters are further reduced to 0.9M~(3.8MB disk storage). We also analyze that Ranking methods are implicitly learning label distributions.
#2627

Coarse-to-fine Image Co-segmentation with Intra and Inter Rank Constraints
Lianli Gao, Jingkuan Song, Dongxiang Zhang, Heng Tao Shen

Computer Vision 2

Image co-segmentation is the problem of automatically discovering the common objects co-occurring in a set of relevant images and segmenting them as foreground simultaneously. Although a bunch of approaches have been proposed to address this problem, many of them still suffer from certain limitations, e.g., supervised feature learning and complex models, which hinder their capability in the real-world scenarios. To alleviate these limitations, we propose a novel coarse-to-fine co-segmentation (CFC) framework, which utilizes the coarse foreground and background proposals to learn a robust similarity measure of the features in an unsupervised way, and then devises a simple objective function based on the definition of image co-segmentation. Specifically, we first generate superpixels for all the images and extract their features. Instead of using existing distance metrics, we utilize object proposal methods to generate coarse foreground and background to learn a similarity measure of superpixels to construct a robust feature similarity graph. Then we design an intuitive objective function to learn a segmentation similarity graph which should be consistent with feature similarity graph and also be able to co-segment the superpixels in the images into either foreground and background. This objective function can be further reformulated as a graph learning problem with intra and inter rank constraints. Experiments on two commonly used image datasets (iCoseg and MSRC) demonstrate that CFC outperforms other state-of-the-art methods. Notably, this performance is achieved by using only HSV feature.
#605

From Reality to Perception: Genre-Based Neural Image Style Transfer
Zhuoqi Ma, Nannan Wang, Xinbo Gao, Jie Li

Computer Vision 2

We introduce a novel thought for integrating artists’ perceptions on the real world into neural image style transfer process. Conventional approaches commonly migrate color or texture patterns from style image to content image, but the underlying design aspect of the artist always get overlooked. We want to address the in-depth genre style, that how artists perceive the real world and express their perceptions in the artwork. We collect a set of Van Gogh’s paintings and cubist artworks, and their semantically corresponding real world photos. We present a novel genre style transfer framework modeled after the mechanism of actual artwork production. The target style representation is reconstructed based on the semantic correspondence between real world photo and painting, which enable the perception guidance in style transfer. The experimental results demonstrate that our method can capture the overall style of a genre or an artist. We hope that this work provides new insight for including artists’ perceptions into neural style transfer process, and helps people to understand the underlying characters of the artist or the genre.
#1623

Active Object Reconstruction Using a Guided View Planner
Xin Yang, Yuanbo Wang, Yaru Wang, Baocai Yin, Qiang Zhang, Xiaopeng Wei, Hongbo Fu

Computer Vision 2

Inspired by the recent advance of image-based object reconstruction using deep learning, we present an active reconstruction model using a guided view planner. We aim to reconstruct a 3D model using images observed from a planned sequence of informative and discriminative views. But where are such informative and discriminative views around an object? To address this we propose a unified model for view planning and object reconstruction, which is utilized to learn a guided information acquisition model and to aggregate information from a sequence of images for reconstruction. Experiments show that our model (1) increases our reconstruction accuracy with an increasing number of views (2) and generally predicts a more informative sequence of views for object reconstruction compared to other alternative methods.

Thursday 19 08:30 - 09:55 SIS-ML2 - Sister Conferences Best Papers, Machine Learning 2 (K11)

Chair: Yang Yu

#5121

Generating High Resolution Climate Change Projections through Single Image Super-Resolution: An Abridged Version
Thomas Vandal, Evan Kodra, Sangram Ganguly, Andrew Michaelis, Ramakrishna Nemani, Auroop R Ganguly

Sister Conferences Best Papers, Machine Learning 2

The impacts of climate change are felt by most critical systems, such as infrastructure, ecological systems, and power-plants. However, contemporary Earth System Models (ESM) are run at spatial resolutions too coarse for assessing effects this localized. Local scale projections can be obtained using statistical downscaling, a technique which uses historical climate observations to learn a low-resolution to high-resolution mapping. The spatio-temporal nature of the climate system motivates the adaptation of super-resolution image processing techniques to statistical downscaling. In our work, we present DeepSD, a generalized stacked super resolution convolutional neural network (SRCNN) framework with multi-scale input channels for statistical downscaling of climate variables. A comparison of DeepSD to four state-of-the-art methods downscaling daily precipitation from 1 degree (~100km) to 1/8 degrees (~12.5km) over the Continental United States. Furthermore, a framework using the NASA Earth Exchange (NEX) platform is discussed for downscaling more than 20 ESM models with multiple emission scenarios.
#5124

Marathon Race Planning: A Case-Based Reasoning Approach
Barry Smyth, Padraig Cunningham

Sister Conferences Best Papers, Machine Learning 2

We describe and evaluate a novel application of case-based reasoning to help marathon runners to achieve a personal best by: (a) predicting a challenging, but realistic race-time; and (b) recommending a race-plan to achieve this time.
#5132

Improving Information Extraction from Images with Learned Semantic Models
Stephan Baier, Yunpu Ma, Volker Tresp

Sister Conferences Best Papers, Machine Learning 2

Many applications require an understanding of an image that goes beyond the simple detection and classification of its objects. In particular, a great deal of semantic information is carried in the relationships between objects. We have previously shown, that the combination of a visual model and a statistical semantic prior model can improve on the task of mapping images to their associated scene description. In this paper, we review the model and compare it to a novel conditional multi-way model for visual relationship detection, which does not include an explicitly trained visual prior model. We also discuss potential relationships between the proposed methods and memory models of the human brain.
#5145

Recursive Spoken Instruction-Based One-Shot Object and Action Learning
Matthias Scheutz, Evan Krause, Bradley Oosterveld, Tyler Frasca, Robert Platt

Sister Conferences Best Papers, Machine Learning 2

Learning new knowledge from single instructions and being able to apply it immediately is highly desirable for artificial agents. We provide the first demonstration of spoken instruction-based one-shot object and action learning in a cognitive robotic architecture and briefly discuss the architectural modifications required to enable such fast learning, demonstrating the new capabilities on a fully autonomous robot.
#5103

Inhibition of Occluded Facial Regions for Distance-Based Face Recognition
Daniel López Sánchez, Juan M. Corchado, Angelica González Arrieta

Sister Conferences Best Papers, Machine Learning 2

This work focuses on the design and validation of a CBR system for efficient face recognition under partial occlusion conditions. The proposed CBR system is based on a classical distance-based classification method, modified to increase its robustness to partial occlusion. This is achieved by using a novel dissimilarity function which discards features coming from occluded facial regions. In addition, we explore the integration of an efficient dimensionality reduction method into the proposed framework to reduce computational cost. We present experimental results showing that the proposed CBR system outperforms classical methods of similar computational requirements in the task of face recognition under partial occlusion.
#5139

Accelerating Innovation Through Analogy Mining
Tom Hope, Joel Chan, Aniket Kittur, Dafna Shahaf

Sister Conferences Best Papers, Machine Learning 2

The availability of large idea repositories (e.g., patents) could significantly accelerate innovation and discovery by providing people inspiration from solutions to analogous problems. However, finding useful analogies in these large, messy, real-world repositories remains a persistent challenge for both humans and computers. Previous approaches include costly hand-created databases that do not scale, or machine-learning similarity metrics that struggle to account for structural similarity, which is central to analogy. In this paper we explore the viability and value of learning simple structural representations. Our approach combines crowdsourcing and recurrent neural networks to extract purpose and mechanism vector representations from product descriptions. We demonstrate that these learned vectors allow us to find analogies with higher precision and recall than traditional methods. In an ideation experiment, analogies retrieved by our models significantly increased people's likelihood of generating creative ideas.
#5106

Toeplitz Inverse Covariance-based Clustering of Multivariate Time Series Data
David Hallac, Sagar Vare, Stephen Boyd, Jure Leskovec

Sister Conferences Best Papers, Machine Learning 2

Subsequence clustering of multivariate time series is a useful tool for discovering repeated patterns in temporal data. Once these patterns have been discovered, seemingly complicated datasets can be interpreted as a temporal sequence of only a small number of states, or clusters. However, discovering these patterns is challenging because it requires simultaneous segmentation and clustering of the time series. Here we propose a new method of model-based clustering, which we call Toeplitz Inverse Covariance-based Clustering (TICC). Each cluster in the TICC method is defined by a correlation network, or Markov random field (MRF), characterizing the interdependencies between different observations in a typical subsequence of that cluster. Based on this graphical representation, TICC simultaneously segments and clusters the time series data. We solve the TICC problem through a scalable algorithm that is able to efficiently solve for tens of millions of observations. We validate our approach by comparing TICC to several state-of-the-art baselines in a series of synthetic experiments, and we then demonstrate on an automobile dataset how TICC can be used to learn interpretable clusters in real-world scenarios.

Thursday 19 08:30 - 09:55 ML-TSE2 - Time Series and Data Streams 2 (C2)

Chair: Xiangliang Zhang

#335

Improving Maximum Likelihood Estimation of Temporal Point Process via Discriminative and Adversarial Learning
Junchi Yan, Xin Liu, Liangliang Shi, Changsheng Li, Hongyuan Zha

Time Series and Data Streams 2

Point process is an expressive tool in learning temporal event sequence which is ubiquitous in real-world applications. Traditional predictive models are based on maximum likelihood estimation (MLE). This paper aims to improve MLE by discriminative and adversarial learning. The initial model is learned by MLE explaining the joint distribution of the occurred event history. Then it is refined by devising a gradient based learning procedure with two complementary recipes: i) mean square error (MSE) that directly reflects the prediction accuracy of the model; ii) adversarial classification loss which induces the Wasserstein distance loss. The hope is that the adversarial loss can add sharpness to the smooth effect inherently caused by the MSE loss. The method is generic and compatible with different differentiable parametric forms of the intensity function. Empirical results via a variant of the Hawkes processes demonstrate its effectiveness of our method.
#2003

Deep into Hypersphere: Robust and Unsupervised Anomaly Discovery in Dynamic Networks
Xian Teng, Muheng Yan, Ali Mert Ertugrul, Yu-Ru Lin

Time Series and Data Streams 2

The increasing and flexible use of autonomous systems in many domains -- from intelligent transportation systems, information systems, to business transaction management -- has led to challenges in understanding the "normal" and "abnormal" behaviors of those systems. As the systems may be composed of internal states and relationships among sub-systems, it requires not only warning users to anomalous situations but also provides "transparency" about how the anomalies deviate from normalcy for more appropriate intervention. We propose a unified anomaly discovery framework "DeepSphere" that simultaneously meet the above two requirements -- identifying the anomalous cases and further exploring the cases' anomalous structure localized in spatial and temporal context. DeepSphere leverages deep autoencoders and hypersphere learning methods, having the capability of isolating anomaly pollution and reconstructing normal behaviors. DeepSphere does not rely on human annotated samples and can generalize to unseen data. Extensive experiments on both synthetic and real datasets demonstrate the consistent and robust performance of the proposed method.
#3322

Z-Transforms and its Inference on Partially Observable Point Processes
Young Lee, Thanh Vinh Vo, Kar Wai Lim, Harold Soh

Time Series and Data Streams 2

This paper proposes an inference framework based on the Z-transform for a specific class of non-homogeneous point processes. This framework gives an alternative method to maximum likelihood estimation which is omnipresent in the field of point processes. The inference strategy is to couple or match the theoretical Z-transform with its empirical counterpart from the observed samples. This procedure fully characterizes the distribution of the point process since there exists a one-to-one mapping with the Z-transform. We illustrate how to use the methodology to estimate a point process whose intensity is driven by a general neural network.
#3711

Exploiting Graph Regularized Multi-dimensional Hawkes Processes for Modeling Events with Spatio-temporal Characteristics
Yanchi Liu, Tan Yan, Haifeng Chen

Time Series and Data Streams 2

Multi-dimensional Hawkes processes (MHP) has been widely used for modeling temporal events. However, when MHP was used for modeling events with spatio-temporal characteristics, the spatial information was often ignored despite its importance. In this paper, we introduce a framework to exploit MHP for modeling spatio-temporal events by considering both temporal and spatial information. Specifically, we design a graph regularization method to effectively integrate the prior spatial structure into MHP for learning influence matrix between different locations. Indeed, the prior spatial structure can be first represented as a connection graph. Then, a multi-view method is utilized for the alignment of the prior connection graph and influence matrix while preserving the sparsity and low-rank properties of the kernel matrix. Moreover, we develop an optimization scheme using an alternating direction method of multipliers to solve the resulting optimization problem. Finally, the experimental results show that we are able to learn the interaction patterns between different geographical areas more effectively with prior connection graph introduced for regularization.
#2626

A Non-Parametric Generative Model for Human Trajectories
Kun Ouyang, Reza Shokri, David S. Rosenblum, Wenzhuo Yang

Time Series and Data Streams 2

Modeling human mobility and synthesizing realistic trajectories play a fundamental role in urban planning and privacy-preserving location data analysis. Due to its high dimensionality and also the diversity of its applications, existing trajectory generative models do not preserve the geometric (and more importantly) semantic features of human mobility, especially for longer trajectories. In this paper, we propose and evaluate a novel non-parametric generative model for location trajectories that tries to capture the statistical features of human mobility {\em as a whole}. This is in contrast with existing models that generate trajectories in a sequential manner. We design a new representation of locations, and use generative adversarial networks to produce data points in that representation space which will be then transformed to a time-series location trajectory form. We evaluate our method on realistic location trajectories and compare our synthetic traces with multiple existing methods on how they preserve geographic and semantic features of real traces at both aggregated and individual levels. The empirical results prove the capability of our model in preserving the utility of real data.
#3712

Request-and-Reverify: Hierarchical Hypothesis Testing for Concept Drift Detection with Expensive Labels
Shujian Yu, Xiaoyang Wang, José C. Príncipe

Time Series and Data Streams 2

One important assumption underlying common classification models is the stationarity of the data. However, in real-world streaming applications, the data concept indicated by the joint distribution of feature and label is not stationary but drifting over time. Concept drift detection aims to detect such drifts and adapt the model so as to mitigate any deterioration in the model's predictive performance. Unfortunately, most existing concept drift detection methods rely on a strong and over-optimistic condition that the true labels are available immediately for all already classified instances. In this paper, a novel Hierarchical Hypothesis Testing framework with Request-and-Reverify strategy is developed to detect concept drifts by requesting labels only when necessary. Two methods, namely Hierarchical Hypothesis Testing with Classification Uncertainty (HHT-CU) and Hierarchical Hypothesis Testing with Attribute-wise "Goodness-of-fit" (HHT-AG), are proposed respectively under the novel framework. In experiments with benchmark datasets, our methods demonstrate overwhelming advantages over state-of-the-art unsupervised drift detectors. More importantly, our methods even outperform DDM (the widely used supervised drift detector) when we use significantly fewer labels.
#3992

Temporal Belief Memory: Imputing Missing Data during RNN Training
Yeo Jin Kim, Min Chi

Time Series and Data Streams 2

We propose a bio-inspired approach named Temporal Belief Memory (TBM) for handling missing data with recurrent neural networks (RNNs). When modeling irregularly observed temporal sequences, conventional RNNs generally ignore the real-time intervals between consecutive observations. TBM is a missing value imputation method that considers the time continuity and captures latent missing patterns based on irregular real time intervals of the inputs. We evaluate our TBM approach with real-world electronic health records (EHRs) consisting of 52,919 visits and 4,224,567 events on a task of early prediction of septic shock. We compare TBM against multiple baselines including both domain experts' rules and the state-of-the-art missing data handling approach using both RNN and long-short term memory. The experimental results show that TBM outperforms all the competitive baseline approaches for the septic shock early prediction task.

Thursday 19 08:30 - 09:55 MUL-REC - Recommender Systems (C3)

Chair: Chaochao Chen

#261

Improving Implicit Recommender Systems with View Data
Jingtao Ding, Guanghui Yu, Xiangnan He, Yuhan Quan, Yong Li, Tat-Seng Chua, Depeng Jin, Jiajie Yu

Recommender Systems

Most existing recommender systems leverage the primary feedback data only, such as the purchase records in E-commerce. In this work, we additionally integrate view data into implicit feedback based recommender systems (dubbed as Implicit Recommender Systems). We propose to model the pairwise ranking relations among purchased, viewed, and non-viewed interactions, being more effective and ﬂexible than typical pointwise matrix factorization (MF) methods. However, such a pairwise formulation poses efﬁciency challenges in learning the model. To address this problem, we design a new learning algorithm based on the element-wise Alternating Least Squares (eALS) learner. Notably, our algorithm can efﬁciently learn model parameters from the whole user-item matrix (including all missing data), with a rather low time complexity that is dependent on the observed data only. Extensive experiments on two real-world datasets demonstrate that our method outperforms several state-of-the-art MF methods by 10% ∼ 28.4%. Our implementation is available at: https://github.com/ dingjingtao/View_enhanced_ALS.
#720

Interpretable Recommendation via Attraction Modeling: Learning Multilevel Attractiveness over Multimodal Movie Contents
Liang Hu, Songlei Jian, Longbing Cao, Qingkui Chen

Recommender Systems

New contents like blogs and online videos are produced in every second in the new media age. We argue that attraction is one of the decisive factors for user selection of new contents. However, collaborative filtering cannot work without user feedback; and the existing content-based recommender systems are ineligible to capture and interpret the attractive points on new contents. Accordingly, we propose attraction modeling to learn and interpret user attractiveness. Specially, we build a multilevel attraction model (MLAM) over the content features -- the story (textual data) and cast members (categorical data) of movies. In particular, we design multilevel personal filters to calculate users' attractiveness on words, sentences and cast members at different levels. The experimental results show the superiority of MLAM over the state-of-the-art methods. In addition, a case study is provided to demonstrate the interpretability of MLAM by visualizing user attractiveness on a movie.
#2275

CoupledCF: Learning Explicit and Implicit User-item Couplings in Recommendation for Deep Collaborative Filtering
Quangui Zhang, Longbing Cao, Chengzhang Zhu, Zhiqiang Li, Jinguang Sun

Recommender Systems

Non-IID recommender system discloses the nature of recommendation and has shown its potential in improving recommendation quality and addressing issues such as sparsity and cold start. It leverages existing work that usually treats users/items as in- dependent while ignoring the rich couplings within and between users and items, leading to limited performance improvement. In reality, users/items are related with various couplings existing within and between users and items, which may better ex- plain how and why a user has personalized pref- erence on an item. This work builds on non- IID learning to propose a neural user-item cou- pling learning for collaborative filtering, called CoupledCF. CoupledCF jointly learns explicit and implicit couplings within/between users and items w.r.t. user/item attributes and deep features for deep CF recommendation. Empirical results on two real-world large datasets show that CoupledCF significantly outperforms two latest neural recom- menders: neural matrix factorization and Google’s Wide&Deep network.
#711

Exploiting POI-Specific Geographical Influence for Point-of-Interest Recommendation
Hao Wang, Huawei Shen, Wentao Ouyang, Xueqi Cheng

Recommender Systems

Point-of-interest (POI) recommendation, i.e., recommending unvisited POIs for users, is a fundamental problem for location-based social networks. POI recommendation distinguishes itself from traditional item recommendation, e.g., movie recommendation, via geographical influence among POIs. Existing methods model the geographical influence between two POIs as the probability or propensity that the two POIs are co-visited by the same user given their physical distance. These methods assume that geographical influence between POIs is determined by their physical distance, failing to capture the asymmetry of geographical influence and the high variation of geographical influence across POIs. In this paper, we exploit POI-specific geographical influence to improve POI recommendation. We model the geographical influence between two POIs using three factors: the geo-influence of POI, the geo-susceptibility of POI, and their physical distance. Geo-influence captures POI?s capacity at exerting geographical influence to other POIs, and geo-susceptibility reflects POI?s propensity of being geographically influenced by other POIs. Experimental results on two real-world datasets demonstrate that POI-specific geographical influence significantly improves the performance of POI recommendation.
#1230

Matrix completion with Preference Ranking for Top-N Recommendation
Zengmao Wang, Yuhong Guo, Bo Du

Recommender Systems

Matrix completion has become a popular method for top-N recommendation due to the low rank nature of sparse rating matrices. However, many existing methods produce top-N recommendations by recovering a user-item matrix solely based on a low rank function or its relaxations, while ignoring other important intrinsic characteristics of the top-N recommendation tasks such as preference ranking over the items. In this paper, we propose a novel matrix completion method that integrates the low rank and preference ranking characteristics of recommendation matrix under a self-recovery model for top-N recommendation. The proposed method is formulated as a joint minimization problem and solved using an ADMM algorithm. We conduct experiments on E-commerce datasets. The experimental results show the proposed approach outperforms several state-of-the-art methods.
#1695

NeuRec: On Nonlinear Transformation for Personalized Ranking
Shuai Zhang, Lina Yao, Aixin Sun, Sen Wang, Guodong Long, Manqing Dong

Recommender Systems

Modeling user-item interaction patterns is an important task for personalized recommendations. Many recommender systems are based on the assumption that there exists a linear relationship between users and items while neglecting the intricacy and non-linearity of real-life historical interactions. In this paper, we propose a neural network based recommendation model (NeuRec) that untangles the complexity of user-item interactions and establish an integrated network to combine non-linear transformation with latent factors. We further design two variants of NeuRec: user-based NeuRec and item-based NeuRec, by focusing on different aspects of the interaction matrix. Extensive experiments on four real-world datasets demonstrated their superior performances on personalized ranking task.
#1943

Sequential Recommender System based on Hierarchical Attention Networks
Haochao Ying, Fuzhen Zhuang, Fuzheng Zhang, Yanchi Liu, Guandong Xu, Xing Xie, Hui Xiong, Jian Wu

Recommender Systems

With a large amount of user activity data accumulated, it is crucial to exploit user sequential behavior for sequential recommendations. Conventionally, user general taste and recent demand are combined to promote recommendation performances. However, existing methods often neglect that user long-term preference keep evolving over time, and building a static representation for user general taste may not adequately reflect the dynamic characters. Moreover, they integrate user-item or item-item interactions through a linear way which limits the capability of model. To this end, in this paper, we propose a novel two-layer hierarchical attention network, which takes the above properties into account, to recommend the next item user might be interested. Specifically, the first attention layer learns user long-term preferences based on the historical purchased item representation, while the second one outputs final user representation through coupling user long-term and short-term preferences. The experimental study demonstrates the superiority of our method compared with other state-of-the-art ones.

Thursday 19 08:55 - 09:55 Industry Day (A4)

Industry Day - Session 1a

Industry Day

Show details

Thursday 19 10:25 - 11:40 EAR8 - Early Career 8 (VICTORIA)

Chair: Alessandro Saffiotti

#5488

Grounded Language Learning: Where Robotics and NLP Meet
Cynthia Matuszek

Early Career 8

Grounded language acquisition is concerned with learning the meaning of language as it applies to the physical world. As robots become more capable and ubiquitous, there is an increasing need for non-specialists to interact with and control them, and natural language is an intuitive, flexible, and customizable mechanism for such communication. At the same time, physically embodied agents offer a way to learn to understand natural language in the context of the world to which it refers. This paper gives an overview of the research area, selected recent advances, and some future directions and challenges that remain.
#5498

Optimizing Robot Action for and around People
Anca Dragan

Early Career 8
#5499

Gearing up Knowledge Representation and Reasoning for the Real World
Martin Gebser

Early Career 8

Thursday 19 10:25 - 11:40 KR-CV - Knowledge Representation and Vision, Spatial Reasoning (C7)

Chair: Diedrich Wolter

#7

GeoMAN: Multi-level Attention Networks for Geo-sensory Time Series Prediction
Yuxuan Liang, Songyu Ke, Junbo Zhang, Xiuwen Yi, Yu Zheng

Knowledge Representation and Vision, Spatial Reasoning

Numerous sensors have been deployed in different geospatial locations to continuously and cooperatively monitor the surrounding environment, such as the air quality. These sensors generate multiple geo-sensory time series, with spatial correlations between their readings. Forecasting geo-sensory time series is of great importance yet very challenging as it is affected by many complex factors, i.e., dynamic spatio-temporal correlations and external factors. In this paper, we predict the readings of a geo-sensor over several future hours by using a multi-level attention-based recurrent neural network that considers multiple sensors' readings, meteorological data, and spatial data. More specifically, our model consists of two major parts: 1) a multi-level attention mechanism to model the dynamic spatio-temporal dependencies. 2) a general fusion module to incorporate the external factors from different domains. Experiments on two types of real-world datasets, viz., air quality data and water quality data, demonstrate that our method outperforms nine baseline methods.
#110

Line separation from topographic maps using regional color and spatial information
Pengfei Xu, Qiguang Miao, Tiange Liu, Xiaojiang Chen, Dingyi Fang

Knowledge Representation and Vision, Spatial Reasoning

The lines in topographic maps are difficult to be separated from each other because of their confusing colors. To solve this problem, we propose a novel line separation method using their regional color and spatial information. Firstly, we divide the lines into lots of circular regions with a certain diameter, and consider these regions as the basic processing units. Then based on a new concept of regional color confusion, we classify all the divided circular regions into two kinds of regions by whether the color is pure or mixed. Further, for pure color regions, a fuzzy clustering algorithm with Gaussian kernel can be used to cluster them into different lines based on their color information. Meanwhile, we determine the memberships of the mixed color regions according to their spatial relations with the clustered pure color regions. The concept of regional color confusion is proposed to reduce the influences of the confusing colors to line separation, and the spatial relations are utilized to solve the problems of the membership determination of the mixed color regions. The experimental results demonstrate that our method can achieve higher accuracy compare with other two state-of-the-art methods, which provides a novel idea for line element segmentation from scanned topographic maps.
#2183

Incrementally Grounding Expressions for Spatial Relations between Objects
Tiago Mota, Mohan Sridharan

Knowledge Representation and Vision, Spatial Reasoning

Recognizing, reasoning about, and providing understandable descriptions of spatial relations between objects is an important task for robots interacting with humans. This paper describes an architecture for incrementally learning and revising the grounding of spatial relations between objects. Answer Set Prolog, a declarative language, is used to represent and reason with incomplete knowledge that includes prepositional spatial relations between scene objects. A generic grounding of prepositions for spatial relations, human input (when available), and non-monotonic logical inference, are used to infer spatial relations between 3D point clouds in given scenes, incrementally acquiring a specialized metric grounding of the prepositions and the relative confidence associated with each grounding. The architecture is evaluated on a benchmark dataset of tabletop images and on complex simulated scenes of furniture.
#2947

Reasoning about Betweenness and RCC8 Constraints in Qualitative Conceptual Spaces
Steven Schockaert, Sanjiang Li

Knowledge Representation and Vision, Spatial Reasoning

Conceptual spaces are a knowledge representation framework in which concepts are represented geometrically, using convex regions. Motivated by the fact that exact conceptual spaces are usually difficult to obtain, we study the problem of spatial reasoning about qualitative abstractions of such representations. In particular, we consider the problem of deciding whether an RCC8 network extended with constraints about betweenness can be realized using bounded and convex regions in a high-dimensional Euclidean space. After showing that this decision problem is PSPACE-hard in general, we introduce an important fragment for which deciding realizability is NP-complete.
#993

Knowledge-Embedded Representation Learning for Fine-Grained Image Recognition
Tianshui Chen, Liang Lin, Riquan Chen, Yang Wu, Xiaonan Luo

Knowledge Representation and Vision, Spatial Reasoning

Humans can naturally understand an image in depth with the aid of rich knowledge accumulated from daily lives or professions. For example, to achieve fine-grained image recognition (e.g., categorizing hundreds of subordinate categories of birds) usually requires a comprehensive visual concept organization including category labels and part-level attributes. In this work, we investigate how to unify rich professional knowledge with deep neural network architectures and propose a Knowledge-Embedded Representation Learning (KERL) framework for handling the problem of fine-grained image recognition. Specifically, we organize the rich visual concepts in the form of knowledge graph and employ a Gated Graph Neural Network to propagate node message through the graph for generating the knowledge representation. By introducing a novel gated mechanism, our KERL framework incorporates this knowledge representation into the discriminative image feature learning, i.e., implicitly associating the specific attributes with the feature maps. Compared with existing methods of fine-grained image classification, our KERL framework has several appealing properties: i) The embedded high-level knowledge enhances the feature representation, thus facilitating distinguishing the subtle differences among subordinate categories. ii) Our framework can learn feature maps with a meaningful configuration that the highlighted regions finely accord with the nodes (specific attributes) of the knowledge graph. Extensive experiments on the widely used Caltech-UCSD bird dataset demonstrate the superiority of our KERL framework over existing state-of-the-art methods.
#2890

Fine-grained Image Classification by Visual-Semantic Embedding
Huapeng Xu, Guilin Qi, Jingjing Li, Meng Wang, Kang Xu, Huan Gao

Knowledge Representation and Vision, Spatial Reasoning

This paper investigates a challenging problem,which is known as fine-grained image classification(FGIC). Different from conventional computer visionproblems, FGIC suffers from the large intraclassdiversities and subtle inter-class differences.Existing FGIC approaches are limited to exploreonly the visual information embedded in the images.In this paper, we present a novel approachwhich can use handy prior knowledge from eitherstructured knowledge bases or unstructured text tofacilitate FGIC. Specifically, we propose a visual-semanticembedding model which explores semanticembedding from knowledge bases and text, andfurther trains a novel end-to-end CNN frameworkto linearly map image features to a rich semanticembedding space. Experimental results on a challenginglarge-scale UCSD Bird-200-2011 datasetverify that our approach outperforms several state-of-the-art methods with significant advances.

Thursday 19 10:25 - 11:40 HSP-HS - Heuristic Search (K2)

Chair: Nathan Sturtevant

#3730

Distributed Pareto Optimization for Subset Selection
Chao Qian, Guiying Li, Chao Feng, Ke Tang

Heuristic Search

The subset selection problem that selects a few items from a ground set arises in many applications such as maximum coverage, influence maximization, sparse regression, etc. The recently proposed POSS algorithm is a powerful approximation solver for this problem. However, POSS requires centralized access to the full ground set, and thus is impractical for large-scale real-world applications, where the ground set is too large to be stored on one single machine. In this paper, we propose a distributed version of POSS (DPOSS) with a bounded approximation guarantee. DPOSS can be easily implemented in the MapReduce framework. Our extensive experiments using Spark, on various real-world data sets with size ranging from thousands to millions, show that DPOSS can achieve competitive performance compared with the centralized POSS, and is almost always better than the state-of-the-art distributed greedy algorithm RandGreeDi.
#1985

Best-Case and Worst-Case Behavior of Greedy Best-First Search
Manuel Heusner, Thomas Keller, Malte Helmert

Heuristic Search

We study the impact of tie-breaking on the behavior of greedy best-first search with a fixed state space and fixed heuristic. We prove that it is NP-complete to determine the number of states that need to be expanded by greedy best-first search in the best case or in the worst case. However, the best- and worst-case behavior can be computed in polynomial time for undirected state spaces. We perform computational experiments on benchmark tasks from the International Planning Competitions that compare the best and worst cases of greedy best-first search to FIFO, LIFO and random tie-breaking. The experiments demonstrate the importance of tie-breaking in greedy best-first search.
#3728

Anytime Focal Search with Applications
Liron Cohen, Matias Greco, Hang Ma, Carlos Hernandez, Ariel Felner, T. K. Satish Kumar, Sven Koenig

Heuristic Search

Focal search (FS) is a bounded-suboptimal search (BSS) variant of A*. Like A*, it uses an open list whose states are sorted in increasing order of their f-values. Unlike A*, it also uses a focal list containing all states from the open list whose f-values are no larger than a suboptimality factor times the smallest f-value in the open list. In this paper, we develop an anytime version of FS, called anytime FS (AFS), that is useful when deliberation time is limited. AFS finds a "good" solution quickly and refines it to better and better solutions if time allows. It does this refinement efficiently by reusing previous search efforts. On the theoretical side, we show that AFS is bounded suboptimal and that anytime potential search (ATPS/ANA*), a state-of-the-art anytime bounded-cost search (BCS) variant of A*, is a special case of AFS. In doing so, we bridge the gap between anytime search algorithms based on BSS and BCS. We also identify different properties of priority functions, used to sort the focal list, that may allow for efficient reuse of previous search efforts. On the experimental side, we demonstrate the usefulness of AFS for solving hard combinatorial problems, such as the generalized covering traveling salesman problem and the multi-agent pathfinding problem.
#3942

A Group-based Approach to Improve Multifactorial Evolutionary Algorithm
Jing Tang, Yingke Chen, Zixuan Deng, Yanping Xiang, Colin Paul Joy

Heuristic Search

Multifactorial evolutionary algorithm (MFEA) exploits the parallelism of population-based evolutionaryalgorithm and provides an efficient way to evolve individuals for solving multiple tasks concurrently.Its efficiency is derived by implicitly transferring the genetic information among tasks.However, MFEA doesn?t distinguish the information quality in the transfer compromising the algorithmperformance. We propose a group-based MFEA that groups tasks of similar types and selectivelytransfers the genetic information only within the groups. We also develop a new selection criterionand an additional mating selection mechanism in order to strengthen the effectiveness andefficiency of the improved MFEA. We conduct the experiments in both the cross-domain and intra-domainproblems.
#4165

Understanding Subgoal Graphs by Augmenting Contraction Hierarchies
Tansel Uras, Sven Koenig

Heuristic Search

Contraction hierarchies and (N-level) subgoal graphs are two preprocessing-based path-planning algorithms that have so far only been compared experimentally through the grid-based path-planning competitions, where both algorithms had undominated runtime/memory trade-offs. Subgoal graphs can be considered as a framework that can be customized to different domains through the choice of a reachability relation R that identifies pairs of nodes on a graph between which it is easy to find shortest paths. Subgoal graphs can exploit R in various ways to speed-up query times and reduce memory requirements. In this paper, we break down the differences between N-level subgoal graphs and contraction hierarchies, and augment contraction hierarchies with ideas from subgoal graphs to exploit R. We propose three different modifications, analyze their runtime/memory trade-offs, and provide experimental results on grids using canonical-freespace-reachability as R, which show that both N-level subgoal graphs and contraction hierarchies are dominated in terms of the runtime/memory trade-off by some of our new variants.
#606

Improving Local Search for Minimum Weight Vertex Cover by Dynamic Strategies
Shaowei Cai, Wenying Hou, Jinkun Lin, Yuanjie Li

Heuristic Search

The minimum weight vertex cover (MWVC) problem is an important combinatorial optimization problem with various real-world applications. Due to its NP hardness, most works on solving MWVC focus on heuristic algorithms that can return a good quality solution in reasonable time. In this work, we propose two dynamic strategies that adjust the behavior of the algorithm during search, which are used to improve a state of the art local search for MWVC named FastWVC, resulting in two local search algorithms called DynWVC1 and DynWVC2. Previous MWVC algorithms are evaluated on graphs with random or hand crafted weights. In this work, we evaluate the algorithms on the vertex weighted graphs that obtained from an important real world problem, the map labeling problem. Experiments show that our algorithm obtains better results than previous algorithms for MWVC and maximum weight independent set (MWIS) on these real world instances. We also test our algorithms on massive graphs studied in previous works, and show significant improvements there.

Thursday 19 10:25 - 11:40 NLP-EMB2 - Embeddings 2 (T2)

Chair: Margot Yann

#1482

Medical Concept Embedding with Time-Aware Attention
Xiangrui Cai, Jinyang Gao, Kee Yuan Ngiam, Beng Chin Ooi, Ying Zhang, Xiaojie Yuan

Embeddings 2

Embeddings of medical concepts such as medication, procedure and diagnosis codes in Electronic Medical Records (EMRs) are central to healthcare analytics. Previous work on medical concept embedding takes medical concepts and EMRs as words and documents respectively. Nevertheless, such models miss out the temporal nature of EMR data. On the one hand, two consecutive medical concepts do not indicate they are temporally close, but the correlations between them can be revealed by the time gap. On the other hand, the temporal scopes of medical concepts often vary greatly (e.g., common cold and diabetes). In this paper, we propose to incorporate the temporal information to embed medical codes. Based on the Continuous Bag-of-Words model, we employ the attention mechanism to learn a ``soft'' time-aware context window for each medical concept. Experiments on public and proprietary datasets through clustering and nearest neighbour search tasks demonstrate the effectiveness of our model, showing that it outperforms five state-of-the-art baselines.
#1691

ACV-tree: A New Method for Sentence Similarity Modeling
Yuquan Le, Zhi-Jie Wang, Zhe Quan, Jiawei He, Bin Yao

Embeddings 2

Sentence similarity modeling lies at the core of many natural language processing applications, and thus has received much attention. Owing to the success of word embeddings, recently, popular neural network methods have achieved sentence embedding, obtaining attractive performance. Nevertheless, most of them focused on learning semantic information and modeling it as a continuous vector, while the syntactic information of sentences has not been fully exploited. On the other hand, prior works have shown the benefits of structured trees that include syntactic information, while few methods in this branch utilized the advantages of word embeddings and another powerful technique ? attention weight mechanism. This paper makes the first attempt to absorb their advantages by merging these techniques in a unified structure, dubbed as ACV-tree. Meanwhile, this paper develops a new tree kernel, known as ACVT kernel, that is tailored for sentence similarity measure based on the proposed structure. The experimental results, based on 19 widely-used datasets, demonstrate that our model is effective and competitive, compared against state-of-the-art models.
#1936

Densely Connected CNN with Multi-scale Feature Attention for Text Classification
Shiyao Wang, Minlie Huang, Zhidong Deng

Embeddings 2

Text classification is a fundamental problem in natural language processing. As a popular deep learning model, convolutional neural network (CNN) has demonstrated great success in this task. However, most existing CNN models apply convolution filters of fixed window size, thereby unable to learn variable n-gram features flexibly. In this paper, we present a densely connected CNN with multi-scale feature attention for text classification. The dense connections build short-cut paths between upstream and downstream convolutional blocks, which enable the model to compose features of larger scale from those of smaller scale, and thus produce variable n-gram features. Furthermore, a multi-scale feature attention is developed to adaptively select multi-scale features for classification. Extensive experiments demonstrate that our model obtains competitive performance against state-of-the-art baselines on five benchmark datasets. Attention visualization further reveals the model's ability to select proper n-gram features for text classification.
#1425

Bootstrapping Entity Alignment with Knowledge Graph Embedding
Zequn Sun, Wei Hu, Qingheng Zhang, Yuzhong Qu

Embeddings 2

Embedding-based entity alignment represents different knowledge graphs (KGs) as low-dimensional embeddings and finds entity alignment by measuring the similarities between entity embeddings. Existing approaches have achieved promising results, however, they are still challenged by the lack of enough prior alignment as labeled training data. In this paper, we propose a bootstrapping approach to embedding-based entity alignment. It iteratively labels likely entity alignment as training data for learning alignment-oriented KG embeddings. Furthermore, it employs an alignment editing method to reduce error accumulation during iterations. Our experiments on real-world datasets showed that the proposed approach significantly outperformed the state-of-the-art embedding-based ones for entity alignment. The proposed alignment-oriented KG embedding, bootstrapping process and alignment editing method all contributed to the performance improvement.
#4069

ANRL: Attributed Network Representation Learning via Deep Neural Networks
Zhen Zhang, Hongxia Yang, Jiajun Bu, Sheng Zhou, Pinggang Yu, Jianwei Zhang, Martin Ester, Can Wang

Embeddings 2

Network representation learning (RL) aims to transform the nodes in a network into low-dimensional vector spaces while preserving the inherent properties of the network. Though network RL has been intensively studied, most existing works focus on either network structure or node attribute information. In this paper, we propose a novel framework, named ANRL, to incorporate both the network structure and node attribute information in a principled way. Specifically, we propose a neighbor enhancement autoencoder to model the node attribute information, which reconstructs its target neighbors instead of itself. To capture the network structure, attribute-aware skip-gram model is designed based on the attribute encoder to formulate the correlations between each node and its direct or indirect neighbors. We conduct extensive experiments on six real-world networks, including two social networks, two citation networks and two user behavior networks. The results empirically show that ANRL can achieve relatively significant gains in node classification and link prediction tasks.
#602

Learning Word Vectors with Linear Constraints: A Matrix Factorization Approach
Wenye Li, Jiawei Zhang, Jianjun Zhou, Laizhong Cui

Embeddings 2

Learning vector space representation of words, or word embedding, has attracted much recent research attention. With the objective of better capturing the semantic and syntactic information inherent in words, we propose two new embedding models based on the singular value decomposition of lexical co-occurrences of words. Different from previous work, our proposed models allow for injecting linear constraints when performing the decomposition, with which the desired semantic and syntactic information will be maintained in word vectors. Conceptually the models are flexible and convenient to encode prior knowledge about words. Computationally they can be easily solved by direct matrix factorization. Surprisingly simple yet effective, the proposed models have reported significantly improved performance in empirical word analogy and sentence classification evaluations, and demonstrated high potentials in practical applications.

Thursday 19 10:25 - 11:40 CV-CV3 - Computer Vision 3 (T1)

Chair: Jianxin Wu

#719

Toward Designing Convergent Deep Operator Splitting Methods for Task-specific Nonconvex Optimization
Risheng Liu, Shichao Cheng, Yi He, Xin Fan, Zhongxuan Luo

Computer Vision 3

Operator splitting methods have been successfully used in computational sciences, statistics, learning and vision areas to reduce complex problems into a series of simpler subproblems. However, prevalent splitting schemes are mostly established only based on the mathematical properties of some general optimization models. So it is a laborious process and often requires many iterations of ideation and validation to obtain practical and task-specific optimal solutions, especially for nonconvex problems in real-world scenarios. To break through the above limits, we introduce a new algorithmic framework, called Learnable Bregman Splitting (LBS), to perform deep-architecture-based operator splitting for nonconvex optimization based on specific task model. Thanks to the data-dependent (i.e., learnable) nature, our LBS can not only speed up the convergence, but also avoid unwanted trivial solutions for real-world tasks. Though with inexact deep iterations, we can still establish the global convergence and estimate the asymptotic convergence rate of LBS only by enforcing some fairly loose assumptions. Extensive experiments on different applications (e.g., image completion and deblurring) verify our theoretical results and show the superiority of LBS against existing methods.
#1264

Deep CNN Denoiser and Multi-layer Neighbor Component Embedding for Face Hallucination
Junjun Jiang, Yi Yu, Jinhui Hu, Suhua Tang, Jiayi Ma

Computer Vision 3

Most of the current face hallucination methods, whether they are shallow learning-based or deep learning-based, all try to learn a relationship model between Low-Resolution (LR) and High-Resolution (HR) spaces with the help of a training set. They mainly focus on modeling image prior through either model-based optimization or discriminative inference learning. However, when the input LR face is tiny, the learned prior knowledge is no longer effective and their performance will drop sharply. To solve this problem, in this paper we propose a general face hallucination method that can integrate model-based optimization and discriminative inference. In particular, to exploit the model based prior, the Deep Convolutional Neural Networks (CNN) denoiser prior is plugged into the super-resolution optimization model with the aid of image-adaptive Laplacian regularization. Additionally, we further develop a high-frequency details compensation method by dividing the face image to facial components and performing face hallucination in a multi-layer neighbor embedding manner. Experiments demonstrate that the proposed method can achieve promising super-resolution results for tiny input LR faces.
#1554

SSR-Net: A Compact Soft Stagewise Regression Network for Age Estimation
Tsun-Yi Yang, Yi-Hsuan Huang, Yen-Yu Lin, Pi-Cheng Hsiu, Yung-Yu Chuang

Computer Vision 3

This paper presents a novel CNN model called Soft Stagewise Regression Network (SSR-Net) for age estimation from a single image with a compact model size. Inspired by DEX, we address age estimation by performing multi-class classification and then turning classification results into regression by calculating the expected values. SSR-Net takes a coarse-to-fine strategy and performs multi-class classification with multiple stages. Each stage is only responsible for refining the decision of its previous stage for more accurate age estimation. Thus, each stage performs a task with few classes and requires few neurons, greatly reducing the model size. For addressing the quantization issue introduced by grouping ages into classes, SSR-Net assigns a dynamic range to each age class by allowing it to be shifted and scaled according to the input face image. Both the multi-stage strategy and the dynamic range are incorporated into the formulation of soft stagewise regression. A novel network architecture is proposed for carrying out soft stagewise regression. The resultant SSR-Net model is very compact and takes only 0.32 MB. Despite its compact size, SSR-Net’s performance approaches those of the state-of-the-art methods whose model sizes are often more than 1500× larger.
#1873

Collaborative and Attentive Learning for Personalized Image Aesthetic Assessment
Guolong Wang, Junchi Yan, Zheng Qin

Computer Vision 3

The ever-increasing volume of visual images has stimulated the demand for organizing such data by aesthetic quality. Automatic and especially learning based aesthetic assessment methods have shown potential by recent works. Existing image aesthetic prediction is often user-agnostic which may ignore the fact that the rating to an image can be inherently individual. We fill this gap by formulating the personalized image aesthetic assessment problem with a novel learning method. Specifically, we collect user-image textual reviews in addition with visual images from the public dataset to organize a review-augmented benchmark. Using this enriched dataset, we devise a deep neural network with a user/image relation encoding input for collaborative filtering. Meanwhile an attentive mechanism is designed to capture the user-specific taste for image semantic tags and regions of interest by fusing the image and user's review. Extensive and promising experimental results on the review-augmented benchmark corroborate the efficacy of our approach.
#3270

Multi-Level Metric Learning via Smoothed Wasserstein Distance
Jie Xu, Lei Luo, Cheng Deng, Heng Huang

Computer Vision 3

Traditional metric learning methods aim to learn a single Mahalanobis distance metric M, which, however, is not discriminative enough to characterize the complex and heterogeneous data. Besides, if the descriptors of the data are not strictly aligned, Mahalanobis distance would fail to exploit the relations among them. To tackle these problems, in this paper, we propose a multi-level metric learning method using a smoothed Wasserstein distance to characterize the errors between any two samples, where the ground distance is considered as a Mahalanobis distance. Since smoothed Wasserstein distance provides not only a distance value but also a flow-network indicating how the probability mass is optimally transported between the bins, it is very effective in comparing two samples whether they are aligned or not. In addition, to make full use of the global and local structures that exist in data features, we further model the commonalities between various classification through a shared distance matrix and the classification-specific idiosyncrasies with additional auxiliary distance matrices. An efficient algorithm is developed to solve the proposed new model. Experimental evaluations on four standard databases show that our method obviously outperforms other state-of-the-art methods.
#2923

Deterministic Binary Filters for Convolutional Neural Networks
Vincent W.-S. Tseng, Sourav Bhattacharya, Javier Fernández Marqués, Milad Alizadeh, Catherine Tong, Nicholas D. Lane

Computer Vision 3

We propose Deterministic Binary Filters, an approach to Convolutional Neural Networks that learns weighting coefficients of predefined orthogonal binary basis instead of the conventional approach of learning directly the convolutional filters. This approach results in model architectures with significantly fewer parameters (4x to 16x) and smaller model sizes (32x due to the use of binary rather than floating point precision). We show our deterministic filter design can be integrated into well-known network architectures (such as ResNet and SqueezeNet) with as little as 2% loss of accuracy (under datasets like CIFAR-10). Under ImageNet, they result in 3x model size reduction compared to sub-megabyte binary networks while reaching comparable accuracy levels.

Thursday 19 10:25 - 11:40 ML-PRO - Probabilistic Machine Learning (K11)

Chair: Fabio Cozman

#276

Variance Reduction in Black-box Variational Inference by Adaptive Importance Sampling
Ximing Li, Changchun Li, Jinjin Chi, Jihong Ouyang

Probabilistic Machine Learning

Overdispersed black-box variational inference employs importance sampling to reduce the variance of the Monte Carlo gradient in black-box variational inference. A simple overdispersed proposal distribution is used. This paper aims to investigate how to adaptively obtain better proposal distribution for lower variance. To this end, we directly approximate the optimal proposal in theory using a Monte Carlo moment matching step at each variational iteration. We call this adaptive proposal moment matching proposal (MMP). Experimental results on two Bayesian models show that the MMP can effectively reduce variance in black-box learning, and perform better than baseline inference algorithms.
#751

Estimating Latent People Flow without Tracking Individuals
Yusuke Tanaka, Tomoharu Iwata, Takeshi Kurashima, Hiroyuki Toda, Naonori Ueda

Probabilistic Machine Learning

Analyzing people flows is important for better navigation and location-based advertising. Since the location information of people is often aggregated for protecting privacy, it is not straightforward to estimate transition populations between locations from aggregated data. Here, aggregated data are incoming and outgoing people counts at each location; they do not contain tracking information of individuals. This paper proposes a probabilistic model for estimating unobserved transition populations between locations from only aggregated data. With the proposed model, temporal dynamics of people flows are assumed to be probabilistic diffusion processes over a network, where nodes are locations and edges are paths between locations. By maximizing the likelihood with flow conservation constraints that incorporate travel duration distributions between locations, our model can robustly estimate transition populations between locations. The statistically significant improvement of our model is demonstrated using real-world datasets of pedestrian data in exhibition halls, bike trip data and taxi trip data in New York City.
#1577

Student-t Variational Autoencoder for Robust Density Estimation
Hiroshi Takahashi, Tomoharu Iwata, Yuki Yamanaka, Masanori Yamada, Satoshi Yagi

Probabilistic Machine Learning

We propose a robust multivariate density estimator based on the variational autoencoder (VAE). The VAE is a powerful deep generative model, and used for multivariate density estimation. With the original VAE, the distribution of observed continuous variables is assumed to be a Gaussian, where its mean and variance are modeled by deep neural networks taking latent variables as their inputs. This distribution is called the decoder. However, the training of VAE often becomes unstable. One reason is that the decoder of VAE is sensitive to the error between the data point and its estimated mean when its estimated variance is almost zero. We solve this instability problem by making the decoder robust to the error using a Bayesian approach to the variance estimation: we set a prior for the variance of the Gaussian decoder, and marginalize it out analytically, which leads to proposing the Student-t VAE. Numerical experiments with various datasets show that training of the Student-t VAE is robust, and the Student-t VAE achieves high density estimation performance.
#2069

INITIATOR: Noise-contrastive Estimation for Marked Temporal Point Process
Ruocheng Guo, Jundong Li, Huan Liu

Probabilistic Machine Learning

Copious sequential event data has consistently increased in various high-impact domains such as social media and sharing economy. When events start to take place in a sequential fashion, an important question arises: "what type of event will happen at what time in the near future?" To answer the question, a class of mathematical models called the marked temporal point process is often exploited as it can model the timing and properties of events seamlessly in a joint framework. Recently, various recurrent neural network (RNN) models are proposed to enhance the predictive power of mark temporal point process. However, existing marked temporal point models are fundamentally based on the Maximum Likelihood Estimation (MLE) framework for the training, and inevitably suffer from the problem resulted from the intractable likelihood function. Surprisingly, little attention has been paid to address this issue. In this work, we propose INITIATOR - a novel training framework based on noise-contrastive estimation to resolve this problem. Theoretically, we show the exists a strong connection between the proposed INITIATOR and the exact MLE. Experimentally, the efficacy of INITIATOR is demonstrated over the state-of-the-art approaches on several real-world datasets from various areas.
#3801

Differentiable Submodular Maximization
Sebastian Tschiatschek, Aytunc Sahin, Andreas Krause

Probabilistic Machine Learning

We consider learning of submodular functions from data. These functions are important in machine learning and have a wide range of applications, e.g. data summarization, feature selection and active learning. Despite their combinatorial nature, submodular functions can be maximized approximately with strong theoretical guarantees in polynomial time. Typically, learning the submodular function and optimization of that function are treated separately, i.e. the function is first learned using a proxy objective and subsequently maximized. In contrast, we show how to perform learning and optimization jointly. By interpreting the output of greedy maximization algorithms as distributions over sequences of items and smoothening these distributions, we obtain a differentiable objective. In this way, we can differentiate through the maximization algorithms and optimize the model to work well with the optimization algorithm. We theoretically characterize the error made by our approach, yielding insights into the tradeoff of smoothness and accuracy. We demonstrate the effectiveness of our approach for jointly learning and optimizing on synthetic maxcut data, and on real world applications such as product recommendation and image collection summarization.
#4446

Small-Variance Asymptotics for Nonparametric Bayesian Overlapping Stochastic Blockmodels
Gundeep Arora, Anupreet Porwal, Kanupriya Agarwal, Avani Samdariya, Piyush Rai

Probabilistic Machine Learning

The latent feature relational model (LFRM) for graphs represents each node as having binary memberships in one or more communities. The community memberships can be represented in form of a binary vector and LFRM defines the link probability between any pair of nodes as a bilinear function of their community membership vectors. Moreover, using nonparametric Bayesian prior - Indian Buffet Process - on the community membership matrix enables learning the number of communities automatically from the data. However, despite its modeling flexibility, strong link predictive performance, and nice interpretability of binary embeddings, inference in LFRM remains a challenge and is typically done via MCMC or variational methods. These methods can be slow and may take a long time to converge. In this work, we apply the small variance asymptotics idea to the non-parametric Bayesian LFRM, utilizing the connection between exponential families and Bregman divergence. This leads to an overlapping k-means like objective function for the nonparametric Bayesian LFRM, which can be optimized using generic or specialized solvers. We also propose an iterative greedy algorithm to optimize the objective function and compare our approach with other inference methods on several benchmark datasets. Our results demonstrate that our inference algorithm is competitive to methods such as MCMC while being much faster.

Thursday 19 10:25 - 11:40 ML-BDS - Big Data, Scalability (C2)

Chair: Ao Xiang

#2390

Fast Vehicle Identification in Surveillance via Ranked Semantic Sampling Based Embedding
Feng Zheng, Xin Miao, Heng Huang

Big Data, Scalability

Identifying vehicles across cameras in traffic surveillance is fundamentally important for public safety purposes. However, despite some preliminary work, the rapid vehicle search in large-scale datasets has not been investigated. Moreover, modelling a view-invariant similarity between vehicle images from different views is still highly challenging. To address the problems, in this paper, we propose a Ranked Semantic Sampling (RSS) guided binary embedding method for fast cross-view vehicle Re-IDentification (Re-ID). The search can be conducted by efficiently computing similarities in the projected space. Unlike previous methods using random sampling, we design tree-structured attributes to guide the mini-batch sampling. The ranked pairs of hard samples in the mini-batch can improve the convergence of optimization. By minimizing a novel ranked semantic distance loss defined according to the structure, the learned Hamming distance is view-invariant, which enables cross-view Re-ID. The experimental results demonstrate that RSS outperforms the state-of-the-art approaches and the learned embedding from one dataset can be transferred to achieve the task of vehicle Re-ID on another dataset.
#2125

Binary Coding based Label Distribution Learning
Ke Wang, Xin Geng

Big Data, Scalability

Label Distribution Learning (LDL) is a novel learning paradigm in machine learning, which assumes that an instance is labeled by a distribution over all labels, rather than labeled by a logic label or some logic labels. Thus, LDL can model the description degree of all possible labels to an instance. Although many LDL methods have been put forward to deal with different application tasks, most existing methods suffer from the scalability issue. In this paper, a scalable LDL framework named Binary Coding based Label Distribution Learning (BC-LDL) is proposed for large-scale LDL. The proposed framework includes two parts, i.e., binary coding and label distribution generation. In the binary coding part, the learning objective is to generate the optimal binary codes for the instances. We integrate the label distribution information of the instances into a binary coding procedure, leading to high-quality binary codes. In the label distribution generation part, given an instance, the k nearest training instances in the Hamming space are searched and the mean of the label distributions of all the neighboring instances is calculated as the predicted label distribution. Experiments on five benchmark datasets validate the superiority of BC-LDL over several state-of-the-art LDL methods.
#416

Lightweight Label Propagation for Large-Scale Network Data
De-Ming Liang, Yu-Feng Li

Big Data, Scalability

Label propagation spreads the soft labels from few labeled data to a large amount of unlabeled data according to the intrinsic graph structure. Nonetheless, most label propagation solutions work under relatively small-scale data and fail to cope with many real applications, such as social network analysis, where graphs usually have millions of nodes. In this paper, we propose a novel algorithm named \algo to deal with large-scale data. A lightweight iterative process derived from the well-known stochastic gradient descent strategy is used to reduce memory overhead and accelerate the solving process. We also give a theoretical analysis on the necessity of the warm-start technique for label propagation. Experiments show that our algorithm can handle million-scale graphs in few seconds while achieving highly competitive performance with existing algorithms.
#1407

Does Tail Label Help for Large-Scale Multi-Label Learning
Tong Wei, Yu-Feng Li

Big Data, Scalability

Large-scale multi-label learning annotates relevant labels for unseen data from a huge number of candidate labels. It is well known that in large-scale multi-label learning, labels exhibit a long tail distribution in which a significant fraction of labels are tail labels. Nonetheless, how tail labels make impact on the performance metrics in large-scale multi-label learning was not explicitly quantified. In this paper, we disclose that whatever labels are randomly missing or misclassified, tail labels impact much less than common labels in terms of commonly used performance metrics (Top-$k$ precision and nDCG@$k$). With the observation above, we develop a low-complexity large-scale multi-label learning algorithm with the goal of facilitating fast prediction and compact models by trimming tail labels adaptively. Experiments clearly verify that both the prediction time and the model size are significantly reduced without sacrificing much predictive performance for state-of-the-art approaches.
#1238

Distributed Primal-Dual Optimization for Non-uniformly Distributed Data
Minhao Cheng, Cho-Jui Hsieh

Big Data, Scalability

Distributed primal-dual optimization has received many focuses in the past few years. In this framework, training samples are stored in multiple machines. At each round, all the machines conduct a sequence of updates based on their local data, and then the local updates are synchronized and merged to obtain the update to the global model. All the previous approaches merge the local updates by averaging all of them with a uniform weight. However, in many real world applications data are not uniformly distributed on each machine, so the uniform weight is inadequate to capture the heterogeneity of local updates. To resolve this issue, we propose a better way to merge local updates in the primal-dual optimization framework. Instead of using a single weight for all the local updates, we develop a computational efficient algorithm to automatically choose the optimal weights for each machine. Furthermore, we propose an efficient way to estimate the duality gap of the merged update by exploiting the structure of the objective function, and this leads to an efficient line search algorithm based on the reduction of duality gap. Combining these two ideas, our algorithm is much faster and more scalable than existing methods on real world problems.
#330

Real-time Traffic Pattern Analysis and Inference with Sparse Video Surveillance Information
Yang Wang, Yiwei Xiao, Xike Xie, Ruoyu Chen, Hengchang Liu

Big Data, Scalability

Recent advances in video surveillance systems enable a new paradigm for intelligent urban traffic management systems. Since surveillance cameras are usually sparsely located to cover key regions of the road under surveillance, it is a big challenge to perform a complete real-time traffic pattern analysis based on incomplete sparse surveillance information. As a result, existing works mostly focus on predicting traffic volumes with historical records available at a particular location and may not provide a complete picture of real-time traffic patterns. To this end, in this paper, we go beyond existing works and tackle the challenges of traffic flow analysis from three perspectives. First, we train the transition probabilities to capture vehicles' movement patterns. The transition probabilities are trained from third-party vehicle GPS data, and thus can work in the area even if there is no camera. Second, we exploit the Multivariate Normal Distribution model together with the transferred probabilities to estimate the unobserved traffic patterns. Third, we propose an algorithm for real-time traffic inference with surveillance as a complement source of information. Finally, experiments on real-world data show the effectiveness of our approach.

Thursday 19 10:25 - 11:40 ML-DRM - Dimensionality Reduction and Manifold Learning (C3)

Chair: Wenjie Ruan

#760

Adversarial Metric Learning
Shuo Chen, Chen Gong, Jian Yang, Xiang Li, Yang Wei, Jun Li

Dimensionality Reduction and Manifold Learning

In the past decades, intensive efforts have been put to design various loss functions and metric forms for metric learning problem. These improvements have shown promising results when the test data is similar to the training data. However, the trained models often fail to produce reliable distances on the ambiguous test pairs due to the different samplings between training set and test set. To address this problem, the Adversarial Metric Learning (AML) is proposed in this paper, which automatically generates adversarial pairs to remedy the sampling bias and facilitate robust metric learning. Specifically, AML consists of two adversarial stages, i.e. confusion and distinguishment. In confusion stage, the ambiguous but critical adversarial data pairs are adaptively generated to mislead the learned metric. In distinguishment stage, a metric is exhaustively learned to try its best to distinguish both adversarial pairs and original training pairs. Thanks to the challenges posed by the confusion stage in such competing process, the AML model is able to grasp plentiful difficult knowledge that has not been contained by the original training pairs, so the discriminability of AML can be significantly improved. The entire model is formulated into optimization framework, of which the global convergence is theoretically proved. The experimental results on toy data and practical datasets clearly demonstrate the superiority of AML to representative state-of-the-art metric learning models.
#957

Dynamic Hypergraph Structure Learning
Zizhao Zhang, Haojie Lin, Yue Gao

Dimensionality Reduction and Manifold Learning

In recent years, hypergraph modeling has shown its superiority on correlation formulation among samples and has wide applications in classification, retrieval, and other tasks. In all these works, the performance of hypergraph learning highly depends on the generated hypergraph structure. A good hypergraph structure can represent the data correlation better, and vice versa. Although hypergraph learning has attracted much attention recently, most of existing works still rely on a static hypergraph structure, and little effort concentrates on optimizing the hypergraph structure during the learning process. To tackle this problem, we propose a dynamic hypergraph structure learning method in this paper. In this method, given the originally generated hypergraph structure, the objective of our work is to simultaneously optimize the label projection matrix (the common task in hypergraph learning) and the hypergraph structure itself. More specifically, in this formulation, the label projection matrix is related to the hypergraph structure, and the hypergraph structure is associated with the data correlation from both the label space and the feature space. Here, we alternatively learn the optimal label projection matrix and the hypergraph structure, leading to a dynamic hypergraph structure during the learning process. We have applied the proposed method in the tasks of 3D shape recognition and gesture recognition. Experimental results on 4 public datasets show better performance compared with the state-of-the-art methods. We note that the proposed method can be further applied in other tasks.
#4109

Robust Graph Dimensionality Reduction
Xiaofeng Zhu, Cong Lei, Hao Yu, Yonggang Li, Jiangzhang Gan, Shichao Zhang

Dimensionality Reduction and Manifold Learning

In this paper, we propose conducting Robust Graph Dimensionality Reduction (RGDR) by learning a transformation matrix to map original high-dimensional data into their low-dimensional intrinsic space without the influence of outliers. To do this, we propose simultaneously 1) adaptively learning three variables, \ie a reverse graph embedding of original data, a transformation matrix, and a graph matrix preserving the local similarity of original data in their low-dimensional intrinsic space; and 2) employing robust estimators to avoid outliers involving the processes of optimizing these three matrices. As a result, original data are cleaned by two strategies, \ie a prediction of original data based on three resulting variables and robust estimators, so that the transformation matrix can be learnt from accurately estimated intrinsic space with the helping of the reverse graph embedding and the graph matrix. Moreover, we propose a new optimization algorithm to the resulting objective function as well as theoretically prove the convergence of our optimization algorithm. Experimental results indicated that our proposed method outperformed all the comparison methods in terms of different classification tasks.
#2363

SDMCH: Supervised Discrete Manifold-Embedded Cross-Modal Hashing
Xin Luo, Xiao-Ya Yin, Liqiang Nie, Xuemeng Song, Yongxin Wang, Xin-Shun Xu

Dimensionality Reduction and Manifold Learning

Cross-modal hashing methods have attracted considerable attention. Most pioneer approaches only preserve the neighborhood relationship by constructing the correlations among heterogeneous modalities. However, they neglect the fact that the high-dimensional data often exists on a low-dimensional manifold embedded in the ambient space and the relative proximity between the neighbors is also important. Although some methods leverage the manifold learning to generate the hash codes, most of them fail to explicitly explore the discriminative information in the class labels and discard the binary constraints during optimization, generating large quantization errors. To address these issues, in this paper, we present a novel cross-modal hashing method, named Supervised Discrete Manifold-Embedded Cross-Modal Hashing (SDMCH). It can not only exploit the non-linear manifold structure of data and construct the correlation among heterogeneous multiple modalities, but also fully utilize the semantic information. Moreover, the hash codes can be generated discretely by an iterative optimization algorithm, which can avoid the large quantization errors. Extensive experimental results on three benchmark datasets demonstrate that SDMCH outperforms ten state-of-the-art cross-modal hashing methods.
#2373

Towards Generalized and Efficient Metric Learning on Riemannian Manifold
Pengfei Zhu, Hao Cheng, Qinghua Hu, Qilong Wang, Changqing Zhang

Dimensionality Reduction and Manifold Learning

Modeling data as points on non-linear Riemannian manifold has attracted increasing attentions in many computer vision tasks, especially visual recognition. Learning an appropriate metric on Riemannian manifold plays a key role in achieving promising performance. For widely used symmetric positive definite (SPD) manifold and Grassmann manifold, most of existing metric learning methods are designed for one manifold, and are not straightforward for the other one. Furthermore, optimizations in previous methods usually rely on computationally expensive iterations. To address above limitations, this paper makes an attempt to propose a generalized and efficient Riemannian manifold metric learning (RMML) method, which can be flexibly adopted to both SPD and Grassmann manifolds. By minimizing the geodesic distance of similar pairs and the interpoint geodesic distance of dissimilar ones on nonlinear manifolds, the proposed RMML is optimized by computing the geodesic mean between inverse of similarity matrix and dissimilarity matrix, benefiting a global closed-form solution and high efficiency. The experiments are conducted on various visual recognition tasks, and the results demonstrate our RMML performs favorably against its counterparts in terms of both accuracy and efficiency.
#2892

Spectral Feature Scaling Method for Supervised Dimensionality Reduction
Momo Matsuda, Keiichi Morikuni, Tetsuya Sakurai

Dimensionality Reduction and Manifold Learning

Spectral dimensionality reduction methods enable linear separations of complex data with high-dimensional features in a reduced space. However, these methods do not always give the desired results due to irregularities or uncertainties of the data. Thus, we consider aggressively modifying the scales of the features to obtain the desired classification. Using prior knowledge on the labels of partial samples to specify the Fiedler vector, we formulate an eigenvalue problem of a linear matrix pencil whose eigenvector has the feature scaling factors. The resulting factors can modify the features of entire samples to form clusters in the reduced space, according to the known labels. In this study, we propose new dimensionality reduction methods supervised using the feature scaling associated with the spectral clustering. Numerical experiments show that the proposed methods outperform well-established supervised methods for toy problems with more samples than features, and are more robust regarding clustering than existing methods. Also, the proposed methods outperform existing methods regarding classification for real-world problems with more features than samples of gene expression profiles of cancer diseases. Furthermore, the feature scaling tends to improve the clustering and classification accuracies of existing unsupervised methods, as the proportion of training data increases.

Thursday 19 10:25 - 12:45 Industry Day (A4)

Industry Day - Session 1b

Industry Day

Show details

Thursday 19 11:40 - 12:45 NLP-PTS - Natural Language Processing: Parsing, Tagging, Word Segmentation, Speech (T2)

Chair: Jie Liu

#3001

Learning Tag Dependencies for Sequence Tagging
Yuan Zhang, Hongshen Chen, Yihong Zhao, Qun Liu, Dawei Yin

Natural Language Processing: Parsing, Tagging, Word Segmentation, Speech

Sequence tagging is the basis for multiple applications in natural language processing. Despite successes in learning long term token sequence dependencies with neural network, tag dependencies are rarely considered previously. Sequence tagging actually possesses complex dependencies and interactions among the input tokens and the output tags. We propose a novel multi-channel model, which handles different ranges of token-tag dependencies and their interactions simultaneously. A tag LSTM is augmented to manage the output tag dependencies and word-tag interactions, while three mechanisms are presented to efficiently incorporate token context representation and tag dependency. Extensive experiments on part-of-speech tagging and named entity recognition tasks show that the proposed model outperforms the BiLSTM-CRF baseline by effectively incorporating the tag dependency feature.
#2506

Neural Networks Incorporating Unlabeled and Partially-labeled Data for Cross-domain Chinese Word Segmentation
Lujun Zhao, Qi Zhang, Peng Wang, Xiaoyu Liu

Natural Language Processing: Parsing, Tagging, Word Segmentation, Speech

Most existing Chinese word segmentation (CWS) methods are usually supervised. Hence, large-scale annotated domain-specific datasets are needed for training. In this paper, we seek to address the problem of CWS for the resource-poor domains that lack annotated data. A novel neural network model is proposed to incorporate unlabeled and partially-labeled data. To make use of unlabeled data, we combine a bidirectional LSTM segmentation model with two character-level language models using a gate mechanism. These language models can capture co-occurrence information. To make use of partially-labeled data, we modify the original cross entropy loss function of RNN. Experimental results demonstrate that the method performs well on CWS tasks in a series of domains.
#2422

Weakly Supervised Audio Source Separation via Spectrum Energy Preserved Wasserstein Learning
Ning Zhang, Junchi Yan, Yuchen Zhou

Natural Language Processing: Parsing, Tagging, Word Segmentation, Speech

Separating audio mixtures into individual instrument tracks has been a standing challenge. We introduce a novel weakly supervised audio source separation approach based on deep adversarial learning. Specifically, our loss function adopts the Wasserstein distance which directly measures the distribution distance between the separated sources and the real sources for each individual source. Moreover, a global regularization term is added to fulfill the spectrum energy preservation property regardless separation. Unlike state-of-the-art weakly supervised models which often involve deliberately devised constraints or careful model selection, our approach need little prior model specification on the data, and can be straightforwardly learned in an end-to-end fashion. We show that the proposed method performs competitively on public benchmark against state-of-the-art weakly supervised methods.
#2795

Listen, Think and Listen Again: Capturing Top-down Auditory Attention for Speaker-independent Speech Separation
Jing Shi, Jiaming Xu, Guangcan Liu, Bo Xu

Natural Language Processing: Parsing, Tagging, Word Segmentation, Speech

Recent deep learning methods have made significant progress in multi-talker mixed speech separation. However, most existing models adopt a driftless strategy to separate all the speech channels rather than selectively attend the target one. As a result, those frameworks may be failed to offer a satisfactory solution in complex auditory scene where the number of input sounds is usually uncertain and even dynamic. In this paper, we present a novel neural network based structure motivated by the top-down attention behavior of human when facing complicated acoustical scene. Different from previous works, our method constructs an inference-attention structure to predict interested candidates and extract each speech channel of them. Our work gets rid of the limitation that the number of channels must be given or the high computation complexity for label permutation problem. We evaluated our model on the WSJ0 mixed-speech tasks. In all the experiments, our model gets highly competitive to reach and even outperform the baselines.
#4073

Attention-Fused Deep Matching Network for Natural Language Inference
Chaoqun Duan, Lei Cui, Xinchi Chen, Furu Wei, Conghui Zhu, Tiejun Zhao

Natural Language Processing: Parsing, Tagging, Word Segmentation, Speech

Natural language inference aims to predict whether a premise sentence can infer another hypothesis sentence. Recent progress on this task only relies on a shallow interaction between sentence pairs, which is insufficient for modeling complex relations. In this paper, we present an attention-fused deep matching network (AF-DMN) for natural language inference. Unlike existing models, AF-DMN takes two sentences as input and iteratively learns the attention-aware representations for each side by multi-level interactions. Moreover, we add a self-attention mechanism to fully exploit local context information within each sentence. Experiment results show that AF-DMN achieves state-of-the-art performance and outperforms strong baselines on Stanford natural language inference (SNLI), multi-genre natural language inference (MultiNLI), and Quora duplicate questions datasets.

Thursday 19 11:45 - 12:45 SUR-AE - Survey Track: Agents and Ethics (VICTORIA)

Chair: Nicolas Maudet

#5403

Building Ethics into Artificial Intelligence
Han Yu, Zhiqi Shen, Chunyan Miao, Cyril Leung, Victor R. Lesser, Qiang Yang

Survey Track: Agents and Ethics

As artificial intelligence (AI) systems become increasingly ubiquitous, the topic of AI governance for ethical decision-making by AI has captured public imagination. Within the AI research community, this topic remains less familiar to many researchers. In this paper, we complement existing surveys, which largely focused on the psychological, social and legal discussions of the topic, with an analysis of recent advances in technical solutions for AI governance. By reviewing publications in leading AI conferences including AAAI, AAMAS, ECAI and IJCAI, we propose a taxonomy which divides the field into four areas: 1) exploring ethical dilemmas; 2) individual ethical decision frameworks; 3) collective ethical decision frameworks; and 4) ethics in human-AI interactions. We highlight the intuitions and key techniques used in each approach, and discuss promising future research directions towards successful integration of ethical AI systems into human societies.
#5409

AGI Safety Literature Review
Tom Everitt, Gary Lea, Marcus Hutter

Survey Track: Agents and Ethics

The development of Artificial General Intelligence (AGI) promises to be a major event. Along with its many potential benefits, it also raises serious safety concerns. The intention of this paper is to provide an easily accessible and up-to-date collection of references for the emerging field of AGI safety. A significant number of safety problems for AGI have been identified. We list these, and survey recent research on solving them. We also cover works on how best to think of AGI from the limited knowledge we have today, predictions for when AGI will first be created, and what will happen after its creation. Finally, we review the current public policy on AGI.
#5435

Stackelberg Security Games: Looking Beyond a Decade of Success
Arunesh Sinha, Fei Fang, Bo An, Christopher Kiekintveld, Milind Tambe

Survey Track: Agents and Ethics

The Stackelberg Security Game (SSG) model has been immensely influential in security research since it was introduced roughly a decade ago. Furthermore, deployed SSG-based applications are one of most successful examples of game theory applications in the real world. We present a broad survey of recent technical advances in SSG and related literature, and then look to the future by highlighting the new potential applications and open research problems in SSG.
#5400

Autonomously Reusing Knowledge in Multiagent Reinforcement Learning
Felipe Leno Da Silva, Matthew E. Taylor, Anna Helena Reali Costa

Survey Track: Agents and Ethics

Autonomous agents are increasingly required to solve complex tasks; hard-coding behaviors has become infeasible. Hence, agents must learn how to solve tasks via interactions with the environment. In many cases, knowledge reuse will be a core technology to keep training times reasonable, and for that, agents must be able to autonomously and consistently reuse knowledge from multiple sources, including both their own previous internal knowledge and from other agents. In this paper, we provide a literature review of methods for knowledge reuse in Multiagent Reinforcement Learning. We define an important challenge problem for the AI community, survey the existent methods, and discuss how they can all contribute to this challenging problem. Moreover, we highlight gaps in the current literature, motivating "low-hanging fruit'' for those interested in the area. Our ambition is that this paper will encourage the community to work on this difficult and relevant research challenge.

Thursday 19 11:45 - 12:45 KR-ML - Knowledge Representation and Learning (C7)

Chair: Steven Schockaert

#972

MASTER: across Multiple social networks, integrate Attribute and STructure Embedding for Reconciliation
Sen Su, Li Sun, Zhongbao Zhang, Gen Li, Jielun Qu

Knowledge Representation and Learning

Recently, reconciling social networks receives significant attention. Most of the existing studies have limitations in the following three aspects: multiplicity, comprehensiveness and robustness. To address these three limitations, we rethink this problem and propose the MASTER framework, i.e., across Multiple social networks, integrate Attribute and STructure Embedding for Reconciliation. In this framework, we first design a novel Constrained Dual Embedding model by simultaneously embedding and reconciling multiple social networks to formulate our problem into a unified optimization. To address this optimization, we then design an effective algorithm called NS-Alternating. We also prove that this algorithm converges to KKT points. Through extensive experiments on real-world datasets, we demonstrate that MASTER outperforms the state-of-the-art approaches.
#1506

A^3NCF: An Adaptive Aspect Attention Model for Rating Prediction
Zhiyong Cheng, Ying Ding, Xiangnan He, Lei Zhu, Xuemeng Song, Mohan Kankanhalli

Knowledge Representation and Learning

Current recommender systems consider the various aspects of items for making accurate recommendations. Different users place different importance to these aspects which can be thought of as a preference/attention weight vector. Most existing recommender systems assume that for an individual, this vector is the same for all items. However, this assumption is often invalid, especially when considering a user's interactions with items of diverse characteristics. To tackle this problem, in this paper, we develop a novel aspect-aware recommender model named A$^3$NCF, which can capture the varying aspect attentions that a user pays to different items. Specifically, we design a new topic model to extract user preferences and item characteristics from review texts. They are then used to 1) guide the representation learning of users and items, and 2) capture a user's special attention on each aspect of the targeted item with an attention network. Through extensive experiments on several large-scale datasets, we demonstrate that our model outperforms the state-of-the-art review-aware recommender systems in the rating prediction task.
#3836

Behavior of Analogical Inference w.r.t. Boolean Functions
Miguel Couceiro, Nicolas Hug, Henri Prade, Gilles Richard

Knowledge Representation and Learning

It has been observed that a particular form of analogical inference, based on analogical proportions, yields competitive results in classification tasks. Using the algebraic normal form of Boolean functions, it has been shown that analogical prediction is always exact iff the labeling function is affine. We point out that affine functions are also meaningful when using another view of analogy. We address the accuracy of analogical inference for arbitrary Boolean functions and show that if a function is epsilon-close to an affine function, then the probability of making a wrong prediction is upper bounded by 4 epsilon. This result is confirmed by an empirical study showing that the upper bound is tight. It highlights the specificity of analogical inference, also characterized in terms of the Hamming distance.
#1415

Scalable Rule Learning via Learning Representation
Pouya Ghiasnezhad Omran, Kewen Wang, Zhe Wang

Knowledge Representation and Learning

We study the problem of learning first-order rules from large Knowledge Graphs (KGs). With recent advancement in information extraction, vast data repositories in the KG format have been obtained such as Freebase and YAGO. However, traditional techniques for rule learning are not scalable for KGs. This paper presents a new approach RLvLR to learning rules from KGs by using the technique of embedding in representation learning together with a new sampling method. Experimental results show that our system outperforms some state-of-the-art systems. Specifically, for massive KGs with hundreds of predicates and over 10M facts, RLvLR is much faster and can learn much more quality rules than major systems for rule learning in KGs such as AMIE+. We also used the RLvLR-mined rules in an inference module to carry out the link prediction task. In this task, RLvLR outperformed Neural LP, a state-of-the-art link prediction system, in both runtime and accuracy.
#1098

Incomplete Multi-View Weak-Label Learning
Qiaoyu Tan, Guoxian Yu, Carlotta Domeniconi, Jun Wang, Zili Zhang

Knowledge Representation and Learning

Learning from multi-view multi-label data has wide applications. There are two main challenges of this learning task: incomplete views and missing (weak) labels. The former assumes that views may not include all data objects. The weak label setting implies that only a subset of relevant labels are provided for training objects while other labels are missing. Both incomplete views and weak labels can lead to significant performance degradation. In this paper, we propose a novel model (iMVWL) to jointly address the two challenges. iMVWL simultaneously learns a shared subspace from incomplete views with weak labels, the local label structure and the predictor in this subspace, which can not only capture cross-view relationships but also weak-label information of training samples. We further develop an alternative solution to optimize our model, this solution can avoid suboptimal results and reinforce their reciprocal effects, and thus further improve the performance. Extensive experimental results on several real-world datasets validate the effectiveness of our model against other competitive algorithms.

Thursday 19 11:45 - 12:45 EurAI Dissertation Award (C8)

Florian Pommerening

EurAI Dissertation Award

Show details

Thursday 19 11:45 - 12:45 MUL-MH - On Machines and Humans (K2)

Chair: Xiangyuan Lan

#3986

Dynamically Forming a Group of Human Forecasters and Machine Forecaster for Forecasting Economic Indicators
Takahiro Miyoshi, Shigeo Matsubara

On Machines and Humans

How can human forecasts and a machine forecast be combined in inflation forecast tasks? A machine-learning-based forecaster makes a forecast based on a statistical model constructed from past time-series data, while humans take varied information such as economic policies into account. Combination methods for different forecasts have been studied such as ensemble and consensus methods. These methods, however, always use the same manner of combination regardless of the situation (input), which makes it difficult to use the advantages of different types of forecasters. To overcome this drawback, we propose an ensemble method for estimating the expected error of a machine forecast and dynamically determining the optimal number of humans included in the ensemble. We evaluated the proposed method by using the seven datasets on U.S. inflation and confirmed that it attained the highest forecast accuracy for four datasets and the same accuracy as the highest one of traditional methods for two datasets.
#5473

(Journal track) Viewpoint: Artificial Intelligence and Labour
Spyridon Samothrakis

On Machines and Humans

The welfare of modern societies has been intrinsically linked to wage labour. With some exceptions, the modern human has to sell her labour-power to be able reproduce biologically and socially. Thus, a lingering fear of technological unemployment features predominately as a theme among Artificial Intelligence researchers. In this short paper we show that, if past trends are anything to go by, this fear is irrational. On the contrary, we argue that the main problem humanity will be facing is the normalisation of extremely long working hours.
#5131

(Sister Conferences Best Papers Track) Bridging the Gap Between Theory and Practice in Influence Maximization: Raising Awareness about HIV among Homeless Youth
Amulya Yadav, Bryan Wilder, Eric Rice, Robin Petering, Jaih Craddock, Amanda Yoshioka-Maxwell, Mary Hemler, Laura Onasch-Vera, Milind Tambe, Darlene Woo

On Machines and Humans

This paper reports on results obtained by deploying HEALER and DOSIM (two AI agents for social influence maximization) in the real-world, which assist service providers in maximizing HIV awareness in real-world homeless-youth social networks. These agents recommend key "seed" nodes in social networks, i.e., homeless youth who would maximize HIV awareness in their real-world social network. While prior research on these agents published promising simulation results from the lab, the usability of these AI agents in the real-world was unknown. This paper presents results from three real-world pilot studies involving 173 homeless youth across two different homeless shelters in Los Angeles. The results from these pilot studies illustrate that HEALER and DOSIM outperform the current modus operandi of service providers by ~160% in terms of information spread about HIV among homeless youth.

Thursday 19 11:45 - 12:45 CV-AR - Action Recognition (T1)

Chair: Sheng Tang

#3153

Exploiting Images for Video Recognition with Hierarchical Generative Adversarial Networks
Feiwu Yu, Xinxiao Wu, Yuchao Sun, Lixin Duan

Action Recognition

Existing deep learning methods of video recognition usually require a large number of labeled videos for training. But for a new task, videos are often unlabeled and it is also time-consuming and labor-intensive to annotate them. Instead of human annotation, we try to make use of existing fully labeled images to help recognize those videos. However, due to the problem of domain shifts and heterogeneous feature representations, the performance of classifiers trained on images may be dramatically degraded for video recognition tasks. In this paper, we propose a novel method, called Hierarchical Generative Adversarial Networks (HiGAN), to enhance recognition in videos (i.e., target domain) by transferring knowledge from images (i.e., source domain). The HiGAN model consists of a \emph{low-level} conditional GAN and a \emph{high-level} conditional GAN. By taking advantage of these two-level adversarial learning, our method is capable of learning a domain-invariant feature representation of source images and target videos. Comprehensive experiments on two challenging video recognition datasets (i.e. UCF101 and HMDB51) demonstrate the effectiveness of the proposed method when compared with the existing state-of-the-art domain adaptation methods.
#599

Memory Attention Networks for Skeleton-based Action Recognition
Chunyu Xie, Ce Li, Baochang Zhang, Chen Chen, Jungong Han, Jianzhuang Liu

Action Recognition

Skeleton-based action recognition task is entangled with complex spatio-temporal variations of skeleton joints, and remains challenging for Recurrent Neural Networks (RNNs). In this work, we propose a temporal-then-spatial recalibration scheme to alleviate such complex variations, resulting in an end-to-end Memory Attention Networks (MANs) which consist of a Temporal Attention Recalibration Module (TARM) and a Spatio-Temporal Convolution Module (STCM). Specifically, the TARM is deployed in a residual learning module that employs a novel attention learning network to recalibrate the temporal attention of frames in a skeleton sequence. The STCM treats the attention calibrated skeleton joint sequences as images and leverages the Convolution Neural Networks (CNNs) to further model the spatial and temporal information of skeleton data. These two modules (TARM and STCM) seamlessly form a single network architecture that can be trained in an end-to-end fashion. MANs significantly boost the performance of skeleton-based action recognition and achieve the best results on four challenging benchmark datasets: NTU RGB+D, HDM05, SYSU-3D and UT-Kinect.
#1431

Deeply-Supervised CNN Model for Action Recognition with Trainable Feature Aggregation
Yang Li, Kan Li, Xinxin Wang

Action Recognition

In this paper, we propose a deeply-supervised CNN model for action recognition that fully exploits powerful hierarchical features of CNNs. In this model, we build multi-level video representations by applying our proposed aggregation module at different convolutional layers. Moreover, we train this model in a deep supervision manner, which brings improvement in both performance and efficiency. Meanwhile, in order to capture the temporal structure as well as preserve more details about actions, we propose a trainable aggregation module. It models the temporal evolution of each spatial location and projects them into a semantic space using the Vector of Locally Aggregated Descriptors (VLAD) technique. This deeply-supervised CNN model integrating the powerful aggregation module provides a promising solution to recognize actions in videos. We conduct experiments on two action recognition datasets: HMDB51 and UCF101. Results show that our model outperforms the state-of-the-art methods.
#1767

Uncertainty Sampling for Action Recognition via Maximizing Expected Average Precision
Hanmo Wang, Xiaojun Chang, Lei Shi, Yi Yang, Yi-Dong Shen

Action Recognition

Recognizing human actions in video clips has been an important topic in computer vision. Sufficient labeled data is one of the prerequisites for the good performance of action recognition algorithms. However, while abundant videos can be collected from the Internet, categorizing each video clip is tedious and even time-consuming. Active learning is one way to alleviate the labeling labor by allowing the classifier to choose the most informative unlabeled instances for manual annotation. Among various active learning algorithms, uncertainty sampling is arguably the most widely-used strategy. Conventional uncertainty sampling strategies such as entropy-based methods are usually tested under accuracy. However, in action recognition Average Precision (AP) is an acknowledged evaluation metric, which is somehow ignored in the active learning community. It is defined as the area under the precision-recall curve. In this paper, we propose a novel uncertainty sampling algorithm for action recognition using expected AP. We conduct experiments on three real-world action recognition datasets and show that our algorithm outperforms other uncertainty-based active learning algorithms.
#2687

Co-occurrence Feature Learning from Skeleton Data for Action Recognition and Detection with Hierarchical Aggregation
Chao Li, Qiaoyong Zhong, Di Xie, Shiliang Pu

Action Recognition

Skeleton-based human action recognition has recently drawn increasing attentions with the availability of large-scale skeleton datasets. The most crucial factors for this task lie in two aspects: the intra-frame representation for joint co-occurrences and the inter-frame representation for skeletons' temporal evolutions. In this paper we propose an end-to-end convolutional co-occurrence feature learning framework. The co-occurrence features are learned with a hierarchical methodology, in which different levels of contextual information are aggregated gradually. Firstly point-level information of each joint is encoded independently. Then they are assembled into semantic representation in both spatial and temporal domains. Specifically, we introduce a global spatial aggregation scheme, which is able to learn superior joint co-occurrence features over local aggregation. Besides, raw skeleton coordinates as well as their temporal difference are integrated with a two-stream paradigm. Experiments show that our approach consistently outperforms other state-of-the-arts on action recognition and detection benchmarks like NTU RGB+D, SBU Kinect Interaction and PKU-MMD.

Thursday 19 11:45 - 12:45 ML-ULA - Unsupervised Learning and Applications (K11)

Chair: Johan Schubert

#2603

A Fast and Accurate Method for Estimating People Flow from Spatiotemporal Population Data
Yasunori Akagi, Takuya Nishimura, Takeshi Kurashima, Hiroyuki Toda

Unsupervised Learning and Applications

Real-time spatiotemporal population data is attracting a great deal of attention for understanding crowd movements in cities.The data is the aggregation of personal location information and consists of just areas and the number of people in each area at certain time instants. Accordingly, it does not explicitly represent crowd movement. This paper proposes a probabilistic model based on collective graphical models that can estimate crowd movement from spatiotemporal population data. There are two technical challenges: (i) poor estimation accuracy as the traditional approach means the model would have too many degrees of freedom, (ii) excessive computation cost. Our key idea for overcoming these two difficulties is to model the transition probability between grid cells (cells hereafter) in a geospatial grid space by using three factors: departure probability of cells, gathering score of cells, and geographical distance between cells. These advances enable us to reduce the degrees of freedom of the model appropriately and derive an efficient estimation algorithm. To evaluate the performance of our method, we conduct experiments using real-world spatiotemporal population data. The results confirm the effectiveness of our method, both in estimation accuracy and computation cost.
#1221

A Joint Learning Approach to Intelligent Job Interview Assessment
Dazhong Shen, Hengshu Zhu, Chen Zhu, Tong Xu, Chao Ma, Hui Xiong

Unsupervised Learning and Applications

The job interview is considered as one of the most essential tasks in talent recruitment, which forms a bridge between candidates and employers in fitting the right person for the right job. While substantial efforts have been made on improving the job interview process, it is inevitable to have biased or inconsistent interview assessment due to the subjective nature of the traditional interview process. To this end, in this paper, we propose a novel approach to intelligent job interview assessment by learning the large-scale real-world interview data. Specifically, we develop a latent variable model named Joint Learning Model on Interview Assessment (JLMIA) to jointly model job description, candidate resume and interview assessment. JLMIA can effectively learn the representative perspectives of different job interview processes from the successful job application records in history. Therefore, a variety of applications in job interviews can be enabled, such as person-job fit and interview question recommendation. Extensive experiments conducted on real-world data clearly validate the effectiveness of JLMIA, which can lead to substantially less bias in job interviews and provide a valuable understanding of job interview assessment.
#3184

DyNMF: Role Analytics in Dynamic Social Networks
Yulong Pei, Jianpeng Zhang, George Fletcher, Mykola Pechenizkiy

Unsupervised Learning and Applications

Roles of nodes in a social network (SN) represent their functions, responsibilities or behaviors within the SN. Roles typically evolve over time, making role analytics a challenging problem. Previous studies either neglect role transition analysis or perform role discovery and role transition learning separately, leading to inefficiencies and limited transition analysis. We propose a novel dynamic non-negative matrix factorization (DyNMF) approach to simultaneously discover roles and learn role transitions. DyNMF explicitly models temporal information by introducing a role transition matrix and clusters nodes in SNs from two views: the current view and the historical view. The current view captures structural information from the current SN snapshot and the historical view captures role transitions by looking at roles in past SN snapshots. DyNMF efficiently provides more effective analytics capabilities, regularizing roles by temporal smoothness of role transitions and reducing uncertainties and inconsistencies between snapshots. Experiments on both synthetic and real-world SNs demonstrate the advantages of DyNMF in discovering and predicting roles and role transitions.
#3185

Representing Urban Functions through Zone Embedding with Human Mobility Patterns
Zijun Yao, Yanjie Fu, Bin Liu, Wangsu Hu, Hui Xiong

Unsupervised Learning and Applications

Urban functions refer to the purposes of land use in cities where each zone plays a distinct role and cooperates with each other to serve people’s various life needs. Understanding zone functions helps to solve a variety of urban related problems, such as increasing traffic capacity and enhancing location-based service. Therefore, it is beneficial to investigate how to learn the representations of city zones in terms of urban functions, for better supporting urban analytic applications. To this end, in this paper, we propose a framework to learn the vector representation (embedding) of city zones by exploiting large-scale taxi trajectories. Specifically, we extract human mobility patterns from taxi trajectories, and use the co-occurrence of origin-destination zones to learn zone embeddings. To utilize the spatio-temporal characteristics of human mobility patterns, we incorporate mobility direction, departure/arrival time, destination attraction, and travel distance into the modeling of zone embeddings. We conduct extensive experiments with real-world urban datasets of New York City. Experimental results demonstrate the effectiveness of the proposed embedding model to represent urban functions of zones with human mobility data.
#1845

Robust Feature Selection on Incomplete Data
Wei Zheng, Xiaofeng Zhu, Yonghua Zhu, Shichao Zhang

Unsupervised Learning and Applications

Feature selection is an indispensable preprocessing procedure for high-dimensional data analysis,but previous feature selection methods usually ignore sample diversity (i.e., every sample has individual contribution for the model construction) andhave limited ability to deal with incomplete datasets where a part of training samples have unobserved data. To address these issues, in this paper, we firstly propose a robust feature selectionframework to relieve the influence of outliers, andthen introduce an indicator matrix to avoid unobserved data to take participation in numerical computation of feature selection so that both our proposed feature selection framework and exiting feature selection frameworks are available to conductfeature selection on incomplete data sets. We further propose a new optimization algorithm to optimize the resulting objective function as well asprove our algorithm to converge fast. Experimental results on both real and artificial incompletedata sets demonstrated that our proposed methodoutperformed the feature selection methods undercomparison in terms of clustering performance.

Thursday 19 14:00 - 14:45 Research Excellence Award 2017 (VICTORIA)

Andy Barto

Research Excellence Award 2017

Thursday 19 14:00 - 16:00 Industry Day (A4)

Industry Day - Session 2

Industry Day

Show details

Thursday 19 14:45 - 15:30 Research Excellence Award 2018 (VICTORIA)

Jitendra Malik

Research Excellence Award 2018

Thursday 19 15:30 - 16:15 McCarthy Award 2018 (VICTORIA)

Milind Tambe

McCarthy Award 2018

Thursday 19 16:15 - 17:00 Computers and Thought Award 2018 (VICTORIA)

Stefano Ermon

Computers and Thought Award 2018

Thursday 19 17:30 - 18:30 Closing (VICTORIA)

Closing

Closing