1. Introduction

This document aims at providing a comprehensive guide through efforts made towards combining two prominent fields within machine learning: statistical relational learning and representation learning. Both fields advance machine learning appraoches in different directions, with little but growing interaction between themselves.

Statistical Relational Learning (see also: tutorial, tutorial) deals with building machine learning models from complex data structures. Statistical relational models express both uncertainty of the knowledge (i.e. statistical component) and its complex relational structure (i.e., relational component). Such expressivity is supported by rich knowledge representation formats used in the existing appraoches. Though different flavours of representation formats exist in the field, a common theme is that (a subset of) first-order logic is used to represent relational knowledge, while probabilistic graphical models represent the uncertainty of a model. Such rich representation serves as lingua franca capable of expressing many different data formats, including vectorized representations, hypergraphs and sequences.

Significant contribution were made since 1990, and a number of difference formalism and systems were introduced. Some representative examples include:

Problog, a relational extension of Bayesian networks
Markov Logic Networks, a relational extension of Markov networks
Probabilistic Soft Logic
PRISM

and many more. A more detailed introduction is available in Chapter 7.

Representation or Deep learning, on the other hand, is concerned with autonomous learning of rich feature hierarchies. In a the last decade, the method developed within this field pushed the frotniers in many applications of machine learning, such as object detection and speech recognition. These methods tackle the biggest bottleneck in any machine learning application: the one of create a good set of features. Several distinct approaches exists, but are coarsly divided into a supervised and unsupervised approaches. Supervised approaches, such as convolutional neural networks, learn useful features specifically for the task at hand. In contrast, approaches such as Restricted Boltzmann machines and Autoencoders require no supervision and rely on reconstruction in finding a good set of features.

Though both field focus on different aspects of machine learning, they are not in a dichotomy and combining the strengths of respective approaches could be beneficial. Learning the structure of SRL models, i.e., a set of formulas describing a domain, is a difficult problem due to its combinatorial nature. Integrating some aspects from deep learning might help to mitigate that problem, by providing a new set of features that would make learning easier and faster. Moreover, better features induces models of lower complexity and reasoning with them faster. Additionally, simpler models with less parameters are less prone to overfitting.

Combining these two fields attacted a growing interest lately. The following section briefly summarizes the main directions and discusses their differences.

Next chapter: 2. An overview