Relation network

A relation network (RN) is an artificial neural network component with a structure that can reason about relations among objects. An example category of such relations is spatial relations (above, below, left, right, in front of, behind).

RNs can infer relations, they are data efficient, and they operate on a set of objects without regard to the objects' order.

History
In June 2017, DeepMind announced the first relation network. It claimed that the technology had achieved "superhuman" performance on multiple question-answering problem sets.

Design
RNs constrain the functional form of a neural network to capture the common properties of relational reasoning. These properties are explicitly added to the system, rather than established by learning just as the capacity to reason about spatial, translation-invariant properties is explicitly part of convolutional neural networks (CNN). The data to be considered can be presented as a simple list or as a directed graph whose nodes are objects and whose edges are the pairs of objects whose relationships are to be considered. The RN is a composite function: $$ RN \left (O \right ) = f_\phi \left ( \sum_{i,j} g_\theta \left (o_i, o_j , q \right ) \right ), $$

where the input is a set of "objects" $$ O = \left \lbrace o_1, o_2, ..., o_n \right \rbrace, o_i \in \mathbb{R} ^m $$ is the ith object, and fφ and gθ are functions with parameters φ and θ, respectively and q is the question. fφ and gθ are multilayer perceptrons, while the 2 parameters are learnable synaptic weights. RNs are differentiable. The output of gθ is a "relation"; therefore, the role of gθ is to infer any ways in which two objects are related.

Image (128x128 pixel) processing is done with a 4-layer CNN. Outputs from the CNN are treated as the objects for relation analysis, without regard for what those "objects" explicitly represent. Questions were processed with a long short-term memory network.