Mission Impossible: Characterising Network Coding Regions
The prevailing approach for data transport in communications networks is routing. Packets travel from source to destination via intermediate routers, which switch packets from incoming links onto outgoing links. This approach is ubiquitous in packet switched networks such as the Internet. The underlying conceptual model is physical commodity flow, e.g. highway traffic or water in pipes. If multiple packets wish to simultaneously occupy the same link, some packets must wait or be dropped. Network coding stems from the realisation that information is fundamentally different to physical commodities. Intermediate nodes can perform more general data processing, rather than just switching. For example, packets can be combined via modulo 2 addition. Network coding is practically and theoretically significant. Not only can it increase throughput, it is robust to link failures, and can minimise delay or transmission cost. This tutorial will begin with an introduction of network coding and its applications in distributed data storage network.
Characterising the network coding region (i.e., the set of achievable source transmission rate and link capacity tuples) is a fundamental question in network coding. The problem is completely solved for single source multicast (one source with several receivers): The network coding region is characterised by max-flow/min-cut bound. Optimal network codes can also be found easily in this case. In contrast, finding optimal routes and maximum routing throughput is fundamentally hard. In the general non-multicast setting, the problem is still an open challenge.
The second part of the tutorial will focus on existing results in addressing this challenging problem. Specifically, we will cover the following topics:
- Characterisation of network coding region by entropy functions
- Dualities between characterisation of network coding regions and information inequalities
- Linear Programming (LP) bounds
- Constrained network coding
LP bound is probably the most tightest computable bound in existing literature. However, one of its deficiency is its high computational complexity: Both the number of variables and the number of constraints increase exponentially with the number of links in the network, making the bound computationally infeasible even for modest networks. The third part of the tutorial will examine several computationally efficient “network-based” bounds, including the cut-set bounds, the functional dependency bounds and the network sharing bounds.