Distributed systems, SDN, and 5G architecture.
Tags:
distributed bgp sdn 5GDescription:
Distributed architecture, edge-computing, SDN, and how they are revolutionizing our communication networks.
Distributed and centralized architectures are two different approaches for the organization of computer networks. These two architectures have several unique characteristics.
In a centralized architecture, there is a central computing unit, or a central node, that serves as the core of the network where all network traffic must pass through. This central node then has control over the management and resource allocation of the network.
Centralized systems are simple, because all the important computation could be implemented within a single computing unit, which is often easy to do. Therefore, systems at the dawn of computer networking often utilized this design.
Despite its popularity, time has shown that this system architecture brings many fundamental challenges for network performances. Because the whole network relies on a single central node, the network performance is inherently limited by the capacity of the node. As more and more devices connect to the system, this node will be put under strain and may get overwhelmed.
In addition, centralized network design allows a single point of failure to exist. Failures of the central node, or any nodes on a brach on a tree design will lead to the cease in operation of the whole network, or the brach where the faied node originates.
In contrast to the centralized architecture, there is no central nodes. Each node has its own level of control and can communicate directly with other nodes in the network. There is no single authority that governs the whole network.
Because of its nature, distributed networks are inherently more scalable than the centralized counterparts. New nodes can be added to the network without putting too much strain on any single existing node. Since traffic does not have to be sent through any central node, the network performance is not limited to the capacity of this node, and thus, bottlenecks can be effectively avoided
In addition, due to the absent of a central node, there is no single point of failure in distributed networks.
It did not take a long time for network scientists to figure out that both centralized and distributed network architectures have their own pros and cons. Centralized networks are often simple and easy to implement and manage, while distributed systems fits for scenarios where scalability, performance, and fault-tolerance are essential.
Since both types network architectures have their own benefits and drawbacks. Network scientists have spent a lot of effort finding ways to combine the two to bring the best of both worlds. As a result, many hybrid architectures were born as products of combining these two network designs.
A great example for the hybrid architecture is the 2-tier or 3-tier network architecture. In the simpler 1-tier networks, all devices are connected to a single central routers. This design is often found in small home and offices, where all devices from desktops, laptops, mobile phones, printers are all connected to a single router either via LAN cable of Wifi. This design is easy, but will cause the router to burn soon when configured for big networks. Therefore, in more serious networks, such as enterprise networks, 2-tier or 3-tier architectures are implemented. In a 3-tier architecture, the network is divided into three layers: Access, Distribution and Core. The 2-tier architecture is simply a 3-tier architecture where the Distribution and the Core layers combined to each other, thus, the 2-tier network design is often called "Collaped Core Design". Apart from the Access layer the connects directly to the end devices, every higher layer in these 2/3-tier architecture consists of multiple layer-3 switches conneted to each other and connect to each node in the adjacent layer. This design allow traffict to be send over multiple routes via different routers, thus, it is distributed, and avoid any single point of failure.
From inception, the internet was aimed to be a distributed system [1]. The original designers of the internet wanted to create a network of networks of computers such that no central authority should have control over the entire network. Instead, it is made up of interconnected networks of autonomous systems (ASes) operated by various organizations, including internet service providers (ISPs), universities, governments, and private companies.
As a fun fact, the network of all computers within Tokyo Insitute of Technology is a registered Autonomous System with AS number (ASN) 9367, AS name TITECH, whose IP addresses are allocated in the 131.112.0.0/16 range. You can easily look them up on the internet.
Nowadays, traffics between the ASes are routed predominently using Border Gateway Protocol (BGP), which is a routing protocol that does not rely on any single central authority to find its routes. BGP is a path-vector protocol, which uses vectors contain a wide range of attributes to determine the best path for routing. These attributes include the length of the AS path, the origin of the route, the next hop, and many others. BGP allows ASes to make routing decisions based on policies, such as preferring certain paths, avoiding congested links, or prioritizing specific types of traffic. ASes can manipulate these attributes and apply routing policies to influence path selection based on their specific requirements. BGP (and path-vector protocols in general) is designed to work well in distributed network, and to be scalable.
While the internet itself is designed to be decentralized and distributed, a few centralized elements are still required for its efficient fuction.
One example of these centralized element is the Domain Name System (DNS). The DNS translate domain names (such as titech.ac.jp) into IP addresses that computers understand and route traffic to. Nowadays, DNS is still a centralized system in nature. For example, when a computer wants to get the IP address of titech.ac.jp, it must asks the root name servers where to get the IP addresses for address with top-level domain ".jp", then ask this ".jp" server where to look up the ".ac.jp" address, and finally the "titech.ac.jp" address. The process must be started from the root, or, a "central" authority. If all servers of the root DNS is down (such as, when all computers in the work access it at the same time), then DNS would stop function, causing a lot of chaos for the internet. However, nowadays, IP address and domain names are cached by Internet Service Providers (ISPs) and many other organizations, as well as your own computers, reducing the strain on these roots and top-level domain servers.
While the internet at large is distribued, within each AS, the network design does not need to be distributed. The forth-generation (4G) cellular network is one example of a hybrid network, where the design is both distributed and centralized.
4G networks have a distributed architecture, with base stations dispersed throughout an area to provide coverage and handle traffic. These base stations are connected to each other and to central processing units through a backhaul network, which is responsible for transmitting data between the base stations and the core network.
However, 4G still relies heavily on centralized elements. After the data is sent from user devices to the Radio Access Network (RAN) and arrives to the CORE network. It is routed by a centralized system based on Software Defined Networking (SDN). SDN enables the separation of the control plane (which controls the routing) and the data plane (which does the routing itself) in the CORE network. While the data plane is distributed, the control plane is largely controlled by the network provider. This centralized control plane that manages network resources, allocates bandwidth, and oversees network functions.
At the end of the day, 4G systems are still "hybrid-but-still-mostly-centralized", because besides a few exception, the user data must go to the CORE network, then be processed there by a system controlled by a centralized control plane, before being sent again to other ASes or back to the RAN network. This is not a good network design in terms of reducing latency, and thus, the next generation of cellular networks has been implemented with several clever methods to improve this design.
5G push the "distributed" network element even further, in both the RAN and CORE networks, making it a true hybrid system.
In order to push the channel capacity further, 5G utilizes signals of higher frequencies than 4G, currently at the millimeter wave (mmWave) band, due to the wider available bandwidths there. The trade-off for using higher frequencies is smaller coverage area, because higher frequency electromagnetic waves have weaker material penetrations, and thus, more susceptible to fading. As a result, in big cities where there are a lot of objects obstructing the Light-of-sight (LoS) signal transmission, and transmission in general, the cellular signal becomes very weak and even undetectable. To fix this problem, a network of small cells and repeaters are installed to extend the coverage area of 5G.
Small cells refer to a network of many small-sized base stations, catering to cover specific areas that signal from other cells cannot reach to. In technical papers, this small cell network design is often called Heterogeneous Networks (HetNets). Other other hand, repeaters refer to devices that extend the coverage area simply by repeating signals from another base station. These repeaters can be smart repeaters, that actively repeat the signals, or passive repeaters, as known as Reconfigurable Intelligent Surface (RIS) that reflects the signal.
Additionally, instead of the traditional Base Station centric networking, where network activities revolve around the Base Stations, 5G aims for user centric networking. In this design, user is no longer the final destination of the wireless network but is expected to actively participate in the network functions, such as relaying traffic to other user devices. This concept is called User Equipment Relay (UE relay), or Sidelink relay. Although not exactly using the same technology, but the concept of Sidelink relay could easily be seen in many devices nowadays, such as on smart watches, where only the phone is required to be connected to the internet, whera the watch will use the phone as a relay to connect to the internet.
The huge system of traditional macro cells, new small cells, smart repeaters, and user equipment, requires an unprecedented level of coordination to be realized, which could not easily be done by configuring individual components and run them independently as in traditional distributed systems. But rather, they have to be centralize controlled.
Thankfully, C-RAN fit perfectly for this purpose. Instead of letting the base stations manage themselves, a system of "virtual base stations" is installed on the cloud as BaseBand Units (BBUs), that manage the operation of the remote physical base stations (Remote Radio Heads - RRHs). This architecture allows efficient control of the PHY and MAC layer of the base stations, thus allowing smart network switching, channel allocation, and avoid interferences. [2][3]
This type of design is allowed by SDN by separating the data and the control plane of the base stations. SDN allows general purpose equipment to be configured as the control unit for the base stations, including small cells and repeaters, instead of installing specialized equipments with specific-purpose hardware, thus drastically reducing the cost of installing the 5G network.
Not only the RAN, but also the CORE networks of 5G systems are also designed with both distributed and centralized elements.
Instead of a single User plane, or U-plane, or sometimes called data plane in the context of SDN, the network is divided into several slices for different network purposes on the same physical infrastructure. A common way of network slicing is to slice the U-plane into 3 slides: Boardband services, Mission Critical services, and IoT services. Each slice is managed with a different QoS policy, routing policy, etc. Traffic from different slices could be sent concurrently and errors in one network slice will not affect other slices.
For mission critical applications and other applications that require low latency in general, Mobile Edge Computing (MEC) servers are installed close to the user than in the traditional design. This requires the decentralization and distribution of computing nodes from the cloud into lower parts of the cellular network. This distributed design allows traffic to traverse the network much more quickly than having to go to the central computing unit on the cloud.
Although having a distributed design, the whole network still requires management from a central control unit to manage the CORE network functionalities.
Although centralized, various functions of the control plane are more and more distributed to other parts of the network. Researchers from Ericsson has proposed a decentralized control plane design in order to distribute this plane to enhance the performance of massive Machine Type Communication (mMTC) and the scalability of the network (Ericsson)
.Overall, both the traditional cable networks and the cellular networks are getting a similar design pattern utilizing Software Defined Networking (SDN) to separate their network functions into a data plane that runs on a distributed network of physical equipments, and a control plane that centralizes the management of the network. This design pattern combines the best of centralized and distributed designs to bring both the ease of management and the scalability for modern networks.
This article is composed of my understanding from several courses from universities, as well as many other resources. The information is not guaranteed to be correct and the detailed references will be provided later.