With the requirements to support the emerging applications, such as autonomous driving, remote medical services, factory automation, etc., the communication networks encounter the challenge of the increasingly stringent key performance indicators (KPIs) within the complex system design. Observed by mobile operators, the traditional human-machine interaction is slow, error-prone, expensive, and cumbersome to handle the challenge. It is expected to enable the system to set the suitable decision with the foreseen and comprehensive view. Artificial Intelligence (AI)/Machine Learning (ML) provides a powerful tool to exploit the implicit relation among the sample data so as to make the prediction or the decision without being explicitly programmed. The way to utilize AI/ML to improve the network management and the user experience sparks keen interests of both operators and vendors.
3rd Generation Partnership Project (3GPP), as the leading standardization organization, started to explore the scheme of AI/ML for NG-RAN (i.e. the base stations connected to 5G core network) from November 2020.
Self-Optimization Network (SON) is a popular mechanism to enhance the network performance i.e., network capacity, coverage and Quality of Service (QoS), by enabling the self-configuration and the self-optimization in each node based on the limited collected information. However, the management cost is heavy for operators to reach the objective. Specifically, diverse situations for high-density nodes with massive number of users bring in the huge analysis burden to find out the root cause and set the suitable configurations. Furthermore, there is no unified method to formulate the policy among different vendors, so the local optimization may occurs with the performance damage in other nodes. To solve the above headaches of network management and maintenance, AI/ML is deemed to be a technique for strengthening the system automation. It is desirable to generate the adaptive policy for various situations with the capability of failure or harm avoidance to guarantee the network performance.
Hence, as the first AI/ML-enabled work item in 3GPP RAN working groups, the solution for three SON related use cases are specified: Energy saving, Load balancing, and Mobility optimization.
Millions of 5G base stations (called gNBs) are being deployed to meet the 5G network requirements of key performance and the demands of the unprecedented growth of the mobile users. Such rapid growth brings the issues of high energy consumption, CO2 emissions and operation expenditures (OPEX). Therefore, energy saving is an important use case to improve the network energy efficiency.
Energy saving is a self-optimization function to enable the network to autonomously switch off the capacity booster cell when its capacity is no longer needed to lower energy consumption, while guaranteeing a target level of the quality of service or experience. When planning to switch off one cell, the node needs to formulate an offloading scheme for remaining served UEs. In details, when the estimated load or traffic is below a defined threshold, the node may deactivate the cell and redirect the UEs to the new target cells.
Historical/current load status of a cell and its neighbour cells is a decisive factor in the existing scheme to assist the action setting, where the information can be collected via the resource status reporting procedure in E1/F1 interface for the intra-gNB information and Xn interface for that of the inter-gNB. However, the non-proactive setting aiming at global energy consumption reduction is not a trivial task. Improper energy saving decision may seriously deteriorate the network performance and energy efficiency instead of an improvement, since the neighbour cells need to serve the additional traffic offloaded from energy saving cells. The existing energy-saving schemes are vulnerable to potential issues listed as follows:
- Difficulty to set a reasonable decision by the conventional non-proactive mechanism. The existing decision relies on the current/historical traffic load without considering future traffic load. When the load in neighbour nodes is extreme high or the load in the following time period changes rapidly, it may lead to switch-on/off ping-pong, overload of neighbour cells, and call drop, resulting in poor service performance. Besides, in such case, the node needs to set the energy saving decision after a short time, which leads to high overhead to inform the decision to other network nodes and/or high energy consumption due to the consumption for switch-on/off.
- Conflicting targets between system performance and energy efficiency. The performance maximization is achieved via sacrificing the energy consumption. How to realize the trade-off between them is regarded as an urgent problem to be settled.
- Local energy efficiency instead of overall energy efficiency. The action set by a single node may produce a local energy consumption reduction within one node, while producing an overall increment from the view of node and its neighbour nodes as the neighbour nodes needs to take over the remaining load of the energy saving cells.
AI/ML provides an approach to solve the above issues through analyzing the collected data that can yield further insights. One of the AI/ML functionalities is predicting the load status and energy cost for a near future period. With the assistance of prediction information, the decision is empowered by the long-term thought, expecting to avoid the local overload, switch-on/off ping-pong, etc. The other AI/ML functionality is to set the adaptive energy saving decision with high resiliency to the situation of networks, where the situation includes UE measurement results, UE location information, (historical/predicted) energy cost and resource status of the node as well as the neighbour cells. The decision outputted from AI/ML model is presumed to reduce the energy consumption and realize the trade-off between performance and energy consumption simultaneously.
Figure 1. AI/ML based energy saving
The rapid traffic growth and multiple frequency bands utilised in a commercial network make it challengeable to steer the traffic in a balanced distribution. To address the problem, load balancing was designed to manage congestion to improve the system capacity by means of distributing traffic across the system radio resources. The automation of such scheme can effectively minimize the human intervention in the network management and optimization tasks.
The existing mechanism determines the load balancing action according to the measured load status of its own and received load status from neighbour node. When the imbalance situation is detected, e.g. much heavier load of itself than others, the node selects the light-load cells as the target cell and draws up a plan of the load amount to be transferred to each targets cells. Based on that, the subsequent handover would be performed to realize the offloading.
The load balancing decision needs to take multiple factors into account, including QoS requirements of UE services, radio quality, UE mobility information, load status of nodes, slicing supported capability, NPN supported capability, etc. It is not easy to lay down a brilliant action to fit for all the aspects by conventional rigid solution. The main shortage of the existing scheme is analysed as follow:
- Inefficient decision relying on the current/historical cell load status. Similar to the energy saving case, the current/historical-information based scheme cannot maintain the balanced state in a long period. Consequently, it may lead to the frequent handover or local overload.
- Difficulty to guarantee the balanced distribution and service performance. One of the objective for radio network is to meet the QoS requirements of the UE. The node hands over the UE with low data rate services (e.g. surfing the Internet) to the neighbour node with medium-load cell, while transferring the UE with high data rate services (e.g. XR gaming) to the light-load cell. Since the service type is time varying, the data volume of UE may dramatically increase due to the application change. The target cells may suffer the overload situation with new-arrival heavy traffic. Even more seriously, the node has no enough radio resource to support such high traffic service, thus, the QoS requirements cannot be fulfilled. Hence, it is a tough objective for conventional scheme to determine whether the service performance after the offloading action can reach the desired targets.
To deal with the above issues, AI/ML model is applied to take advantage of the prediction function to overcome the drawbacks of short-term strategy to realize stable load balancing among dense networks. The first AI/ML functionality, same as energy saving use case, is to predict the resource status (i.e. load status) to avoid the overload case in target cells with the new-arrival traffic. UE traffic prediction is another AI/ML functionality to assist the target cell selection to keep the harmonious status with fulfilled QoS performance. Apart from that, AI/ML model is expected to generate a comprehensive decision with the eye of panorama and forecasting to reach the goal of balanced distribution to improve the system capacity as well as satisfying the service requirements. To ensure the UE performance, the target nodes can be requested by the source cell to feedback the UE performance (i.e., throughput, delay, and loss rate), and the source node evaluates whether the decision is effective and does the adjustment (e.g. triggering the re-training procedure) if required.
Figure 2. AI/ML based load balancing
With the incremental capacity requirements, the network design goes to seek the solution for higher operational frequency band. As a consequence, the coverage of a single node shrinks, and UE needs to perform handover frequently, especially for high-mobility UE. In addition, the Quality of Experience (QoE) is sensitive to the handover performance of the applications characterised with the rigorous QoS requirements such as reliability, latency, etc. The robustness of mobility decision has the rich contribution for service performance and user experience.
Mobility management is the scheme to guarantee the service-continuity during the mobility by minimizing the occurrence of unintended events. The examples for unintended events are given:
- Too Late Handover: A radio link failure (RLF) occurs after the UE has stayed for a long period of time in the cell; the UE attempts to re-establish the radio link connection in a different cell.
- Too Early Handover: An RLF occurs shortly after a successful handover from a source cell to a target cell or a handover failure occurs during the handover procedure; the UE attempts to re-establish the radio link connection in the source cell.
- Handover to Wrong Cell: An RLF occurs shortly after a successful handover from a source cell to a target cell or a handover failure occurs during the handover procedure; the UE attempts to re-establish the radio link connection in a cell other than the source cell and the target cell.
- Successful Handover: During a successful handover, there is an underlying issue.
The UE measurement result is the dominating information for conventional mobility decision. When the measured signal quality is lower than the defined threshold or that of neighbour cell is much higher than the one in serving cell, the node may trigger the handover procedure. The cell with highest signal quality and required capability is chosen as the target cell, where the required capability contains whether supporting the allowed slicing, requested NPN or other features. SON defines a mechanism to try to avoid the unsuitable mobility strategy via the steps of collecting the failure data, recognizing the problem and adjusting the subsequent decision. However, the existing mechanism still cannot guarantee the success of handover due to the challenges:
- Difficulty for current trial-and-error-based scheme to achieve nearly zero-failure handover. The existing scheme enables the UE to log the related information after the failure and report to network, which is used for root-cause analysis and configuration update. This failure-case basis scheme is vulnerable for the packet-drop-intolerant and low-latency applications due to the poor performance during the trial stage, as the unsuccessful handover cases are the main reason for packet dropping or extra delay during the mobility period. In addition, the effectiveness of the adjustment based on the feedback may be weak due to randomness and inconstancy of transmission environment.
- Additional complexity of Dual-Connectivity (DC), Conditional Handover (CHO) and Dual Active Protocol Stack (DAPS). The dual-connectivity was born to provide the adequate capacity for the high traffic services. The dimensionality and complexity of mobility decision for DC case are remarkably increases. For example, the handover types for DC cases consist of signal-connectivity to dual-connectivity, dual-connectivity to single-connectivity, secondary node change with same master node, master node change with same secondary node, both master and secondary node change. Besides, conditional handover is designed to reduce the handover time to improve the service continuity, which can be combined with the DC cases such as Conditional PSCell Addition and Conditional PSCell Change.
RAN intelligence enabled mobility aims at the robustness improvement and unintended event avoidance. The experience can be learnt via observing massive handover events with associated parameters. The AI/ML model tries to identify sets of parameters that lead to successful handover for various situations, e.g. resource status of neighbour cells, UE measurement result, UE trajectory related information, and performance requirements. The proactive decision from AI/ML can achieve the zero-failure mobility management.
In addition, AI/ML model can help to predict the trajectory for the following time period by exploring the trend of historical position data, so that the handover strategy can take the predicted trajectory information as reference. In details, predicted trajectory information helps to select the proper target node/cell to alleviate the wrong handover cases, and set the precise handover time to avoid too early/late handover, resulting in the reduced failure rate. For high-mobility UEs, the amount of collected data in a small-coverage cell (i.e. mmWave cell) is low, which is not sufficient for AI/ML model to generate accurate predicted UE trajectory information. And the node capability to support AI/ML model is diverse, so there may exist some nodes with no ability for AI/ML model inference or the function for UE trajectory prediction. To expend the benefits of UE trajectory prediction, the source node can carry the prediction results to the target node to provide the reference information for subsequent handover decision. The target node feeds back the actual UE trajectory to source node to monitor the prediction accuracy.
Figure 3. AI/ML based mobility optimization
As AI/ML for NG-RAN in Rel-18 is the first AI/ML enabled normative work item in 3GPP RAN working groups, the initial solutions of embedding the AI/ML functionality in the access network are specified with the aim to generate high-precise strategy and prediction to improve the efficiency of network management and maintenance in three prioritised use cases. To continuously explore the RAN intelligence in depth, a new study item is planned after the on-going work item to seek the schemes for further enhancement and for other potential use cases, such as slicing, QoE, etc. Samsung, as one of the main contributors to the AI/ML for NG-RAN standardization, will keep driving the system automation with the aid of advanced techniques to secure the performance.