Open Source

The telecommunication network evolution into cloud native architecture

By Ranny Haiby Samsung Resesarch America

By Victor Morales Samsung Resesarch America

What does cloud native mean for telecommunications?

The term “cloud native” is used to describe an architecture of applications that are designed to run on cloud environments and leverage their capabilities. For over a decade, the telecommunications industry has been working on adopting modern software architecture for the devices in the wireless and broadband networks. Legacy monolithic telecommunications applications were simply lifted from their physical machines and shifted to run on a virtual machine in the cloud, in what is known as Network Function Virtualization, or NFV.
Cloud native applications on the other hand were designed from day 1 to run in a cloud. As such, they are deployed in lightweight containers, designed as loosely-coupled microservices, and make extensive use of APIs for internal and external interactions. Telecommunications Vendors and operators working on the NFV initiative have been focusing on virtualization technologies for years. Recently there has been a desire to follow the success of Web Scale companies who are using cloud native architecture. Several aspects of the cloud native architecture, such as scalability and modularity, make it a perfect match for 5G networks.

The trend of Cloud native Network Functions (CNF) instead of Virtual Network Functions (VNFs) is now considered mainstream topic in the telecommunications industry. The current discussion is about the “how” and not the “why” anymore. Running network functions in containers using Kubernetes is becoming the new norm. However, there are still several gaps of functionality in a Kubernetes platform when it comes to running network functions. Few of the main gaps identified are:
a. Network connectivity for containers, including high throughput interfaces (SR-IOV, DPDK).
b. Hardware topology awareness in container scheduling (NUMA aware schedulers).

c. Harnessing hardware acceleration technologies and making them available to containers.

What exactly is cloud native?

After a collaborative discussion, the CNCF published the following definition of cloud native:
“Cloud native technologies empower organizations to build and run scalable applications in modern, dynamic environments such as public, private, and hybrid clouds. Containers, service meshes, microservices, immutable infrastructure, and declarative APIs exemplify this approach. These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil.”

Cloud native Network Functions, or CNFs are a hot topic in the industry. According to a survey done by the Cloud Native Computing Foundation (CNCF), 88% of Communication Service Providers already use Kubernetes, and 50% have some production deployment of it. According to this survey, networking and security are two of the main challenges when using Kubernetes for Telco workloads.

The evolution to cloud native architecture – the case of ONAP

Cloud native network functions make the role of an orchestrator more important than ever. The added level of complexity created by the decomposed architecture makes orchestration a necessity, not a luxury. The Open Networking Automation Platform (ONAP) is a comprehensive platform for real-time, policy-driven service orchestration and automation for networks and edge services. It is being developed as an open source software project with neutral governance under the auspices of The Linux Foundation. The project started back in 2017 when VNF was the leading technology and is quickly evolving to adapt to the new world dominated by Cloud native Network Functions (CNFS). A special taskforce in the ONAP community named the ”CNF Taskforce” is steering the direction of this evolution and coordinates all the various activities related to this transition. The activities of the taskforce fall into two main categories:
• ONAP evolving to orchestrate network services built using CNFs.

• The ONAP software becoming a true cloud native application.

CNF Orchestration
In the first category, CNF orchestration, the main development is around the onboarding and design of services, playing a role in “super blueprints” that provide a reference architecture to real-life services, and the actual orchestration of the services:

Source: https://wiki.onap.org/display/DW/TSC+Task+Force+-+Cloud+Native

Samsung software engineers and architects are deeply involved in this effort, contributing to the system design effort and the source code implementation. Several of our colleagues from Samsung R&D Institute Poland (SRPOL) successfully demonstrated how ONAP orchestrates CNFs during a recent developer forum event of the Linux Foundation Networking. Ranny Haiby from the Open Source Group (OSG) at Samsung Research America (SRA) is co-leading the CNF taskforce. This kind of active involvement in the development allows us to gain intimate knowledge of the implementation and make sure the cloud native network functions produced by Samsung are fully compatible with the ONAP orchestrator and ready to be deployed by our customers.

ONAP as a cloud native application
In this category, the main efforts are around Containerization, Configuration & Security management, and Observability & Analysis.

Source: https://wiki.onap.org/display/DW/TSC+Task+Force+-+Cloud+Native

The key aspect of the Configuration & Security management is the adoption of a service mesh architecture. This architecture was first proposed by a Samsung developer from SRPOL driven by his concern regarding the security aspects of ONAP. The security expertise of this developer and his team members allowed them to analyze the security vulnerabilities of ONAP and come up with a solution using state of the art technologies. Samsung developers later collaborated with other ONAP community members in creating a proof-of-concept that paved the way to the adoption of the service mesh technology.

CNF Workgroup under the Cloud Native Computing Foundation (CNCF)

The technical challenges involved in running network workloads as cloud native applications are still being addressed by the industry and it seems that the effort done so far has just scratched the surface. To help with that, the CNCF launched a new workgroup to create best practices for designing and building cloud native network functions. The workgroup (cnf-wg) consists of experts from network function vendors, operators and the CNCF community. Members of the OSG in SRA are actively participating and contributing to the deliverables of the workgroup. There are several parallel discussions ongoing in this workgroup with the aim of creating best practices for how to build proper cloud native network functions (CNFs). While it is relatively easy to re-package a virtual machine to run inside a container, it is far more difficult to make the network function software behave like a true cloud application, using the same scalability, resilience, and portability of cloud apps.

Use Cases

The workgroup creates best practices that are driven by Telecommunications use cases. Those use cases can be categorized into two major groups: Infrastructure restrictions and customer limitations.

Infrastructure restrictions are use cases resulting from the implementation and usage of Kubernetes, for example the onboarding process (a common term used for getting the CNF deployed on the platform) where the CNF vendor expects that CSP platform to fulfill certain requirements and preconditions for CNF. This assumption can result in an intensive interaction between CNF vendor, CNF/platform DevOps teams and lot of trial and error tuning and manual adaptations during the process.

Another interesting use case for this group is the Lifecycle management of the infrastructure. Kubernetes native approach assumes that cluster nodes are ephemeral and relatively short living. This approach affects vendors that created CNFs tightly coupled to the underlying infrastructure, as consequence CNFs blocks Kubernetes cluster nodes from being drained for operation and maintenance processes. CNFs shall be written for cloud, in other words, they shall be able to recover from node failures and draining with marginal loss of traffic.

Customer limitations are a group of use cases where the restrictions are faced by CNF developers and architects, like the BGP (Border Gateway routing Protocol) connection which was designed to be direct from one BGP process to another, with the assumption that the network provides reachability and nothing else. The current BGP speakers are not going to be able to make use of standards forms of address rewriting or load balancing from Kubernetes platform.

Another use case deals with network Functions that maintain State (Infrastructure). A 5G Real-Time Convergent Charging System has to maintain the account balances and usage quotas of active subscriptions. The state should be available across the cluster to all stateless CNFs that need to access it. Lack of real-time decision exposes the Service Provider to potential financial loss. When a node must be taken out of a cluster for planned downtime, in a low-latency environment it can be challenging to move the workload from one node to another. All ongoing charging sessions must be completed before any move can take place. In case of a node failure, the short lived state must have replicated to other nodes in the cluster.

Best Practices

One example of a best practice deals with container execution privileges. Cloud native 5G Networks can be a lucrative target for cyber threat actors. To mitigate this security risk, the CNF working group has defined the non-root user container execution best practice. This best practice proposes the execution of containers with a host independent user to avoid compromised CNFs from causing more damage in a multitenant environment like Kubernetes.

Impact of the workgroup

To expand 5G coverage for an increasing consumer base, communication service providers need to be able to calculate easily the return of investment for the new capital investment as well as keep running the operational expenditures. Therefore, Telecommunication service providers need an adaptable, on-demand and pay-as-they-go infrastructure services offered by Cloud Service Providers.

The open source software has been driving the business and technology transformation for communication service providers allowing them to mix and match on-premise with public cloud infrastructure, all monitored and managed through a single plane control.

The CNF working group has provided a space to collaborate and discuss challenges faced by Cloud Service providers, Communication service providers and Infrastructure providers following the principles provided by open source projects.

Samsung’s Thought Leadership

The Open Source Group at Samsung Research is leveraging its expertise in cloud native technology and Telecommunications to contribute to the work of this group. Victor Morales, a member of the Open Source Group at SRA and active workgroup participant, is making important contributions and reviews to the content of the deliverables to constantly improve it.

#ONAP #OpenSource