The growing interests on accurate positioning has been developed from thriving applications/services including the cellular network operation. The classical positioning methods mostly rely on information extracted from channel information, e.g., time of arrival, angle of arrival (or departure). While these could work under line-of-sight (LOS) propagation conditions, the positioning accuracy of these methods deteriorates heavily in the environments where multipath propagation and non-line-of-sight (NLOS) are predominant. As shown in Fig.1 for indoor factory scenario targeting to the smart/automobile operation in which the high accuracy of the equipment position is important, the LOS-path could be blocked by all kinds of objects/clusters, which limits the accuracy of the traditional methods.
Figure 1. An illustration of complicated indoor factory environment with NLOS scenarios.
Artificial intelligence or machine learning (AI/ML) has drawn great attention in areas of both academic and industry, due to its strong abilities to extract features efficiently and inference accurately. There have been some initial studies on applying AI/ML to positioning acquisition  , which showed the AI/ML based methods could have potentials to obtain accurate positioning estimation theoretically. On the other hand, AI/ML based positioning has been approved in the 3rd generation partnership project (3GPP) as one of the three significant use cases for considering AI/ML in Rel-18 5G-Advanced network . For bridging the academic study to industry application, one major obstacle needed to deal with is the generalizing abilities of AI/ML models on unpredicted practical data inputs. This is expected to be more problematic for positioning, since the channel information (CI) quality is essential for the location estimation. The misalignments (e.g., a noisy CI caused by low SNR) or incompleteness of the qualified CI compared to that used in training process, will severely jeopardize the performance of the trained model. One possible way is using methods that increase model generalizing ability developed in the common ML community, such as data augmentation, loss function regularization, dropout, etc. These methods can improve the model’s generalizing ability, however, with the cost of the model training time budget and the training data size budget, which may be an issue in practice because of the time-varying property of CI and limited backhaul link capacity.
In this blog, a novel hybrid machine learning (HML) approach is introduced by exploiting both supervised and unsupervised learning models developed with denoising and inpainting abilities to enable accurate positioning in NLOS scenarios. The simulation shows the proposed approach can have 10 times higher accuracy than conventional approaches.
The proposed HML based positioning method is shown in Fig.2. In training phase, two neural networks (NNs), namely NN-1 and NN-2, corresponding to the unsupervised and the supervised learning part could be trained in parallel. NN-1 is the unsupervised learning part and trained by CIRs to learn statistical characterizations of CIRs. NN-2 (e.g., DenseNet ) is the supervised model and trained to learn the mapping from CIRs to locations. In testing phase, two NNs will be applied successively, i.e., NN-1 is used recursively for denoising and inpainting, and the denoised/inpainted CIRs are fed into the NN-2 for the final positioning inference. For realistic choice the CIR check could help reduce the burden to avoid unnecessary usage of NN-1 if the input CIR is qualified.
Figure 2. General description of the proposed HML based positioning method.
A. Unsupervised Learning Part: Denoising and Inpainting
For the unsupervised learning part, the diffusion model (DM) is used to learn CI without corresponding position information. A DM is a latent variable model that is parameterized by a Markov chain and trained using variational inference . An illustration of DMs is shown in Fig.3. In DMs, the distribution of dataset is defined as the starting point of a forward Markov chain process. The dataset distribution is gradually corrupted from one distribution into another analytical distribution, e.g., Gaussian or binomial, by adding slowly increasing noise at each step. With the forward process, we train neural networks to learn its reverse process, using the theoretical insight that the reverse process has the identical functional form as the forward process.
Figure 3. An illustration of diffusion models
The noise schedule of the forward process, i.e., the choice of noise variance at each step, is important and has impacts on the training time and the performance of the trained model. Commonly used noise schedule includes linear schedule  and cosine schedule . In contrast, we use exponential schedule to learn the CIR patterns. The intuition behind this idea is that the power-delay profile of CIR typically exhibits an exponential decay phenomenon. It shows training the DM with exponential schedule of noise variance has better resolution for channel paths with small powers, compared with linear and cosine schedules.
After training completion of the DM, an approximated prior distribution of CIRs is obtained. Given a noisy CIR, we can utilize this approximated prior information to do CIR generation, denoising, and inpainting. The proposed DM based denoising algorithm is illustrated in Fig.4. A weighted version of the noisy CIRs is fed into the trained neural network recursively. Since the conditional probability of the reverse process is Gaussian, the weights of noisy CIRs and output of neural networks can be obtained in closed form under maximum a posterior (MAP) or minimum mean squared error (MMSE) criterion. The denoising algorithm can be easily extended to situations of incomplete CIRs, which is a typical case when measurements of CIRs of some transmission-reception-points (TRPs) are unavailable or their quality are too poor to be used for positioning. The inpainting algorithm has a similar structure as the denoising algorithm, except that the missing part of CIRs are recovered by CIR generation procedures.
Figure 4. An illustration of the proposed denoising algorithm
B. Supervised Learning Part: Location Acquisition
For the supervised model, we use a densely connected convolutional neural network (DenseNet) to learn the mapping from CIRs to UE location. DenseNets alleviate the gradient vanishing problem and strengthen feature propagation, which enables very deep network architectures. Compared with conventional convolutional networks or residual networks (ResNets), they have much less number of parameters which is more suitable for on-device deployment. An illustration of DenseNet used in this work and its building blocks are shown in Fig.5. The core building block, i.e., DenseBlock consists of multiple general convolutional layers and one transition layer, each layer input is connected to every subsequent layers in the DenseBlock, the multiple inputs of layer are concatenated together and fed into the layer.
Figure 5. An illustration of DenseNet.
Figure 5-1. Building blocks of a DenseNet.
The 3GPP InF-DH scenario is used for evaluating the proposed method, and experimental configurations are given in Table.1. Each training sample consists of CIRs from 18 TRPs and the corresponding 2D location between UE and TRP. Ideal synchronization among all TRPs is assumed. To evaluate the performance of the proposed method under various noise conditions, the CIRs of testing set are added by white noise with various SNRs.
Table 1. Experimental configurations
As a preprocessing, each input sample of CIRs with dimension of 256×18×2 (the real and imaginary parts) is reshaped to a tensor with shape of (96; 96; 1) and then fed into NNs. Both NNs are trained with Adam optimizer and the batch size is set to be 128. The setting of learning rate for NN of DM is 2e-4, and that for DenseNet is 1e-3 and divided by 10 after every 10 epochs. The NN for DM with 1000 diffusion steps are trained for 80K training iterations. To prevent from overfitting and increase DenseNet’s generalizing ability, the training set is split into 8K training points and 2K validation points. Moreover, early stopping techniques are used and the setting of patience for restoring best weights is 15.
The cumulative distribution function (C.D.F) of positioning errors for all methods are given in Fig.6. Firstly, it can observe that both the legacy RAT-dependent method, i.e., the downlink TDOA (DL-TDOA), and the conventional AI without data augmentation (DA) perform poorly, i.e., over 30m estimation error. Here the conventional AI method means using only supervised learning model (implemented by the same DenseNet as in HML). The DA means the model is trained with 80K data points under SNRs from -10dB to 10dB with a 3dB step. On the other hand, the convention AI with DA could have better performance but with huge amount of training data size, e.g., 80K versus 10K (in HML). Eventually, the proposed HML method has improved accuracy even with less training data. In some cases, the proposed HML could even approach the results under noise-free CIRs. These results show the effects of the DM based denoising and superiority of HML.
Figure 6. Results of positioning errors under different SNRs
Fig.7 shows results under scenarios with missing CIRs, in which 6 CIRs of each testing data are removed randomly to evaluate the performance of the proposed method. It can be seen that, without denoising and inpaiting (D&I), the positioning accuracy decreases rapidly. With even only 8 step D&I, more than 20m accuracy improvement can be achieved, and it can be further increased with more steps. It proves the effect of the DM based inpainting for wireless channel information by proposed HML.
Figure 7. Results of positioning errors under 0dB and 6 missing CIRs
In this blog a novel HML based positioning method is proposed. With its denoising and inpainting ability, the positioning accuracy in NLOS scenarios can be improved significantly under low SNR or incomplete CIRs. Compared with only supervised learning based method, the proposed method can achieve better accuracy with less training time and less samples of training set. Moreover, the proposed method is very flexible in terms of balancing overhead and positioning accuracy, which could fit the practical requirement easier.
 P. Ferrand, A. Decurninge, M. Guillaud, "DNN-based localization from channel estimates: feature design and experimental results," In Proc. IEEE Globe Communications Conference (Globecom), Dec. 2020.
 S. Bast, A. Guevara, S. Polin, "CSI-based positioning in massive MIMO systems using convolutional neural networks." In Proc. IEEE Vehicular Technology Conference (VTC-Spring), May. 2020.
 3GPP RP-213599, "New SI: Study on Artificial Intelligence (AI) Machine Learning (ML) for NR Air Interface" #94e, Dec.2021.
 G. Huang, Z. Liu, L. Maaten, and K. Weinberger, "Densely Connected Convolutional Networks," In Proc. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jul. 2017.
 J. Ho, A. Jain, and P. Abbeel, "Denoising diffusion probabilistic models," In Proc. of Conference on Neural Information Process Systems (NeurIPS), 2020.
 A. Nichol and P. Dhariwal, "Improved denoising diffusion probabilistic models," International Conference on Machine Learning (ICML), 2021.
 3GPP TR.38.901 V16.1.0, "Study on channel model for frequencies from 0.5 to 100GHz (Rel-16)," 2019.