AI
The advent of technical breakthroughs in communication, the Internet followed by increasing digitisation and proliferation of personal devices such as mobiles, smart appliances, etc. have made it possible to connect and digitize many human activities. This empowered the industries to collect large data of their users with consent, which is used for offering personalised services and products to enhance customer experience.
Personalization refers to tailored experiences created for an individual by customer-facing organisation based on their knowledge of the customer. The advancements in artificial intelligence (AI) techniques have made it possible to do this personalization based on your mood, location, context, etc. For any industry, personalization yields an increase in sales with increased customer retention, improving targeted advertising, etc. On the other hand, customers enjoy personalised services as it saves their time and effort.
However, such detailed user data is often personal in nature and so poses a threat to their privacy. Here, data privacy should not be confused with data security. Data privacy typically refers to user ability to control and regulate the private sensitive data such as personally identifiable information or the information from which a person’s identity can be compromised. Data security indicates the system that protects such data from an unauthorised breach and cyber-attacks.
Figure 1. User Dilemma between Personalized Services and Privacy of their Data Image Source
Users are therefore becoming concerned that their collected personal sensitive data is used for other purposes than those for which it was collected, e.g., sharing it with third parties or pouring users with marketing messages. In addition, as the data collected by industries is generally stored on a server, a data breach will lead to access to personal sensitive data of large number of users. Therefore, access and storage of such personal sensitive data, even with good intentions, should be fully controlled.
Thus, privacy and personalization are generally seen as mutually exclusive concepts. Most service providers are able to provide personalization but then such providers collect and process the private sensitive data of their users. On the other hand, there are service providers who do not access user-sensitive data to guarantee privacy but are not able to provide any kind of personalization. This trade-off between providing personalised services and protecting user privacy is known as Privacy-Personalization Paradox. Even if users are aware of this trade-off, they generally sacrifice privacy over personalization.
However, we claim that the user does not have to compromise on privacy while having the same personalised experience. With the improvement of personal devices in terms of memory, computation, etc. and the compression techniques for large AI models, deploying AI models on devices itself has become possible and have ushered in On-Device AI techniques that have the potential to provide both privacy and personalization. This also serves the interests of industries, given the privacy regulations such as GDPR, CCPA, etc. coming into the effect. Further, they also want to avoid privacy violations while providing the same personalised services to their users as before.
Traditionally, industries collected user data at a central server where this data is utilised for training AI models. These AI models are further used for inferring personalised recommendations to the user. The On-Device AI technology takes the opposite approach. Instead of collecting user data at the central server, an AI model is trained in the cloud with publicly available data and then deployed on individual user devices. The difference between these two systems is depicted in Fig. 2.
Figure 2. Difference between centralised AI and On-Device AI Image Source
1. This approach can be categorised as a privacy-by-design approach as the data does not leave the user devices for providing personalised services. Therefore, industries do not have to perform any privacy analysis, as the system developed by this principle will conform to privacy regulations automatically.
2. Users will be willing to give access to more data than in the case where they need to share it with the central server. For example, user data across her personal devices can be shared securely so that AI models on personal devices are exposed to more data leading to better/hyper-personalization.
3. As the AI model resides on the device itself, the user does not need an internet connection for receiving services, which also reduces latency leading to more customer satisfaction.
4. As the data remains on the user devices, users are in more control of their data and they will not have to worry whether their data will leak from the server or the company will share the data with third parties.
With increasing communication bandwidth due to 5G technology and increased computational capability of devices, the On-Device AI is going to be more and more popular as users are demanding private, personalised, and fast services. Samsung has already proved this by implementing On-Device AI technology solutions like smart widgets, image tilt corrections, etc.
We may think that as On-Device AI brings the best of both worlds (privacy and personalization), why is it not widely adopted? To compete with traditional centralised-AI solutions, On-Device AI technology needs to evolve. We need to develop learning frameworks that combines the cloud and on-device processing using privacy enhancing technologies (PETs) to learn from individual as well as other user data without compromising any user’s privacy.
PETs are a range of technologies that enable extracting data value while ensuring its privacy. Federated learning is one such machine learning technique that trains AI model across multiple personal devices without exchanging data. It achieves this by only transmitting the local model gradients, aggregating them in the cloud followed by sending the global model to all users as shown in Fig. 3. Differential private techniques add statistical noise to data such that certain facts about the dataset can be publically shared with a negligible probability of privacy leakage. The 2020 US census adopted differential privacy techniques to counter the emerging threats of the digital world.
Figure 3. Federated learning across multiple user personal devices Image Source
Homomorphic encryption is a cryptographic technique that enables mathematical operations on encrypted data to generate a result which when decrypted matches the result one would obtain after applying the same mathematical operations to that of non-encrypted data as depicted in Fig. 4. This enables encrypted data of the user to be transferred, analysed, and returned without compromising the privacy. This methodology is apt for applications where computations in cloud are preferred over doing them on a device.
Figure 4. Homomorphic encryption methodology Image Source
Most of the above PETs have found application in industrial use-cases already and they are rapidly becoming mature. This makes them very relevant for Samsung with their wide portfolio of smart appliances. Samsung research is committed to actively contributing to researching and making privacy enhancing technology industry ready to provide private and personalised services.
[1] Schwartz, B. (2016). The Paradox of Choice: Why Less is More. ECCO Press.
[2] Wheelwright, T. (2021, April 21). Cell phone behavior survey: Are people addicted to their phones? reviews.org. https://www.reviews.org/mobile/cell-phone-addiction/
[3] Fulmer, J. (2021, September 21). The Value of Emotion Recognition Technology. IT Business Edge. Retrieved March 15, 2022, from https://www.itbusinessedge.com/business-intelligence/value-emotion-recognition-technology/
[4] Consumer privacy in retail: The next regulatory and competitive frontier. (n.d.). Deloitte. Retrieved February 28, 2022.
[5] Treiblmaier, Horst and Irene Pollach. “The influence of privacy concerns on perceptions of web personalisation.” Int. J. Web Sci. 1 (2011): 3-20.
[6] Julien Cloarec, The personalization–privacy paradox in the attention economy, Technological Forecasting and Social Change, Volume 161, 2020.
[7] Richard Console, Jr. More than 81,000 Social Security Numbers Leaked in Quality Temporary Services, Inc. Data Breach. https://www.jdsupra.com/legalnews/more-than-81-000-social-security-6807521/. Personal Injury Journal, published on 20.06.2022, Retrieved on 18.08.2022
[8] Holly Evarts, Study Confirms Differential Privacy Was the Correct Choice for the 2020 U.S. Census. https://www.engineering.columbia.edu/news/2020-us-census. Published on 19.04.2022. Retrieved on 18.08.2022.
[9] Imageby pikisuperstar on Freepik