Kardome CEO: AI Voice Control Domination Carries Risks

Artificial intelligence-powered voice assistance will see the end of manual instructions for cars but also raise huge risks for user privacy and cybersecurity.

That’s the assessment of Dani Cherkassky, CEO of Kardome, a voice-activated human-machine-interaction start-up that will see its first in-cabin system rolled out this year by a mass-market global automaker.

His warning centers around the use of voice assistants that are linked to the cloud where personal data can be mined without the user’s knowledge.

Cherkassky highlights the recent move by Amazon to automatically send all voice recordings by users to its cloud to train its upcoming Alexa Plus software, essentially depriving users of the ability to keep instructions and conversations private.

He says this threatens the expansion of AI-powered voice assistance because people just won’t trust it or use it.

“We are heading towards a world dominated by a speech-based operation system,” Cherkassky predicts. “And it is moving towards a voice-enabled computer operating system for both cars and other devices.”

However, he insists for the systems to work most effectively, they must be on all the time to learn and interact with the user and their needs.

Now with Alexa, for example, the “first concern that brings is, would you use a system that is always on and constantly listening to you 24/7 that uploads your data to the cloud and you have doubts about what is actually happening with this data?”

In his opinion, the answer is a resounding “No,” saying: “I don’t want the Big Brother to listen to me 24/7,” and no-one else would because “in terms of privacy, you are giving up on everything that belongs to you.”

In light of several high-profile cybercriminal hacks of commercial cloud storage data, not least Toyota’s 2023 hack that saw about 260,000 customers’ data exposed online owing to a misconfigured cloud environment.

Cherkassky says the solution is Kardome’s approach with a voice assistant that works entirely on Edge compute even in the event of no internet connectivity.

This system was born out of frustration with modern voice assistance that he says had progressed very little in the past 30 years or so and that frequently misunderstood instructions, often because of noisy background environments.

Cherkassky says his company’s “spatial listening model” software captures the audio while being able to understand the three-dimensional environment. “This sort of mimics the human auditory system aiming at understanding where the different sources of sound are located and have the ability to reject environmental or other noises and to focus on the sounds that are important to the user.”

Kardome’s system won the backing of Hyundai Motor in 2020 when the automaker led the seed funding to develop the system that can handle instructions from several users in noisy conditions.

However, Cherkassky also warns that the advance of AI within voice-activated systems could accelerate the leaking of private data unless it is ring-fenced from third-party data-gathering operations. He says systems have to be tailor-made for specific uses to protect privacy, adding, “It is unreasonable to assume that there will be one entity, made by Google, Apple or whoever, that will be used as our voice assistant for all aspects in our life – from our cars to TVs to personal devices.”

All this must be restricted to the Edge and not automatically sent to the cloud, Cherkassky argues. “Today they are coding using AI assistance and so every key-strike that you hit on your keyboard is going to the cloud.”

Instead of large language models used by several mainstream AI providers, he suggests the use of bespoke tiny language models “for our interaction with a specific device and this should happen privately, on the Edge.” In conclusion, he adds, “In terms of experience, they should be fast, private and should be able to work even if the network fails.”