ChatGPT is, ostensibly, a chatbot that depends on a big language mannequin (LLM) constructed round a man-made neural community to generate textual content. However that underlying functionality has confirmed to be extraordinarily versatile and folks have been in a position to make use of ChatGPT for every part from writing code to fixing puzzles. It will possibly even act as a voice assistant, like a complicated model of Siri or Alexa. Its talents are nearly limitless if in case you have the ability to show it learn how to deal with the specified duties. Zoltan took benefit of that to flip a classic rotary cellphone into a house assistant powered by ChatGPT.
That is an replace to a undertaking that Zoltan accomplished years in the past, through which he put primary a voice assistant into an outdated rotary cellphone. When he picked up the handset, he might communicate a command and the assistant would reply—if it understood what was requested of it. However that wasn’t excellent at understanding or decoding speech. If the spoken command did not match the precise format or wasn’t enunciated nicely, the assistant would not have the ability to reply. Most of us have skilled comparable issues with Siri and different voice assistants. However ChatGPT is excellent at decoding pure language, so Zoltan upgraded his rotary cellphone voice assistant to make use of that.
The one {hardware} within the cellphone itself is a Grandstream HT801, which is a tool supposed to transform analog telephones into VoIP (Voice over Web Protocol) gadgets. On this case, it turns audio picked up by the handset right into a digital audio stream and vice-versa. That audio feeds to a Raspberry Pi single-board laptop situated elsewhere in Zoltan’s dwelling. The Raspberry Pi then handles the entire communication with the varied providers this undertaking requires.
First, it sends audio to OpenAI’s Whisper service, which offers speech recognition. The textual content generated by Whisper then goes to ChatGPT for interpretation. Whether it is one thing like a query, ChatGPT will return a solution within the type of textual content. That textual content is then handed AWS Polly, which handles the text-to-speech performance. Lastly, that audio goes again via the Grandstream HT801 to the handset’s speaker.
If, nonetheless, ChatGPT interprets the textual content as a command, corresponding to “activate the lights,” it would name a Python operate. It’s as much as Zoltan what to do with these capabilities. He can, as an illustration, play a music via the Spotify API (Utility Programming Interface). In principle, Zoltan can management something that has an accessible API. However that does require that he arrange the capabilities himself — ChatGPT cannot make these connections.
And, after all, there’s the problem of accuracy in relation to factual info. ChatGPT is well-known for its “hallucinations” (learn: lies). It can be outdated, if a solution comes from coaching information gathered previous to an occasion occurring. As Zoltan demonstrates, it was unaware of the demise of Queen Elizabeth II as a result of that occurred after its final coaching information replace.
Besides, it is a enjoyable undertaking and it finally has extra potential than your typical client voice assistant.
