about-me/AI Examples/Radio Presenter/README.md

39 lines
3.1 KiB
Markdown

# AI Powered Radio Presenter
> *Because of copyrights, the music tracks have been cropped to only show the crossfades and enough is left for context on what the presenter is talking about. I own legal copies of each song, and I do not condone piracy. Musicians work very hard to bring you the music you love. Well rock musicians do. The plastic crap on modern radio stations sounds, well, like mass produced plastic. The portions of the music tracks presented in this demo are presented as part of a critque and used under the execption in international copyright law known as "Fair Use."*
So in this demo, I had a bunch of my favourite rocks songs set up on a virtual server. Each song has a JSON file that marked the cue points for crossfades.
Example:
```json
{
"artist": "Evanescence",
"cues": {
"in": 1.6,
"out": 229.3
},
"duration": 236.4,
"filename": "evanescence-bringmetolife.mp3",
"title": "Bring Me To Life",
}
```
Two AI services ran on the virtual server.
The first one curats the playlists, creating a playlist of 50 songs. When the playlist has 10 songs or less left in it, it curates the playlist again so that there are 50 new songs. It chooses the songs so that they flow into each other. It also ensures that no songs are repeated in a six hour period and that no artists are repeated in a two hour period.
On the virtual server there are three virtual sound devices in ALSA Jack. Device 0 and device 1 are used for the music. Odd numbered songs from the playlist okay in device 0 and even numbered songs are played in device 1. When the song starts it fades in from 0% at 0 seconds to 100% at `.cue.in` in seconds. When the song reaches `.cue.out` in seconds, it starts to fade from 100% to 0% at `.duration` in seconds. Also as `.cue.out` the next song starts playing in the other device.
The songs are played in groups of four. While they are playing, a GPT Large Language Model is given the Artists and Titles for the last three songs in the group and the Artist and Title of the first song in the next group and instructed to talk about them as if it was a radio presenter. The resulting text is passed to a TTS model that returns an MP3 of spoken word which is then played when song four of the group reaches `.cue.out` as the fade out begins. The first song in the next group starts playing so that the MP3 from the TTS model ends at when it reaches `.cue.in`.
There is also a virutal recording device that is used to play the "output" of the virtual sound devices 0-3. This is fed into an Icecast stream which I could connect to with VLC and I was thus able to record an example.
It is possible to incorporate ad plays, notices and any other type of content engagement.
## Possible Spin Off
I would like to setup an Internet Radio station but I am still investigating the licensing side of things.
But it is possible to use this tech for companies that stream their own radio station in their retail stores. If you are interested in this tech, then start a chat with me on Reddit ([u/thisiszeev](https://www.reddit.com/user/thisiszeev/)) or you can send me a message via my personal website ([https://thisiszeev.web.za](https://thisiszeev.web.za/hit-me-up)).