AI News! HUGE Chatbot Research, Viral AI Songs, Text to Video & More!
Hello everybody and welcome back to your mid-week AI Roundup. I do want to do these at the end of the week, but since I missed last week’s, I figured I’d do a Roundup of this week’s best AI news – the biggest stuff that’s going on in AI.
Starting off somewhat small, GPT-4 32k model, which means it can essentially accept more information in its input. Instead of a few pages of work being pasted into Chat GPT, it could accept more like 10 pages, 20 pages of information. It could do really long coding, meaning it could make actual games out of code, potentially write big essays, write books. A lot more capabilities come along when you add to the amount of tokens that these models can input and output. Matt Schumer here on Twitter is going to summarize some of the things that it can do because he actually got access. And we do believe this is because he is a developer, and OpenAI has given just Tim’s specific access, but it’s cool to see that outsiders are now beginning to get access to these 32k GPT-4 models.
Summarize and answer questions about an entire research paper. He literally just pasted the whole paper in the prompt, no embeddings required, and yeah, it’s able to accurately access information, gives a summary, gives the main contributions. It’s also able to take an entire codebase and its supporting documentation, meaning all the different stuff that tells you about the codebase, how to use it, and whatnot, and it can actually make changes and improvements to it. Long context opens up fundamentally new opportunities to make ridiculously powerful developer tools. Absolutely agree with that statement. Devs are going to go wild with these larger AI models. GPT-4 is going to be able to take huge, big, long portions of code, reconfigure them, make them more efficient. I am not the most knowledgeable person when it comes to programming and coding, but I know enough to know that things are really going to change. A lot of our world is built off of code, it’s built off of programming, and it’s all human code and human programming, so these things can be rapidly made efficient with these larger AI models being able to understand that code, read it, and reconfigure it.
You could pass in dozens of different full articles and get a personalized summary of the entire day’s news, which is awesome. I can’t wait until I just have an AI assistant that can read me the latest news and keep me well informed based off of multiple different viewpoints of the same exact event. My personal experience with media and news nowadays is that there are huge biases on both sides to each argument. So what I would do is I would tell an AI system to go read both sides of each argument, read all of the news, and then get back to me with, you know, the straight facts that cross over from each and some of the different opinions that people have about it, so I can hear both sides of the story and have a well-informed opinion about the event.
This is kind of scary, this is hyper-personalized AI. Matt Schumer here showed the AI tons of information about himself and HyperWrite, and it essentially produced an AI version of himself. He says this is the likely only use case that embeddings are still better for at this moment due to the cost and frequency of use. So it gives a ton of different context, and it’s able to really write an email as if it was him, Matt Schumer, which is so crazy.
So, viewers, GPT-4 with 32,000 different tokens available for it to read and interpret is, you know, it’s pretty cool, but you know what’s really cool? A million tokens! So, this was a research paper that kind of went viral in the AI space. At least, I saw it on my Discord server, I saw it all around Twitter, scaling a Transformer AI, large language model to 1 million tokens and beyond with RMT (Recurrent Memory Transformer) by augmenting a pre-trained Bert model with recurrent memory. They enabled the model to store task-specific information across seven segments of 512 tokens each during inference. The model was able to effectively utilize that memory for up to 4096 segments, with a total length of 2 million 48,000 tokens. Way, way exceeding the largest input size that has been reported for Transformer models thus far. The record before that was 64,000 with Colt 5 and 32,000 with GPT-4, which we just talked about, which is still really exciting. And this augmentation actually maintains the base model’s memory size at 3.6 gigabytes in their experiment, and Bert is actually one of the most effective Transformer-based models in natural language processing, which is really good. It’s a recurrent memory Transformer architecture that is allowing this to all have been allowing it to interpret 2 million freaking tokens, which is just ridiculous. This is really the main thing that holds large language models like Chat GPT back – its ability to store long-term information. Like us as humans, we can store information for years and years and years and hundreds of gigabytes worth of data, but these AI models can only hold a few pages of data right now. Again, the 32,000 one, that’s like a whole research paper’s worth. This is like books and books worth of data that is able to store and reference and learn from and create results from.
There are some caveats with this, such as the fact that the main issue with so many tokens is that you get a higher error ratio, so it’s a balance between how many tokens you give it and how factual and accurate you can keep the models. But we can expect these things to get better over time. Thirty-two thousand is already a really high amount of tokens.
Artificial intelligence has also taken the music world by storm lately. Lots of AI-generated music has been coming out from real artists around the world, such as Drake, Juice World, Kanye. I’m sure you guys have seen these videos on YouTube. It sounds just like them singing these songs, and honestly, it’s very beautiful sounding. I do love this AI music, but I can understand where there can be some problems that arise from this.
Let’s start out with the AI Drake scenario, which really exploded. This AI-generated Drake song, he never actually sang it, but it sounds just like his voice. Altman Sam here did a fantastic job summarizing everything that happened. An anonymous creator used relatively simple machine learning models to produce artificial Drake songs that racked up tens of millions of views in a few days on YouTube. Drake is very famous. It makes sense that these songs blew up because it sounds just like him. Like I said, the tracks are indistinguishable from the real thing. If the real Drake had dropped them, they would likely top the charts. No doubt about that. The reaction of Drake’s music label, Universal Music Group, was to invoke copyright law to forcibly remove the songs from all major platforms. Truthfully, there’s no copyright law written about this yet because this is brand new technology we’ve never seen before. We don’t know what the courts are going to decide on this stuff. It really is not hard to produce these. I did a little bit of dabbling off camera with some Kanye stuff, and it’s pretty good. It’s very, very easy. Many, many flooded Drake clones are already out there on the subreddit. And of course, as we know, there’s lots of money in the music industry, and it’s about to be turned on its head by AI technology. We just witnessed how just a little bit of machine learning magic can elevate some programmer’s side project to the likes of pop superstars. You can essentially make a hit song sitting in your basement like a little gremlin. So that’s really what’s happening here. You don’t need to be Drake anymore; you can be anybody. AI is going to flip the world on its head, one industry at a time. I completely agree with this. Every industry is going to be changed, the whole world is going to be changed. That’s why I say buckle up, guys, watch videos similar to this one, stay in the loop about AI. I’m not trying to scare anyone; I’m just saying even just a little bit of insight into the world of AI is going to be a huge boost for you in the future. I think personally, even if you just go and play with Chat GPT for like an hour today, you’re probably ahead of most people.
Universal Music Group justified banning AI Drake so it could be on the right side of history of artists, fans, and human creative expression. It’s a fair statement, I think, to ban the music just because it’s using Drake’s likeness without his actual consent. I understand that there are a lot of problems with this; they’re making money essentially off of Drake’s likeness. I can understand the legal issues that come with that. As much as I do love experimentation and stuff like that, I don’t think it is fair to use Drake’s likeness to make a profit like that. The creator of the tweet argues that these tools bring fans creative expression to a whole new level. I completely agree with that. How freaking cool is it that you can produce an entire Drake song with a few lines of code? Absolutely agree. It is fantastic and amazing technology. I do think that the laws of copyright and all that stuff are going to rule in favor of the Universal Music Group in this scenario just because it’s Drake’s likeness.
Here is the song. I don’t want to get taken down, so I’m not going to play it, but I will link it down below for you guys to take a listen to. Yeah, it sounds exactly like Drake, and it’s a fantastic song. It’s a pop hit. It’s insane. The other problem though with banning music like this is it’s so easy to create that it’s going to be rampant no matter what, and they’re not going to be able to stamp the fire out. I think no matter what, you’re going to have AI-generated stuff floating around.
Here’s what I think should actually happen, and Grimes really stepped up to the plate and made, I think, the best take ever on this situation of AI music. Grimes goes out on Twitter and says, “I will split 50 percent royalties on any successful AI-generated song that uses my voice. Same deal as I would with any artist I collab with. Feel free to use my voice without penalty, giving consent. Legal consent.” I have no label and no legal bindings.
Grimes is letting you use her voice. Maybe she was a little jealous of Drake in some weird way, and Kanye and all of them. But yeah, this is the right way to go about this as an artist. If you’re already famous, if you want to survive in the age of AI-generated music, split your royalties. Because there are going to be thousands and thousands of these hit songs that sound just like you. Maybe it’s not exactly your music, and maybe you legally require them to say, “This is AI-generated” or something like that in the description as a disclaimer. But split your royalties with them, and you’re going to be raking in checks every single month that are going to be massive, just from the sheer amount of AI music that’s being created in your name.
Grimes also says that she thinks it’s cool to be fused with a machine and likes the idea of open-sourcing all art and killing copyright. Interesting take on open-sourcing all art and killing copyright. That’s something that I’ve never even really dived into, but there you go. I think this is the route that you should take if you’re really serious, if you want to really accept the future of this tech. This is it.
What is the number one concern in AI at the moment? It’s safety. If you guys remember a few weeks ago, maybe a month ago now, some people in the technology community, Elon Musk, I think Elon Musk even signed it, very powerful leaders in the AI space announced that they should pause AI development past GPT-4 capabilities because AI is just getting way too powerful. And then promptly, no one stopped developing their AI, but at least it did raise the concern of AI safety. It’s been a problem that we need to address. I keep getting comments from you viewers at home saying, “Oh, you don’t talk about AI safety enough,” or “You talk about AI safety too much,” and there’s nothing to worry about. Listen, whatever side of the debate you’re on, you have to admit there are some safety concerns. I don’t agree with banning AI; I don’t think that’s a solution at all. The genie’s out of the bottle, obviously. But there are some things that we need to do to stay safe with this technology. There are some things to think about.
Nvidia has done something, and they’ve made it open source so that anyone can use it. They want this to be widely adopted. They have created Nemo guardrails for AI chatbots. You can use this with any AI chatbot, you know, your OpenAI GPT-3.5, GPT-4, your Open Assistant, essentially. Nemo guardrails is here to help ensure that smart applications that are powered by large language models are accurate, appropriate, on-topic, and secure. So the software includes all the code examples and documentation businesses need to safely add AI to their apps. Nvidia has their concerns in terms of the rapid adoption of AI tech. Just because you can’t think of some horrible thing that these AIs can do doesn’t mean that it’s not possible. There are some real bad people out there. So Nvidia developed three different kinds of guardrails that keep large language models safe in their opinion.
We’ve got topical guardrails, which prevent apps from veering off in undesired areas. For example, this will keep a customer service assistant bot from answering questions about the weather. Let’s be honest, when you have a simple chatbot that’s just supposed to answer questions about maybe inventory for your small business, you don’t want some customer wasting your generation time talking about philosophical meanings. It might make the chatbot a little bit boring, but it’s very helpful for like a huge number of very basic AI applications.
We’ve also got safety guardrails. This ensures that apps respond with accurate information and appropriate information. This is the big one here. They can filter out unwanted language and enforce that references are made to only credible sources. We will have to see how well this implementation works. We can expect the jailbreaks of this stuff in the future. That’s something I actually asked about during the briefing yesterday. Security guardrails restrict apps from making connections only to existing third-party applications that are known to be safe. So this keeps these AIs safe from malware, for example. They also say any software developer can really use Nemo guardrails. It is super accessible. I saw the demo they did yesterday. It’s a few lines of code, really, to get this thing working. It is nothing too difficult at all to set up and add on to your AI machine learning interface. It runs on top of Langchain, which is an open-source toolkit for powerful LLMS, so it’s built off of familiar technology. It’s also incorporated into a Zapier, a very popular automation platform. So, yeah, the only other thing I was super worried about with this is that people could potentially reverse engineer this AI chatbot guardrail system to actually jailbreak existing large language models like OpenAI’s GPT-4. I do think that people are going to be jailbreaking this AI safety systems and all of these AI safety systems, really, to try to take advantage of them. So that’s something to keep in mind. I do appreciate Nvidia’s take here, though, trying to actually produce an open-source tool for AI safety. They’re actually doing something rather than just complaining about it, which is what most people or most AI companies, I think, have been doing lately.
Another little cool tidbit here is that Nvidia actually produced an AI video generation model that produces videos up to five minutes long. You know, it’s about as good, maybe a little bit worse than Gen 2, so I didn’t really cover it in a full video. But, um, yeah, it’s exciting to see that Nvidia is getting into AI video generation.
In other news, Hugging Face has announced their open-source alternative to Chat GPT, and it’s actually based on Open Assistant, which I covered last week. And it’s based on their best model. They’re very confident about this. Some people said that closed APIs were winning, but we will never give up the fight for open-source AI. Heck yeah, man! I’m a pretty big supporter of open-source AI, so I do love this competition. And yeah, we can try this thing out. I won’t spend too much time on this, I promise.
Alright, we’re tossing this thing into the deep end of the testing. We’re gonna say, “Make it write a hit rap song that’s all about subscribing to MattVidPro AI.” Oh, it didn’t work at all. Make up any details needed in the song. It’s trying to do completion here. Let’s prompt it again. There we go. Alright, now it’s doing it. “Listen up, y’all, I got something to say about a YouTuber who makes your day. His name is MattVidPro, and he’s on the rise with his AI technology, he thrives. Subscribe to MattVidPro, don’t be shy. He’ll take your content to the next level, from music tracks to video games galore. ThatVidPro is an algorithm for more. Okay, it thinks I’m like an AI algorithm. No need to fear, no need to fuss, just sit back, relax, and let him work like a boss. The quality of your videos will increase, and with new ideas, you’ll never have a shortage. Don’t believe me, just check out his page. You won’t regret it, give him a chance. Join the community, start connecting, unleash your creativity, stop procrastinating. So come on now, what are you waiting for? Click subscribe, show some love before. Not too bad, I mean, a lot better than the current large language model that just got released by Stabilify AI, which is Stable LM. I did a whole video on that as well. But um, yeah, Open Assistant is quite good, and it’s nice to see Hugging Face is supporting this thing, and you can try it out for free. I’ll link it down below. It is faster than Chat GPT, I’ll give it that. Obviously, this thing’s gonna get better, so we can expect this thing, you know, in a year or so, to be as good as probably Chat GPT is today, for all we know.
So long-term memory is coming to Bing chat, thanks Alex Banks for bringing this to our attention. So the CEO of advertising and web services at Microsoft teased that Bing is soon incorporating memory in a fairly restricted form. He teased that in a tweet here. Bing will be able to retain information from previous chats, improve context awareness in ongoing conversations, enhance the user experience with personalized responses, and he feels like this will make Bing even more intuitive and user-friendly than it already is, further pushing it past Google. I use Bing; I wouldn’t say I use it more than Google, but if I have a question or something, or I need to do some basic research, I will use Bing. It’s also going to have Chat GPT-style Pro plugins, which is really, really cool, reducing the need for external links, connect with third-party apps or services, provide informed responses outside of its own training. You’ll be able to order your groceries with Bing, insane!
And finally, RunwayML released their Gen 1 app. So, yeah, this is Gen 1 access on your phone, and it’s way more intuitive to use than inside of a Discord server. So it’s already trending at number 15 in photo and video, lots of screenshots here. I haven’t got my hands on the app yet; I’m going to try it out probably today, later today. But obviously, you can take videos and apply styles to them either with a photo or a text prompt. Turn anything into everything with our generative AI magic tools now available on iOS. Runway is the leading Next Generation Creative Suite that has everything you need to make anything you want, and now its most popular tools are available right from your phone. And so far, it’s got a 5 out of 5 stars with 30 ratings, which is quite a lot. I mean, it’s going to be really, really cool. This just puts the power of AI tools further into the hands of everybody, and that’s what I love to see. And I’m really glad to see that RunwayML is sticking to their guns here, saying they want to deliver these AI tools to absolutely everybody and allow creative people who might not have the resources or the opportunities as big corporations to create awesome things from their home.
So yeah, viewers, let me know what you think about the top AI news stories so far. Really exciting stuff. I am so excited for the future of AI. I know there’s some scary safety stuff and safety concerns out there and industry concerns. I am still hopeful, though, for the future of AI by a long shot. Also, folks, we’re almost at 10,000 members on our Discord server, so please join up. You might be the 10,000th member. Great community in there. Thank you for watching, and I will see you in the next video. Goodbye.