Big AI News You Probably Missed

This week started off as a really slow week in AI, but as the week progressed, more and more news started to come out that got more and more exciting. Instead of breaking it down in chronological order as the news happened, I want to start off with the good stuff.

MidJourney V5.2: On Thursday, MidJourney announced version 5.2. It got quite a few updates, including new aesthetics and more variation in the generations. A new shortened command was introduced to find out which words are not impacting your prompt at all. But the most interesting new feature, in my opinion, of the new MidJourney is the zoom-out feature. So if we jump over to Mid Journey here, I generated a handful of images of wolves in the snow. Here’s how you’d use the zoom-out feature: I really like this image number three here. Let’s go ahead and upscale number three. And as soon as I upscale it, you can see some new options here. You have a very strong, very subtle zoom out, 2x zoom out, 1.5x, custom zoom, or make square. So let’s click zoom out 2x and see what happens. You can see it took that original image and actually made a zoomed-out image with the wolf even further back. And you can continue to zoom out. Let’s say I like number three again, let’s upscale number three. Let’s zoom out two more times. You can see once again, it further zoomed out on the image. If I take one of these images and click make square, you can see it turned them into squares and filled in the top and the bottom. This is MidJourney’s current answer to generative fill in Photoshop and uncropped from Stability AI. Let’s also quickly test the shortened feature. Let’s take one of these longer prompts that I’ve used in the past to get these really detailed images. Let’s copy this and let’s use the new shortened prompt and paste this whole prompt in. And you can see that it tells me that I don’t need to use the word “features,” I don’t need to use the word “lens,” “accent lighting,” or “global.” If I cut all of those out, it shouldn’t really impact the result of the image. So not only will this new update help you zoom further and further out on any image you want, which will be really helpful when you get those images with like cropped shoulders or cropped head, it also helps turn you into a better prompter. Now, this MidJourney version 5.2 news drops on the same day that Stability AI announced their Stable Diffusion XL 0.9. Now, a lot of people were speculating that the full version of Stable Diffusion XL was going to be launched today. They gave us 0.9, basically telling us that they don’t feel like the full SDXL 1.0 is fully ready yet. But it does claim to produce massively improved image and composition detail over its predecessor. Here’s some examples they give on the website of the beta version of SDXL versus the new 0.9 version of SDXL. You could see a little bit of improvement in realism and quality. Here’s a hand holding up a coffee cup. SDXL 0.9 has one of the largest parameter counts of any open-source image model. Now, they say that the API and the ability to run it from Dream Studio will be available to access on Monday the 26th. You can actually play with it right now inside of Clip Drop, which is one of Stability AI’s front-end user interfaces. So if we look at Clip Drop here, we can actually test the new version right now. You can find it over at clipdrop.co/stable-diffusion. Let’s do the ultimate test: a woman holding out her hands. It seems like a lot of people are probably using it right now, so it’s loading kind of slow. But the first image that it generated, uh, less than desirable results. Here are some other examples that were presumably generated with the newest version of Stable Diffusion. And for the most part, they actually look pretty good. In fact, if I scroll down here, you can see an image of silhouettes of hands that actually have the proper amount of fingers and don’t look too deformed. This car is looking really good. The people are looking really good. This is getting much closer to the types of images that we would expect from, like, a MidJourney. Now, Stable Diffusion is closing that gap and getting better and better with these types of images.

Now, onto Meta Voicebox. This was actually announced last week, but it was announced after I recorded my last video, so let’s talk about it now. Meta introduced Voicebox. They claim it’s the most versatile AI for speech generation. And to me, it sounds very similar to what L11 Labs can do right now, where with just two seconds of audio, it can generate very realistic text-to-speech. Here are some examples from Meta’s website. This is Voicebox. Now, I’m unclear if the Mark Zuckerberg voice that you’re hearing in this video was actually generated with this tool or if this is actually Mark Zuckerberg talking. If I had to guess, they probably generated this audio with their text-to-speech system, because that would be very well Meta. If you give it text, it can read it in a bunch of different styles. Penelope Porcupine and Sammy Sloth. Penelope Porcupine and Sammy Sloth. They also say you can use this tool to fix background noise because it knows your voice. It can sort of edit out background noise and then replace your voice with the fixed version of the text-to-speech voice. Sami and Penelope’s heartwarming friendship inspires joy. Sami and Penelope’s heartwarming friendship inspires joy. So it can do in-context text-to-speech synthesis using an audio sample as short as two seconds long, speech editing and noise reduction, cross-lingual style transfer so you can actually write something out in a different language, and it will use your voice but in that language. And it has diverse speech sampling so that you get more variation in the way the voices sound. Now, you can learn a lot more about this on Meta’s website. I’ll make sure I link up the article below this video. But as far as us getting access to it, they claim they’re sort of holding it back for ethical reasons. I don’t totally understand that because there are already tools out there that do this, like L11 Labs, like Uber Duck, like The Script. So I’m not quite understanding why Meta is holding it back. I imagine we’ll probably see this as an open-source model at some point, but as to when, that’s really anybody’s guess.

Now, in news that I find quite exciting, this week Dropbox announced that they’re adding more AI features into their platform. Dropbox AI is a new feature that lets you instantly summarize and ask questions about your Dropbox files. With Dropbox AI, now you can pull up a file, you can ask it anything. Dropbox will read the document for you and give you an answer. And in a single click, you can easily get a summary of the entire 100-page doc, saving you all the time and extra effort that it would otherwise have taken to do that manually. We’re starting with individual files, but this is just the first step. Now, this is interesting to me because there are so many tools out there right now that that is their main selling proposition. You upload a PDF and we’ll summarize it for you or you can ask questions of it. Well, now, if you have a Dropbox account, you’re just going to be able to do that. I also believe this is something that Google is planning on adding into Project Tailwind as well. Are features like this from larger incumbent companies just sort of obsoleting a whole bunch of smaller companies right now? Because of this feature, possibly. Over time, you’ll be able to ask a question about your Dropbox folders and even your entire Dropbox. We all need a search box for our private information, just like Google gave us a search box for public information. Not only with your Dropbox files, but with everything. That’s why today we’re introducing Dropbox Dash. Dash is an AI-powered universal search engine that connects all of your tools, apps, and content in a single search bar. It lets you easily find whatever you’re looking for, no matter what app it’s in, who sent it to you, what you named it, how long it’s been since you last looked at it. You can find it super fast. Dash connects across your favorite tools like Google Workspace, HubSpot, Asana, and Notion, so that you can access all of your content no matter where it lives or what format it’s in. And because Dash is powered by machine learning, it learns and evolves with you, and it gets better the more you use it. This is the part that was the most exciting to me because imagine just taking all of your files, your documents, your videos, your audios, pretty much any file type you can imagine, plus you have all of your apps that you use. Maybe you’ll use Notion, maybe you use HubSpot, maybe you use Google Drive and Google Sheets and Google Docs, or Asana or any tool like that. It sounds like Dropbox is going to tie this all together into sort of a personalized search engine where you can search all of your files and all of the various tools that you use across various platforms. And like he said, Google gives us the search engine to search the web. It sounds like Dropbox is trying to give us that search engine to search all of our own personal data. Now, I’m not going to show the whole video here, but they do say at the end of the video that they promise not to share your data and that they’re going to keep it private and that you can trust them. Although they want you to connect every single tool, every app, and every piece of your life to Dropbox, they promise that they’re going to keep it safe for you. So we’ll see how that plays out.

Also, YouTube adding dubbing. This week, it was announced that YouTube is getting AI-powered dubbing, which is absolutely amazing to me because I am a creator. I love making YouTube videos. Right now, the options to watch my videos are either to watch them in English or to turn on subtitles in whatever language is your native language. But pretty soon, YouTube is just going to automatically overdub into whatever language you want. So the words that I’m speaking right now will be coming out overdubbed in whatever language you’re listening to right now. YouTube made this announcement on Thursday at VidCon that they’re working with the team from a company called Aloud, which is an AI-powered dubbing service from Google’s incubator. So basically, the tool first transcribes your video. You can then review the transcription and edit it. It then translates and produces the dub. Dubbing with Aloud is as simple as editing texts. So if you’re a content creator, this is huge news, and I’m personally really, really excited about this. It’s going to open up all new parts of the world for people to learn about the types of news and stuff that I’m talking about in AI. So really, really exciting times to be a content creator.

Now, onto some of the cool tech that was announced or rolled out this week. Let’s talk about some of the weirder aspects of AI that are happening right now. The Grammys allowing AI music: The Recording Academy, the organization that runs the Grammys, announced their rules about AI-generated music. They said that songs that include elements generated by AI can still be nominated, but there must be proof that a real person meaningfully contributed to the song too. If there’s an AI voice singing the song or AI instrumentation, they’ll consider it. But in a songwriting-based category, it has to have been written mostly by a human. They added that AI will unequivocally shape the future of the music industry, and instead of downplaying its significance, the Grammy Awards should confront questions related to AI head-on.

Also, celebrities selling their AI likeness: Celebrities are attempting to co-opt this AI craze. They are making deals with brands to put AI-created duplicates of themselves into marketing campaigns, giving them more control over their own likeness and more latitude in the types of deals they can make. Celebrities get paid, but they don’t have to turn up. Stars just need to spend a few minutes in a studio with a 3D scanner, which could then create representations of them for countless hours of content. Golf legend Jack Nicholas recently agreed with AI company Soul Machines to create an AI-powered version of himself at the peak of his career when he was 38 years old. The company Metaphysic has actually signed on to provide AI services for a forthcoming Robert Zemeckis film starring Tom Hanks. This is literally the news story that’s showing that the plot of that episode of Black Mirror called “Joan Is Awful” is coming true in real life.

Speaking of the entertainment industry, this week Marvel was criticized for using AI to make the Secret Invasion opening credits. Secret Invasion is a brand new Disney Plus series. I haven’t watched it myself yet, so I can’t spoil it for you. But what I can tell you is that their new intro was actually generated with AI tools. The creators of the intro wrote an exclusive to The Hollywood Reporter that the opening that used AI cost no artist jobs. Method Studios clarifies reports that sparked a social media backlash, stating AI tools complemented and assisted their creative teams.

Now, onto Future Tools. This company Illumine AI introduced something that I talked about earlier this week, and it’s just nerd heaven for me. They turned MidJourney images into 3D worlds where you can move around and zoom in and zoom out. They were able to drop a character into this 3D scene and actually make the character wander around inside of this 3D world. To me, this is just so insanely cool that we’re able to do stuff like this. And as of right now, this Instaverse tool from Illumine AI is actually free to play around with.

Finally, let’s talk about the most bizarre thing that’s going on on the planet right now, where Mark Zuckerberg is ready to fight Elon Musk in a cage match. That’s right, Meta is actually planning on building a competitor to Twitter. Elon Musk has trolled him a little bit on Twitter about this, and it’s all sort of escalated into Elon Musk tweeting that he’s up for a cage match with Mark Zuckerberg. It further escalated because Mark Zuckerberg screenshotted the tweet saying, “I’m up for a cage match,” and then said, “Send me the location.” According to Dana White from the UFC, they are absolutely dead serious about making this happen. There’s actually a non-zero percent chance we might see an actual cage match between Elon Musk and Mark Zuckerberg. That’s going to be interesting.

So, that’s all I’ve got for you in the world of AI. When the week started, I thought, “Man, this is going to be a slow week. There’s really nothing to talk about for Friday’s news video.” But by the time Thursday rolled around, it really ramped up. And I don’t even know what happened on Friday yet because I’m recording this on Thursday. So there could be new news that I haven’t even reported on yet. But if there is, I’ll report it in next Friday’s video. Since I’m doing these Friday news videos on a regular basis now, I’m trying to come up with a clever name for what I should call them. I’ve been thinking about just going along with the name of “The AI News You Probably Haven’t Heard Of Yet” because that seems to do well with the YouTube algorithm. But if you have ideas, let me know in the comments. I’d love to hear them. Hopefully, you enjoyed this video. If you did, maybe give it a thumbs up. And if you’re not subscribed to this channel already, subscribe. I’ll keep you in the loop with all the latest AI news, all the coolest AI breakthroughs, a handful of tutorials from time to time, and give you the grand overview of the TL;DR. If all of this is overwhelming and you don’t want to keep up with it yourself, that’s what I’m making this YouTube channel for. Also, if you do want to go deeper down the rabbit hole, check out Future Tools. I actually keep this website updated with all the AI news on a daily basis. I also curate all the cool AI tools that I come across, and I send out a weekly newsletter every Friday with just the TL;DR of everything that’s happened in the AI world for the week. You can find it over at futuretools.io. If you join now, I’ll make sure you’re hooked up with the next week’s newsletter. Thanks again for nerding out with me. I really, really appreciate you. See you guys the next video. Bye-bye.

 

Privacy Policy | Privacy Policy