12 New Code Interpreter Uses (Image to 3D, Book Scans, Multiple Datasets, Error Analysis… )
In the 48 hours since I released my first code interpreter video, I believe I have found another 12 use cases that showcase its power. From finding errors in data sets to reading Anna Karenina, ASCII art to image captioning, most of these I haven’t seen anyone else mention. So let’s begin.
First is creating a 3D plot from an image surface plot, which you can see on the left from the image on the right. I know I will get two professional uses in a second, but I was personally very impressed that all of this that you can see can be done through the interface of ChatGPT. You can even see the little buildings at the bottom left reflected in this 3D surface plot.
To give you an idea of how it works, you click on the button to the left of the chat box, and then it analyzes whatever you’ve uploaded. All I said was “analyze the RGB of the pixels and output a 3D surface map of the colors of the image.” Now, I will admit it doesn’t do a perfect job immediately. At first, it wasn’t downloadable, and then it wasn’t big enough, but eventually, I got it to work. But it’s time for the next example.
Scanning 340k Words: What I wondered was, what is the biggest document I could upload to get it to analyze? The longest book that I’ve ever read is Anna Karenina. I think it’s about a thousand pages long, and I pasted it into a word doc, and it’s about 340,000 words. I uploaded it, and then I asked, as you can see, “find all mentions of England, analyze them to discover the tone in which the country is perceived in the book.”
Now, I know what some of you may be wondering is if it’s just using its stored knowledge of the book, and I’ll get to that in a second, but look at what it did. It went through and found the relevant quotes. There are seven of them there. I checked the document, and these were legitimate quotes. But here’s where we get to something that you can’t just do with Ctrl+F in a Word document. It analyzed the tone and sentiment of each of these passages, and you can see the analysis.
Then I asked, “Drawing on your own knowledge of the 19th century and the finding above, write a two-thousand-word reflection on the presentation of England in Anna Karenina.” Now, I know many of you won’t be interested in that book, but imagine your own text. This is 340,000 words. It then created a somewhat beautiful essay, and yes, it did bring up each of those quotes with analysis.
Now, here is where things get kind of wild. Just to demonstrate that it’s not using its own knowledge, I asked the same question in a different window without, of course, uploading the file. And at first, I was like, “Oh, damn, it did it!” Here are the quotes. Wow, it did the job. It didn’t even need the document. But that was until I actually checked out whether the quotes were legitimate, and lo and behold, it had made up the quotes. I searched far and wide for these quotes, and unless I’m going completely crazy, they are completely made up.
So when it found those quotes earlier, it wasn’t drawing upon its own knowledge. It was finding them from the document. And this also serves as a warning of the hallucinations that the model can do if it doesn’t have enough data. I’m going to get back to reliability and factuality in a moment, but just quickly, a bonus, I got it to write an epilogue to “The Death of Ivan Ilyich,” an incredible short story by Leo Tolstoy. And as some people had asked, it can indeed output that to a PDF, which is convenient for many people.
Next, what about multiple files? I didn’t actually investigate this in my previous video, which, if you haven’t watched, by the way, please do check it out. There are 23 examples of use cases there, but anyway, what I wanted to try out was I wanted to upload four data sets, and then I wanted to get GPT-4 to find any correlations between the data sets. Also, I was kind of investigating if there was a limit to the number of files you could upload, and honestly, there doesn’t seem to be.
I picked this global data almost at random, to be honest. It was the amount of sugar consumed per person, and then the murder rate per 100,000 people, and then the inequality index of each of those countries, and the population aged 20 to 39. But first, notice how it didn’t stop me. I could just keep uploading files, and then it would ask me things like, “Please provide guidance on the kind of analysis or specific questions you would like me to investigate with these four data sets.” So it’s still aware of the previous files.
What I asked was this: “Analyze all four data sets and find five surprising correlations. Output as many insights as you can, distinguishing between correlation and causation.” This is really pushing the limits of what Code Interpreter can do, but it did it. Many of you asked before if it could be lulled with false data into giving bad conclusions, and it’s really hard to get it to do that. GPT-4 is honestly really smart and increasingly hard to fool. You can read what it said. It found a very weak negative correlation, for example, between sugar consumption and murder, and then just admitted, “There is probably no significant relationship between these two factors.” But notice it then found a correlation that it found more plausible: “There is a moderate positive correlation (0.4) between the murder rate per 100,000 people and the Gini inequality index. This suggests that countries with higher income inequality tend to have a higher murder rate.”
I then followed up with this: “Drawing on your own knowledge of the world, which correlation seems the most causally related?” It then brought in research from the field of social science and gave a plausible explanation about why this correlation might exist. Obviously, this was just my example. You would have to think about all the different files and data sets that you were willing to upload to find correlations and surprising insights within.
I’m going to try to alternate between fun and serious, so the next one is going to be kind of fun. I was surprised by the number of comments asking me to get it to do ASCII art. Now, you may remember from the last video that I got it to analyze this image, and yes, I asked it to turn it into ASCII art, and here is what it came up with. Not bad, not amazing, but not bad.
A bit more seriously now for data analytics. What I wanted to do is test if it could spot an error in a mass-passive CSV or Excel file. This is a huge data set of population density, and notice what I did. I say “notice,” you almost certainly wouldn’t be able to notice, but basically, for the Isle of Man for 1975, I changed 105, which was the original, to 1,500, and I did something similar for Lichtenstein for a different year. Then I uploaded the file and said, “Find any anomalies in the data by looking for implausible percent changes year to year. Output any data points that look suspicious.”
And really interestingly, here, the wording does make a difference. You’ve got to give it a tiny hint. If you just say, “Find anything that looks strange,” it will find empty cells and say, “Oh, there’s a missing cell here.” But if you give it a tiny nudge and just say that you’re looking for anomalies, look out for things like implausible percent changes, data that looks suspicious, then look what it did. It did the analysis, and you can see the reasoning above, and it found the Isle of Man and Liechtenstein, and it said, “These values are indeed very unusual and may indicate errors in the data.” It said, “It’s also possible that these changes could be due to significant population migration, changes in land area, or other factors.” I guess if, in one year, one of those places was invaded and it was only a city that was left officially as part of the territory, the population density would skyrocket. So that’s a smart answer, but it spotted the two changes that I’d made among thousands of data points. In fact, actually, I’m going to work out how many data points there were in that file. I used Excel to work it out, of course, and there were 36,000 data points, and I made two changes, and it spotted both of them.
But now it’s time to go back to something a bit more fun and creative. Next, I gave it that same image again and said, “Write a sonnet about a more full AI reflecting on a dystopian landscape. He does look kind of solemn here. Overlay the poem in the background of this image, using the Edge detector to avoid any objects.” Now, there are different ways of doing it. It can detect the foreground and background, so it put the text away from the central character. I think the final result is really not bad, and the sonnet is pretty powerful. I’ll read just the ending: “Gone are the humans it once adored, leaving it in silent solitude. In binary sorrow, it has stored memories of a world it wants new. In the void, it sends a mournful cry, a ghost in the machine left to die.”
Anyway, this is a glimpse of the power of merging GPT-4’s language abilities with its nascent code interpreter abilities. Next, people asked about unclean data. So, I tried to find the most unclean data I could find. What I did is I pasted in directly from a website, Real Clear Politics, a bunch of polls. Different polls, and notice the formatting is quite confusing. You’ve got the dates on top, you have missing data, colored data, all sorts of things.
Then I asked, “Extract out the data for the 2024 Republican Presidential nomination. Analyze the trend in the data and output the results in a new downloadable file.” It sorted through and then found the averages for each of the candidates in that specific race. And I’m going to get to factuality and accuracy just a bit later on, the hint is that the accuracy is really surprisingly good.
I wanted to push it a bit further and do some trend analysis. So first to analyze some of the other races from that very unclean data set. And then what I did is I pasted in an article from Politico, and based on this very messy data, I got it to do some political prognostication. It analyzed the article and the polls, and then I think gave quite a smart and nuanced answer.
And what about accuracy? I know many people had that question. Well, I uploaded this data, and I’m also going to link to it in the description so you can do further checks. It was based on a fictional food company based in Boston and New York. And what I asked was, “Draw six actionable insights that would be relevant for the CEO of this company.” It then gave the insights below, and even though I didn’t actually ask for this, it gave six visualizations. Let me zoom in here so you can see it. And then I picked out a random five of those data points. Obviously, I’m not going to check hundreds of them, but I picked out five. Then I laboriously checked them in Excel, and they were all right.
Obviously, though, I’m not guaranteeing that every single calculation is correct. And as I say, you can download the file and see if these six visualizations are correct yourself. So far, honestly, it’s looking good. And then below, we have more detail on those insights and then some actions that we could take as a CEO.
Just like I did with Anna Karenina, I then followed up and said, “Use your own knowledge of the world and offer plausible explanations for each of these findings.” This is where GPT-4 becomes your own data analyst assistant, and it gave plausible explanations for some of the findings. For example, “The higher sales in the east region could be due to a higher population density, better distribution networks, or higher demand for the company’s products.” And at this point, you could either use the web browser plugin to do more research on your own, or you could upload more files. I actually asked, and I think this is a great question, “Suggest six other company data sets you would find helpful to access to test these suppositions.” Now, obviously, a lot is going to come down to privacy and data protection, but GPT-4 code interpreter can suggest further files that would help it with its analytics, and it gives those below.
And again, the lazy CEO could just upload those files and get GPT-4 code interpreter to do further analysis. You don’t have to think about what to upload; GPT-4 will suggest it for you. Whether that’s advisable or not, I’ll leave you to decide. The next one is slightly less serious, and it’s that code interpreter can output PowerPoint slides directly. Now, I know when Microsoft 365 Copilot rolls out, this might be a little bit redundant, but it is cool to know you can output directly into PowerPoint the visualizations and analysis from code interpreter.
Now, on to mathematics. Many people pointed out that I didn’t fully test out Wolfram to give it a fair shot. So, I tested both code interpreter and Wolfram on differential equations, and they both got it right. Interestingly, though, they gave you a link for the step-by-step solutions because this is a paid option on the Wolfram website. But I did find some other differences between them, and honestly, it favored code interpreter.
Here is a really challenging mathematics question, and Wolfram can’t get it right. It says that the answer is 40, even though that’s not one of the options. Yes, it used Wolfram. I think about five times. Here was the exact same prompt except, instead of saying “use Wolfram,” I said “use code interpreter,” and this was not a one-off example. It fairly consistently got it right. So code interpreter does indeed have some serious oomph behind it.
Just quickly again on the silly stuff, I uploaded the entire “Death of Ivan Ilyich” short story by Tolstoy. Then I changed one phrase in one line out of about 23,000 words. I changed “his daughter” into “an astronaut.” Of course, if you just ask GPT-4 directly, it doesn’t have enough space. It will give you this message: “The message you submitted was too long. Please reload the conversation.” But with code interpreter, it did spot the mistake. Now, again, you do have to give it a little bit of help. I said, “Anything about the daughter in the story that seems strange?” And after thinking for a while, it did eventually get it. It said, “The phrase ‘despite being a showgoth astronaut’ seems to be out of place in the 19th-century context.” So this does strike me as a somewhat sneaky, albeit imperfect way of getting around the context limit. You can’t input all the words directly; you have to upload the file, and then you do have to give a slight indication of what you’re looking for in the file. But for many use cases, it is a way to get the desired result without using up too much money.
As we come to the end here, I’m going to leave in the background a beautiful fractal visualization done through code interpreter. As before, let me know if there’s anything that I’ve missed or further experiments you would want me to do. I honestly don’t know when they’re going to roll this out more widely. I know it’s going to have a lot of use cases, both professionally and personally, and that’s before you bring in advanced prompt engineering like Smart GPT and Tree of Thought prompting. Again, if you haven’t seen my other video on the code interpreter plugin, please do check it out. There are about 23 experiments I did, just as good as the ones you can see here. Thank you for watching to the end, and have a wonderful day.