Almost everyone in todays world has heard of AI. After speaking to a bunch of them I realized their only encounter has been with ChatGPT. That made me really sad because ChatGPT is just the tip of the iceberg in the vast and fascinating world of artificial intelligence. With the hopes of getting more people interested in AI lets explore the diverse and intriguing types of AI that exist beyond the basics.
Multimodal AI
Possibly the most fascinating type of AI out there. Unlike the unimodal systems we’re used to, which process just one type of data, multimodal AI juggles multiple forms like text, images, audio, and video. This complexity is what makes it thrilling! Just imagine an AI that doesn’t just read your text but can understand your pictures and sounds too. OpenAI’s GPT-4V(ision) is a prime example I came across. It adds the ability to process images to the already impressive text-handling capabilities of GPT-4. And it’s not alone; there’s Runway Gen-2 for video generation and Inworld AI for creating game characters. It’s quite literally magic.
The complexity behind this innovation might be arguably more interesting than the innovation itself. Multimodal AI needs heaps of diverse data to train, and aligning these different data types is like trying to solve a multidimensional puzzle. Plus, there’s the issue of ensuring that the AI doesn’t…