AI news roundup: OpenAI video model, Nvidia chatbot and more
Explore last week's AI news highlights with analyst Mike Leone's roundup of top developments, including OpenAI's launch of video model Sora and Nvidia's locally running chatbot.
Last week, we saw developments from OpenAI and Nvidia, in addition to model releases, funding news and company launches. Let's get into the five new items from last week that most piqued my interest.
OpenAI introduces Sora
As we continue to see leapfrog improvements in generative AI, OpenAI's Sora serves as a prime example of how advanced technology can empower and enhance human creativity. Sora, OpenAI's just-announced text-to-video model, can generate videos up to a minute long while maintaining visual quality and adherence to the user's prompt. Sora's internal model of the physical world enables it to create multicharacter scenes based on a user's prompt, with realistic movements and accuracy. The result is nothing short of breathtaking and truly something you need to see to believe.
Sora's capabilities signify yet another breakthrough for OpenAI, enabling the rapid prototyping and creation of visual content through democratized video production. But Sora's capabilities have implications far beyond simplifying content creation and reducing the time it takes for skilled professionals to create video shorts.
Of course, on the business side, this means the ability to respond to a market trend faster than ever before by reducing production cycles. But I'm intrigued by Sora's potential impact in other areas. Take education as just one example: Imagine being able to cater to different learning styles and improve knowledge retention by bringing lessons to life in a personalized way with the click of a button.
Reka announces a new multimodal, multilingual model
Reka announced Reka Flash, a multimodal model with 21 billion parameters that rivals the performance of other leading models. Trained from scratch, Reka Flash serves as the turbo offering in the company's lineup of generative AI models. Reka also announced a more compact variant called Reka Edge, which has seven billion parameters and can run on devices locally, improving deployment efficiency.
Performance is an important component of all model announcements right now. And as we see new models get released, evaluating performance through benchmarking is critical to proving their viability, reliability, accuracy and underlying capabilities. With that in mind, Reka released benchmarking results highlighting the model's performance across several generative AI benchmarks in language, multilingual reasoning, vision and video, including the following:
- MMLU for knowledge-based question answering.
- GSM8K for reasoning and math.
- HumanEval for code generation.
- GPQA for graduate-level question answering.
The model did quite well on core language tests, outperforming models such as Google's Gemini Pro, OpenAI's GPT-3.5 and Meta's Llama2-70B, while slightly trailing GPT-4 and Gemini Ultra. But what I like about this announcement is the level of transparency Reka is bringing through disclosing the model's performance on these benchmarks. While I've highlighted just a few of the results, many more can be found here with comparisons to other models.
Lambda raises $320M to grow GPU cloud
Lambda Labs has been around since 2012, pushing the boundaries of AI and helping businesses utilize AI on their own terms. But starting in 2017, Lambda began pushing the delivery of a GPU cloud. What started as wearable tech and image recognition has evolved into a dedicated GPU cloud for training, inference, deep learning and more.
Most recently, Lambda has focused on training large language models (LLMs) and other types of generative AI through the company's on-demand cloud and reserved cloud, making thousands of Nvidia GPUs available to customers in an easy-to-consume way. And this is exactly where the $320 million will be going: accelerating the growth of Lambda's GPU cloud and enabling teams to use thousands of Nvidia GPUs with high-speed Nvidia Quantum-2 InfiniBand networking. In the midst of global GPU shortages, Lambda's infusion of capital will better ensure sustained expansion of the company's GPU resources, mitigating potential bottlenecks for AI development and deployment.
The most important part of this announcement is its recognition of the top capabilities for organizations when selecting AI infrastructure. And our research at Enterprise Strategy Group highlights that the top two capabilities organizations desire are performance and ease of deployment. Lambda has consistently checked both of those boxes, and this funding will ensure they remain checked as organizations pursue generative AI initiatives.
Guardrails AI launches with seed round, aiming to improve LLM reliability
Trust has been at the heart of many generative AI conversations lately. Organizations want stability, accuracy and compliance as they pursue generative AI. They want to uphold their reputation. They want to transparently provide an understanding of model performance. And they want to control and empower end users to confidentially use generative AI for the better.
In other words, organizations need assurances with systematic methodologies to ensure generative AI's safety and effectiveness. Enter Guardrails AI, a platform focused on delivering the safe and effective use of generative AI with enhanced accuracy and reliability. At launch, the company also introduced Guardrails Hub, an open source product that empowers AI developers to build, contribute, share and reuse advanced validation techniques known as validators.
These validators can be used with the core Guardrails platform to act as a critical reliability layer for building AI applications that ensures they adhere to specified guidelines and norms. Guardrails offers over 50 prebuilt validators created by not only Guardrails AI but also several partners and open source contributors. These validators provide essential risk management tools for AI that help businesses address compliance, security and ethical AI in a programmatic way.
Nvidia delivers new chatbot that can run on your PC
Chat with RTX, Nvidia's new custom chatbot, enables users to personalize a chatbot using their own content on their own PC. The demo is free to download and uses retrieval-augmented generation, Nvidia TensorRT-LLM software and Nvidia RTX acceleration to bring generative AI capabilities to local, GeForce-powered Windows PCs.
End users can connect local files as a data set to an open source LLM, such as Mistral or Llama 2, and enable queries for quick, contextually relevant answers. And from a file-format support standpoint, it's surprisingly flexible, with support for txt, pdf, doc, docx and xml. You can also paste URLs to YouTube videos within the chat for even more context.
With so much emphasis on cloud-based LLM services lately, this announcement is a breath of fresh air. With the chatbot running locally, all the data stays on the device -- there's no need to share it with a third party or even over the internet. In other words, you can process sensitive data without worrying about it going off the device.
Of course, there are certain requirements that must be met on the PC itself, such as having a GeForce RTX 30 Series GPU, 8 GB of video RAM and a Windows 10 or 11 OS. But regardless, the ability to use a personalized and contextually aware chatbot on a personal device is quite powerful in itself.
Mike Leone is a principal analyst at TechTarget's Enterprise Strategy Group, where he covers data, analytics and AI.
Enterprise Strategy Group is a division of TechTarget. Its analysts have business relationships with technology vendors.