Announcing the Artificiality Summit 2025!
Announcing the Artificiality Summit 2025! Don't miss the super early bird special at the end of the email...
This week: New research and product previews from Apple, Google, and OpenAI; an interview with Richard Kerris of NVIDIA, crafting better promtps, an interview with Tyler Marghetis, and an exploration of generative AI and flow.
This Week:
Just when you might have thought that the pace of new generative AI technology might slow, Google and OpenAI released product previews that represent material leaps. Both announcements were showcases of technologies that are only available to limited audiences, leaving us with the feeling that both companies are posturing to ward off competition and/or build investor interest in OpenAI’s current fundraising and Google’s lagging stock price. While OpenAI’s release is clearly remarkable, we’re more focused on the fundamental changes behind Google’s new version of Gemini which, we think, should quiet the narrative that Google has been left behind.
OpenAI Sora
OpenAI released a preview of Sora, a new text-to-video technology. To date, OpenAI’s image generation technology has been somewhat underwhelming vs. competition from MidJourney and Stable Diffusion. Sora, however, is a major leap forward in terms of image quality and video duration. Currently, AI video generation presents exciting potential, but the reality is low-quality images, frequent errors, and videos that are limited to only a few seconds. The Sora demo shows videos up to a minute long with high-quality images and complex compositions like reflections. For instance, take a look at the reflections in the water, sunglasses, and even earrings in this video of a woman walking on a Tokyo street.
Sora isn’t without errors, however. OpenAI shared videos with objects morphing and spontaneously appearing, distorted body parts, and implausible movements. Even in the showcase examples, you can find errors if you look closely enough. For instance, watch people spontaneously disappear in this video of people walking along a snowy street in Tokyo.
Despite the errors, Sora represents a significant improvement over current technology. And, in somewhat unusual form, OpenAI is not releasing Sora due to safety concerns—instead choosing to work with experts on misinformation, hateful content, and bias for now. The company also says that it will be embedding metadata so that Sora-generated videos will be easy to identify.
Related to AI & creativity, make sure to check out our interview with Richard Kerris, VP Developer Relations and GM of Media & Entertainment at NVIDIA about the impact of AI that he anticipates in the creative industries. And, take a look at our coverage of recent Apple research presenting a unique approach to vision modeling that things at Apple’s likely strategic imperative towards heavily integrating vision models in spatial computing environments.
Google Gemini 1.5
The second major product preview this week was Gemini 1.5 from Google. Yes, Gemini was announced in December and fully released about a week ago. Perhaps determined not to look behind again, Google is already shifting focus to the next version by releasing research about and developer access to Gemini 1.5.
The two most important parts of Gemini 1.5 appear to be:
The speed with which generative AI vendors are updating products is beyond anything we’ve seen in previous technology shifts. And the changes don’t seem to be slowing down. What to do? Plan on a continuous learning journey. Unlike pretty much any other technology, an individual can’t learn about generative AI and expect that learning to be current for long. Adopting a continuous learning journey seems to be the only way to keep pace and anticipate the future.
We have structured Artificiality Pro as a continuous learning journey specifically for this reason. Get in touch if you’d like to learn more about our upcoming research releases on AI Trust, AI-Enhanced Learning, and Human-Centered AI Design.
The introduction of Gemini 1.5 Pro's ability to handle unprecedented context lengths, its superior performance compared to its predecessors, and the sustained relevance of power laws in its design underscore the breadth and depth of Google's long term capabilities.
By understanding the principles behind the evolving field of prompt engineering, we can craft better queries and engage more effectively with AI. They're insights we can all use to sharpen our own interactions with AI, even if we're not writing the code ourselves.
An interview with Richard Kerris, Vice President of Developer Relations and GM of Media & Entertainment at NVIDIA, about AI, creators, and developers.
It appears that there is one effect many researchers are finding across multiple fields: generative AI has a significant impact on lower skilled and less experienced people. However, if we automate difficult tasks we cut ourselves off from the essential components for achieving mastery like flow.
An interview with about the lulls and leaps of human imagination with Tyler Marghetis, Assistant Professor of Cognitive & Information Sciences at the University of California, Merced.
Apple researchers recently published a paper describing a new architecture for vision models. The paper's unique approach to vision modeling hints at Apple's likely strategic imperative towards heavily integrating vision models in spatial computing environments.
The Artificiality Weekend Briefing: About AI, Not Written by AI