Zuckerberg and Nvidia CEO Unveil Meta’s Video Vision AI

July 30, 2024

0

Meta AI had a palpable hit last year with Segment Anything, a machine learning model that could quickly and reliably identify and outline just about anything in an image. The sequel, which CEO Mark Zuckerberg debuted on stage Monday at SIGGRAPH, takes the model to the video domain, showing how fast the field is moving.

Advancing Segmentation Technology

Segmentation is the technical term for when a vision model looks at a picture and picks out the parts: “this is a dog, this is a tree behind the dog” hopefully, and not “this is a tree growing out of a dog.”

Introducing Segment Anything 2 (SA2)

Segment Anything 2 (SA2) is a natural follow-up in that it applies natively to video and not just still images; though you could, of course, run the first model on every frame of a video individually, it’s not the most efficient workflow.

74 Robotics Companies Actively Hiring Right Now

Computational Demands and Industry Advances

Processing video is, of course, much more computationally demanding, and it’s a testament to the advances made across the industry in efficiency that SA2 can run without melting the datacenter. Of course, it’s still a huge model that needs serious hardware to work, but fast, flexible segmentation was practically impossible even a year ago.

AI Availability and Accessibility

The model will, like the first, be open and free to use, and there’s no word of a hosted version, something these AI companies sometimes offer. But there is a free demo.

AI Training Data and Database Release

Naturally such a model takes a ton of data to train, and Meta is also releasing a large, annotated database of 50,000 videos that it had created just for this purpose. In the paper describing SA2, another database of over 100,000 “internally available” videos was also used for training, and this one is not being made public — I’ve asked Meta for more information on what this is and why it is not being released. (Our guess would be that it’s sourced from public Instagram and Facebook profiles.)

Huawei Mate XT Ultimate Tri-Fold Global Launch 2025

Meta’s Leadership in Open AI

Meta has been a leader in the “open” AI domain for a couple years now, though it actually (as Zuckerberg opined in the conversation) has been doing so for a long time, with tools like PyTorch. But more recently, LLaMa, Segment Anything, and a few other models its put out freely have become a relatively accessible bar for AI performance in those areas, although their “openness” is a matter of debate.

Meta Quest Headset: Enhance Your Journey Tour Mode Unveiled

AI Motivations Behind Openness

Zuckerberg mentioned that the openness is not entirely out of the goodness of their hearts over at Meta, but that doesn’t mean their intentions are impure: “This isn’t just like a piece of software that you can build — you need an ecosystem around it. It almost wouldn’t even work that well if we didn’t open source it, right? We’re not doing this because we’re altruistic people, even though I think that this is going to be helpful for the ecosystem — we’re doing it because we think that this is going to make the thing that we’re building the best.”

Careers

Zuckerberg and Nvidia CEO Unveil Meta’s Video Vision AI

Advancing Segmentation Technology

Introducing Segment Anything 2 (SA2)

Computational Demands and Industry Advances

AI Availability and Accessibility

AI Training Data and Database Release

Meta’s Leadership in Open AI

AI Motivations Behind Openness

Related Articles

Ngannou’s Emotional MMA Return: Victory in Riyadh

Inter Miami Joins FIFA Club World Cup 2025 in Expanded Format

Chelsea’s Resurgence Under Enzo Maresca: Stats & Insights

LEAVE A REPLY Cancel reply