Twelve Labs

Software Development

San Francisco, California 6,029 followers

Help developers build programs that can see, listen, and understand the world as we do.

See jobs Follow

View all 69 employees

About us

Helping developers build programs that can see, hear, and understand the world as we do by giving them the world's most powerful video-understanding infrastructure.

Website: http://www.twelvelabs.io
External link for Twelve Labs
Industry: Software Development
Company size: 11-50 employees
Headquarters: San Francisco, California
Type: Privately Held
Founded: 2021

Locations

Primary

555 Mission St

San Francisco, California 94105, US

Get directions

Employees at Twelve Labs

See all employees

Updates

Twelve Labs

6,029 followers
1d
Report this post
In the 51st session of #MultimodalWeekly, we have three exciting presentations from startup founder and researchers working in Multimodal AI. ✅ Jay Chia, the co-founder of Eventual, will share the DIY multimodal data lake with Daft data frames. -> Check out daft: https://www.getdaft.io/ ✅ Saptarshi Sinha, a Ph.D. researcher at the University of Bristol, will present his work "Every Shot Counts: Using Exemplars for Repetition Counting in Videos." -> Read the paper: https://lnkd.in/gUdY2NCh ✅ Yunhua Zhang, a Ph.D. candidate at UvA, will present her work "Low-Resource Vision Challenges for Foundation Models." -> Read the paper: https://lnkd.in/gXSe_ed5 Register for the webinar here: https://lnkd.in/gJGtscSH 👈 Join our Discord to connect with the speakers: https://lnkd.in/gRt4GdDx 🤝
Like Comment Share
Twelve Labs

6,029 followers
2d
Report this post
~ New Webinar ~ The recording of #MultimodalWeekly 49 with Jiwoo Hong from KAIST AI and Associate Professor Lei Huang and Baichuan Zhou from Beihang University is up! 📺 Watch here: https://lnkd.in/gjtn4ZQX They discussed: - Motivation for ORPO: RLHF with PPO, DPO, and SFT in alignment - Experimental results of ORPO in single-turn and multi-turn instruction following - Efficiency and scalability of ORPO - The opportunity for small-scale LMM - How to merge modality (vision) in small LMM? - TinyLLaVA: From the model, data, and training perspectives Join our Discord community: discord.gg/Sh6BRfakJa 🤝

Single-Step Language Model Alignment & Smaller-Scale Large Multimodal Models | Multimodal Weekly 49

https://www.youtube.com/

Like Comment Share
Twelve Labs

6,029 followers
6d
Report this post
~ New Webinar ~ The recording of #MultimodalWeekly 48 with Letian (Max) Fu from University of California, Berkeley and Bo Zhao from Beijing Academy of Artificial Intelligence(BAAI) is up! 📺 Watch here: https://lnkd.in/gX_iSzuD They discussed: - Touch as a sensing modality is missing in multimodal models - Touch-vision-language dataset - TVL-Tactile Encoder & TVL-LLaMA - SVIT: Scaling Up Visual Instruction Tuning - Bunny: A concise open-source lightweight multimodal LLM - M3D: Advancing 3D medical image analysis with multimodal LLMs - MLVU: A comprehensive benchmark for multi-task long video understanding Join our Discord community: discord.gg/Sh6BRfakJa 🤝

Modality Alignment for Multimodal Perception & Open-Source Lightweight MLLM | Multimodal Weekly 48

https://www.youtube.com/

Like Comment Share
Twelve Labs

6,029 followers
1w
Report this post
In the 50th session of #MultimodalWeekly, we have two exciting presentations from startup founders building real-world products for Multimodal AI applications. ✅ Jesse N. Clark, the Co-Founder and CTO of Marqo AI, will discuss generalized contrastive learning for multimodal retrieval and ranking. They generalize the popular CLIP training method to accommodate any number of text and images when representing documents and encode relevance (or rank) to provide better first-stage retrieval. 📄 ✅ Alexandre Berkovic, the Co-Founder and CEO of Adorno AI, will dive into how video and audio understanding technologies from Twelve Labs and Adorno AI are transforming video production. 📻 Register for the webinar here: https://lnkd.in/gJGtscSH 👈 Join our Discord community: https://lnkd.in/gRt4GdDx
3 Comments

Like Comment Share
Twelve Labs

6,029 followers
1w
Report this post
Twelve Labs will be attending AWS Summit NY on July 10! Connect with our team to learn how you can streamline all your video-related workflows with our multimodal AI models and discuss the latest in tech. Don’t hesitate to say hello when you spot any of our team members Jae Lee Soyoung Lee Maninder Saini Andy Vaughan. We can’t wait to see everyone there! #AWSSummit #AWSNY

Like Comment Share
Twelve Labs

6,029 followers
1w
Report this post
We got a new exciting collaboration with the Phyllo team to transform video insights on social media 😉 🌟 Why This Matters 🌟 With social media shifting to video, extracting insights is crucial. Video posts get up to 10 times more engagement and 74% of users take action after viewing a brand's video. 🔍The Phyllo and Twelve Labs Advantage🔍 Phyllo: - Customizable searches across 15+ social media platforms. - Cost-effective social data access. Twelve Labs: - Foundation models that analyze videos through visual, audio, and text modalities. - Offers semantic video search, zero-shot classification, video-to-text generation, and multimodal video embeddings. 🌐 Innovative Use Cases 🌐 1 - Insights for Videos: Get detailed answers, summaries, and sentiment analysis. 2 - Product Development: Analyze product usage in social videos. 3 - Byte-Sized Segments: Break long videos into short clips for Instagram and TikTok. 4 - Influencer Insights: Identify influencers using specific products and their impact. Read more about our collaboration here: https://lnkd.in/gC9Zjmgp 👀
2 Comments

Like Comment Share
Twelve Labs

6,029 followers
1w
Report this post
~ New Webinar ~ The video recording of #MultimodalWeekly 47 with Benjamin Muller, Tu Anh NGUYEN, and Bokai Yu from AI at Meta is up! 📺 Watch here: https://lnkd.in/guZ5C_mU 👀 They discussed: - Challenges of expressive speech generation - SpiRit-LM combines TextLM and SpeechLM - SpiRit-LM training recipe and generation samples - Evaluation: zero-shot, few-shot, and text-speech sentiment-preservation benchmark - Can we observe the speech-text alignment? Join our Discord community: discord.gg/Sh6BRfakJa 🤝

SpiRit-LM, an Interleaved Spoken and Written Language Model | Multimodal Weekly 47

https://www.youtube.com/

Like Comment Share
Twelve Labs

6,029 followers
2w
Report this post
🏇 We are excited to announce the launch of Jockey: A Conversational Video Agent powered by Twelve Labs APIs and LangGraph from LangChain! Here's why developers should dive into Jockey: 👇 1 - Advanced Video Understanding: Jockey utilizes Twelve Labs' state-of-the-art video foundation models to extract rich insights from video content, offering capabilities like video search, classification, summarization, and more. 📽 2 - Flexible and Scalable Framework: Built on LangGraph, Jockey provides unparalleled control over the flow of code, prompts, and LLM calls, facilitating robust human-agent collaboration and ensuring reliable performance. ⛓ 3 - Efficient and Precise Architecture: Jockey's architecture includes key components such as the Supervisor, the Planner, and specialized Workers that handle tasks like video search, text generation, and editing, ensuring optimal token usage and accurate node responses. 🏛 4 - Customizable and Extensible: Jockey's modular design allows for easy customization and extension. Developers can modify prompts, extend state management, or add new workers to tailor Jockey to specific needs, making it a versatile foundation for advanced video AI applications. 🤟 Full blog post here: https://lnkd.in/gbudqhKM 😎
9 Comments

Like Comment Share
Twelve Labs

6,029 followers
2w
Report this post
~ New Webinar ~ The video recording of #MultimodalWeekly 46 with Anoop Thomas from EMAM, Inc. is up! 📺 Watch here: https://lnkd.in/gZnWiYNS 👀 He discussed: - eMAM provides an end-to-end media workflow - eMAM's technology partners - eMAM's architecture and deployment options - Live demo of Twelve Labs models capabilities in eMAM product Join our Discord community: discord.gg/Sh6BRfakJa 🤝

Enhancing Video Production & Media Search with eMAM and Twelve Labs | Multimodal Weekly 46

https://www.youtube.com/

1 Comment

Like Comment Share
Twelve Labs

6,029 followers
3w
Report this post
Exciting times at Twelve Labs! Our team just returned from #CVPR2024 in Seattle last week, and what an incredible experience it was! 🌌 CVPR did not disappoint this year. We immersed ourselves in the latest advancements in video understanding and multimodal AI - areas at the core of our mission at Twelve Labs. Some highlights: 🌟 • Engaging discussions on cutting-edge research in multimodal foundation models • Insights into the latest trends in video embedding and retrieval • Connecting with brilliant minds pushing the boundaries of video-language modeling 🔬 Calling all ML Researchers! 🔬 Are you passionate about advancing the field of video understanding and multimodal AI? We're expanding our ML Research team and looking for talented individuals to join us on this exciting journey. Open Roles: • ML Research Scientist • Research Internships If you're ready to tackle challenging problems in video foundation models and multimodal LLMs, we want to hear from you! Learn more and apply: https://lnkd.in/ggc-mYa8 ◀ Aiden L. Hyojun Go Ryan Scott Kate Chen Sunny Hien Nguyen James Le Jenny Jayoung Ahn Minjoon Seo
Like Comment Share

Browse jobs

Funding

Twelve Labs 5 total rounds

Last Round

Series A Jul 4, 2024

US$ 50.0M

Investors

NVentures New Enterprise Associates + 4 Other investors

See more info on crunchbase

Twelve Labs

Software Development

San Francisco, California 6,029 followers

Help developers build programs that can see, listen, and understand the world as we do.

About us

Locations

Employees at Twelve Labs

Sean Barclay

Tiffany Luck

Manish Maheshwari

Multimodal AI Product Lead |Dev Experience |Ex - Datadog, Airbnb, AWS

Rick Kjellberg

Cleantech & SaaS CFO | MSA, MBA, MPA

Updates

Single-Step Language Model Alignment & Smaller-Scale Large Multimodal Models | Multimodal Weekly 49

https://www.youtube.com/

Modality Alignment for Multimodal Perception & Open-Source Lightweight MLLM | Multimodal Weekly 48

https://www.youtube.com/

SpiRit-LM, an Interleaved Spoken and Written Language Model | Multimodal Weekly 47

https://www.youtube.com/

Enhancing Video Production & Media Search with eMAM and Twelve Labs | Multimodal Weekly 46

https://www.youtube.com/

Join now to see what you are missing

Similar pages

ElevenLabs

Sona (getsona.com)

Vorlon

Gynger

Backflip

re:cap

WindBorne Systems

HeyGen

Firefly

WitnessAI

Browse jobs

Scientist jobs

Intern jobs

Director jobs

Machine Learning Engineer jobs

Manager jobs

Real Estate Associate jobs

Geographic Information Systems Analyst jobs

Sustainability Consultant jobs

Education Manager jobs

Data Science Specialist jobs

Associate jobs

Operations Assistant jobs

Manufacturing Manager jobs

Travel Agent jobs

Senior Associate jobs

Corporate Development Specialist jobs

Paralegal jobs

General Manager jobs

Associate Product Manager jobs

Digital Marketing Director jobs

Funding