Is AGI Just a Fantasy?

Published 2024-06-15
Nick Frosst, the co-founder of Cohere, on the future of LLMs, and AGI. Learn how Cohere is solving real problems for business with their new AI models.

Nick talks about his journey at Google Brain, working with AI legends like Geoff Hinton, and the amazing things his company, Cohere, is doing. From creating the must useful language models for businesses to making tools for developers, Nick shares a lot of interesting insights. He even talks about his band, Good Kid! Nick said that RAG is one of the best features of Cohere's new Command R* models. We are about to release a deep-dive on RAG with Patrick Lewis from Cohere, keep an eye out for that - he explains why their models are specifically optimised for RAG use cases.

Learn more about Cohere Command R models here:
cohere.com/command
github.com/cohere-ai/cohere-toolkit

Nick's band Good Kid:
goodkidofficial.com/

Nick on Twitter:
x.com/nickfrosst

00:00:00 Intro
00:01:55 Backstory of Cohere
00:02:31 Hinton
00:02:54 Nick's band
00:03:11 How is Cohere differentiated?
00:03:44 Not an AGI company
00:06:00 Command+R
00:06:41 Standout feature: RAG
00:09:07 How is RAG changing the way we build apps?
00:09:44 Build day
00:10:25 Building robust applications
00:11:59 RAG evolution
00:14:45 Unsupervised RAG
00:16:30 Agents and divergence
00:18:27 Agency
00:22:19 Are LLMs general?
00:24:48 Benchmarks
00:27:07 Would Cohere verticalize?
00:27:43 RAG vs long context
00:29:20 Tech hasn't landed yet?
00:31:36 Are LLMs saturating?
00:35:50 Cohere's data acquisition pipeline
00:36:34 SOTA chasing vs fairness
00:37:21 Fall of the data scientist
00:40:37 Final callouts

Disclaimer: This is the first video from our Cohere partnership. We were not told what to say in the interview, and didn't edit anything out from the interview.

All Comments (21)
  • This is a serious guy, so refreshing to listen to someone with their head screwed on correctly
  • @NER0IDE
    I don't know how they manage to keep bringing these amazing guests episode after episode
  • It says something about the current state of things when a company saying “we aren’t building digital gods. We are trying to solve real world problems” is a green flag. Excellent video as always MLST. I’m 10 minutes in and I can see the channel improving with every vid. This deep dive, direct to the source, appropriately skeptical content is needed in this parrot-filled AI hype cacophony.
  • Wow.. some sanity, humility and thoughtfulness brought to the AI debate... I applaud!
  • @RonVolkovinsky
    The most coherent and down to earth LLM discussion I've heard in a while!
  • @AAjax
    Great interview, but it's surreal hearing the lead singer from Good Kid talking about his company's ML products. A Renaissance man. For anybody curious about the band, check out "good kid no time to explain".
  • @mrdbourke
    That opening sentence is so refreshing to hear 👏
  • @GabrielVeda
    The question I wished you had asked is this: “In what fundamental ways does your thinking about LLMs, AI and AGI differ from Hinton’s?”.
  • @smicha15
    this is such a high quality and immensely well timed channel. the interviews are really right on with the SOTI (state of the industry) just made that up by the way. enjoy.
  • @LuigiSimoncini
    The face of Nick everytime Tim tries to bring him in the AGI, intelligence, sentience, agency... BS debate... And then him lecturing Tim on how LLMs really work. PRICELESS!
  • @toadlguy
    What a great interview. Nick Frosst seems to have a real good understanding of the current state of AI and a refreshingly open view of the landscape.
  • @codediporpal
    So much going on I keep forgetting how good this channel is and forget to watch!
  • If I'm doing something complex with code, I wouldn't use a A.I at all, because the script engineer would need to understand their 10000+ lines of code symbiotically in order to predict the corner cases and bugs, A.I systems are terrible at it and even with simple code (the big problem with A.I is that it doesn't interpolate through consequences of a multicomponent system), but the human brain just seems to be able to continue to track complexity almost indefinitely by summing and subtracting effects and then scaling (We are better at understanding where and how a corner case will emerge when given thousands of lines of code, our relational context windows are broader and more robust and more interrelated). Also, we have to stop lying with false data (A.I ingest more data than humans but are less performant) humans consume FAR more sensory data than A.I (millions of different sensory detection, and trillions of sensory detections a second), and it's not even close, and all this multi-directional data follows cause and effect progression patterns which allow us to train on words of sequential meaning indirectly (non-linguistic data is correlated with linguistic data interrelatedly, making language easier for humans, because we ingest far more data than a language model per second).
  • @halseylynn5161
    This is the least fake-hype AI lab head I've seen, it is so refreshing! Everyone else is yelling their heads off about tech that we might have in a decade rather than focusing on what we do have. Don't get me wrong, the quest for AGI is nobel and important, but holy hell, there is work to be done in the here and now too.
  • As a counter perspective. Great company first off. However, he keeps saying the limitation is the data they are trained on, thus they will never be agents. He missing the underlying intuition. Using “platonic representation hypothesis” as grounding, the data they trained on isn’t a limitation, rather a benefit. We will get a point to where LM have hierarchal representations. Why I’m so excited with all the research with grokking, because I believe the answer lies with fully grokked transformers and way better data(actually providing synthetic insights about the data, not 100% synthetic data; raw data is so oblique). Assuming you’re optimizing for ultra long sequences, and afford the model an ability to exploit this behavior with extended test-time compute…why couldn’t an agent be superior to human CEO’s? Hierarchical representations imo, would result in a model with very nuanced “intuition”. Give the model the special embeddings to allow it to control multi-turn output itself…I don’t see why we can’t have legit agents. Why I believe data engineering will be a true competitive edge and a TRUE moat. Why my intuition is sooo strong in this regard, the fact we train it on raw data and do post-training preference tuning that results in current capabilities..make the answer clear, the data is too oblique, hell even how we sample during pretraining is subpar imo, I think a lot hallucinations are knowledge-conflicts. Why I relate LM’s to like adolescent autism. First labs that start adding behavior psychologists to their data teams wins 😂. Idk about ai gods, I’m a Christian 😂, but I do know we are no where near a ceiling with LM’s and I don’t mean the caveman style of pure scale increase.
  • @Jononor
    Very happy to see such a pragmatic, focused and grounded view of Large Language Models. Both that it is represented in the people building such systems today - and that it gets surfaced here on MLST. The philosophical discussions that we often have, are super interesting and worthwhile - but we must not entirely forget where we actually stand with today's technology. And most importantly, not fool ourselves in thinking that we are much further along than we really are. Or be fooled by organizations pretending that they are, in order to push their own agenda (possibly to our detriment as citizens...).
  • @enthuesd
    Asking all the right questions, thank you