Mechanistic Interpretability & Memory vs. Margins

In this episode, we provide updates from our Artificiality Pro presentation, including key developments in mechanistic interpretability for understanding AI models and considerations around the costs of large language models: aka memory vs margins.

Mechanistic Interpretability & Memory vs Margins

In this episode, we provide updates from our Artificiality Pro presentation, including key developments in mechanistic interpretability for understanding AI models and considerations around the costs of large language models: aka memory vs margins. We also highlight an upcoming webinar we are hosting on discussing AI with teenagers.

audio-thumbnail
Mechanistic Interpretability & Memory vs. Margins
0:00
/970.4751020408163

Episode Notes:

Mechanistic Interpretability

  • An emerging way for us to unravel the "blackbox" of AI models by understanding their internal representations and computations.
  • Allows for safer and more reliable use of AI, especially in critical applications like healthcare.
  • Involves conceptualizing models in terms of "features" and "circuits" instead of just neurons.
  • Anthropic has developed new methods for creating sparse representations to query models.

Costs of Large Language Models: Memory vs. Margins

  • Costs broken down into prompt (input) and response (output) costs per 1000 tokens from our perspective.
  • Conversations require sending full context back and forth, rapidly increasing costs.
  • Assistants that aim to maintain long-term conversational context could get expensive because memory requires processing many more tokens.

Upcoming Webinar on Teens & AI

  • Free webinar on December 23 at 11am Pacific that we are hosting on talking to teenagers about AI.
  • Register here and tell all your friends!

Great! You’ve successfully signed up.

Welcome back! You've successfully signed in.

You've successfully subscribed to Artificiality.

Success! Check your email for magic link to sign-in.

Success! Your billing info has been updated.

Your billing was not updated.