Svelte Hacker News logo
  • top
  • new
  • show
  • ask
  • jobs
  • about

Debugging misaligned completions with sparse-autoencoder latent attribution

alignment.openai.com

1 points by rd 2 hours ago