2 Comments
User's avatar
тна Return to thread
Alberto Romero's avatar

Yeah, that paper from Anthropic (and the scaling monosemanticity one) prompted me to write this. I'm researching that topic to publish a deeper dive sometime down the line.

Expand full comment
dan mantena's avatar

Looking forward to it. Seems like some level of progress although it does not seem to give full certainty of what the ai will produce for a given prompt...assuming that is when we can call ai explainability solved?

Expand full comment