PITTI - Article - Decoding intermediate activations in llama-2-7b

Decoding intermediate activations in llama-2-7b

Artificial Intelligence,Information Processing | Computing

Date : 2023-07-21

Introduction

In line with previous research, Nina Rimsky found that the decoded block outputs at most layers, except a few early ones, were interpretable. She also found that the other intermediate outputs were interpretable and provided some intuition on what different layers were responsible for. Some very interesting insights in the post.

Read article here

How hard does Art need to be ?

Evaluation of Sports Performance: Cognitive Biases, Vectors an...

Recently on :

Artificial Intelligence

Information Processing | Computing

PITTI - 2026-07-14

What Happens When AI Agents Have to Find Their Own Market?

Implementation of the Scaling Trust project : early lessons from simulating discovery, negotiation, and trust between autonomou...

PITTI - 2026-03-05

Scaling Trust : a Missing Piece in Multi-Agent Worlds

Humanity’s ability to build complex civilizations relies on an "invisible infrastructure" - the shared culture, institutions, a...

PITTI - 2026-01-14

Cultural, Ideological and Political Bias in LLMs

Transcription of a talk given during the work sessions organized by Technoréalisme on December 9, 2025, in Paris. The talk pres...

WEB - 2025-11-13

Measuring political bias in Claude

Anthropic gives insights into their evaluation methods to measure political bias in models.

WEB - 2025-10-09

Defining and evaluating political bias in LLMs

OpenAI created a political bias evaluation that mirrors real-world usage to stress-test their models’ ability to remain objecti...

more articles on
-
Artificial Intelligence

We care about your privacy so we do not store nor use any cookie unless it is stricly necessary to make the website to work

Got it

Learn more