Skip to content

Conversation

@wassname
Copy link
Contributor

This helps find out which methods work best, which models it works with, and so on. For example this show that middle layers work best, it only works on model >4B, and diff_pca is best

This helps find out which methods work best, which models it works with, and so on.
For example this show that middle layers work best, it only works on model >4B, and diff_pca is best
@wassname wassname mentioned this pull request Sep 21, 2025
@thiswillbeyourgithub
Copy link
Contributor

it only works on model >4B

Just in case: have you checked wether the error message Error at layer 27: 'Qwen3DecoderLayer' object has no attribute 'set_control' repeated over and over at the bottom is an issue here?

@wassname
Copy link
Contributor Author

oh yeah that is just because those layers were not transformed into repeng control blocks. And I just used a try except pattern over all layers

With small models, they are kind of incoherent to start with, and I suspect they don't have well-developed inner concepts, making them harder to steer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants