ResearcharXiv cs.AI — 21 h ago

Recoverable but Not Stationary:Local Linear Structures in Weights and Activations

The paper presents findings on the local linear structures within the weights and activations of pretrained models, specifically using DistilGPT-2 and GPT-2 with LoRA adapters. It reveals that learned behaviors can be manipulated through linear directions, but these structures are dynamic rather than fixed, with the useful basis evolving significantly within a short training period. This work enhances understanding of parameter perturbations and activation steering, indicating that effective random parameter search can be justified in high-dimensional spaces, which is crucial for practitioners optimizing model performance.

linear structuresweightsactivationrelevance 0.00 · engagement 0.00

Read at source ↗← all news