Sidharth Pulipaka has not written any posts yet.
Curriculum learning can also be tried, when training the LoRA. At first training would be entirely unmasked and gradually masking it, either layer-by-layer or by reducing the context.
That should probably boost the masked-LoRA performance a little.
Curriculum learning can also be tried, when training the LoRA. At first training would be entirely unmasked and gradually masking it, either layer-by-layer or by reducing the context.
That should probably boost the masked-LoRA performance a little.