Pruner-Zero is an innovative framework designed to evolve symbolic pruning metrics for Large Language Models (LLMs) without retraining. The framework leverages Genetic Programming (GP) to search for optimal symbolic pruning metrics, which are then used to prune LLMs efficiently. Key contributions include:
1. **Symbolic Pruning Metric Discovery**: Pruner-Zero formulates the pruning metric discovery as a Symbolic Regression problem, creating a comprehensive search space that includes existing pruning metrics.
2. **Opposing Operation Simplification (OOS)**: This strategy reduces redundancy in the search space by identifying and eliminating opposing operations, enhancing the efficiency of metric discovery.
3. **Performance Evaluation**: Extensive experiments on LLaMA and LLaMA-2 models demonstrate that Pruner-Zero outperforms state-of-the-art post-training pruning methods, achieving superior performance in language modeling and zero-shot tasks without retraining or weight updates.
The framework's effectiveness is validated through various experiments, including language modeling on the WikiText2 dataset and zero-shot tasks on the EleutherAI LM Harness benchmark. The results show that Pruner-Zero consistently achieves lower perplexity and better performance compared to existing methods, highlighting its potential for efficient and effective LLM pruning.Pruner-Zero is an innovative framework designed to evolve symbolic pruning metrics for Large Language Models (LLMs) without retraining. The framework leverages Genetic Programming (GP) to search for optimal symbolic pruning metrics, which are then used to prune LLMs efficiently. Key contributions include:
1. **Symbolic Pruning Metric Discovery**: Pruner-Zero formulates the pruning metric discovery as a Symbolic Regression problem, creating a comprehensive search space that includes existing pruning metrics.
2. **Opposing Operation Simplification (OOS)**: This strategy reduces redundancy in the search space by identifying and eliminating opposing operations, enhancing the efficiency of metric discovery.
3. **Performance Evaluation**: Extensive experiments on LLaMA and LLaMA-2 models demonstrate that Pruner-Zero outperforms state-of-the-art post-training pruning methods, achieving superior performance in language modeling and zero-shot tasks without retraining or weight updates.
The framework's effectiveness is validated through various experiments, including language modeling on the WikiText2 dataset and zero-shot tasks on the EleutherAI LM Harness benchmark. The results show that Pruner-Zero consistently achieves lower perplexity and better performance compared to existing methods, highlighting its potential for efficient and effective LLM pruning.