FedBiOT is a federated learning approach for local fine-tuning of large language models (LLMs) without requiring access to the full model. The method compresses the LLM into two components: an emulator and an adapter. The emulator simulates the behavior of the full model, while the adapter learns domain-specific linguistic patterns from clients' data. The server compresses the LLM and aligns its performance with the full model, while clients fine-tune the adapter. The problem is formulated as a bi-level optimization problem to minimize the negative impact of data discrepancy between the server and clients. The server and clients update their parameters accordingly. Extensive experiments on LLaMA-2 show that FedBiOT significantly reduces resource consumption compared to existing benchmarks while achieving comparable performance. The method uses parameter-efficient fine-tuning techniques such as LoRA to reduce the number of trainable parameters and communication costs. The emulator is trained on a public dataset to mimic the full model, while the adapter is fine-tuned on clients' data. The proposed FedBiOT achieves better performance than existing methods in tasks such as math problem-solving, code generation, and question answering. The method is efficient in terms of computation and communication overhead, making it suitable for federated learning scenarios where clients have limited resources.FedBiOT is a federated learning approach for local fine-tuning of large language models (LLMs) without requiring access to the full model. The method compresses the LLM into two components: an emulator and an adapter. The emulator simulates the behavior of the full model, while the adapter learns domain-specific linguistic patterns from clients' data. The server compresses the LLM and aligns its performance with the full model, while clients fine-tune the adapter. The problem is formulated as a bi-level optimization problem to minimize the negative impact of data discrepancy between the server and clients. The server and clients update their parameters accordingly. Extensive experiments on LLaMA-2 show that FedBiOT significantly reduces resource consumption compared to existing benchmarks while achieving comparable performance. The method uses parameter-efficient fine-tuning techniques such as LoRA to reduce the number of trainable parameters and communication costs. The emulator is trained on a public dataset to mimic the full model, while the adapter is fine-tuned on clients' data. The proposed FedBiOT achieves better performance than existing methods in tasks such as math problem-solving, code generation, and question answering. The method is efficient in terms of computation and communication overhead, making it suitable for federated learning scenarios where clients have limited resources.