FedBiOT is a novel federated learning (FL) approach designed to fine-tune large language models (LLMs) without access to the full model. The method aims to address the challenges of limited computation and communication capacities in FL clients, particularly when dealing with domain-specific data that is privately distributed across multiple owners. FedBiOT involves the server generating a compressed LLM and aligning its performance with the full model. Clients then fine-tune a lightweight adapter, which is a part of the compressed model. The server and clients perform bi-level optimization to minimize the negative effects of data discrepancy, with the server distilling an emulator from the original LLM and clients fine-tuning the adapter using their local datasets. Extensive experiments on LLaMA-2 demonstrate that FedBiOT significantly reduces resource consumption while achieving comparable or better performance compared to existing benchmarks. The proposed approach is particularly effective in tasks such as code generation, math problem-solving, and question answering, showing significant improvements over baseline methods in terms of accuracy and computational efficiency.FedBiOT is a novel federated learning (FL) approach designed to fine-tune large language models (LLMs) without access to the full model. The method aims to address the challenges of limited computation and communication capacities in FL clients, particularly when dealing with domain-specific data that is privately distributed across multiple owners. FedBiOT involves the server generating a compressed LLM and aligning its performance with the full model. Clients then fine-tune a lightweight adapter, which is a part of the compressed model. The server and clients perform bi-level optimization to minimize the negative effects of data discrepancy, with the server distilling an emulator from the original LLM and clients fine-tuning the adapter using their local datasets. Extensive experiments on LLaMA-2 demonstrate that FedBiOT significantly reduces resource consumption while achieving comparable or better performance compared to existing benchmarks. The proposed approach is particularly effective in tasks such as code generation, math problem-solving, and question answering, showing significant improvements over baseline methods in terms of accuracy and computational efficiency.