[1]Devlin J, Chang M W, Lee K, et al. BERT: Pretraining of deep bidirectional transformers for language understanding[EBOL]. 2019 [20250819]. https:arxiv.orgpdf1810.04805[2]Radford A, Narasimhan K. Improving language understanding by generative pretraining[EBOL].[20250811]. https:www.cs.princeton.educoursesarchivespring20cos598Clectureslec4pretraining.pdf[3]Raffel C, Shazeer N, Roberts A, et al. Exploring the limits of transfer learning with a unified texttotext transformer[J]. Journal of Machine Learning Research, 2020, 21(2020): 167[4]Vaswani A, Shazeer N, Parmar N, et al. Attentionis all you need[C] Proc of the 31st Conf on Neural Information Processing Systems (NIPS 2017). LongBeach, USA: ACM, 2017: 60006010[5]Lewis M, Liu Y H, Goyal N, et al. BART: Denoising sequencetosequence pretraining for natural language generation, translation, and comprehension[EBOL]. 2019 [20250819]. https:xueshu.baidu.comndscholarbrowsedetail?paperid=1f3r0rd02t2408g0nt4a089023089519 &site=xueshu_se[6]Brown T B, Mann B, Ryder N, et al. Language models are fewshot learners[C] Proc of the 34th Int Conf on Neural Information Processing Systems. Vancouver, Canada: ACM, 2020: 18771901[7]Openai. GPT4 technical report[EBOL]. (20230327). https:arxiv.orgab s2303.08774.2023[8]Chowdhery A, Narang S, Devlin J, et al. PaLM: Scaling language modeling with pathways [J]. The Journal of Machine Learning Research, 2023, 24(1): 1132411436[9]Touvron H, Lavril T, Izacard G, et al. LLaMA: Open and efficient foundation language models[EBOL]. 2023 [20250819]. https:arxiv.orgpdf2302.13971v1[10]Chiang W L, LI Z, LIN Z, et al. Vicuna: An opensource chatbot impressing gpt4 with 90%*chatgpt quality[EBOL]. (20231016). https:vicuna.lmsys.org[11]Smith S, Patwary M, Norick B, et al. Using deepspeed and megatron to train megatronturing nlg 530b, a largescale generative language model[EBOL]. (20231016). https:arxiv.orgpdf2201.11990.pdf[12]Bi K F, Xie L X, Zhang H H, et al. Accurate mediumrange global weather forecasting with 3D neural networks[EBOL]. (20230705). https:www.nature.comarticless41586023061853[13]Lewis P, Perez E, Piktus A, et al. Retrievalaugmented generation for knowledgeintensive nlp tasks[C] Proc of the 34th Int Conf on Neural Information Processing Systems. New York: ACM, 2020: 94599474[14]Izacard G, Caron M, Hosseini L, et al. Unsupervised dense information retrieval with contrastive learning. Transactions on Machine Learning Research[EBOL]. 2021 [20250819]. https:arxiv.orgpdf2112.09118[15]Balaguer A, Benara V, Renato Luiz de Freitas Cunha, et al. RAG vs finetuning: Pipelines, tradeoffs, and a case study on agriculture[EBOL]. 2024 [20250819]. https:arxiv.orgpdf2401.08406[16]Qu Y Q, Ding Y C, Liu J, et al. RocketQA: An optimized training approach to dense passage retrieval foropendomain question answering[C] Proc of the 2021 Conf of the North American Chapte of the Association for Computational Linguistics: Human Language Technologies. Stroudsburg, PA: ACL, 2021: 58355847[17]Johnson J, Douze M, Jégou H. Billionscale similarity search with gpus[J]. IEEE Trans on Big Data, 2019, 7(3): 535547[18]Kingma D P, Ba J. Adam: A method for stochastic optimization[EBOL]. 2017 [20250819]. https:arxiv.orgpdf1412.6980 |