开源语言模型百宝袋ercreateoo

Open-Source Language Model Pocket

目录 (Table of Contents):

Low-Rank LLaMA Instruct-Tuning

This repository contains code for reproducing the Stanford Alpaca results using low-rank adaptation (LoRA). We provide an Instruct model of similar quality to text-davinci-003 that can run on a Raspberry Pi (for research), and the code can be easily extended to the 13b, 30b, and 65b models.

In addition to the training code, which runs within five hours on a single RTX 4090, we publish a script for downloading and inference on the foundation model and LoRA, as well as the resulting LoRA weights themselves. To fine-tune cheaply and efficiently, we use Hugging Face's PEFT as well as Tim Dettmers' bitsandbytes.

Without hyperparameter tuning or validation-based checkpointing, the LoRA model produces outputs comparable to the Stanford Alpaca model. (Please see the outputs included below.) Further tuning might be able to achieve better performance; I invite interested users to give it a try and report their results.

An Instruction Fine-Tuning Platform with Instruction Data Collection and Unified Large Language Models Interface

Alpaca-CoT项目旨在探究如何更好地通过instruction-tuning的方式来诱导LLM具备类似ChatGPT的交互和instruction-following能力。为此,我们广泛收集了不同类型的instruction(尤其是Chain-of-Thought数据集),并基于LLaMA给出了深入细致的实证研究,以供未来工作参考。据我们所知,我们是首个将CoT拓展进Alpaca的工作,因此简称为"Alpaca-CoT"。

本项目目标是促进中文对话大模型开源社区的发展,愿景做能帮到每一个人的LLM Engine。现阶段本项目基于一些开源预训练大语言模型(如BLOOM),针对中文做了优化,模型调优仅使用由ChatGPT生产的数据(不包含任何其他数据)。

Colossal-AI: Making large AI models cheaper, faster and more accessible

Colossal-AI provides a collection of parallel components for you. We aim to support you to write your distributed deep learning models just like how you write your model on your laptop. We provide user-friendly tools to kickstart distributed training and inference in a few lines.

开源7个可商用GPT模型,含数据集和可直接下载的预训练模型权重: Cerebras 开源 7 个 GPT 模型,均可商用,参数量分别达到 1.11 亿、2.56 亿、5.9 亿、13 亿、27 亿、67 亿和 130 亿。其中最大的模型参数量达到 130 亿,与 Meta 最近开源的 LLaMA-13B 相当。该项目开源数据集和预训练模型权重,其中预训练模型权重文件大小近50G可直接下载,并且可用于商业和研究用途。与此前的 GPT-3 模型相比,Cerebras 开源的模型具有更高的可用性和透明度,研究人员和开发者可以使用少量数据对其进行微调,构建出高质量的自然语言处理应用。

ChatRWKV is like ChatGPT but powered by my RWKV (100% RNN) language model, which is the only RNN (as of now) that can match transformers in quality and scaling, while being faster and saves VRAM. Training sponsored by Stability EleutherAI :)

ChatLLaMA 🦙 has been designed to help developers with various use cases, all related to RLHF training and optimized inference.

ChatLLaMA is a library that allows you to create hyper-personalized ChatGPT-like assistants using your own data and the least amount of compute possible. Instead of depending on one large assistant that “rules us all”, we envision a future where each of us can create our own personalized version of ChatGPT-like assistants. Imagine a future where many ChatLLaMAs at the "edge" will support a variety of human's needs. But creating a personalized assistant at the "edge" requires huge optimization efforts on many fronts: dataset creation, efficient training with RLHF, and inference optimization.

We show that anyone can take a dated off-the-shelf open source large language model (LLM) and give it magical ChatGPT-like instruction following ability by training it in 30 minutes on one machine, using high-quality training data. Surprisingly, instruction-following does not seem to require the latest or largest models: our model is only 6 billion parameters, compared to 175 billion for GPT-3. We open source the code for our model (Dolly) and show how it can be re-created on Databricks. We believe models like Dolly will help democratize LLMs, transforming them from something very few companies can afford into a commodity every company can own and customize to improve their products.

FlexGen is a high-throughput generation engine for running large language models with limited GPU memory. FlexGen allows high-throughput generation by IO-efficient offloading, compression, and large effective batch sizes.

Limitation. As an offloading-based system running on weak GPUs, FlexGen also has its limitations. FlexGen can be significantly slower than the case when you have enough powerful GPUs to hold the whole model, especially for small-batch cases. FlexGen is mostly optimized for throughput-oriented batch processing settings (e.g., classifying or extracting information from many documents in batches), on single GPUs.

FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model. Our goal is to support training, fine-tuning, and deployment of large-scale models on various downstream tasks with multi-modality.

FlagData, a data processing toolkit that is easy to use and expand. FlagData integrates the tools and algorithms of multi-step data processing, including cleaning, condensation, annotation and analysis, providing powerful data processing support for model training and deployment in multiple fields, including natural language processing and computer vision.

LLaMA: Open and Efficient Foundation Language Models

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

Demo, data and code to train an assistant-style large language model with ~800k GPT-3.5-Turbo Generations based on LLaMa

HuggingGPT利用ChatGPT作为控制器,连接HuggingFace社区中的各种AI模型,来完成多模态复杂任务。

这意味着,你将拥有一种超魔法,通过HuggingGPT,便可拥有多模态能力,文生图、文生视频、语音全能拿捏了。

Inference of LLaMA model in pure C/C++

The main goal is to run the model using 4-bit quantization on a MacBook

This is the repo for the Llama-X, which aims to:

Lit-LLaMA is:

OpenChatKit uses a 20 billion parameter chat model trained on 43 million instructions and supports reasoning, multi-turn conversation, knowledge and generative answers.

Open Assistant is a project meant to give everyone access to a great chat based large language model.

We believe that by doing this we will create a revolution in innovation in language. In the same way that stable-diffusion helped the world make art and images in new ways we hope Open Assistant can help improve the world by improving language itself.

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Maybe I'll add retrieval functionality too, à la RETRO

A modular RL library to fine-tune language models to human preferences

We provide easily customizable building blocks for training language models including implementations of on-policy algorithms, reward functions, metrics, datasets and LM based actor-critic policies

在这个项目中,我们将通过开源项目 trl 搭建一个通过强化学习算法(PPO)来更新语言模型(GPT-2)的几个示例,包括:

Alpaca: A Strong, Replicable Instruction-Following ModelAl

We introduce Alpaca 7B, a model fine-tuned from the LLaMA 7B model on 52K instruction-following demonstrations. On our preliminary evaluation of single-turn instruction following, Alpaca behaves qualitatively similarly to OpenAI’s text-davinci-003, while being surprisingly small and easy/cheap to reproduce (<600$).

With trl you can train transformer language models with Proximal Policy Optimization (PPO). The library is built on top of the transformers library by 🤗 Hugging Face. Therefore, pre-trained language models can be directly loaded via transformers. At this point most of decoder architectures and encoder-decoder architectures are supported.

trlX is a distributed training framework designed from the ground up to focus on fine-tuning large language models with reinforcement learning using either a provided reward function or a reward-labeled dataset.

Training support for 🤗 Hugging Face models is provided by Accelerate-backed trainers, allowing users to fine-tune causal and T5-based language models of up to 20B parameters, such as facebook/opt-6.7b, EleutherAI/gpt-neox-20b, and google/flan-t5-xxl. For models beyond 20B parameters, trlX provides NVIDIA NeMo-backed trainers that leverage efficient parallelism techniques to scale effectively.

An open platform for training, serving, and evaluating large language model based chatbots.

Alpaca 是斯坦福团队基于 LLaMA 7B 在 52k 指令上微调得到的模型,能出色适应多种自然语言应用场景。近日来自商汤科技和华中科技大学开源中文语言模型 Luotuo,基于 ChatGPT API 翻译 Alpaca 微调指令数据,并使用 lora 进行微调得到。目前该项目已公开训练的语料和模型权重文件(两个型号),供开发者可使用自己各种大小的语料,训练自己的语言模型,并适用到对应的垂直领域。

以ChatGPT、GPT-4等为代表的大语言模型(Large Language Model, LLM)掀起了新一轮自然语言处理领域的研究浪潮,展现出了类通用人工智能(AGI)的能力,受到业界广泛关注。然而,由于大语言模型的训练和部署都极为昂贵,为构建透明且开放的学术研究造成了一定的阻碍。

为了促进大模型在中文NLP社区的开放研究,本项目开源了中文LLaMA模型和经过指令精调的Alpaca大模型。这些模型在原版LLaMA的基础上扩充了中文词表并使用了中文数据进行二次预训练,进一步提升了中文基础语义理解能力。同时,在中文LLaMA的基础上,本项目使用了中文指令数据进行指令精调,显著提升了模型对指令的理解和执行能力。

本项目基于 Stanford Alpaca ,Stanford Alpaca 的目标是构建和开源一个基于LLaMA的模型。 Stanford Alpaca 的种子任务都是英语,收集的数据也都是英文,因此训练出来的模型未对中文优化。

本项目目标是促进中文对话大模型开源社区的发展。本项目针对中文做了优化,模型调优仅使用由ChatGPT生产的数据(不包含任何其他数据)。

BLOOM is an autoregressive Large Language Model (LLM), trained to continue text from a prompt on vast amounts of text data using industrial-scale computational resources. As such, it is able to output coherent text in 46 languages and 13 programming languages that is hardly distinguishable from text written by humans. BLOOM can also be instructed to perform text tasks it hasn't been explicitly trained for, by casting them as text generation tasks.

元语功能型对话大模型, 这个模型可以用于问答、结合上下文做对话、做各种生成任务,包括创意性写作,也能回答一些像法律、新冠等领域问题。它基于PromptCLUE-large结合数亿条功能对话多轮对话数据进一步训练得到。

PromptCLUE-large在1000亿token中文语料上预训练,累计学习1.5万亿中文token,并且在数百种任务上进行Prompt任务式训练。针对理解类任务,如分类、情感分析、抽取等,可以自定义标签体系;针对多种生成任务,可以进行采样自由生成。

ChatGLM-6B 是一个开源的、支持中英双语的对话语言模型,基于 General Language Model (GLM) 架构,具有 62 亿参数。结合模型量化技术,用户可以在消费级的显卡上进行本地部署(INT4 量化级别下最低只需 6GB 显存)。 ChatGLM-6B 使用了和 ChatGPT 相似的技术,针对中文问答和对话进行了优化。经过约 1T 标识符的中英双语训练,辅以监督微调、反馈自助、人类反馈强化学习等技术的加持,62 亿参数的 ChatGLM-6B 已经能生成相当符合人类偏好的回答。更多信息请参考我们的博客。

本项目提供了智源研究院"文汇" 预训练模型Chinese-Transformer-XL的预训练和文本生成代码。

EVA 是目前最大的开源中文预训练对话模型,拥有28亿参数,主要擅长开放域闲聊,目前有 1.0 和 2.0 两个版本。其中,1.0版本在 WudaoCorpus-Dialog 上训练而成,2.0 版本在从 WudaoCorpus-Dialog 中清洗出的更高质量的对话数据上训练而成,模型性能也明显好于 EVA1.0。

简化整理 GPT2 训练代码(based on Grover, supporting TPUs)

移植 bert tokenizer,添加多语言支持

15亿参数 GPT2 中文预训练模型( 15G 语料,训练 10w 步 )

开箱即用的模型生成效果 demo #

15亿参数 GPT2 中文预训练模型( 30G 语料,训练 22w 步 )

PromptCLUE:大规模多任务Prompt预训练中文开源模型。

中文上的三大统一:统一模型框架,统一任务形式,统一应用方式。

支持几十个不同类型的任务,具有较好的零样本学习能力和少样本学习能力。针对理解类任务,如分类、情感分析、抽取等,可以自定义标签体系;针对生成任务,可以进行采样自由生成。

千亿中文token上大规模预训练,累计学习1.5万亿中文token,亿级中文任务数据上完成训练,训练任务超过150+。比base版平均任务提升7个点+;具有更好的理解、生成和抽取能力,并且支持文本改写、纠错、知识图谱问答。

SkyText是由奇点智源发布的中文GPT3预训练大模型,可以进行聊天、问答、中英互译等不同的任务。 应用这个模型,除了可以实现基本的聊天、对话、你问我答外,还能支持中英文互译、内容续写、对对联、写古诗、生成菜谱、第三人称转述、创建采访问题等多种功能。

持续更新中 (Continuously Updated)...

Open-Source Language Model Pocket

目录 (Table of Contents):

Low-Rank LLaMA Instruct-Tuning

This repository contains code for reproducing the Stanford Alpaca results using low-rank adaptation (LoRA). We provide an Instruct model of similar quality to text-davinci-003 that can run on a Raspberry Pi (for research), and the code can be easily extended to the 13b, 30b, and 65b models.

In addition to the training code, which runs within five hours on a single RTX 4090, we publish a script for downloading and inference on the foundation model and LoRA, as well as the resulting LoRA weights themselves. To fine-tune cheaply and efficiently, we use Hugging Face's PEFT as well as Tim Dettmers' bitsandbytes.

Without hyperparameter tuning or validation-based checkpointing, the LoRA model produces outputs comparable to the Stanford Alpaca model. (Please see the outputs included below.) Further tuning might be able to achieve better performance; I invite interested users to give it a try and report their results.

An Instruction Fine-Tuning Platform with Instruction Data Collection and Unified Large Language Models Interface

Alpaca-CoT项目旨在探究如何更好地通过instruction-tuning的方式来诱导LLM具备类似ChatGPT的交互和instruction-following能力。为此,我们广泛收集了不同类型的instruction(尤其是Chain-of-Thought数据集),并基于LLaMA给出了深入细致的实证研究,以供未来工作参考。据我们所知,我们是首个将CoT拓展进Alpaca的工作,因此简称为"Alpaca-CoT"。

本项目目标是促进中文对话大模型开源社区的发展,愿景做能帮到每一个人的LLM Engine。现阶段本项目基于一些开源预训练大语言模型(如BLOOM),针对中文做了优化,模型调优仅使用由ChatGPT生产的数据(不包含任何其他数据)。

Colossal-AI: Making large AI models cheaper, faster and more accessible

Colossal-AI provides a collection of parallel components for you. We aim to support you to write your distributed deep learning models just like how you write your model on your laptop. We provide user-friendly tools to kickstart distributed training and inference in a few lines.

开源7个可商用GPT模型,含数据集和可直接下载的预训练模型权重: Cerebras 开源 7 个 GPT 模型,均可商用,参数量分别达到 1.11 亿、2.56 亿、5.9 亿、13 亿、27 亿、67 亿和 130 亿。其中最大的模型参数量达到 130 亿,与 Meta 最近开源的 LLaMA-13B 相当。该项目开源数据集和预训练模型权重,其中预训练模型权重文件大小近50G可直接下载,并且可用于商业和研究用途。与此前的 GPT-3 模型相比,Cerebras 开源的模型具有更高的可用性和透明度,研究人员和开发者可以使用少量数据对其进行微调,构建出高质量的自然语言处理应用。

ChatRWKV is like ChatGPT but powered by my RWKV (100% RNN) language model, which is the only RNN (as of now) that can match transformers in quality and scaling, while being faster and saves VRAM. Training sponsored by Stability EleutherAI :)

ChatLLaMA 🦙 has been designed to help developers with various use cases, all related to RLHF training and optimized inference.

ChatLLaMA is a library that allows you to create hyper-personalized ChatGPT-like assistants using your own data and the least amount of compute possible. Instead of depending on one large assistant that “rules us all”, we envision a future where each of us can create our own personalized version of ChatGPT-like assistants. Imagine a future where many ChatLLaMAs at the "edge" will support a variety of human's needs. But creating a personalized assistant at the "edge" requires huge optimization efforts on many fronts: dataset creation, efficient training with RLHF, and inference optimization.

We show that anyone can take a dated off-the-shelf open source large language model (LLM) and give it magical ChatGPT-like instruction following ability by training it in 30 minutes on one machine, using high-quality training data. Surprisingly, instruction-following does not seem to require the latest or largest models: our model is only 6 billion parameters, compared to 175 billion for GPT-3. We open source the code for our model (Dolly) and show how it can be re-created on Databricks. We believe models like Dolly will help democratize LLMs, transforming them from something very few companies can afford into a commodity every company can own and customize to improve their products.

FlexGen is a high-throughput generation engine for running large language models with limited GPU memory. FlexGen allows high-throughput generation by IO-efficient offloading, compression, and large effective batch sizes.

Limitation. As an offloading-based system running on weak GPUs, FlexGen also has its limitations. FlexGen can be significantly slower than the case when you have enough powerful GPUs to hold the whole model, especially for small-batch cases. FlexGen is mostly optimized for throughput-oriented batch processing settings (e.g., classifying or extracting information from many documents in batches), on single GPUs.

FlagAI (Fast LArge-scale General AI models) is a fast, easy-to-use and extensible toolkit for large-scale model. Our goal is to support training, fine-tuning, and deployment of large-scale models on various downstream tasks with multi-modality.

FlagData, a data processing toolkit that is easy to use and expand. FlagData integrates the tools and algorithms of multi-step data processing, including cleaning, condensation, annotation and analysis, providing powerful data processing support for model training and deployment in multiple fields, including natural language processing and computer vision.

LLaMA: Open and Efficient Foundation Language Models

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

Demo, data and code to train an assistant-style large language model with ~800k GPT-3.5-Turbo Generations based on LLaMa

HuggingGPT利用ChatGPT作为控制器,连接HuggingFace社区中的各种AI模型,来完成多模态复杂任务。

这意味着,你将拥有一种超魔法,通过HuggingGPT,便可拥有多模态能力,文生图、文生视频、语音全能拿捏了。

Inference of LLaMA model in pure C/C++

The main goal is to run the model using 4-bit quantization on a MacBook

This is the repo for the Llama-X, which aims to:

Lit-LLaMA is:

OpenChatKit uses a 20 billion parameter chat model trained on 43 million instructions and supports reasoning, multi-turn conversation, knowledge and generative answers.

Open Assistant is a project meant to give everyone access to a great chat based large language model.

We believe that by doing this we will create a revolution in innovation in language. In the same way that stable-diffusion helped the world make art and images in new ways we hope Open Assistant can help improve the world by improving language itself.

Implementation of RLHF (Reinforcement Learning with Human Feedback) on top of the PaLM architecture. Maybe I'll add retrieval functionality too, à la RETRO

A modular RL library to fine-tune language models to human preferences

We provide easily customizable building blocks for training language models including implementations of on-policy algorithms, reward functions, metrics, datasets and LM based actor-critic policies

在这个项目中,我们将通过开源项目 trl 搭建一个通过强化学习算法(PPO)来更新语言模型(GPT-2)的几个示例,包括:

Alpaca: A Strong, Replicable Instruction-Following ModelAl

We introduce Alpaca 7B, a model fine-tuned from the LLaMA 7B model on 52K instruction-following demonstrations. On our preliminary evaluation of single-turn instruction following, Alpaca behaves qualitatively similarly to OpenAI’s text-davinci-003, while being surprisingly small and easy/cheap to reproduce (<600$).

With trl you can train transformer language models with Proximal Policy Optimization (PPO). The library is built on top of the transformers library by 🤗 Hugging Face. Therefore, pre-trained language models can be directly loaded via transformers. At this point most of decoder architectures and encoder-decoder architectures are supported.

trlX is a distributed training framework designed from the ground up to focus on fine-tuning large language models with reinforcement learning using either a provided reward function or a reward-labeled dataset.

Training support for 🤗 Hugging Face models is provided by Accelerate-backed trainers, allowing users to fine-tune causal and T5-based language models of up to 20B parameters, such as facebook/opt-6.7b, EleutherAI/gpt-neox-20b, and google/flan-t5-xxl. For models beyond 20B parameters, trlX provides NVIDIA NeMo-backed trainers that leverage efficient parallelism techniques to scale effectively.

An open platform for training, serving, and evaluating large language model based chatbots.

Alpaca 是斯坦福团队基于 LLaMA 7B 在 52k 指令上微调得到的模型,能出色适应多种自然语言应用场景。近日来自商汤科技和华中科技大学开源中文语言模型 Luotuo,基于 ChatGPT API 翻译 Alpaca 微调指令数据,并使用 lora 进行微调得到。目前该项目已公开训练的语料和模型权重文件(两个型号),供开发者可使用自己各种大小的语料,训练自己的语言模型,并适用到对应的垂直领域。

以ChatGPT、GPT-4等为代表的大语言模型(Large Language Model, LLM)掀起了新一轮自然语言处理领域的研究浪潮,展现出了类通用人工智能(AGI)的能力,受到业界广泛关注。然而,由于大语言模型的训练和部署都极为昂贵,为构建透明且开放的学术研究造成了一定的阻碍。

为了促进大模型在中文NLP社区的开放研究,本项目开源了中文LLaMA模型和经过指令精调的Alpaca大模型。这些模型在原版LLaMA的基础上扩充了中文词表并使用了中文数据进行二次预训练,进一步提升了中文基础语义理解能力。同时,在中文LLaMA的基础上,本项目使用了中文指令数据进行指令精调,显著提升了模型对指令的理解和执行能力。

本项目基于 Stanford Alpaca ,Stanford Alpaca 的目标是构建和开源一个基于LLaMA的模型。 Stanford Alpaca 的种子任务都是英语,收集的数据也都是英文,因此训练出来的模型未对中文优化。

本项目目标是促进中文对话大模型开源社区的发展。本项目针对中文做了优化,模型调优仅使用由ChatGPT生产的数据(不包含任何其他数据)。

BLOOM is an autoregressive Large Language Model (LLM), trained to continue text from a prompt on vast amounts of text data using industrial-scale computational resources. As such, it is able to output coherent text in 46 languages and 13 programming languages that is hardly distinguishable from text written by humans. BLOOM can also be instructed to perform text tasks it hasn't been explicitly trained for, by casting them as text generation tasks.

元语功能型对话大模型, 这个模型可以用于问答、结合上下文做对话、做各种生成任务,包括创意性写作,也能回答一些像法律、新冠等领域问题。它基于PromptCLUE-large结合数亿条功能对话多轮对话数据进一步训练得到。

PromptCLUE-large在1000亿token中文语料上预训练,累计学习1.5万亿中文token,并且在数百种任务上进行Prompt任务式训练。针对理解类任务,如分类、情感分析、抽取等,可以自定义标签体系;针对多种生成任务,可以进行采样自由生成。

ChatGLM-6B 是一个开源的、支持中英双语的对话语言模型,基于 General Language Model (GLM) 架构,具有 62 亿参数。结合模型量化技术,用户可以在消费级的显卡上进行本地部署(INT4 量化级别下最低只需 6GB 显存)。 ChatGLM-6B 使用了和 ChatGPT 相似的技术,针对中文问答和对话进行了优化。经过约 1T 标识符的中英双语训练,辅以监督微调、反馈自助、人类反馈强化学习等技术的加持,62 亿参数的 ChatGLM-6B 已经能生成相当符合人类偏好的回答。更多信息请参考我们的博客。

本项目提供了智源研究院"文汇" 预训练模型Chinese-Transformer-XL的预训练和文本生成代码。

EVA 是目前最大的开源中文预训练对话模型,拥有28亿参数,主要擅长开放域闲聊,目前有 1.0 和 2.0 两个版本。其中,1.0版本在 WudaoCorpus-Dialog 上训练而成,2.0 版本在从 WudaoCorpus-Dialog 中清洗出的更高质量的对话数据上训练而成,模型性能也明显好于 EVA1.0。

简化整理 GPT2 训练代码(based on Grover, supporting TPUs)

移植 bert tokenizer,添加多语言支持

15亿参数 GPT2 中文预训练模型( 15G 语料,训练 10w 步 )

开箱即用的模型生成效果 demo #

15亿参数 GPT2 中文预训练模型( 30G 语料,训练 22w 步 )

PromptCLUE:大规模多任务Prompt预训练中文开源模型。

中文上的三大统一:统一模型框架,统一任务形式,统一应用方式。

支持几十个不同类型的任务,具有较好的零样本学习能力和少样本学习能力。针对理解类任务,如分类、情感分析、抽取等,可以自定义标签体系;针对生成任务,可以进行采样自由生成。

千亿中文token上大规模预训练,累计学习1.5万亿中文token,亿级中文任务数据上完成训练,训练任务超过150+。比base版平均任务提升7个点+;具有更好的理解、生成和抽取能力,并且支持文本改写、纠错、知识图谱问答。

SkyText是由奇点智源发布的中文GPT3预训练大模型,可以进行聊天、问答、中英互译等不同的任务。 应用这个模型,除了可以实现基本的聊天、对话、你问我答外,还能支持中英文互译、内容续写、对对联、写古诗、生成菜谱、第三人称转述、创建采访问题等多种功能。

THE END
0.浙西天池严禁穿越,去浙3处“平替”打卡地看同款风景|东白山|大明浙西天池严禁穿越,去浙3处“平替”打卡地看同款风景创新工具箱 2025-11-11 04:57 ·香港 0 近日,浙江清凉峰国家级自然保护区管理局(以下简称“清凉峰管理局”)发布一则《关于严禁穿越浙江清凉峰国家级自然保护区核心区域的告知书》。 由于部分户外团队和徒步爱好者非法穿越千顷塘保护区域,给保护区的森林防火工作jvzquC41o07757hqo1jz1jwvkerf1TJ4FC?Q29:78J;YV7mvon
1.ks推广自助网站如何助力中小企业实现业务增长?ks热度网站免费:为什么说它是打工人工具箱里最实用的“平替”? ks推广自助网站如何助力中小企业实现业务增长? 在当前数字化浪潮下,企业营销方式正经历深刻变革。传统的广告投放与人力推广模式成本高、见效慢,已难以满足中小企业快速试错、灵活运营的需求。在此背景下,ks推广自助网站应运而生,成为众多企业开启线上业务拓展的重要入口。该平台不仅提 jvzq<84yyy4{jnsizk3ugl3eqo5cnxl1s|y1x\IrEM4ivvq
2.软件工具箱TreeSize 类型:文件管理 介绍:快速查看磁盘上文件的空间占用情况 好用度: 流氓软件: 广告: 平替: Everything 类型:文件管理 介绍:超级快速的本地文件搜索工具 好用度: 流氓软件: 广告: 平替: 7-Zip 类型:压缩工具 介绍:支持7z, zip, rar等压缩格式 好用度: 流氓软件: 广告: 平替:jvzquC41yy}/jn~gzrrptnw0ep5uqxqu1jung8x{uvkn
3.《设计思维工具箱》:我服!设计思维还得看这“箱”!(设计思维工具设计思维还得看这“箱”! 这篇书评可能有关键情节透露 重点: 1、在表达原理上,以“设计工具”为核心,多视角结合包容度,图文可视化搭配原型构建,在配合同理心,搭建出富有创意的底层表达逻辑,让你理解的更加透彻,输出的更有条理。 2、在内容方面,不仅有设计思维工具的简单介绍,还有平替工具以及建议,搭配应用示例图jvzquC41dqul0mtwdct/exr1tg|jg€4376>829<1
4.38款免费、开源、实用的工业设计软件20. MATLAB的平替GNU Octave 软件类别:工具类软件/CAE/数据分析 推荐理由:GNU Octave是一款数学计算软件,在GNU开放软件框架下, 可以代替MATLAB的大多数功能。GNU Octave主要用于科学计算、数据分析和可视化等方面,支持丰富的数学函数和工具箱,可以帮助用户进行各种数学计算和数据分析,包括线性代数、数值积分、微积分、方程jvzq<84yyy4489iqe0ipo8hqpvkov87512=298721369;969736`39=;23=6393ujvsm
5.iPadPencil平替推荐,高性价比选择指南不过,我又担心平替产品可能伤屏或质量不好,坏了还得重新挑选。主要用途是记笔记,偶尔画画。有没有推荐的平替品牌? 其它配件 苹果 苹果Pencil 23人讨论5053 浏览 5 回答 请问这个图形在ps工具箱哪里,怎么画出来? 6449 浏览 4 回答 ZOL问答 其它配件 苹果Pencil 问题详情 电脑版 首页jvzquC41ycv/|xq0eqs/ew4cum5ya;=868;937mvon
6.Sora模型效果炸裂!盘点那些Sora的免费平替AI工具盘点那些Sora的免费平替AI工具 🔍 智谱清影:电影级运镜的免费黑马 智谱清影作为国内大模型厂商智谱 AI 的视频生成工具,近期上线后迅速成为行业焦点。它支持文生视频、图生视频和视频生成视频三种模式,生成 6 秒视频仅需 30 秒时间,效率在同类工具中处于领先水平。在南都记者的实测中,清影对 “拟人化猫咪做菜” 的jvzquC41yy}/:uwe0eun1mopgyy0pƒzjz0nuou
7.实拍亲测️特斯拉平替脚垫到底选哪个实拍亲测️特斯拉平替脚垫到底选哪个? 很多特斯拉准车主在提车前就开始纠结脚垫怎么选了,全包还是半包?大品牌还是平价替代款?美尼斯毯面还是真维斯材质? 作为一个过来人,给大家看看我的真实体验⬆️ 所有照片都是手机直出,绝对真实无修图无滤镜! 个人强烈不建议选全包款❌ 只要是全包的就必须要拆座椅动线路!还jvzquC41o0pqsnc0eun1yfigu5eqwlvck5jpmjzAkj>8>5347912;+htqs`u{h?dkpja}fd
8.硬派越野降价了,卫士平替来了,国产车真敢想奇瑞iCAR V27(图片)还没正式开卖,价格消息就传出来了,大概在二十万到二十五万之间,这车外形看着像路虎卫士,圆灯、方框、大保险杠,轮眉和车顶行李架也模仿得挺像,不过它有个小设计很聪明,把工具箱藏在扶梯里,不光显得好看,还能减小风阻,国产车现在也开始懂得把实用和外观结合起来做。 这款车采用1.5T增程系统,发动机负责jvzquC41yy}/fxsiejkek7hqo1gsvrhng1=699:3:797797863816
9.simulink中比scope模块还好用的平替出图工具?①平替scope模块 ②能保留模型的历史仿真数据,方便比对:如调参时,通过比较观测量判断调节参数的“好坏”。 首先,他是在这个位置: 下面介绍该工具的使用方法: 1.首先在我们需要的观测信号线上选中这个无线wifi信号:点击信号线,选择wifi标识信号,并且命名该信号(信号一多分不清,方便查看)为:pas jvzquC41dnuh0lxfp0tfv8|gkzooa>5:;4>228ftvkimg8igvcomu864:286::5
10.【惠百施平替|48孔宽头更舒适】洁饶10支高端宽头牙刷护龈超软【惠百施平替 | 48孔宽头更舒适】洁饶 10支高端宽头牙刷 护龈超软细丝 成人家庭装情侣牙刷 分享品质保证 过敏无忧 专属客服 先行赔付 服务退货包运费 · 7天无理由退货 · 合作商直销 选择 规格 共1种规格可选6人已购买 查看全部 林*侠 06月27日买了1件 庞* 06月26日买了1件 徐*华 03月18日jvzquC41j74zq~cp0ipo8{41iupf|44qi{nz
11.Ican工具箱V5.0稳定版经典图书 Ican工具箱V5.0稳定版 jvzquC41yy}/u€gdue4dqv4vjtkbf6895774/:230jznn
12.ToDeskAI百宝箱:解锁100+种AI与办公实用技能,免费AI绘图、抠图【图片工具箱】 该功能是超适合电商人的宝藏功能,平替PS一键轻松改图,解决抠图/图片转化/无损放大/人像变清晰等难题。 智能抠图可帮助电商运营10分钟提取商品主题更换宣传图;图片转换帮助设计师快速解决客户提供素材格式混乱转换;无损放大帮助电商卖家解决供应商提供商品图分辨率过低,AI算法智能补充细节,清晰放大2倍,甚至jvzq<84o0kqbplmck0ipo8uectzjeuj185<88B
13.2024年ctfshowweb延时注入平替网络安全工具箱 当然,当你入门之后,仅仅是视频教程已经不能满足你的需求了,你肯定需要学习各种工具的使用以及大量的实战项目,这里也分享一份我自己整理的网络安全入门工具以及使用教程和实战。 项目实战 最后就是项目实战,这里带来的是SRC资料&HW资料,毕竟实战是检验真理的唯一标准嘛~ jvzquC41dnuh0lxfp0tfv87623e96;;68;80c{ykenk0fnyckny03<=6797:7