> For the complete documentation index, see [llms.txt](https://docs.skydeck.ai/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.skydeck.ai/zh-cn/integrations/llms-and-databases/huggingface-integration.md).

# HuggingFace 集成

在 HuggingFace 上部署您的推理端点后，您应该会看到以下用户界面：

<figure><img src="/files/GpIUSaHlgSRjdTGvrmyD" alt="huggingface 部署界面"><figcaption></figcaption></figure>

在此页面上，您需要以下信息：

* 端点 URL
* 模型库
* API 令牌。您可以通过在调用示例代码块中勾选“添加 API 令牌”框来查看此信息。

除了这些，您还需要模型的上下文窗口。您可以在模型的信息页面找到此信息。

收集完这些信息后，将其格式化为 JSON，如下例所示：

```json
{
    "api_key":"your_api_key",
    "endpoint": "your_api_endpoint",
    "model_name": "meta-llama/Llama-2-7b-chat-hf",
    "context_window": 4096
}
```

接下来，将其粘贴到您的集成的凭据字段中。

<figure><img src="/files/D3hOFlCmGYChr1A9Jxof" alt="凭据字段"><figcaption></figcaption></figure>

一旦凭据成功验证，您应该会在 GenStudio 的模型列表中看到您的 HuggingFace 模型：

<figure><img src="/files/F054pd2UQdPEtFx3tvfF" alt="huggingface 模型作为 genstudio 模型"><figcaption></figcaption></figure>

### 将HuggingFace端点缩减到零

缩减到0是Inference Endpoints提供的一种动态特性，旨在优化资源利用和成本。通过智能监控请求模式并在空闲时间将副本数量减少到零，确保您只在必要时使用资源。

然而，这确实引入了一个冷启动期，当流量恢复时，有几个需要注意的考虑因素。要深入了解这个特性的功能，其优点以及潜在挑战，请参考[HuggingFace的自动缩放指南](https://huggingface.co/docs/inference-endpoints/autoscaling)。

### 支持的模型

目前，我们仅支持带有 `text-generation` 标签并作为 `text-generation-inference` 容器部署的模型的端点。我们正在努力扩展我们支持的模型列表。

<figure><img src="/files/avhr2PisCq2KAC8k762m" alt="image (48)"><figcaption><p>LLaMA 2 是一个带有文本生成标签的模型</p></figcaption></figure>

<figure><img src="/files/kQ5dp01uhRz8i0hvIvmz" alt="image (49)"><figcaption><p>确保在部署期间选择文本生成推理作为容器类型</p></figcaption></figure>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.skydeck.ai/zh-cn/integrations/llms-and-databases/huggingface-integration.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
