SkyDeck.ai Docs
Sign UpAdmin Sign InContact Us
English
English
  • SkyDeck.ai
  • GenStudio Workspace
    • Conversations
    • SkyDeck AI Helper App
    • Document Upload
    • Sharing and Collaboration
    • Slack Synchronization
    • Public Snapshots
    • Web Browsing
    • Tools
      • Pair Programmer
        • How to Use
        • Example – Python Script Assistance
      • SQL Assistant
        • How to Use
        • Example – Query Debugging
      • Legal Agreement Review
        • How to Use
        • Example – NDA Clause
      • Teach Me Anything
        • How to Use
        • Example – Intro to Programming
      • Strategy Consultant
        • How to Use
        • Example – Employee Retention
      • Image Generator
        • How to Use
        • Example – Winter Wonderland
    • Data Security
      • Data Loss Prevention
  • Control Center
    • Admin & Owner Tools
    • Setup Guide
      • Set Up Account
      • Set Up Integrations
        • Integration Assistance
      • Set Up Security
        • Authentication (SSO)
      • Organize Teams
        • Add New Group
        • Remove Groups
      • Curate Tools
        • System Tools
        • Assign Tags
      • Manage Members
        • Add Members
        • Import File
        • Invite Members
        • Edit Members
    • Billing
      • Free Trial
      • Buy Credit
      • Plans and Upgrades
      • Model Usage Prices
  • Integrations
    • LLMs and Databases
      • Anthropic Integration
      • Database Integration
      • Groq Integration
      • HuggingFace Integration
      • Mistral Integration
      • OpenAI Integration
      • Perplexity Integration
      • Together AI Integration
      • Vertex AI Integration
    • App Integrations
      • Rememberizer Integration
      • Slack Integration
  • Developers
    • Develop Your Own Tools
      • JSON format for Tools
      • JSON Format for LLM Tools
      • Example: Text-based UI Generator
      • JSON Format for Smart Tools
  • Use Cases
    • Creating a Privacy Policy
  • Notices
    • Terms of Use
    • Privacy Policy
    • Cookie Notice
  • Releases
    • May 9th, 2025
    • May 2nd, 2025
    • Apr 25th, 2025
    • Apr 18th, 2025
    • Apr 11th, 2025
    • Apr 4th, 2025
    • Mar 28th, 2025
    • Mar 21st, 2025
    • Mar 14th, 2025
    • Mar 7th, 2025
    • Feb 28th, 2025
    • Feb 21st, 2025
    • Feb 14th, 2025
    • Feb 7th, 2025
    • Jan 31st, 2025
    • Jan 24th, 2025
    • Jan 17th, 2025
    • Jan 10th, 2025
    • Jan 3rd, 2025
    • Dec 27th, 2024
    • Dec 20th, 2024
    • Dec 13th, 2024
    • Dec 6th, 2024
    • Nov 29th, 2024
    • Nov 22nd, 2024
    • Nov 15th, 2024
    • Nov 8th, 2024
    • Nov 1st, 2024
    • Oct 25th, 2024
    • Oct 18th, 2024
    • Oct 11th, 2024
    • Oct 4th, 2024
    • Sep 27th, 2024
    • Sep 20th, 2024
    • Sep 13th, 2024
    • Sep 6th, 2024
    • Aug 23rd, 2024
    • Aug 16th, 2024
    • Aug 9th, 2024
    • Aug 2nd, 2024
    • Jul 26th, 2024
    • Jul 12th, 2024
    • Jul 5th, 2024
    • Jun 28th, 2024
    • Jun 21st, 2024
    • Nov 12th 2023
    • Nov 6th 2023
    • Oct 30th 2023
    • Oct 23th 2023
    • Oct 16th 2023
    • Sep 18th 2023
    • Sep 8th 2023
  • Security
    • SkyDeck.ai Security Practices
    • Bug Bounty Program
  • AI Documentation
    • LLM Evaluation Report
    • SkyDeck.ai LLM Ready Documentation
Powered by GitBook
On this page
  • Scaling HuggingFace Endpoints to Zero
  • Supported models
  1. Integrations
  2. LLMs and Databases

HuggingFace Integration

Using SkyDeck.ai as the front end for your HuggingFace models.

Last updated 16 days ago

After deploying your inference endpoint on HuggingFace, you should see the following user interface:

On this page, you will need the following information:

  • Endpoint URL

  • Model Repository

  • API token. You can view this by checking the "Add API token" box in the Call Examples code block.

In addition to these, you will also need the context window of your model. This can be found on the model's information page.

After collecting this information, format it into JSON as shown in the example below:

{
    "api_key":"your_api_key",
    "endpoint": "your_api_endpoint",
    "model_name": "meta-llama/Llama-2-7b-chat-hf",
    "context_window": 4096
}

Next, paste this into the Credential field of your integration.

Once the credential is successfully validated, you should see your HuggingFace model listed in GenStudio's model list:

Scaling HuggingFace Endpoints to Zero

Scaling to 0 is a dynamic feature offered by Inference Endpoints, designed to optimize resource utilization and costs. By intelligently monitoring request patterns and reducing the number of replicas to none during idle times, it ensures that you only use resources when necessary.

Supported models

At the moment, we only support endpoints for models with a text-generation tag that are deployed as text-generation-inference containers. We are working to expand our list of supported models.

However, this does introduce a cold start period when traffic resumes, and there are a few considerations to be mindful of. For an in-depth look at how this feature functions, its benefits, and potential challenges, please refer to .

HuggingFace's guide on Autoscaling
LLaMA 2 is a model with Text Generation tag
Make sure you select Text Generation Inference as the container type during deployment
huggingface deploy interface
credential field
huggingface model as genstudio model
image (48)
image (49)