Token Utilities

Helper functions for managing tokens with the HF-Inferoxy proxy server.

Overview

This module provides essential functions for:

  • Getting managed tokens from the proxy server (requires authentication)
  • Reporting token usage status (success/error)
  • Automatic error classification and handling
  • Environment variable management

⚠️ Important: Authentication Required

All client operations now require authentication with the HF-Inferoxy server. This is part of the Role-Based Access Control (RBAC) system that provides secure access to the proxy services.

Getting Your API Key

  1. Default Admin User: The system creates a default admin user on first run. Check your server logs or the users.json file for the default admin credentials.

  2. Create a User Account: Use the admin account to create a regular user account:
    curl -X POST "http://localhost:8000/admin/users" \
      -H "Authorization: Bearer ADMIN_API_KEY" \
      -H "Content-Type: application/json" \
      -d '{"username": "youruser", "email": "user@example.com", "full_name": "Your Name", "role": "user"}'
    
  3. Use the Generated API Key: The response will include an API key that you’ll use in all client operations.

For detailed RBAC setup and user management, see RBAC_README.md.

Code Example

import os
import requests
import json
from typing import Dict, Optional, Any, Tuple

def get_proxy_token(proxy_url: str = "http://localhost:8000", api_key: str = None) -> Tuple[str, str]:
    """
    Get a valid token from the proxy server.
    
    Args:
        proxy_url: URL of the HF-Inferoxy server
        api_key: Your API key for authenticating with the proxy server (REQUIRED)
        
    Returns:
        Tuple of (token, token_id)
        
    Raises:
        Exception: If token provisioning fails
    """
    headers = {}
    if api_key:
        headers["Authorization"] = f"Bearer {api_key}"
    
    response = requests.get(f"{proxy_url}/keys/provision", headers=headers)
    if response.status_code != 200:
        raise Exception(f"Failed to provision token: {response.text}")
    
    data = response.json()
    token = data["token"]
    token_id = data["token_id"]
    
    # For convenience, also set environment variable
    os.environ["HF_TOKEN"] = token
    
    return token, token_id

def report_token_status(
    token_id: str, 
    status: str = "success", 
    error: Optional[str] = None,
    proxy_url: str = "http://localhost:8000",
    api_key: str = None,
    client_name: Optional[str] = None,
) -> bool:
    """
    Report token usage status back to the proxy server.
    
    Args:
        token_id: ID of the token to report (from get_proxy_token)
        status: Status to report ('success' or 'error')
        error: Error message if status is 'error'
        proxy_url: URL of the HF-Inferoxy server
        api_key: Your API key for authenticating with the proxy server (REQUIRED)
        client_name: Optional end-user identifier for attribution; defaults to username on server
        
    Returns:
        True if report was accepted, False otherwise
    """
    payload = {"token_id": token_id, "status": status}
    
    if error:
        payload["error"] = error
        
        # Extract error classification based on actual HF error patterns
        error_type = None
        if "401 Client Error" in error:
            error_type = "invalid_credentials"
        elif "402 Client Error" in error and "exceeded your monthly included credits" in error:
            error_type = "credits_exceeded"
            
        if error_type:
            payload["error_type"] = error_type
    
    if client_name:
        payload["client_name"] = client_name

    headers = {"Content-Type": "application/json"}
    if api_key:
        headers["Authorization"] = f"Bearer {api_key}"
        
    try:
        response = requests.post(f"{proxy_url}/keys/report", json=payload, headers=headers)
        return response.status_code == 200
    except Exception as e:
        # Silently fail to avoid breaking the client application
        # In production, consider logging this error
        return False

Functions

get_proxy_token(proxy_url: str = "http://localhost:8000", api_key: str = None) -> Tuple[str, str]

Retrieves a valid token from the HF-Inferoxy proxy server.

Parameters:

  • proxy_url: URL of the HF-Inferoxy server (defaults to localhost:8000)
  • api_key: REQUIRED - Your API key for authenticating with the proxy server

Returns:

  • token: The actual API token to use with HuggingFace providers
  • token_id: Unique identifier for tracking this token’s usage

Features:

  • Automatically sets HF_TOKEN environment variable for convenience
  • Handles HTTP errors gracefully
  • Returns both token and tracking ID
  • Requires authentication with your proxy API key

report_token_status(token_id: str, status: str = "success", error: Optional[str] = None, proxy_url: str = "http://localhost:8000", api_key: str = None) -> bool

Reports the status of a token back to the proxy server for health monitoring.

Parameters:

  • token_id: ID returned from get_proxy_token()
  • status: Either “success” or “error”
  • error: Error message if status is “error”
  • proxy_url: URL of the HF-Inferoxy server
  • api_key: REQUIRED - Your API key for authenticating with the proxy server

Returns:

  • True if report was accepted, False otherwise

Features:

  • Automatic error classification (invalid credentials, credits exceeded)
  • Silent failure to avoid breaking client applications
  • Supports custom proxy server URLs
  • Requires authentication with your proxy API key

Usage Examples

Basic Token Usage

from hf_token_utils import get_proxy_token, report_token_status

# You need to get your API key from the admin or create a user account
# See RBAC_README.md for details on user management
proxy_api_key = "your_proxy_api_key_here"  # Get this from admin

# Get a token (requires authentication)
token, token_id = get_proxy_token(api_key=proxy_api_key)

try:
    # Use the token with your HuggingFace client
    # ... your code here ...
    
    # Report success
    report_token_status(token_id, "success", api_key=proxy_api_key)
    
except Exception as e:
    # Report error
    report_token_status(token_id, "error", str(e), api_key=proxy_api_key)
    raise

Custom Proxy Server

# Use a different proxy server
token, token_id = get_proxy_token("https://my-proxy.example.com", api_key=proxy_api_key)

# Report to the same server
report_token_status(token_id, "success", proxy_url="https://my-proxy.example.com", api_key=proxy_api_key)

Error Handling

try:
    # Your HuggingFace API call
    result = client.chat.completions.create(...)
    report_token_status(token_id, "success", api_key=proxy_api_key)
    
except Exception as e:
    error_msg = str(e)
    
    # The function automatically classifies common errors
    report_token_status(token_id, "error", error_msg, api_key=proxy_api_key)
    
    # You can also check the error type manually
    if "401 Client Error" in error_msg:
        print("Authentication failed - token may be invalid")
    elif "402 Client Error" in error_msg:
        print("Credits exceeded - token may be rate limited")

Installation

  1. Copy this file to your project directory
  2. Install required dependencies: uv add requests
  3. Import and use the functions in your code
  4. Get your API key from the HF-Inferoxy admin or create a user account

Error Classification

The utility automatically classifies common HuggingFace errors:

  • invalid_credentials: 401 errors indicating token authentication failure
  • credits_exceeded: 402 errors indicating monthly credit limits exceeded

This helps the proxy server make intelligent decisions about token health and rotation.