Token Utilities

Helper functions for managing tokens with the HF-Inferoxy proxy server.

Overview

This module provides essential functions for:

Getting managed tokens from the proxy server (requires authentication)
Reporting token usage status (success/error)
Automatic error classification and handling
Environment variable management

⚠️ Important: Authentication Required

All client operations now require authentication with the HF-Inferoxy server. This is part of the Role-Based Access Control (RBAC) system that provides secure access to the proxy services.

Getting Your API Key

Default Admin User: The system creates a default admin user on first run. Check your server logs or the users.json file for the default admin credentials.

Create a User Account: Use the admin account to create a regular user account:

curl -X POST "http://localhost:8000/admin/users" \
  -H "Authorization: Bearer ADMIN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"username": "youruser", "email": "user@example.com", "full_name": "Your Name", "role": "user"}'

Use the Generated API Key: The response will include an API key that you’ll use in all client operations.

For detailed RBAC setup and user management, see RBAC_README.md.

Code Example

import os
import requests
import json
from typing import Dict, Optional, Any, Tuple

def get_proxy_token(proxy_url: str = "http://localhost:8000", api_key: str = None) -> Tuple[str, str]:
    """
    Get a valid token from the proxy server.
    
    Args:
        proxy_url: URL of the HF-Inferoxy server
        api_key: Your API key for authenticating with the proxy server (REQUIRED)
        
    Returns:
        Tuple of (token, token_id)
        
    Raises:
        Exception: If token provisioning fails
    """
    headers = {}
    if api_key:
        headers["Authorization"] = f"Bearer {api_key}"
    
    response = requests.get(f"{proxy_url}/keys/provision", headers=headers)
    if response.status_code != 200:
        raise Exception(f"Failed to provision token: {response.text}")
    
    data = response.json()
    token = data["token"]
    token_id = data["token_id"]
    
    # For convenience, also set environment variable
    os.environ["HF_TOKEN"] = token
    
    return token, token_id

def report_token_status(
    token_id: str, 
    status: str = "success", 
    error: Optional[str] = None,
    proxy_url: str = "http://localhost:8000",
    api_key: str = None,
    client_name: Optional[str] = None,
) -> bool:
    """
    Report token usage status back to the proxy server.
    
    Args:
        token_id: ID of the token to report (from get_proxy_token)
        status: Status to report ('success' or 'error')
        error: Error message if status is 'error'
        proxy_url: URL of the HF-Inferoxy server
        api_key: Your API key for authenticating with the proxy server (REQUIRED)
        client_name: Optional end-user identifier for attribution; defaults to username on server
        
    Returns:
        True if report was accepted, False otherwise
    """
    payload = {"token_id": token_id, "status": status}
    
    if error:
        payload["error"] = error
        
        # Extract error classification based on actual HF error patterns
        error_type = None
        if "401 Client Error" in error:
            error_type = "invalid_credentials"
        elif "402 Client Error" in error and "exceeded your monthly included credits" in error:
            error_type = "credits_exceeded"
            
        if error_type:
            payload["error_type"] = error_type
    
    if client_name:
        payload["client_name"] = client_name

    headers = {"Content-Type": "application/json"}
    if api_key:
        headers["Authorization"] = f"Bearer {api_key}"
        
    try:
        response = requests.post(f"{proxy_url}/keys/report", json=payload, headers=headers)
        return response.status_code == 200
    except Exception as e:
        # Silently fail to avoid breaking the client application
        # In production, consider logging this error
        return False

Functions

`get_proxy_token(proxy_url: str = "http://localhost:8000", api_key: str = None) -> Tuple[str, str]`

Retrieves a valid token from the HF-Inferoxy proxy server.

Parameters:

proxy_url: URL of the HF-Inferoxy server (defaults to localhost:8000)
api_key: REQUIRED - Your API key for authenticating with the proxy server

Returns:

token: The actual API token to use with HuggingFace providers
token_id: Unique identifier for tracking this token’s usage

Features:

Automatically sets HF_TOKEN environment variable for convenience
Handles HTTP errors gracefully
Returns both token and tracking ID
Requires authentication with your proxy API key

`report_token_status(token_id: str, status: str = "success", error: Optional[str] = None, proxy_url: str = "http://localhost:8000", api_key: str = None) -> bool`

Reports the status of a token back to the proxy server for health monitoring.

Parameters:

token_id: ID returned from get_proxy_token()
status: Either “success” or “error”
error: Error message if status is “error”
proxy_url: URL of the HF-Inferoxy server
api_key: REQUIRED - Your API key for authenticating with the proxy server

Returns:

True if report was accepted, False otherwise

Features:

Automatic error classification (invalid credentials, credits exceeded)
Silent failure to avoid breaking client applications
Supports custom proxy server URLs
Requires authentication with your proxy API key

Usage Examples

Basic Token Usage

from hf_token_utils import get_proxy_token, report_token_status

# You need to get your API key from the admin or create a user account
# See RBAC_README.md for details on user management
proxy_api_key = "your_proxy_api_key_here"  # Get this from admin

# Get a token (requires authentication)
token, token_id = get_proxy_token(api_key=proxy_api_key)

try:
    # Use the token with your HuggingFace client
    # ... your code here ...
    
    # Report success
    report_token_status(token_id, "success", api_key=proxy_api_key)
    
except Exception as e:
    # Report error
    report_token_status(token_id, "error", str(e), api_key=proxy_api_key)
    raise

Custom Proxy Server

# Use a different proxy server
token, token_id = get_proxy_token("https://my-proxy.example.com", api_key=proxy_api_key)

# Report to the same server
report_token_status(token_id, "success", proxy_url="https://my-proxy.example.com", api_key=proxy_api_key)

Error Handling

try:
    # Your HuggingFace API call
    result = client.chat.completions.create(...)
    report_token_status(token_id, "success", api_key=proxy_api_key)
    
except Exception as e:
    error_msg = str(e)
    
    # The function automatically classifies common errors
    report_token_status(token_id, "error", error_msg, api_key=proxy_api_key)
    
    # You can also check the error type manually
    if "401 Client Error" in error_msg:
        print("Authentication failed - token may be invalid")
    elif "402 Client Error" in error_msg:
        print("Credits exceeded - token may be rate limited")

Installation

Copy this file to your project directory
Install required dependencies: uv add requests
Import and use the functions in your code
Get your API key from the HF-Inferoxy admin or create a user account

Error Classification

The utility automatically classifies common HuggingFace errors:

invalid_credentials: 401 errors indicating token authentication failure
credits_exceeded: 402 errors indicating monthly credit limits exceeded

This helps the proxy server make intelligent decisions about token health and rotation.

Simple Chat Completion - Basic usage example
Streaming Chat Completion - Streaming usage example
Provider Examples - Provider-specific configuration guides
RBAC Setup - User management and authentication setup