オーディオ推論

オプションのテキスト指示による高度な音声分析を可能にします。

エンドポイント

POST https://api.sambanova.ai/v1/audio/reasoning

リクエストパラメータ

以下の表は、音声リクエストを行うために必要なパラメータ、パラメータの型、説明、およびデフォルト値を示しています。

パラメータ	型	説明	デフォルト
`model`	String	使用するモデルのID。現在はQwen2-Audio-7B-Instructのみ利用可能です。	必須
`messages`	Message	ロール（user/system/assistant）、タイプ（text/audio_content）、およびaudio_content（base64音声コンテンツ）を含むメッセージのリスト。	必須
`response_format`	String	出力形式、“json”または”text”。	`json`
`temperature`	Integer	0から1の間のサンプリング温度。高い値（例：0.8）はランダム性を増加させ、低い値（例：0.2）は出力をより焦点を絞ったものにします。	`0`
`max_tokens`	Integer	生成する最大トークン数。	`1000`
`file`	File	FLAC、MP3、MP4、MPEG、MPGA、M4A、Ogg、WAV、またはWebM形式の音声ファイル。各ファイルは30秒を超えてはいけません。	必須
`stream`	Boolean	ストリーミングレスポンスを有効にします。	`false`
`stream_options`	Object	追加のストリーミング設定（例：{“include_usage”: true}）。	オプション

リクエスト形式

このセクションでは、異なる方法でリクエストを送信する例を提供します。

CURL

curl --location 'https://api.sambanova.ai/v1/audio/reasoning' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer YOUR_API_KEY' \
--data '{
    "messages": [
        {"role": "assistant", "content": "you are a helpful assistant"},  
        {"role": "user", "content":[
                {
                    "type": "audio_content",
                    "audio_content": {
                        "content": "data:audio/mp3;base64,<base64_audio>"
                    }
                }
            ]
        },
        {"role": "user", "content": "what is the audio about"}
    ],   
    "max_tokens": 1024,
    "model": "Qwen2-Audio-7B-Instruct",
    "temperature": 0.01,
    "stream": true // Optional
}'

Python

import requests
import base64

def analyze_audio(audio_file_path, api_key):
    with open(audio_file_path, "rb") as audio_file:
        base64_audio = base64.b64encode(audio_file.read()).decode('utf-8')
    
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {api_key}"
    }
    
    data = {
        "messages": [
            {"role": "assistant", "content": "you are a helpful assistant"},
            {"role": "user", "content": [
                {
                    "type": "audio_content",
                    "audio_content": {
                        "content": f"data:audio/mp3;base64,{base64_audio}"
                    }
                }
            ]},
            {"role": "user", "content": "what is the audio about"}
        ],
        "model": "Qwen2-Audio-7B-Instruct",
        "max_tokens": 1024,
        "temperature": 0.01,
        "stream": True  # Optional
    }
    
    response = requests.post(
        "https://api.sambanova.ai/v1/audio/reasoning",
        headers=headers,
        json=data
    )
    
    return response.json()

レスポンス形式

APIは選択された形式でレスポンスを返します。

{
    "choices": [{
        "delta": {
            "content": "The sound is that of ",
            "role": "assistant"
        },
        "finish_reason": null,
        "index": 0,
        "logprobs": null
    }],
    "created": 1732317298,
    "id": "211b9a22-58cf-4b90-94e9-1fed8d0d9d0a",
    "model": "Qwen2-Audio-7B-Instruct",
}

ストリーミングレスポンス

ストリーミングが有効な場合、APIは以下の形式でデータチャンクのシリーズを返します：

data: {"choices":[{"delta":{"content":"","role":"assistant"},"finish_reason":null,"index":0,"logprobs":null}],"created":1732317298,"id":"211b9a22-58cf-4b90-94e9-1fed8d0d9d0a","model":"Qwen2-Audio-7B-Instruct","object":"chat.completion.chunk","system_fingerprint":"fastcoe"}

data: {"choices":[{"delta":{"content":"The sound is that of ","role":"assistant"},"finish_reason":null,"index":0,"logprobs":null}],"created":1732317298,"id":"211b9a22-58cf-4b90-94e9-1fed8d0d9d0a","model":"Qwen2-Audio-7B-Instruct","object":"chat.completion.chunk","system_fingerprint":"fastcoe"}

Endpoints

Using the API

エンドポイント

リクエストパラメータ

リクエスト形式

CURL

Python

レスポンス形式

ストリーミングレスポンス

Endpoints

Using the API

​エンドポイント

​リクエストパラメータ

​リクエスト形式

​CURL

​Python

​レスポンス形式

​ストリーミングレスポンス

エンドポイント

リクエストパラメータ

リクエスト形式

CURL

Python

レスポンス形式

ストリーミングレスポンス