切分音频文件：

Whisper API 仅支持小于 25 MB 的文件，如果您有比这更长的音频文件，则需要将其分红 25 MB 或更小的块或运用紧缩音频格式。

1. 装置 PyDub 和 FFmpeg

PyDub 依靠 FFmpeg 来处理各种音频格式。首要，运用 pip 装置 PyDub，然后装置 FFmpeg。

pip install pydub
brew install ffmpeg

2. 创建 Python 脚本 split_audio.py

运用以下内容创建 Python 脚本 split_audio.py，该脚本将音频文件切分红 10 分钟长的音频片段，您可以依据自己音频文件的具体情况进行切分。

from pydub import AudioSegment
import os
#切分红10分钟长的音频片段，您可以依据自己音频文件的具体情况进行切分
def split_audio(input_file, output_dir, chunk_length_ms=10 * 60 * 1000):
    # Load the audio file
    audio = AudioSegment.from_file(input_file)
    chunks =  for i in range(0, len(audio), chunk_length_ms)]
    if not os.path.exists(output_dir):
        os.makedirs(output_dir)
    for i, chunk in enumerate(chunks):
        chunk.export(os.path.join(output_dir, f"{i:03d}_{os.path.basename(input_file).split('.')[0]}.mp3"), format="mp3")
if __name__ == "__main__":
    input_file = "/Users/zhouluyao/Desktop/split_audio/001.m4a"
    output_dir = "/Users/zhouluyao/Desktop/split_audio/output/"
    # Create the output directory if it doesn't exist
    os.makedirs(output_dir, exist_ok=True)
    split_audio(input_file, output_dir)

将 /Users/zhouluyao/Desktop/split_audio/001.m4a 替换为长音频文件的途径，将 /Users/zhouluyao/Desktop/split_audio/output/ 替换为要保存拆分音频文件的文件夹的途径。

3. 运转脚本

python split_audio.py

此脚本会将输入音频文件拆分为更小的块，每个块 10 分钟（默认情况下），并将它们保存在指定的输出文件夹中。您可以更改 split_audio 函数中的 chunk_length_ms 参数以依据需要调整块长度（以毫秒为单位）。拆分音频文件后，您可以运用 Whisper API 处理每个较小的块。

音频文件转文本：

要运用 OpenAI 的 API 将本地录音文件 record.m4a 转换为文本，请依照以下步骤操作：

1. 装置 OpenAI Python 库（v0.27.0 或更高版别）

pip3 install openai

2. 运用以下脚本 speech-to-text.py 将本地录音文件 record.m4a 转换为文本

import openai
# Set your API key
#openai.api_key = "sk-1sFnitcInw96iD2UH6bjT3BlbkFJNmgvL4ur9ulkv4g"
openai.api_key = "<YOUR_API_KEY>"
# Open the audio file
audio_file = open("/Users/zhouluyao/Downloads/002.m4a", "rb")
# Transcribe the audio file
transcript = openai.Audio.transcribe("whisper-1", audio_file, language="zh")
# Get the transcribed text
transcribed_text = transcript["text"]
# Print and save the transcribed text
print(transcribed_text)
with open("transcription.txt", "w", encoding="utf-8") as file:
    file.write(transcribed_text)

将 <YOUR_API_KEY> 替换为您的实践 API 密钥，并将 /Users/zhouluyao/Downloads/002.m4a 替换为您的音频文件的途径。

3、您可以运转以下指令生成文本文件

python3 speech-to-text.py

如何使用whisper实现音频转文字

切分音频文件：

1. 装置 PyDub 和 FFmpeg

2. 创建 Python 脚本 split_audio.py

3. 运转脚本

音频文件转文本：

1. 装置 OpenAI Python 库（v0.27.0 或更高版别）

2. 运用以下脚本 speech-to-text.py 将本地录音文件 record.m4a 转换为文本

3、您可以运转以下指令生成文本文件

作者信息

如何使用whisper实现音频转文字

切分音频文件：

1. 装置 PyDub 和 FFmpeg

2. 创建 Python 脚本 split_audio.py

3. 运转脚本

音频文件转文本：

1. 装置 OpenAI Python 库（v0.27.0 或更高版别）

2. 运用以下脚本 speech-to-text.py 将本地录音文件 record.m4a 转换为文本

3、您可以运转以下指令生成文本文件

相关文章

聊聊CSS预处理语言

WSL2-Ubuntu16.04配置qemu+xv6

酒店测试环境 V3.0 设计和实践

iOS h264解码前数据处理

作者信息