langchain_community.document_loaders.assemblyai
.AssemblyAIAudioTranscriptLoader¶
- class langchain_community.document_loaders.assemblyai.AssemblyAIAudioTranscriptLoader(file_path: str, *, transcript_format: TranscriptFormat = TranscriptFormat.TEXT, config: Optional[assemblyai.TranscriptionConfig] = None, api_key: Optional[str] = None)[source]¶
Loader for AssemblyAI audio transcripts.
It uses the AssemblyAI API to transcribe audio files and loads the transcribed text into one or more Documents, depending on the specified format.
To use, you should have the
assemblyai
python package installed, and the environment variableASSEMBLYAI_API_KEY
set with your API key. Alternatively, the API key can also be passed as an argument.Audio files can be specified via an URL or a local file path.
Initializes the AssemblyAI AudioTranscriptLoader.
- Parameters
file_path (str) – An URL or a local file path.
transcript_format (TranscriptFormat) – Transcript format to use. See class
TranscriptFormat
for more info.config (Optional[assemblyai.TranscriptionConfig]) – Transcription options and features. If
None
is given, the Transcriber’s default configuration will be used.api_key (Optional[str]) – AssemblyAI API key.
Methods
__init__
(file_path, *[, transcript_format, ...])Initializes the AssemblyAI AudioTranscriptLoader.
A lazy loader for Documents.
Transcribes the audio file and loads the transcript into documents.
load
()Load data into Document objects.
load_and_split
([text_splitter])Load Documents and split into chunks.
- __init__(file_path: str, *, transcript_format: TranscriptFormat = TranscriptFormat.TEXT, config: Optional[assemblyai.TranscriptionConfig] = None, api_key: Optional[str] = None)[source]¶
Initializes the AssemblyAI AudioTranscriptLoader.
- Parameters
file_path (str) – An URL or a local file path.
transcript_format (TranscriptFormat) – Transcript format to use. See class
TranscriptFormat
for more info.config (Optional[assemblyai.TranscriptionConfig]) – Transcription options and features. If
None
is given, the Transcriber’s default configuration will be used.api_key (Optional[str]) – AssemblyAI API key.
- async alazy_load() AsyncIterator[Document] ¶
A lazy loader for Documents.
- Return type
AsyncIterator[Document]
- lazy_load() Iterator[Document] [source]¶
Transcribes the audio file and loads the transcript into documents.
It uses the AssemblyAI API to transcribe the audio file and blocks until the transcription is finished.
- Return type
Iterator[Document]
- load_and_split(text_splitter: Optional[TextSplitter] = None) List[Document] ¶
Load Documents and split into chunks. Chunks are returned as Documents.
Do not override this method. It should be considered to be deprecated!
- Parameters
text_splitter (Optional[TextSplitter]) – TextSplitter instance to use for splitting documents. Defaults to RecursiveCharacterTextSplitter.
- Returns
List of Documents.
- Return type
List[Document]