google#

Integration with Google Cloud APIs.

class voice_stream.integrations.google.TTSRequest(*, text: str, voice: str)#

Bases: BaseModel

model_config: ClassVar[ConfigDict] = {}#

Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

model_fields: ClassVar[dict[str, FieldInfo]] = {'text': FieldInfo(annotation=str, required=True), 'voice': FieldInfo(annotation=str, required=True)}#

Metadata about the fields defined on the model, mapping of field names to [FieldInfo][pydantic.fields.FieldInfo].

This replaces Model.__fields__ from Pydantic V1.

text: str#
voice: str#
voice_stream.integrations.google._get_transcript(result)#
voice_stream.integrations.google._initial_recognition_config(include_events: bool, project: str, location: str, recognizer: str, model: str = 'latest_long', language_codes: str | list[str] = ['en-US', 'es-US'], audio_format: AudioFormat | ExplicitDecodingConfig | None = None)#
voice_stream.integrations.google._initial_recognition_config_v1(include_events: bool, audio_format: AudioFormat, model: str = 'latest_long', language_code: str = 'en-US', use_enhanced: bool = True)#
voice_stream.integrations.google._map_speech_events(input)#
voice_stream.integrations.google._map_speech_events_v1(input)#
voice_stream.integrations.google._resolve_audio_decoding(audio_format)#
voice_stream.integrations.google._resolve_google_audio_config(audio_format: AudioFormat | AudioConfig, speaking_rate: float)#
voice_stream.integrations.google._run_google_speech(async_iter: AsyncIterator[bytes], include_events: bool, map_audio, initial_config, speech_async_client, map_events)#
async voice_stream.integrations.google.array_source(array: list[T]) AsyncIterator[T]#

Data flow source the yields items from a list.

This function takes a list and converts it into an asynchronous iterator. Each element of the list is yielded one by one in an asynchronous fashion.

Parameters:

array (list[T]) – A list of items of type T. The elements of this list will be yielded by the asynchronous iterator.

Returns:

An asynchronous iterator that yields each element of the input list.

Return type:

AsyncIterator[T]

Examples

>>> stream = array_source([1,2,3])
>>> stream = log_step(stream, "Value")
>>> await empty_sink(stream)

Notes

  • This source is useful when an existing list needs to be processed asynchronously.

voice_stream.integrations.google.async_init_step(async_iter: AsyncIterator[T], f: Callable[[AsyncIterator[T]], Coroutine], num_outputs: int = 1) AsyncIterator[Output]#

Data flow step that asyncronously initializes a step.

This step takes an AsyncIterator and a lambda that takes the iterator and returns an Awaitable. It eagerly pulls the first item from the iterator and separately calls the lambda and awaits the result in a separate task. This is useful when you ahve a step that requires an async call to initialize (which we generally avoid). It’s also useful when the initialization depends on a value produced from earlier in the data flow.

Parameters:
  • async_iter (AsyncIterator[T]) – An input asynchronous iterator.

  • f (Callable[[AsyncIterator[T]], Coroutine]) – A coroutine function that takes the input async iterator and returns another async iterator

  • num_outputs (int, optional) – The number of output async iterators to expect from f. Defaults to 1. Use this if the step you are asynchronously initializing returns more than one iterator, such as a voice_stream:fork_step. num_outputs can be 0 in the case where the asynchronous initialization is for a sink.

Returns:

if num_outputs is 1, then returns an AsyncIterator otherwise, returns a tuple of AsyncIterators the same length as num_outputs

Return type:

AsyncIterator[Output] or Tuple[AsyncIterator[Output], …]

Examples

# Copy a file, generating a name based on the first line of the file. >>> stream = text_file_source(“example.txt”) >>> stream, name = extract_value_step(stream, lambda x: x) >>> done = await async_init_step(stream, lambda x: text_file_sink(x, resolve_awaitable_or_obj(name)), num_outputs=0)

voice_stream.integrations.google.byte_buffer_step(async_iter: AsyncIterator[bytes]) AsyncIterator[bytes]#

Data flow step that buffers and aggregates byte sequences to match producer and consumer rates.

This function acts as a buffer specifically for byte streams. It pulls bytes from the incoming iterator as fast

as they can come in, and then outputs aggreged byte buffers only as fast as the output iterator can pull them.

Parameters:

async_iter (AsyncIterator[bytes]) – An asynchronous iterator yielding byte sequences.

Returns:

An asynchronous iterator yielding concatenated byte sequences.

Return type:

AsyncIterator[bytes]

Examples

>>> async def rate_limit_sink(ai):
...     out = []
...     async for item in ai:
...         await asyncio.sleep(0.5)
...         out.append(item)
>>> stream = array_source([b'a',b'b',b'c',b'd'])
>>> stream = byte_buffer_step(stream)
>>> done = await rate_limit_sink(stream)
>>> assert done == [b'abcd']
async voice_stream.integrations.google.chunk_bytes_step(async_iter: AsyncIterator[bytes], chunk_size: int | Awaitable[int]) AsyncIterator[bytes]#

Data flow step that breaks long byte sequences from an asynchronous iterator into fixed-size chunks.

This function takes an asynchronous iterator yielding byte sequences and a chunk size, then yields these byte sequences in chunks of the specified size. The chunk size can be provided either as a direct integer value or as an awaitable object that resolves to an integer. This is useful for processing large byte sequences in manageable sizes.

Parameters:
  • async_iter (AsyncIterator[bytes]) – An asynchronous iterator that yields byte sequences.

  • chunk_size (AwaitableOrObj[int]) – The size of each chunk as an integer, or an awaitable object that yields an integer.

Returns:

An asynchronous iterator yielding byte chunks of the specified size.

Return type:

AsyncIterator[bytes]

Examples

>>> stream = array_source([b'HelloWorld!', b'PythonAsyncIO']):
>>> stream = chunk_bytes_step(stream, chunk_size=5)
>>> done = await array_sink(stream)
>>> assert done == [b'Hello', b'World', b'!', b'Pytho', b'nAsyn', b'cIO']
async voice_stream.integrations.google.concat_step(*async_iters: List[AsyncIterator[T]]) AsyncIterator[T]#

Data flow step that combines multiple streams by concatenating them.

This function takes a list of asynchronous iterators and concatenates their elements into a single async iterator. It iterates over each input iterator in the order they are provided, yielding all items from each before moving on to the next.

Parameters:

async_iters (List[AsyncIterator[T]]) – A list of asynchronous iterators to be concatenated.

Returns:

A single asynchronous iterator yielding items from all provided iterators in sequence.

Return type:

AsyncIterator[T]

Examples

>>> stream1 = array_source([1,2])
>>> stream2 = array_source([3,4])
>>> stream = concat_step(stream1, stream2)
>>> done = await array_sink(stream)
>>> assert done == [1,2,3,4]
async voice_stream.integrations.google.filter_step(async_iter: AsyncIterator[T], condition: Callable[[T], bool]) AsyncIterator[T]#

Data flow step that filters items based on a specified condition.

This function wraps an async iterator and yields only those items that satisfy a given condition. It is analogous to the built-in filter function but for asynchronous iterators. Each item from the input async iterator is passed to the condition function, and only items for which this function returns True are yielded.

Parameters:
  • async_iter (AsyncIterator[T]) – The asynchronous iterator whose items are to be filtered.

  • condition (Callable[[T], bool]) – A function that evaluates each item in the iterator. If this function returns True, the item is yielded; otherwise, it is skipped.

Returns:

An asynchronous iterator yielding only the items that satisfy the condition.

Return type:

AsyncIterator[T]

Examples

>>> stream = array_source(range(4))
>>> stream = filter_step(stream, lambda x: x % 2 == 0)
>>> done = await array_sink(stream)
>>>
# Output: 0, 2
voice_stream.integrations.google.google_speech_step(async_iter: AsyncIterator[bytes], speech_async_client: SpeechAsyncClient, project, location, recognizer, model: str = 'latest_long', language_codes: str | list[str] = ['en-US', 'es-US'], audio_format: AudioFormat | ExplicitDecodingConfig | None = None, include_events: bool = False) AsyncIterator[str] | Tuple[AsyncIterator[str], AsyncIterator[BaseEvent]]#

Data flow step for converting audio into text using Google Cloud Speech-to-Text V2 API.

This function processes an asynchronous stream of audio bytes, using Google Cloud Speech-to-Text service to convert the audio into text. It supports additional configuration such as specifying the model, language codes, and audio format. If ‘include_events’ is set to True, it also returns a stream of speech recognition events alongside the text (SpeechStart and SpeechEnd).

Parameters:
  • async_iter (AsyncIterator[bytes]) – An asynchronous iterator over audio data in bytes.

  • speech_async_client (SpeechAsyncClient) – An instance of SpeechAsyncClient for interacting with the Google Cloud Speech-to-Text API.

  • project – The Google Cloud project identifier.

  • location – The location or region of the Google Cloud project.

  • recognizer – The recognizer identifier within the Google Cloud project. This must be a previously created recognizer. See create_recognizer() for details.

  • model (str, optional) – The model to be used by the recognizer. Default is “latest_long”.

  • language_codes (Union[str, list[str]], optional) – The language code(s) for the recognizer. Can be a single string or a list of strings. Default is [“en-US”, “es-US”].

  • audio_format (GoogleDecodingConfig, optional) – Optional configuration for audio decoding. Not required if the recognizer auto-detects the format.

  • include_events (bool, optional) – If True, the function also returns a stream of speech recognition events. Default is False.

Returns:

If include_events is False, returns an asynchronous iterator yielding recognized text from the audio stream. If include-events is True, returns a tuple with 2 iterators. The first yields the recognized text, and the second contain speech events.

Return type:

Union[AsyncIterator[str], Tuple[AsyncIterator[str], AsyncIterator[voice_stream.events.BaseEvent]]]

Notes

  • The function breaks the audio stream into chunks, sends them to the Speech-to-Text API, and processes the responses to extract the transcript.

  • Speech recognition events include information like word timings and confidences.

voice_stream.integrations.google.google_speech_v1_step(async_iter: AsyncIterator[bytes], speech_async_client: SpeechAsyncClient, audio_format: AudioFormat, model: str = 'latest_long', language_code: str = 'en-US', include_events: bool = False) AsyncIterator[str]#

Data flow step for converting audio into text using Google Cloud Speech-to-Text V1 API.

This function processes an asynchronous stream of audio bytes, using Google Cloud Speech-to-Text V1 service to convert the audio into text. It supports additional configuration such as specifying the model, language codes, and audio format. If ‘include_events’ is set to True, it also returns a stream of speech recognition events alongside the text (SpeechStart and SpeechEnd).

Parameters:
  • async_iter (AsyncIterator[bytes]) – An asynchronous iterator over audio data in bytes.

  • speech_async_client (SpeechAsyncClient) – An instance of SpeechAsyncClientV1 for interacting with the Google Cloud Speech-to-Text API V1.

  • audio_format (voice_stream.audio.AudioFormat) – The audio format of the input data. This is required in V1. Use the V2 API for auto-detection of formats.

  • model (str, optional) – The model to be used by the recognizer. Default is “latest_long”.

  • language_code (str, optional) – The language code(s) for the recognizer. Default is “en-US”.

  • include_events (bool, optional) – If True, the function also returns a stream of speech recognition events. Default is False.

Returns:

If include_events is False, returns an asynchronous iterator yielding recognized text from the audio stream. If include-events is True, returns a tuple with 2 iterators. The first yields the recognized text, and the second contain speech events.

Return type:

Union[AsyncIterator[str], Tuple[AsyncIterator[str], AsyncIterator[voice_stream.events.BaseEvent]]]

Notes

  • The function breaks the audio stream into chunks, sends them to the Speech-to-Text API, and processes the responses to extract the transcript.

  • Speech recognition events include information like word timings and confidences.

async voice_stream.integrations.google.google_text_to_speech_step(async_iter: AsyncIterator[str | TTSRequest], text_to_speech_async_client: TextToSpeechAsyncClient, voice_name: str = 'en-US-Standard-H', audio_format: AudioFormat | AudioConfig = AudioFormat.OGG_OPUS, speaking_rate: float = 1.0) AsyncIterator[AudioWithText]#

Data flow step that converts text to speech using Google’s Text-to-Speech service.

This function takes in strings or TTSRequest objects and converts each into audio using the specified voice and audio format. The function supports customization of the voice for each item if provided within a TTSRequest object.

Parameters:
  • async_iter (AsyncIterator[Union[str, TTSRequest]]) – An asynchronous iterator over text blocks or TTSRequest objects. Using TTSRequest objects allows the voice to be customized for each block of text.

  • text_to_speech_async_client (TextToSpeechAsyncClient) – An instance of TextToSpeechAsyncClient for interacting with the Google Text-to-Speech API.

  • voice_name (str, optional) – Default voice name to be used for text-to-speech conversion. Default is “en-US-Standard-H”.

  • audio_format (GoogleAudioConfig, optional) – The audio format for the output speech. Default is AudioFormat.OGG_OPUS.

  • speaking_rate – Speaking rate/speed, in the range [0.25, 4.0]. 1.0 is the normal native speed supported by the specific voice. 2.0 is twice as fast, and 0.5 is half as fast. If unset(0.0), defaults to the native 1.0 speed. Ignored if a GoogleAudioConfig is passed to audio_format.

Yields:

AsyncIterator[AudioWithText] – An asynchronous iterator yielding AudioWithText objects, each containing the audio output and the original text.

Notes

  • The function allows for dynamic voice selection if a TTSRequest object is provided, which specifies the voice for a particular text block.

  • Supports different audio encoding formats based on the GoogleAudioConfig.

async voice_stream.integrations.google.log_step(async_iter: ~typing.AsyncIterator[~voice_stream.types.T], name: str, formatter: ~typing.Callable[[~voice_stream.types.T], ~typing.Any] = <function <lambda>>) AsyncIterator[T]#

Data flow step that prints using the Python logger.

This function takes an async iterator and logs each item that passes through it. The logging includes a specified name and uses an optional formatter function to convert items to a log-friendly format. This is useful for debugging or monitoring the contents of an data flow.

Parameters:
  • async_iter (AsyncIterator[T]) – The asynchronous iterator whose items are to be logged.

  • name (str) – A name to prepend to each logged message for identification.

  • formatter (Callable[[T], Any], optional) – A function to format each item before logging. Defaults to a function that returns the item unchanged.

Returns:

An asynchronous iterator that logs each item and then yields it.

Return type:

AsyncIterator[T]

Examples

>>> stream = array_source([1,2,3])
>>> stream = log_step(stream, "Value squared", lambda x: x*x)
>>> await empty_sink()
Value squared: 1
Value squared: 4
Value squared: 9
async voice_stream.integrations.google.map_step(async_iter: AsyncIterator[T], func: Callable[[T], Output], ignore_none: bool | None = False) AsyncIterator[Output]#

Data flow step that transforms items using a mapping function.

This function applies either a synchronous or asynchronous mapping function to each item in the input async iterator. If ignore_none is set to True, any items that are transformed to None are not yielded. This feature allows the function to perform both transformation and filtering in a single step.

Parameters:
  • async_iter (AsyncIterator[T]) – The asynchronous iterator whose items are to be transformed.

  • func (Callable[[T], Output]) – A mapping function to apply to each item. This can be either a synchronous or asynchronous function.

  • ignore_none (Optional[bool], default False) – If True, items that are transformed to None by func are not yielded.

Returns:

An asynchronous iterator yielding transformed items, optionally skipping None values.

Return type:

AsyncIterator[Output]

Examples

>>> stream = array_source(range(4))
>>> stream = map_step(stream, lambda x: x + 2)
>>> done = await array_sink(stream)
>>> assert done == [2,3,4,5]
voice_stream.integrations.google.partition_step(async_iter: AsyncIterator[T], condition: Callable[[T], bool]) Tuple[AsyncIterator[T], AsyncIterator[T]]#

Data flow step that partitions items into two output streams based on a specified condition.

This function divides the items from the input async iterator into two separate iterators: one for items where the condition evaluates to True, and another for items where the condition is False. If an exception occurs during iteration, it is propagated to both output iterators.

Parameters:
  • async_iter (AsyncIterator[T]) – An asynchronous iterator whose items are to be partitioned.

  • condition (Callable[[T], bool]) – A function that evaluates each item in the iterator. Returns True if the item should go to the first iterator, and False for the second.

Returns:

A tuple of two asynchronous iterators: the first yields items where the condition is True, and the second yields items where the condition is False.

Return type:

Tuple[AsyncIterator[T], AsyncIterator[T]]

Examples

>>> stream = array_source(range(4))
>>> true_stream, false_stream = partition_step(stream, lambda x: x % 2 == 0)
>>> true_done = array_sink(true_stream)
>>> false_done = array_sink(false_stream)
>>> true_ret, false_ret = await asyncio.gather(true_done, false_done)
>>> assert true_ret == [0,2]
>>> assert false_ret == [1,3]
async voice_stream.integrations.google.recover_exception_step(async_iter: AsyncIterator[T], exception_type: Type[BaseException], exception_handler: Callable[[BaseException], Any]) AsyncIterator[T]#

Data flow step that recovers from an exception in a previous step.

This function takes an async iterator and a specified exception type. If an exception with the specified type occurs during iteration, it is caught and passed to the provided exception handler function. The result depends on what is returned by the exception handler: * If it returns an AsyncIterator, that will be yielded and this step will end when the iterator ends. * If it returns None, this step will immediately end. * If it returns anything else, that object will be yielded and this step will end.

Parameters:
  • async_iter (AsyncIterator[T]) – The asynchronous iterator to be wrapped.

  • exception_type (Type[BaseException]) – The type of exception to be caught and handled.

  • exception_handler (Callable[[BaseException], Any]) – A function to handle the caught exception. Returns an optional value that will be converted to an AsyncIterator.

Returns:

An asynchronous iterator that handles exceptions with the specified type.

Return type:

AsyncIterator[T]

Examples

>>> def gen():
...    yield 1
...    yield 2
...    raise ValueError("No more")
>>> stream = gen()
>>> stream = recover_exeption_step(stream, ValueError, lambda x: x.args[0])
>>> done = await array_sink(stream)
>>> assert done == [1,2,"No more"]

Notes

  • Once the source iterator throws the exception, it will not be accessed again.

voice_stream.integrations.google.remove_wav_header(wav_bytes: bytes) bytes#

Removes the wav header from a wav file, regardless of the format.

Parameters:

wav_bytes – The beginning of a WAV file, including the header.

Returns:

The audio bytes from the file, without the header.

Return type:

bytes