Whisper integration

In this tutorial, we'll transcribe an audio file with privacy guarantees using the popular model Whisper with BlindAI API. Whisper is an open-source automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web.

Installation¶

Before we begin, you will need to install the blindai Python library and download an example test audio file. This one was provided by University of Illinois at Chicago as part of their Computer Science 101 course.

In [ ]:

            
                Copied!
                
# install blindai
!pip install blindai

# Download our example audio file and save it as `taunt.wav`
!wget https://www2.cs.uic.edu/~i101/SoundFiles/taunt.wav -O taunt.wav
# install blindai
!pip install blindai

# Download our example audio file and save it as `taunt.wav`
!wget https://www2.cs.uic.edu/~i101/SoundFiles/taunt.wav -O taunt.wav

Transcribing the audio file¶

Speech-to-text operations are handled within the Audio.transcribe module of BlindAI.api.

To transcribe an audio file, we call the transcribe method and provide the path to our audio file as the file option. The Whisper model accepts a variety of audio file formats (m4a, mp3, mp4, mpeg, mpga, wav, webm).

In [2]:

            
                Copied!
                
import blindai

transcript = blindai.api.Audio.transcribe("taunt.wav")
import blindai

transcript = blindai.api.Audio.transcribe("taunt.wav")

Downloading (…)rocessor_config.json:   0%|          | 0.00/185k [00:00<?, ?B/s]

Downloading (…)okenizer_config.json:   0%|          | 0.00/844 [00:00<?, ?B/s]

Downloading (…)olve/main/vocab.json:   0%|          | 0.00/999k [00:00<?, ?B/s]

Downloading (…)/main/tokenizer.json:   0%|          | 0.00/2.13M [00:00<?, ?B/s]

Downloading (…)olve/main/merges.txt:   0%|          | 0.00/456k [00:00<?, ?B/s]

Downloading (…)main/normalizer.json:   0%|          | 0.00/52.7k [00:00<?, ?B/s]

Downloading (…)in/added_tokens.json:   0%|          | 0.00/2.08k [00:00<?, ?B/s]

Downloading (…)cial_tokens_map.json:   0%|          | 0.00/1.72k [00:00<?, ?B/s]

If you or your service provider are running your own instance of the BlindAI server and you don't wish to connect to our managed cloud solution, you can pass your BlindAiConnection client object to the transcribe method's connection option.

We can print out the returned transcript string variable to check our results are as expected.

In [3]:

            
                Copied!
                
print(transcript)
print(transcript)

 Now go away, or I shall taunt you a second timer!

Our audio file has been correctly transcribed!

Feel free to test this out with your own audio files 🔊

Under the hood¶

Let's take a moment to discuss what went on under the hood when we called the blindai.api.Audio.transcribe() method. This will allow you to understand better how data privacy is guaranteed by the hardware.

Connection¶

First, the transcribe() method connects to our managed cloud solution by calling the BlindAI.Core's connect() method, with the following code:

In [4]:

            
                Copied!
                
DEFAULT_BLINDAI_ADDR = "4.246.205.63"

connection = blindai.core.connect(
    DEFAULT_BLINDAI_ADDR,
    hazmat_http_on_unattested_port=True,
)
DEFAULT_BLINDAI_ADDR = "4.246.205.63"

connection = blindai.core.connect(
    DEFAULT_BLINDAI_ADDR,
    hazmat_http_on_unattested_port=True,
)

It is at this point that the attestation process is triggered. There is a technical explanation of what goes on in our confidential computing guide.

But if you haven't read it yet, all you need to know at this point is that the attestation process verifies that the application code running in the enclave has not been modified. To do so, it will check that the Mithril server instance is running a real enclave using genuine hardware.

The client will also check the application code running in the enclave exactly matches the expected code for the latest release of BlindAI. The code is defined in the client's built-in manifest.toml file. This step prevents users from being able to connect with a version of the application which has been tampered with and may not be trustworthy.

We can quickly prove that this check is in place by doing the following:

Creating an empty manifest.toml file called fake_manifest.toml
Using the connect() method's hazmat_manifest_path() option to override the default path for the manifest.toml file with the path for our fake_manifest.toml file

Uncomment and run the following code to see this for yourself.

In [6]:

            
                Copied!
                
                    
                    
                
                

        
# !touch fake_manifest.toml

# connection = blindai.core.connect(
#             DEFAULT_BLINDAI_ADDR,
#             hazmat_http_on_unattested_port=True,
#             use_cloud_manifest=True,
#             hazmat_manifest_path="./fake_manifest.toml"
#         )
# !touch fake_manifest.toml

# connection = blindai.core.connect(
#             DEFAULT_BLINDAI_ADDR,
#             hazmat_http_on_unattested_port=True,
#             use_cloud_manifest=True,
#             hazmat_manifest_path="./fake_manifest.toml"
#         )

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/usr/local/lib/python3.9/dist-packages/blindai/client.py in __init__(self, addr, unattested_server_port, attested_server_port, model_management_port, hazmat_manifest_path, hazmat_http_on_unattested_port, simulation_mode, use_cloud_manifest)
    604 
--> 605                 validate_attestation(
    606                     quote,

/usr/local/lib/python3.9/dist-packages/blindai/_dcap_attestation.py in validate_attestation(quote, collateral, enclave_held_data, manifest_path, use_cloud_manifest)
    187     if manifest_path is not None and use_cloud_manifest is True:
--> 188         raise TypeError(
    189             "Input inconsistency. You cannot set a manifest path and ask to use the cloud manifest in the same time"

TypeError: Input inconsistency. You cannot set a manifest path and ask to use the cloud manifest in the same time

During handling of the above exception, another exception occurred:

AttestationError                          Traceback (most recent call last)
<ipython-input-6-9ac48413e7b9> in <cell line: 3>()
      1 get_ipython().system('touch fake_manifest.toml')
      2 
----> 3 connection = blindai.core.connect(
      4             DEFAULT_BLINDAI_ADDR,
      5             hazmat_http_on_unattested_port=True,

/usr/local/lib/python3.9/dist-packages/blindai/client.py in connect(addr, unattested_server_port, attested_server_port, model_management_port, hazmat_manifest_path, hazmat_http_on_unattested_port, simulation_mode, use_cloud_manifest)
    838     """
    839 
--> 840     return BlindAiConnection(
    841         addr,
    842         unattested_server_port,

/usr/local/lib/python3.9/dist-packages/blindai/client.py in __init__(self, addr, unattested_server_port, attested_server_port, model_management_port, hazmat_manifest_path, hazmat_http_on_unattested_port, simulation_mode, use_cloud_manifest)
    613                 raise
    614             except Exception as e:
--> 615                 raise AttestationError("Attestation verification failed")
    616 
    617         # requests (http library) takes a path to a file containing the CA

AttestationError: Attestation verification failed

This causes the connection to be refused with an AttestationError because the application code cannot be verified.

Note

To test this more thoroughly, you can download our GitHub project and modify the server code stored in the src folder very slightly.

You then need to generate a new manifest.toml based on this modified source code by building the project. Once this is done, replace the path to the fake_manifest.toml in the previous example with a path to your newly generated manifest.toml. This will lead to the same error, since the server's hash received in the attestation report will not match the code in the newly generated manifest.toml.

Querying the model¶

Once connected, the transcribe() method will prepare the audio file correctly for the model and call the BlindAI.Core run_model() method, passing it the ID for the default Whisper model and the now pre-processed input data:

res = connection.run_model(model_hash=DEFAULT_WHISPER_MODEL, input_tensors=input_data)

Then, the method extracts the text transcription from the returned object and returns that to our BlindAI API user!

Conclusions¶

We have seen:

How to confidentially transcribe audio using the Whisper model in BlindAI API.
How privacy is guaranteed under the hood.

Please check out the rest of our BlindAI documentation to see more examples of how you can use BlindAI to query AI models without compromising the safety of user data or models.