In this tutorial, we'll transcribe an audio file with privacy guarantees using the popular model Whisper with BlindAI API. Whisper is an open-source automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web.
Installation¶
Before we begin, you will need to install the blindai Python library and download an example test audio file. This one was provided by University of Illinois at Chicago as part of their Computer Science 101 course.
# install blindai
!pip install blindai
# Download our example audio file and save it as `taunt.wav`
!wget https://www2.cs.uic.edu/~i101/SoundFiles/taunt.wav -O taunt.wav
Transcribing the audio file¶
Speech-to-text operations are handled within the Audio.transcribe
module of BlindAI.api
.
To transcribe an audio file, we call the transcribe
method and provide the path to our audio file as the file
option. The Whisper model accepts a variety of audio file formats (m4a, mp3, mp4, mpeg, mpga, wav, webm).
import blindai
transcript = blindai.api.Audio.transcribe("taunt.wav")
Downloading (…)rocessor_config.json: 0%| | 0.00/185k [00:00<?, ?B/s]
Downloading (…)okenizer_config.json: 0%| | 0.00/844 [00:00<?, ?B/s]
Downloading (…)olve/main/vocab.json: 0%| | 0.00/999k [00:00<?, ?B/s]
Downloading (…)/main/tokenizer.json: 0%| | 0.00/2.13M [00:00<?, ?B/s]
Downloading (…)olve/main/merges.txt: 0%| | 0.00/456k [00:00<?, ?B/s]
Downloading (…)main/normalizer.json: 0%| | 0.00/52.7k [00:00<?, ?B/s]
Downloading (…)in/added_tokens.json: 0%| | 0.00/2.08k [00:00<?, ?B/s]
Downloading (…)cial_tokens_map.json: 0%| | 0.00/1.72k [00:00<?, ?B/s]
If you or your service provider are running your own instance of the BlindAI server and you don't wish to connect to our managed cloud solution, you can pass your
BlindAiConnection
client object to thetranscribe
method'sconnection
option.
We can print out the returned transcript
string variable to check our results are as expected.
print(transcript)
Now go away, or I shall taunt you a second timer!
Our audio file has been correctly transcribed!
Feel free to test this out with your own audio files 🔊
Under the hood¶
Let's take a moment to discuss what went on under the hood when we called the blindai.api.Audio.transcribe()
method. This will allow you to understand better how data privacy is guaranteed by the hardware.
Connection¶
First, the transcribe()
method connects to our managed cloud solution by calling the BlindAI.Core's connect()
method, with the following code:
DEFAULT_BLINDAI_ADDR = "4.246.205.63"
connection = blindai.core.connect(
DEFAULT_BLINDAI_ADDR,
hazmat_http_on_unattested_port=True,
)
It is at this point that the attestation process is triggered. There is a technical explanation of what goes on in our confidential computing guide.
But if you haven't read it yet, all you need to know at this point is that the attestation process verifies that the application code running in the enclave has not been modified. To do so, it will check that the Mithril server instance is running a real enclave using genuine hardware.
The client will also check the application code running in the enclave exactly matches the expected code for the latest release of BlindAI. The code is defined in the client's built-in manifest.toml
file. This step prevents users from being able to connect with a version of the application which has been tampered with and may not be trustworthy.
We can quickly prove that this check is in place by doing the following:
- Creating an empty
manifest.toml
file calledfake_manifest.toml
- Using the
connect()
method'shazmat_manifest_path()
option to override the default path for themanifest.toml
file with the path for ourfake_manifest.toml
file
Uncomment and run the following code to see this for yourself.
# !touch fake_manifest.toml
# connection = blindai.core.connect(
# DEFAULT_BLINDAI_ADDR,
# hazmat_http_on_unattested_port=True,
# use_cloud_manifest=True,
# hazmat_manifest_path="./fake_manifest.toml"
# )
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) /usr/local/lib/python3.9/dist-packages/blindai/client.py in __init__(self, addr, unattested_server_port, attested_server_port, model_management_port, hazmat_manifest_path, hazmat_http_on_unattested_port, simulation_mode, use_cloud_manifest) 604 --> 605 validate_attestation( 606 quote, /usr/local/lib/python3.9/dist-packages/blindai/_dcap_attestation.py in validate_attestation(quote, collateral, enclave_held_data, manifest_path, use_cloud_manifest) 187 if manifest_path is not None and use_cloud_manifest is True: --> 188 raise TypeError( 189 "Input inconsistency. You cannot set a manifest path and ask to use the cloud manifest in the same time" TypeError: Input inconsistency. You cannot set a manifest path and ask to use the cloud manifest in the same time During handling of the above exception, another exception occurred: AttestationError Traceback (most recent call last) <ipython-input-6-9ac48413e7b9> in <cell line: 3>() 1 get_ipython().system('touch fake_manifest.toml') 2 ----> 3 connection = blindai.core.connect( 4 DEFAULT_BLINDAI_ADDR, 5 hazmat_http_on_unattested_port=True, /usr/local/lib/python3.9/dist-packages/blindai/client.py in connect(addr, unattested_server_port, attested_server_port, model_management_port, hazmat_manifest_path, hazmat_http_on_unattested_port, simulation_mode, use_cloud_manifest) 838 """ 839 --> 840 return BlindAiConnection( 841 addr, 842 unattested_server_port, /usr/local/lib/python3.9/dist-packages/blindai/client.py in __init__(self, addr, unattested_server_port, attested_server_port, model_management_port, hazmat_manifest_path, hazmat_http_on_unattested_port, simulation_mode, use_cloud_manifest) 613 raise 614 except Exception as e: --> 615 raise AttestationError("Attestation verification failed") 616 617 # requests (http library) takes a path to a file containing the CA AttestationError: Attestation verification failed
This causes the connection to be refused with an AttestationError
because the application code cannot be verified.
Note
To test this more thoroughly, you can download our GitHub project and modify the server code stored in the
src
folder very slightly.You then need to generate a new
manifest.toml
based on this modified source code by building the project. Once this is done, replace the path to thefake_manifest.toml
in the previous example with a path to your newly generatedmanifest.toml
. This will lead to the same error, since the server's hash received in the attestation report will not match the code in the newly generatedmanifest.toml
.
Querying the model¶
Once connected, the transcribe()
method will prepare the audio file correctly for the model and call the BlindAI.Core run_model()
method, passing it the ID for the default Whisper model and the now pre-processed input data:
res = connection.run_model(model_hash=DEFAULT_WHISPER_MODEL, input_tensors=input_data)
Then, the method extracts the text transcription from the returned object and returns that to our BlindAI API user!
Conclusions¶
We have seen:
- How to confidentially transcribe audio using the Whisper model in BlindAI API.
- How privacy is guaranteed under the hood.
Please check out the rest of our BlindAI documentation to see more examples of how you can use BlindAI to query AI models without compromising the safety of user data or models.