Introduction¶
BlindAI API is an open-source Python library allowing developers to query AI models while keeping their data confidential. We use secure enclaves, hardware based isolated environments, to encrypt the data end-to-end. Everything that happens in the enclave, stays in the enclave - hence our name BlindAI!
Key Features¶
- Multiple Large Language Models (LLMs) provided out of the box
- Simple and fast Python API to use the service
- Data and model protected by hardware security
Quick start guide¶
This guide will show you how to install and use BlindAI API to transcribe an audio file with privacy guarantees, using the popular model Whisper. You can check our tutorials, concept guides and how-to-guides to learn more about how we keep your data private every step of the way.
Installation¶
Before we begin, you will need to install the blindai Python library and download a test audio file. The one we'll use as an example was provided by the University of Illinois at Chicago as part of their Computer Science 101 course.
# install blindai
!pip install blindai
# Download our example audio file and save it as `taunt.wav`
!wget https://www2.cs.uic.edu/~i101/SoundFiles/taunt.wav -O taunt.wav
Transcribing the audio file¶
Speech-to-text operations are handled within the Audio.transcribe
module of BlindAI.api
.
To transcribe an audio file, we call the transcribe
method and provide the path to our audio file as the file
option. The Whisper model accepts a variety of audio file formats (m4a, mp3, mp4, mpeg, mpga, wav, webm).
import blindai
transcript = blindai.api.Audio.transcribe("taunt.wav")
Downloading (…)rocessor_config.json: 0%| | 0.00/185k [00:00<?, ?B/s]
Downloading (…)okenizer_config.json: 0%| | 0.00/844 [00:00<?, ?B/s]
Downloading (…)olve/main/vocab.json: 0%| | 0.00/999k [00:00<?, ?B/s]
Downloading (…)/main/tokenizer.json: 0%| | 0.00/2.13M [00:00<?, ?B/s]
Downloading (…)olve/main/merges.txt: 0%| | 0.00/456k [00:00<?, ?B/s]
Downloading (…)main/normalizer.json: 0%| | 0.00/52.7k [00:00<?, ?B/s]
Downloading (…)in/added_tokens.json: 0%| | 0.00/2.08k [00:00<?, ?B/s]
Downloading (…)cial_tokens_map.json: 0%| | 0.00/1.72k [00:00<?, ?B/s]
We can print out the returned transcript
string variable to check our results are as expected.
print(transcript)
Now go away, or I shall taunt you a second timer!
Our audio file has been correctly transcribed!
Feel free to test this out with your own audio files 🔊
There is a lot more you can explore about BlindAI. More models are coming soon (for example OpenChatKit, the open-source version of ChatGPT). Historically BlindAI was a solution for AI engineer to upload their models to an enclave so we cover that whole part as well in BlindAI Core tutorials.
How to verify BlindAI's security features¶
A beta version of the BlindAI Core solution has been independently audited by Quarkslab SAS. The audit report is coming soon!
- This audit does not cover the client-side SDK, the BlindAI API or Nitro enclaves.
- Some information-level security weaknesses were identified, such as insufficient error handling or the inclusion of dependencies that are no longer officially maintained. See the audit report for an exhaustive list.
Our source code is
open source
. You can inspect the code yourself on our GitHub page.We aim to provide clear explanations of the technologies behind BlindAI. You can get started with an introduction to confidential computing explaining key concepts behind BlindAI, or go with more advanced security explanations in our security section.
A guide will help you verify one of our security features yourself! The feature in question is the verification that the application code has not been modified during the attestation process. We explain what this feature is and how it works here