Phala has released the first ever benchmark for Trusted Execution Environment (TEE) enabled GPUS for large language model (LLM) processing, with some promising results.
They found that on a TEE enabled Nvidia H100 GPU, a small LLM query incurred a 7% reduction in performance. However, for larger LLMs, the performance impact was negligible.
TEE is integral to Phala's decentralized, verifiable, and confidential AI model processing, and these results indicate that TEE-enabled GPUs can handle the workload without significantly compromising performance.