Inference and Fine-tuning ProtST on Intel Gaudi 2 Accelerator

Protein Language Models (PLMs) have emerged as a powerful tool for predicting and designing protein structure and function. At the International Conference on Machine Learning 2023 (ICML), MILA and Intel Labs released ProtST, a pioneering multi-modal language model for protein design based on text prompts. In this blog post, we demonstrate the ease of deploying ProtST inference and fine-tuning on Intel Gaudi 2 accelerator.

Inference with ProtST

Common subcellular locations include the nucleus, cell membrane, cytoplasm, mitochondria, and others as described in this dataset in greater detail.

We compare ProtST’s inference performance on NVIDIA A100 80GB PCIe and Gaudi 2 accelerator using the test split of the ProtST-SubcellularLocalization dataset. This test set contains 2772 amino acid sequences, with variable sequence lengths ranging from 79 to 1999.

You can reproduce our experiment using this script, where we run the model in full bfloat16 precision with batch size 1. We get an identical accuracy of 0.44 on the Nvidia A100 and Intel Gaudi 2, with Gaudi2 delivering 1.76x faster inferencing speed than the A100. The wall time for a single A100 and a single Gaudi 2 is shown in the figure below.

Fine-tuning ProtST

Fine-tuning the ProtST model on downstream tasks is an easy and established way to improve modeling accuracy. In this experiment, we specialize the model for binary location, a simpler version of subcellular localization, with binary labels indicating whether a protein is membrane-bound or soluble.

You can reproduce our experiment using this script. Here, we fine-tune the ProtST-ESM1b-for-sequential-classification model in bfloat16 precision on the ProtST-BinaryLocalization dataset. The table below shows model accuracy on the test split with different training hardware setups, and they closely match the results published in the paper (around 92.5% accuracy).

Conclusion

In this blog post, we have demonstrated the ease of deploying ProtST inference and fine-tuning on Gaudi 2 based on Optimum for Intel Gaudi Accelerators. In addition, our results show competitive performance against A100, with a 1.76x speedup for inference and a 2.92x speedup for fine-tuning.

Frequently Asked Questions

Q1: What is ProtST?

ProtST is a pioneering multi-modal language model for protein design based on text prompts.

Q2: What is Gaudi 2 accelerator?

The Gaudi 2 is the second-generation AI accelerator that Intel designed, used for accelerating machine learning models.

Q3: Can I fine-tune ProtST on my own data?

Yes, you can fine-tune ProtST on your own data by following the script provided.

Q4: How can I deploy ProtST on Intel Gaudi 2 accelerator?

You can deploy ProtST on Intel Gaudi 2 accelerator by using Optimum for Intel Gaudi Accelerators and following the instructions provided.

Q5: What are the results of fine-tuning ProtST on Intel Gaudi 2 accelerator?

The results show competitive performance against A100, with a 1.76x speedup for inference and a 2.92x speedup for fine-tuning.

Revolutionizing Protein Language Processing: High-Performance Implementation of Accelerated ProtST on Intel Gaudi 2 AI Chip

Inference and Fine-tuning ProtST on Intel Gaudi 2 Accelerator

Inference with ProtST

Fine-tuning ProtST

Conclusion

Frequently Asked Questions

Q1: What is ProtST?

Q2: What is Gaudi 2 accelerator?

Q3: Can I fine-tune ProtST on my own data?

Q4: How can I deploy ProtST on Intel Gaudi 2 accelerator?

Q5: What are the results of fine-tuning ProtST on Intel Gaudi 2 accelerator?

Counting the Letters: How Many R’s Are in the Word STRAWBERRY?

AI to Outdo Human Intelligence: Expert Claims Neural Code Decoding Holds the Key

Top-Rated Cameras for Computer Vision: Expert Reviews and Buying Guide

Revolutionizing Logistics: How Modern Technologies and Software Enhance Efficiency and Performance

Counting the Letters: How Many R’s Are in the Word STRAWBERRY?

AI to Outdo Human Intelligence: Expert Claims Neural Code Decoding Holds the Key

Top-Rated Cameras for Computer Vision: Expert Reviews and Buying Guide

Revolutionizing Logistics: How Modern Technologies and Software Enhance Efficiency and Performance

Editor Picks

AI to Outdo Human Intelligence: Expert Claims Neural Code Decoding Holds the Key

Top-Rated Cameras for Computer Vision: Expert Reviews and Buying Guide

Revolutionizing Logistics: How Modern Technologies and Software Enhance Efficiency and Performance

Must read

AI to Outdo Human Intelligence: Expert Claims Neural Code Decoding Holds the Key

Top-Rated Cameras for Computer Vision: Expert Reviews and Buying Guide

Revolutionizing Logistics: How Modern Technologies and Software Enhance Efficiency and Performance

Popular categories

Counting the Letters: How Many R’s Are in the Word STRAWBERRY?

AI to Outdo Human Intelligence: Expert Claims Neural Code Decoding Holds...

Top-Rated Cameras for Computer Vision: Expert Reviews and Buying Guide

Revolutionizing Logistics: How Modern Technologies and Software Enhance Efficiency and Performance

Inbolt Secures Series A Funding to Revolutionize Industrial Robotics with Vision...

Thermally Engineered Mobile Intelligent System (THEMIS) Demonstrates Unparalleled Resilience in Extreme...