
Step-by-Step Guide to Building QA Systems Using Transformers
Jul 01, 2025 3 Min Read 213 Views
(Last Updated)
In this blog, we are going to understand a step-by-step guide on building QA systems using transformers.
The question and answer system using transformers is commonly used in the field of natural language processing, and it has a wide range of applications, and it gives answers to the questions in the form of human human-understandable.
The question and answer system using transformers can be implemented in several methods and mechanisms. We will discuss each mechanism in detail. Let’s get started!
Table of contents
- RNN and LSTM
- Long Live Transformers
- Five steps to understand the mechanism of transformers
- What is BERT?
- Why BERT?
- What are Longformers?
- Steps in Building QA Systems Using Transformers
- CODE FOR BUILDING AND ANSWERS SYSTEM USING TRANSFORMERS:
- CONCLUSION
RNN and LSTM
Recurrent Neural Network works with a generalization of a feedforward Neural network that has an internal memory. RNN uses its internal state (memory) to process a sequence of inputs.
Long-term short-term memory (LSTM) networks are modified versions of recurrent neural networks, making it easier to remember past data in memory. The vanishing gradient problem is resolved here. LSTM is well-suited to classify, process, and predict time series with time lags of unknown duration. It trains the model by using back propagation.
Recurrent neural networks and long-term term short-term memory models for what concern this question are almost identical in their core properties.Sequential Processing: sentences must be processed by words. Past information retained through past hidden states: to the sequence model, follow the Markov property, each state is assumed to be dependent.
The first property is the reason why RNN and LSTM can’t be trained in parallel. Information in RNN and LSTM is retained thanks to previously computed hidden states. Another way in which people mitigated this problem is to work with the Bidirectional Models, which encode the same sentence from two directions, from start to end and end to start.
Long Live Transformers
The attention all you need is a describer transformer and what is called a sequence-to-sequence architecture. Seq2seq is a neural net that transforms a given sequence into another sequence for a specific task. The most famous application of seq2seq models is translation, where the sequence of words from one language is transformed into another language. A popular choice for this type of model is the Long Short-Term Memory-based model.
So Transformers models are born to solve these problems of LSTM. The attention mechanism will replace the recurrent mechanism.
Five steps to understand the mechanism of transformers
This is the core idea of transformers. Self-attention is an attention mechanism relating different positions of a single sequence in order to compute a representation of the sequence.
FIRST STEP:
For each word, we create 3 vectors Q, K, V. These vectors are created by multiplying the embedding by three matrices (WQ, WK, WV).
SECOND STEP:
We divide the score by the square root of the dimension of the key vector we use. Then use the softmax function to determine how much each word will be expressed at this position.
THIRD STEP:
Multiply each value vector by the softmax score to keep important related words and eliminate the other.
FINAL STEP:
Sum V vectors to have vector attention Z for a word. Then repeat these steps to have a matrix’s attention for a sentence.
Are you interested in learning more about transformers? Enroll in Guvi’s IITM Pravartak certified Artificial Intelligence and Machine Learning Course. This covers all the important concepts of artificial intelligence from basics such as the history of AI, Python programming to advanced level, including transformer architecture, LLMs, with hands-on projects.
What is BERT?
BERT, which stands for Bidirectional Encoder Representations from Transformers, developed by researchers at Google in 2018, is based on Transformers, a deep learning model in which every output element is connected to every input element, and the weightings between them are dynamically calculated based on their connection.
Why BERT?
BERT helps the search engine understand the significance of transformer words like ‘to’ and ‘for’ in the keywords used.
For the Question Answering System, BERT takes two parameters, the input question, and passage as a single packed sequence. Then we fine-tune the output to display the answer that exists in the passage.
What are Longformers?
Transformer-based language models have been leading the NLP benchmarks lately. Models like BERT, ROBERTa have been state-of-the-art for a while. However, one major drawback of these models is that they cannot attend to longer sequences.
To overcome these long sequence issues, the Longformer essentially combines several attention patterns:
1. SLIDING WINDOW:
The name speaks for itself. In this approach, we take an arbitrary window size w, and each token in the sequence will only attend to some w tokens (mostly w/2 to the left and w/2 to the right).
2 .DILATED SLIDING WINDOW:
We skip 1 word next to get attention. The idea is to create a vastly greater window of attention, the window size w is bigger, so you can incorporate information faster across the layers,it will not harm the model’s computation.
3 . GLOBAL ATTENTION (full self-attention):
Let’s consider the same example of QA tasks. In the case of Longformer, we can have all the question tokens to have a global attention pattern, i.e., to have them attend to all the other tokens in the sequence.
Steps in Building QA Systems Using Transformers
Step 1. Install Anaconda. Install
Step 2. Create an Anaconda environment with Python version 3.7.
conda install -c QAS_longformer python=3.7 |
Step 3. Activate the environment.
conda activate QAS_longformer |
Step 4. We recommend using CUDA for fast training.
#install pytorch with cuda versionpip install torch==1.7.0+cu101 torchvision==0.8.1+cu101 torchaudio==0.7.0 -f https://download.pytorch.org/whl/torch_stable.html |
Step 5. Install the Transformer library.
pip install transformers pip install simpletransformers |
CODE FOR BUILDING AND ANSWERS SYSTEM USING TRANSFORMERS:
import logging import json import queue import sklearn import time import sys import multiprocessing as mp from simpletransformers.question_answering import QuestionAnsweringModel, QuestionAnsweringArgs from sklearn.metrics import accuracy_score, precision_recall_fscore_support #logging.basicConfig(level=logging.INFO) #transformers_logger = logging.getLogger(“transformers”) #transformers_logger.setLevel(logging.WARNING) #if __name__==’__main__’: #mp.freeze_support() model_args = QuestionAnsweringArgs(overwrite_output_dir=True,doc_stride=80) model_args.evaluate_during_training = True #after training. replace the model outputs it generate when training by path like below. model = QuestionAnsweringModel( “longformer”, “./outputs”,use_cuda=True,args=model_args ) def predictset(): # input_question=sys.argv[1] # start = timeit.default_timer() phrase = “break” input_question = input(“question: “) if input_question == phrase: print(“QAS: good bye!”) else: to_predict1 = [{ “context”: “<input your context here>”, “qas”: [ { “question”: input_question, “id”: “0”, } ] }] start_time = time.time() answers, probabilities = model.predict(to_predict1) print(“— %s seconds —” % (time.time() – start_time)) dict_ans = answers[0] real_answer = print(dict_ans[“answer”][0]) return print(real_answer),predictset() predictset() |
CONCLUSION
By understanding the basics of transformers and their architecture, you can use transformers for various applications. In this blog, we discussed the steps involved in building question and answer systems using transformers, which is one of the top applications of transformers.
Did you enjoy this article?