Question Answering with a fine | Question answering BERT
BERThasauniquewayofprocessingthetokenizedinputs.Fromtheabovescreenshot,wecanseetwospecialtokens[CLS]and[SEP].[CLS]tokenstandsforclassificationandistheretorepresentsentence-levelclassificationandisusedwhenweareclassifying.AnothertokenusedbyBERTis[SEP].Itisusedtoseparatethetwopiecesoftext.Youcanseetwo[SEP]tokensintheabovescreenshots,oneafterthequestionandanotherafterthetext.Apartfromthe“TokenEmbeddings”,BERTinternallyalsouses“SegmentEmbeddings”and“PositionEmbeddings”.SegmentembeddingshelpBERTi...
BERT has a unique way of processing the tokenized inputs. From the above screenshot, we can see two special tokens [CLS] and [SEP]. [CLS] token stands for classification and is there to represent sentence-level classification and is used when we are classifying. Another token used by BERT is [SEP]. It is used to separate the two pieces of text. You can see two [SEP] tokens in the above screenshots, one after the question and another after the text.
Apart from the “Token Embeddings”, BERT internally also uses “Segment Embeddings” and “Position Embeddings”. Segment embeddings help BERT in differentiating a question from the text. In practice, we use a vector of 0s if embeddings are from sentence 1 else a vector of 1’s if embeddings are...