Using BERT for next sentence predictionBest way to classify labeled sentences from a set of documentsHow to use word embeddings for prediction in Tensorflowtensorflow loaded model gives different predictionsSentence order prediction from user given input using RNN- LSTM language modelingUse LSTM tutorial code to predict next word in a sentence?How to detect sentence type using pythonNeural Network High Confidence Inaccurate PredictionsPytorch tutorial LSTMdynamic input slicing for tensorflow dynamic RNNHow to use BERT in image caption tasks,such as im2txt,densecap

If an attacker targets a creature with the Sanctuary spell cast on them, but fails the Wisdom save, can they choose not to attack anyone else?

Test whether a string is in a list with variable

Why is the blank symbol not considered part of the input alphabet of a Turing machine?

Justification of physical currency in an interstellar civilization?

Are modes in jazz primarily a melody thing?

How do I minimise waste on a flight?

Range hood vents into crawl space

When does WordPress.org notify sites of new version?

Gift for mentor after his thesis defense?

How can I test a shell script in a "safe environment" to avoid harm to my computer?

Magical Modulo Squares

Make me a minimum magic sum

In a series of books, what happens after the coming of age?

Translation of "invincible independence"

Scaling rounded rectangles in Illustrator

Why were the rules for Proliferate changed?

Can you just subtract the challenge rating of friendly NPCs?

Picking a theme as a discovery writer

Explaining intravenous drug abuse to a small child

How can I finally understand the confusing modal verb "мочь"?

What’s the interaction between darkvision and the Eagle Aspect of the beast, if you have Darkvision past 100 feet?

While drilling into kitchen wall, hit a wire - any advice?

Why did Gendry call himself Gendry Rivers?

Splitting polygons and dividing attribute value proportionally using ArcGIS Pro?

Using BERT for next sentence prediction

Best way to classify labeled sentences from a set of documentsHow to use word embeddings for prediction in Tensorflowtensorflow loaded model gives different predictionsSentence order prediction from user given input using RNN- LSTM language modelingUse LSTM tutorial code to predict next word in a sentence?How to detect sentence type using pythonNeural Network High Confidence Inaccurate PredictionsPytorch tutorial LSTMdynamic input slicing for tensorflow dynamic RNNHow to use BERT in image caption tasks,such as im2txt,densecap

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

Google's BERT is pretrained on next sentence prediction tasks, but I'm wondering if it's possible to call the next sentence prediction function on new data.

The idea is: given sentence A and given sentence B, I want a probabilistic label for whether or not sentence B follows sentence A. BERT is pretrained on a huge set of data, so I was hoping to use this next sentence prediction on new sentence data. I can't seem to figure out if this next sentence prediction function can be called and if so, how. Thanks for your help!

asked Mar 11 at 22:29

Paul

162

add a comment |

Google's BERT is pretrained on next sentence prediction tasks, but I'm wondering if it's possible to call the next sentence prediction function on new data.

asked Mar 11 at 22:29

Paul

162

add a comment |

Google's BERT is pretrained on next sentence prediction tasks, but I'm wondering if it's possible to call the next sentence prediction function on new data.

asked Mar 11 at 22:29

Paul

162

Google's BERT is pretrained on next sentence prediction tasks, but I'm wondering if it's possible to call the next sentence prediction function on new data.

tensorflow deep-learning nlp reproducible-research natural-language-processing

asked Mar 11 at 22:29

Paul

162

asked Mar 11 at 22:29

Paul

162

asked Mar 11 at 22:29

Paul

162

asked Mar 11 at 22:29

Paul

162

asked Mar 11 at 22:29

Paul

162

add a comment |

1 Answer
1

active

oldest

votes

Hugging face did it for you: https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling.py#L854

class BertForNextSentencePrediction(BertPreTrainedModel):
 """BERT model with next sentence prediction head.
 This module comprises the BERT model followed by the next sentence classification head.
 Params:
 config: a BertConfig class instance with the configuration to build a new model.
 Inputs:
 `input_ids`: a torch.LongTensor of shape [batch_size, sequence_length]
 with the word token indices in the vocabulary(see the tokens preprocessing logic in the scripts
 `extract_features.py`, `run_classifier.py` and `run_squad.py`)
 `token_type_ids`: an optional torch.LongTensor of shape [batch_size, sequence_length] with the token
 types indices selected in [0, 1]. Type 0 corresponds to a `sentence A` and type 1 corresponds to
 a `sentence B` token (see BERT paper for more details).
 `attention_mask`: an optional torch.LongTensor of shape [batch_size, sequence_length] with indices
 selected in [0, 1]. It's a mask to be used if the input sequence length is smaller than the max
 input sequence length in the current batch. It's the mask that we typically use for attention when
 a batch has varying length sentences.
 `next_sentence_label`: next sentence classification loss: torch.LongTensor of shape [batch_size]
 with indices selected in [0, 1].
 0 => next sentence is the continuation, 1 => next sentence is a random sentence.
 Outputs:
 if `next_sentence_label` is not `None`:
 Outputs the total_loss which is the sum of the masked language modeling loss and the next
 sentence classification loss.
 if `next_sentence_label` is `None`:
 Outputs the next sentence classification logits of shape [batch_size, 2].
 Example usage:
 ```python
 # Already been converted into WordPiece token ids
 input_ids = torch.LongTensor([[31, 51, 99], [15, 5, 0]])
 input_mask = torch.LongTensor([[1, 1, 1], [1, 1, 0]])
 token_type_ids = torch.LongTensor([[0, 0, 1], [0, 1, 0]])
 config = BertConfig(vocab_size_or_config_json_file=32000, hidden_size=768,
 num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072)
 model = BertForNextSentencePrediction(config)
 seq_relationship_logits = model(input_ids, token_type_ids, input_mask)
 ```
 """
 def __init__(self, config):
 super(BertForNextSentencePrediction, self).__init__(config)
 self.bert = BertModel(config)
 self.cls = BertOnlyNSPHead(config)
 self.apply(self.init_bert_weights)

 def forward(self, input_ids, token_type_ids=None, attention_mask=None, next_sentence_label=None):
 _, pooled_output = self.bert(input_ids, token_type_ids, attention_mask,
 output_all_encoded_layers=False)
 seq_relationship_score = self.cls( pooled_output)

 if next_sentence_label is not None:
 loss_fct = CrossEntropyLoss(ignore_index=-1)
 next_sentence_loss = loss_fct(seq_relationship_score.view(-1, 2), next_sentence_label.view(-1))
 return next_sentence_loss
 else:
 return seq_relationship_score

answered Mar 23 at 6:03

Aerin

4,35364067

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55111360%2fusing-bert-for-next-sentence-prediction%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

Hugging face did it for you: https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling.py#L854

class BertForNextSentencePrediction(BertPreTrainedModel):
 """BERT model with next sentence prediction head.
 This module comprises the BERT model followed by the next sentence classification head.
 Params:
 config: a BertConfig class instance with the configuration to build a new model.
 Inputs:
 `input_ids`: a torch.LongTensor of shape [batch_size, sequence_length]
 with the word token indices in the vocabulary(see the tokens preprocessing logic in the scripts
 `extract_features.py`, `run_classifier.py` and `run_squad.py`)
 `token_type_ids`: an optional torch.LongTensor of shape [batch_size, sequence_length] with the token
 types indices selected in [0, 1]. Type 0 corresponds to a `sentence A` and type 1 corresponds to
 a `sentence B` token (see BERT paper for more details).
 `attention_mask`: an optional torch.LongTensor of shape [batch_size, sequence_length] with indices
 selected in [0, 1]. It's a mask to be used if the input sequence length is smaller than the max
 input sequence length in the current batch. It's the mask that we typically use for attention when
 a batch has varying length sentences.
 `next_sentence_label`: next sentence classification loss: torch.LongTensor of shape [batch_size]
 with indices selected in [0, 1].
 0 => next sentence is the continuation, 1 => next sentence is a random sentence.
 Outputs:
 if `next_sentence_label` is not `None`:
 Outputs the total_loss which is the sum of the masked language modeling loss and the next
 sentence classification loss.
 if `next_sentence_label` is `None`:
 Outputs the next sentence classification logits of shape [batch_size, 2].
 Example usage:
 ```python
 # Already been converted into WordPiece token ids
 input_ids = torch.LongTensor([[31, 51, 99], [15, 5, 0]])
 input_mask = torch.LongTensor([[1, 1, 1], [1, 1, 0]])
 token_type_ids = torch.LongTensor([[0, 0, 1], [0, 1, 0]])
 config = BertConfig(vocab_size_or_config_json_file=32000, hidden_size=768,
 num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072)
 model = BertForNextSentencePrediction(config)
 seq_relationship_logits = model(input_ids, token_type_ids, input_mask)
 ```
 """
 def __init__(self, config):
 super(BertForNextSentencePrediction, self).__init__(config)
 self.bert = BertModel(config)
 self.cls = BertOnlyNSPHead(config)
 self.apply(self.init_bert_weights)

 def forward(self, input_ids, token_type_ids=None, attention_mask=None, next_sentence_label=None):
 _, pooled_output = self.bert(input_ids, token_type_ids, attention_mask,
 output_all_encoded_layers=False)
 seq_relationship_score = self.cls( pooled_output)

 if next_sentence_label is not None:
 loss_fct = CrossEntropyLoss(ignore_index=-1)
 next_sentence_loss = loss_fct(seq_relationship_score.view(-1, 2), next_sentence_label.view(-1))
 return next_sentence_loss
 else:
 return seq_relationship_score

answered Mar 23 at 6:03

Aerin

4,35364067

add a comment |

Hugging face did it for you: https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling.py#L854

class BertForNextSentencePrediction(BertPreTrainedModel):
 """BERT model with next sentence prediction head.
 This module comprises the BERT model followed by the next sentence classification head.
 Params:
 config: a BertConfig class instance with the configuration to build a new model.
 Inputs:
 `input_ids`: a torch.LongTensor of shape [batch_size, sequence_length]
 with the word token indices in the vocabulary(see the tokens preprocessing logic in the scripts
 `extract_features.py`, `run_classifier.py` and `run_squad.py`)
 `token_type_ids`: an optional torch.LongTensor of shape [batch_size, sequence_length] with the token
 types indices selected in [0, 1]. Type 0 corresponds to a `sentence A` and type 1 corresponds to
 a `sentence B` token (see BERT paper for more details).
 `attention_mask`: an optional torch.LongTensor of shape [batch_size, sequence_length] with indices
 selected in [0, 1]. It's a mask to be used if the input sequence length is smaller than the max
 input sequence length in the current batch. It's the mask that we typically use for attention when
 a batch has varying length sentences.
 `next_sentence_label`: next sentence classification loss: torch.LongTensor of shape [batch_size]
 with indices selected in [0, 1].
 0 => next sentence is the continuation, 1 => next sentence is a random sentence.
 Outputs:
 if `next_sentence_label` is not `None`:
 Outputs the total_loss which is the sum of the masked language modeling loss and the next
 sentence classification loss.
 if `next_sentence_label` is `None`:
 Outputs the next sentence classification logits of shape [batch_size, 2].
 Example usage:
 ```python
 # Already been converted into WordPiece token ids
 input_ids = torch.LongTensor([[31, 51, 99], [15, 5, 0]])
 input_mask = torch.LongTensor([[1, 1, 1], [1, 1, 0]])
 token_type_ids = torch.LongTensor([[0, 0, 1], [0, 1, 0]])
 config = BertConfig(vocab_size_or_config_json_file=32000, hidden_size=768,
 num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072)
 model = BertForNextSentencePrediction(config)
 seq_relationship_logits = model(input_ids, token_type_ids, input_mask)
 ```
 """
 def __init__(self, config):
 super(BertForNextSentencePrediction, self).__init__(config)
 self.bert = BertModel(config)
 self.cls = BertOnlyNSPHead(config)
 self.apply(self.init_bert_weights)

 def forward(self, input_ids, token_type_ids=None, attention_mask=None, next_sentence_label=None):
 _, pooled_output = self.bert(input_ids, token_type_ids, attention_mask,
 output_all_encoded_layers=False)
 seq_relationship_score = self.cls( pooled_output)

 if next_sentence_label is not None:
 loss_fct = CrossEntropyLoss(ignore_index=-1)
 next_sentence_loss = loss_fct(seq_relationship_score.view(-1, 2), next_sentence_label.view(-1))
 return next_sentence_loss
 else:
 return seq_relationship_score

answered Mar 23 at 6:03

Aerin

4,35364067

add a comment |

Hugging face did it for you: https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling.py#L854

class BertForNextSentencePrediction(BertPreTrainedModel):
 """BERT model with next sentence prediction head.
 This module comprises the BERT model followed by the next sentence classification head.
 Params:
 config: a BertConfig class instance with the configuration to build a new model.
 Inputs:
 `input_ids`: a torch.LongTensor of shape [batch_size, sequence_length]
 with the word token indices in the vocabulary(see the tokens preprocessing logic in the scripts
 `extract_features.py`, `run_classifier.py` and `run_squad.py`)
 `token_type_ids`: an optional torch.LongTensor of shape [batch_size, sequence_length] with the token
 types indices selected in [0, 1]. Type 0 corresponds to a `sentence A` and type 1 corresponds to
 a `sentence B` token (see BERT paper for more details).
 `attention_mask`: an optional torch.LongTensor of shape [batch_size, sequence_length] with indices
 selected in [0, 1]. It's a mask to be used if the input sequence length is smaller than the max
 input sequence length in the current batch. It's the mask that we typically use for attention when
 a batch has varying length sentences.
 `next_sentence_label`: next sentence classification loss: torch.LongTensor of shape [batch_size]
 with indices selected in [0, 1].
 0 => next sentence is the continuation, 1 => next sentence is a random sentence.
 Outputs:
 if `next_sentence_label` is not `None`:
 Outputs the total_loss which is the sum of the masked language modeling loss and the next
 sentence classification loss.
 if `next_sentence_label` is `None`:
 Outputs the next sentence classification logits of shape [batch_size, 2].
 Example usage:
 ```python
 # Already been converted into WordPiece token ids
 input_ids = torch.LongTensor([[31, 51, 99], [15, 5, 0]])
 input_mask = torch.LongTensor([[1, 1, 1], [1, 1, 0]])
 token_type_ids = torch.LongTensor([[0, 0, 1], [0, 1, 0]])
 config = BertConfig(vocab_size_or_config_json_file=32000, hidden_size=768,
 num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072)
 model = BertForNextSentencePrediction(config)
 seq_relationship_logits = model(input_ids, token_type_ids, input_mask)
 ```
 """
 def __init__(self, config):
 super(BertForNextSentencePrediction, self).__init__(config)
 self.bert = BertModel(config)
 self.cls = BertOnlyNSPHead(config)
 self.apply(self.init_bert_weights)

 def forward(self, input_ids, token_type_ids=None, attention_mask=None, next_sentence_label=None):
 _, pooled_output = self.bert(input_ids, token_type_ids, attention_mask,
 output_all_encoded_layers=False)
 seq_relationship_score = self.cls( pooled_output)

 if next_sentence_label is not None:
 loss_fct = CrossEntropyLoss(ignore_index=-1)
 next_sentence_loss = loss_fct(seq_relationship_score.view(-1, 2), next_sentence_label.view(-1))
 return next_sentence_loss
 else:
 return seq_relationship_score

answered Mar 23 at 6:03

Aerin

4,35364067

Hugging face did it for you: https://github.com/huggingface/pytorch-pretrained-BERT/blob/master/pytorch_pretrained_bert/modeling.py#L854

class BertForNextSentencePrediction(BertPreTrainedModel):
 """BERT model with next sentence prediction head.
 This module comprises the BERT model followed by the next sentence classification head.
 Params:
 config: a BertConfig class instance with the configuration to build a new model.
 Inputs:
 `input_ids`: a torch.LongTensor of shape [batch_size, sequence_length]
 with the word token indices in the vocabulary(see the tokens preprocessing logic in the scripts
 `extract_features.py`, `run_classifier.py` and `run_squad.py`)
 `token_type_ids`: an optional torch.LongTensor of shape [batch_size, sequence_length] with the token
 types indices selected in [0, 1]. Type 0 corresponds to a `sentence A` and type 1 corresponds to
 a `sentence B` token (see BERT paper for more details).
 `attention_mask`: an optional torch.LongTensor of shape [batch_size, sequence_length] with indices
 selected in [0, 1]. It's a mask to be used if the input sequence length is smaller than the max
 input sequence length in the current batch. It's the mask that we typically use for attention when
 a batch has varying length sentences.
 `next_sentence_label`: next sentence classification loss: torch.LongTensor of shape [batch_size]
 with indices selected in [0, 1].
 0 => next sentence is the continuation, 1 => next sentence is a random sentence.
 Outputs:
 if `next_sentence_label` is not `None`:
 Outputs the total_loss which is the sum of the masked language modeling loss and the next
 sentence classification loss.
 if `next_sentence_label` is `None`:
 Outputs the next sentence classification logits of shape [batch_size, 2].
 Example usage:
 ```python
 # Already been converted into WordPiece token ids
 input_ids = torch.LongTensor([[31, 51, 99], [15, 5, 0]])
 input_mask = torch.LongTensor([[1, 1, 1], [1, 1, 0]])
 token_type_ids = torch.LongTensor([[0, 0, 1], [0, 1, 0]])
 config = BertConfig(vocab_size_or_config_json_file=32000, hidden_size=768,
 num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072)
 model = BertForNextSentencePrediction(config)
 seq_relationship_logits = model(input_ids, token_type_ids, input_mask)
 ```
 """
 def __init__(self, config):
 super(BertForNextSentencePrediction, self).__init__(config)
 self.bert = BertModel(config)
 self.cls = BertOnlyNSPHead(config)
 self.apply(self.init_bert_weights)

 def forward(self, input_ids, token_type_ids=None, attention_mask=None, next_sentence_label=None):
 _, pooled_output = self.bert(input_ids, token_type_ids, attention_mask,
 output_all_encoded_layers=False)
 seq_relationship_score = self.cls( pooled_output)

 if next_sentence_label is not None:
 loss_fct = CrossEntropyLoss(ignore_index=-1)
 next_sentence_loss = loss_fct(seq_relationship_score.view(-1, 2), next_sentence_label.view(-1))
 return next_sentence_loss
 else:
 return seq_relationship_score

answered Mar 23 at 6:03

Aerin

4,35364067

answered Mar 23 at 6:03

Aerin

4,35364067

answered Mar 23 at 6:03

Aerin

4,35364067

answered Mar 23 at 6:03

Aerin

4,35364067

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Styjun

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

은진 송씨 목차 역사 본관 분파 인물 조선 왕실과의 인척 관계 집성촌 항렬자 인구 같이 보기 각주 둘러보기 메뉴은진 송씨세종실록 149권, 지리지 충청도 공주목 은진현

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

은진 송씨 목차 역사 본관 분파 인물 조선 왕실과의 인척 관계 집성촌 항렬자 인구 같이 보기 각주 둘러보기 메뉴은진 송씨세종실록 149권, 지리지 충청도 공주목 은진현

1 Answer
1

1 Answer
1

1 Answer
1