'Why does BERT Model fail to find an option that matches my input positional arguments?
While attempting an NLP exercise, I tried to make use of BERT architecture to get a good training model. So I defined a function that builds and compiles the model using BERT as the layer. However, upon trying to execute the function and actually build the model, I get an error that the BERT Layer could not find an option to match my input positional arguments.
The dimensions of my positional arguments are [None, 160]
but the BERT Layer seemingly expects them to be [None, None]
. How do I resolve this?
To reproduce my problem:
These are the libraries I imported:
import tensorflow as tf
from tensorflow.keras.layers import Dense, Input
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.models import Model
import tensorflow_hub as hub
Next, I defined a function for the model as follows:
# Build and compile the model
def build_model(bert_layer, max_len = 512):
input_word_ids = Input(shape=(max_len,), dtype=tf.int32, name="input_word_ids")
input_mask = Input(shape=(max_len,), dtype=tf.int32, name="input_mask")
segment_ids = Input(shape=(max_len,), dtype=tf.int32, name="segment_ids")
pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, segment_ids])
clf_output = sequence_output[:, 0, :]
out = Dense(1, activation='sigmoid')(clf_output)
model = Model(inputs=[input_word_ids, input_mask, segment_ids], outputs=out)
model.compile(Adam(lr=1e-5), loss='binary_crossentropy', metrics=['accuracy'])
return model
Next, I downloaded the BERT architecture and instantiated the bert_layer
as follows:
module_url = "https://tfhub.dev/tensorflow/bert_en_uncased_L-24_H-1024_A-16/4"
bert_layer = hub.KerasLayer(module_url, trainable=True)
Finally, I tried to build the model using the build_model
function and bert_layer
as seen below:
model = build_model(bert_layer, max_len=160)
model.summary()
But this returns an error which I think implies that the dimensions of my input are different from the dimensions that are required. The error is as seen below:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-42-516b88804394> in <module>
----> 1 model = build_model(bert_layer, max_len=160)
2 model.summary()
<ipython-input-41-713013238e2f> in build_model(bert_layer, max_len)
6 segment_ids = Input(shape=(max_len,), dtype=tf.int32, name="segment_ids")
7
----> 8 pooled_output, sequence_output = bert_layer([input_word_ids, input_mask, segment_ids])
9 clf_output = sequence_output[:, 0, :]
10 out = Dense(1, activation='sigmoid')(clf_output)
~\Anaconda3\lib\site-packages\tensorflow_core\python\keras\engine\base_layer.py in __call__(self, inputs, *args, **kwargs)
840 not base_layer_utils.is_in_eager_or_tf_function()):
841 with auto_control_deps.AutomaticControlDependencies() as acd:
--> 842 outputs = call_fn(cast_inputs, *args, **kwargs)
843 # Wrap Tensors in `outputs` in `tf.identity` to avoid
844 # circular dependencies.
~\Anaconda3\lib\site-packages\tensorflow_core\python\autograph\impl\api.py in wrapper(*args, **kwargs)
235 except Exception as e: # pylint:disable=broad-except
236 if hasattr(e, 'ag_error_metadata'):
--> 237 raise e.ag_error_metadata.to_exception(e)
238 else:
239 raise
ValueError: in converted code:
relative to C:\Users\Wolemercy\Anaconda3\lib\site-packages:
tensorflow_hub\keras_layer.py:237 call *
result = smart_cond.smart_cond(training,
tensorflow_core\python\framework\smart_cond.py:59 smart_cond
name=name)
tensorflow_core\python\saved_model\load.py:436 _call_attribute
return instance.__call__(*args, **kwargs)
tensorflow_core\python\eager\def_function.py:457 __call__
result = self._call(*args, **kwds)
tensorflow_core\python\eager\def_function.py:494 _call
results = self._stateful_fn(*args, **kwds)
tensorflow_core\python\eager\function.py:1822 __call__
graph_function, args, kwargs = self._maybe_define_function(args, kwargs)
tensorflow_core\python\eager\function.py:2150 _maybe_define_function
graph_function = self._create_graph_function(args, kwargs)
tensorflow_core\python\eager\function.py:2041 _create_graph_function
capture_by_value=self._capture_by_value),
tensorflow_core\python\framework\func_graph.py:915 func_graph_from_py_func
func_outputs = python_func(*func_args, **func_kwargs)
tensorflow_core\python\eager\def_function.py:358 wrapped_fn
return weak_wrapped_fn().__wrapped__(*args, **kwds)
tensorflow_core\python\saved_model\function_deserialization.py:262 restored_function_body
"\n\n".join(signature_descriptions)))
ValueError: Could not find matching function to call loaded from the SavedModel. Got:
Positional arguments (3 total):
* [<tf.Tensor 'inputs:0' shape=(None, 160) dtype=int32>, <tf.Tensor 'inputs_1:0' shape=(None, 160) dtype=int32>, <tf.Tensor 'inputs_2:0' shape=(None, 160) dtype=int32>]
* True
* None
Keyword arguments: {}
Expected these arguments to match one of the following 4 option(s):
Option 1:
Positional arguments (3 total):
* {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask')}
* False
* None
Keyword arguments: {}
Option 2:
Positional arguments (3 total):
* {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_mask')}
* False
* None
Keyword arguments: {}
Option 3:
Positional arguments (3 total):
* {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='inputs/input_mask')}
* True
* None
Keyword arguments: {}
Option 4:
Positional arguments (3 total):
* {'input_type_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_type_ids'), 'input_word_ids': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_word_ids'), 'input_mask': TensorSpec(shape=(None, None), dtype=tf.int32, name='input_mask')}
* True
* None
Keyword arguments: {}
My expectation was that the model would be compiled successfully. Instead, I got this error.
Solution 1:[1]
First of all you need the bert preprocessor
bert_preprocessor = hub.load("https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3")
This will give you the : input_word_ids , input_mask , segment_ids. you simply pass your text to the bert_preprocessor
then add your bert model as a KerasLayer
bert_model = hub.KerasLayer("https://tfhub.dev/tensorflow/bert_en_uncased_L-24_H-1024_A-16/4")
as for fine tunning your model :
def bert_funtional_API(seq_length):
text_input = [tf.keras.layers.Input(shape=(),dtype=tf.string)]
tokenize = hub.KerasLayer(bert_preprocessor.tokenize)
tokenized_inputs = [tokenize(segment) for segment in input1]
bert_pack_inputs = hub.KerasLayer(bert_preprocessor.bert_pack_inputs,
arguments=dict(seq_length=seq_length))
encoder_inputs = bert_pack_inputs(tokenized_inputs)
bert_input = bert_encoder(encoder_inputs)
pooled_output = bert_input['pooled_output']
sequence_output = bert_input['sequence_output']
output = Dense(1,activation = 'sigmoid')(sequence_output)
model = Model(inputs = [text_input], outputs = output)
model.compile(Adam(lr=1e-5), loss='binary_crossentropy', metrics=['accuracy'])
return model
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |