'list indices must be integers or slices, not ListWrapper

I'm having some trouble with a pretty basic model. Am unable to create a pre-processing layer that simply normalizes all features. It is likely that my conceptual understanding of the situation is problematic. My thinking was that the input layer is a list or a dictionary of tf.keras.Input objects, which refer to the input tensors by "name", and indicates their shape and datatypes. Normalizer layers are built by first adapting them over the training dataset, and those layers can be accrued in a list and concatenated. After the input layer is defined, the preprocessing layer takes as input the input layer, and passes its results downstream. Each item in an input layer list is a symbolic representation of the tensors that will flow, and each normalizer will get the right tensors by virtue of having been adapted on that feature.

The error I get is as follows:

TypeError                                 Traceback (most recent call last)                                                                                                 
Input In [11], in <cell line: 63>()                                                                                                                                         
     59 concatenated_preprocessing_layer = tf.keras.layers.Concatenate(preprocessing_layers)                                                                                
     61 #outputs = concatenated_preprocessing_layer(input_layer.values())                                                                                                   
---> 63 outputs = concatenated_preprocessing_layer(all_inputs)                                                                                                              
                                                                                                                                                                            
File ~/.pyenv/versions/3.8.5/lib/python3.8/site-packages/keras/utils/traceback_utils.py:67, in filter_traceback.<locals>.error_handler(*args, **kwargs)                     
     65 except Exception as e:  # pylint: disable=broad-except                                                                                                              
     66   filtered_tb = _process_traceback_frames(e.__traceback__)                                                                                                          
---> 67   raise e.with_traceback(filtered_tb) from None                                                                                                                     
     68 finally:                                                                                                                                                            
     69   del filtered_tb                                                                                                                                                   
                                                                                                                                                                            
File ~/.pyenv/versions/3.8.5/lib/python3.8/site-packages/keras/layers/merge.py:509, in Concatenate.build(self, input_shape)                                                 
    507 shape_set = set()                                                                                                                                                   
    508 for i in range(len(reduced_inputs_shapes)):                                                                                                                         
--> 509   del reduced_inputs_shapes[i][self.axis]                                                                                                                           
    510   shape_set.add(tuple(reduced_inputs_shapes[i]))                                                                                                                    
    512 if len(shape_set) != 1:                                                                                                                                             
                                                                                                                                                                            
TypeError: list indices must be integers or slices, not ListWrapper       

And the code is as follows:

import tensorflow as tf                                                                                                                                                     
filepath='./taxi_data.csv'                                                                                                                                                  
CSV_COLUMNS = [                                                                                                                                                             
    'fare_amount',                                                                                                                                                          
    'pickup_datetime',                                                                                                                                                      
    'pickup_longitude',                                                                                                                                                     
    'pickup_latitude',                                                                                                                                                      
    'dropoff_longitude',                                                                                                                                                    
    'dropoff_latitude',                                                                                                                                                     
    'passenger_count',                                                                                                                                                      
    'key',                                                                                                                                                                  
]                                                                                                                                                                           
LABEL_COLUMN = 'fare_amount'                                                                                                                                                
STRING_COLS = ['pickup_datetime']                                                                                                                                           
NUMERIC_COLS = ['pickup_longitude', 'pickup_latitude',                                                                                                                      
                'dropoff_longitude', 'dropoff_latitude',                                                                                                                    
                'passenger_count']                                                                                                                                          
DEFAULTS = [[0.0], ['na'], [0.0], [0.0], [0.0], [0.0], [0.0], ['na']]                                                                                                       
DAYS = ['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']                                                                                                                    
                                                                                                                                                                            
                                                                                                                                                                            
                                                                                                                                                                            
def map_features_and_labels(row, target_name):                                                                                                                              
    label = row.pop(target_name)                                                                                                                                            
    row.pop('key')                                                                                                                                                          
    row.pop('pickup_datetime')                                                                                                                                              
    return row, label                                                                                                                                                       
                                                                                                                                                                            
                                                                                                                                                                            
def create_dataset(filepath, target_name, batch_size=1, mode=tf.estimator.ModeKeys.EVAL, CSV_COLUMNS=None, column_defaults=None  ):                                         
    dataset = tf.data.experimental.make_csv_dataset(file_pattern=filepath, column_names=CSV_COLUMNS, column_defaults=DEFAULTS, num_epochs=1, batch_size=1)                  
    dataset = dataset.map(lambda X: map_features_and_labels(X, target_name))                                                                                                
    if mode == tf.estimator.ModeKeys.TRAIN:                                                                                                                                 
        dataset = dataset.shuffle(buffer_size=1000).repeat()                                                                                                                
    return dataset                                                                                                                                                          
                                                                                                                                                                            
train_ds = create_dataset(filepath, target_name=LABEL_COLUMN, batch_size=1, CSV_COLUMNS=CSV_COLUMNS, column_defaults=DEFAULTS  )                                            
                                                                                                                                                                            
#The input layer is  usually a dictionary of feature_name: Input object   
input_layer = {                                                                                                                                                             
    'pickup_longitude': tf.keras.Input(shape=(0,), name='pickup_longitude', dtype=tf.dtypes.float32),                                                                       
    'pickup_latitude': tf.keras.Input(shape=(0,), name='pickup_latitude', dtype=tf.dtypes.float32),                                                                         
    'dropoff_longitude': tf.keras.Input(shape=(0,), name='dropoff_longitude', dtype=tf.dtypes.float32),                                                                     
    'dropoff_latitude': tf.keras.Input(shape=(0,), name='dropoff_latitude', dtype=tf.dtypes.float32),                                                                       
    'passenger_count': tf.keras.Input(shape=(0,), name='passenger_count', dtype=tf.dtypes.float32),                                                                         
}                                                                                                                                                                           
                                                                                                                                                                            
                                                                                                                                                                            
preprocessing_layers = []                                                                                                                                                   
all_inputs = []                                                                                                                                                             
for column in NUMERIC_COLS:                                                                                                                                                 
    feature_ds = train_ds.map(lambda X, y: X[column])                                                                                                                       
    normalizer = tf.keras.layers.Normalization(axis=None)                                                                                                                   
    normalizer.adapt(feature_ds)                                                                                                                                            
    preprocessing_layers.append(normalizer)                                                                                                                                 
    all_inputs.append(tf.keras.Input(shape=(0,), name=column, dtype=tf.dtypes.float32, ))                                                                                   
concatenated_preprocessing_layer = tf.keras.layers.Concatenate(preprocessing_layers)                                                                                        
                                                                                                                                                                            
#outputs = concatenated_preprocessing_layer(input_layer.values())                                                                                                           
                                                                                                                                                                            
outputs = concatenated_preprocessing_layer(all_inputs)              

And here is some of the data in the taxi_data.csv file

17,2014-10-25 21:39:42 UTC,-73.978713,40.78303,-74.008102,40.73881,2,unused                                                                                                 
14.9,2012-08-22 12:01:00 UTC,-73.987667,40.728747,-74.003272,40.715202,2,unused                                                                                             
21.5,2013-12-18 23:26:12 UTC,-74.008969,40.716853,-73.97688,40.780289,2,unused                                                                                              
23.5,2014-10-04 21:58:00 UTC,-73.954153,40.806257,-74.00343,40.731867,2,unused                                                                                              
34.3,2012-12-17 15:23:00 UTC,-73.866917,40.770342,-73.968872,40.757482,2,unused                                                                                             
16.1,2009-09-24 17:37:31 UTC,-73.967549,40.762828,-73.97961,40.723133,2,unused                                                                                              
17.3,2010-04-26 20:52:36 UTC,-73.981381,40.749913,-73.966612,40.691132,2,unused                                                                                             
35,2014-08-13 20:16:00 UTC,-73.866107,40.771245,-74.013987,40.676437,2,unused                                                                                               
17.3,2010-12-30 17:55:00 UTC,-73.997803,40.725982,-73.982382,40.772225,2,unused    

I was able to get this to work. Like I suspected, it was my conceptual understanding that was the issue. Specifically, I wasn't correctly hooking up the Input (input_placeholder) to the normalizer. The modified code is below:

preprocessing_layers = []                                                                                                                                                   
all_inputs = []                                                                                                                                                             
for column in NUMERIC_COLS:                                                                                                                                                 
    normalizer = get_normalization_layer(column, train_ds)                                                                                                                  
    input_placeholder = tf.keras.Input(shape=(1,), name=column, dtype=tf.dtypes.float32, )                                                                                  
    encoded_feature = normalizer(input_placeholder)                                                                                                                         
    preprocessing_layers.append(encoded_feature)                                                                                                                            
    all_inputs.append(input_placeholder)                                                                                                                                    
concatenated_preprocessing_layer = tf.keras.layers.concatenate(preprocessing_layers)                                                                                        
                                                                                                                                                                            
#outputs = concatenated_preprocessing_layer(input_layer.values())                                                                                                           
                                                                                                                                                                            
preprocessing_new_model = tf.keras.Model(inputs=all_inputs, outputs=concatenated_preprocessing_layer)                                                                       
preprocessing_new_model(train_features) 


Solution 1:[1]

You need to concatenate preprocessing_layers and all_inputs by using the code below:

concatenated_preprocessing_layer = tf.keras.layers.Concatenate((preprocessing_layers,all_inputs))

As you have used

concatenated_preprocessing_layer = tf.keras.layers.Concatenate(preprocessing_layers)

You can concatenate all_inputs by using:

outputs =tf.keras.layers.Concatenate((concatenated_preprocessing_layer,all_inputs))

Please refer to this working gist for your reference.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1