Missing functionality in TensorFlow when TF-Slim is deprecated - serious issue(?)

108 views
Skip to first unread message

Niclas Danielsson

unread,
Feb 25, 2019, 4:51:25 AM2/25/19
to TensorFlow Developers

Hi,


I have tried to find an answer to this question for quite some time now, but it seems like there might be a piece missing that is not adressed with the upgoming TensorFlow 2.0 release that is absolutely fundamental for many Deep Learning approaches.


A common use-case is fine-tuning from Imagenet. At a first glance it seems like Tensorflow Hub takes over from TF-Slim in providing a model zoo for such things. However, TensorFlow Hub defines so-called "signatures" that dictate what you can do with the models available in the model zoo there. This means for a classifier you can typically only replace the last layer of the CNN. The rest of the CNN is an opaque blob and you can not access or modify its internals.


If I am right about TensorFlow Hub, then I can list numerous use-cases where the classifier signature is completely useless:


1: You can not train with multiple learning rates (say e.g. 1/10:th of the learning rate for the base-network, and the full learning rate for the final layer)


2: You can not extract intermediate layers from the CNN since they must be given by the signature that the one who provided the network defined for you. (This essentially means you have to do the Imagenet training yourself in order to get that level of control, and this makes it pointless to use TensorFlow Hub at all.)


3:If I am interested in, say, the only the 3 first layers of a CNN pretrained on Imagenet because I think they provide good initial feature extraction, and then want to design a completely different CNN for some other purpose, then this is not possible with TensorFlow Hub either.


4: If I want to finetune more than the last layer, I can not do that either. Say, for a very deep CNN with 100+ layers, it might make sense to finetune the 10 last layers, rather than only the last layer. This does not seem possible...


We use such techniques on a daily basis and REALLY need that flexibility.

It was provided with TF-Slim, but if I am right, this flexibility will no longer be provided in TensorFlow 2.0,

which puts the framework as a serious disadvantage compared to other frameworks such as PyTorch, MxNet etc.


Therefore I am hoping that I am either wrong about how TensorFlow Hub works,

or there will is an upcoming different kind of Model Zoo with greater flexibility available than I am not yet aware of.


I am very much looking  forward to an answer to what the plan is to support such use-cases, since it seems quite urgent to solve if people are to migrate to TensorFlow 2.0 at all.

Thanks in advance,

/Niclas



Aakash Kumar Nain

unread,
Feb 25, 2019, 4:57:32 AM2/25/19
to Niclas Danielsson, TensorFlow Developers
Hi Niclas,

Except for 1st point, everything can be done in tf.keras in a much better and cleaner way as compared to tf slim. Also, it is also possible to achieve 1st as well in tf.keras but I haven't really found a neat way to do it.

Regards,
Aakash Nain 

--
You received this message because you are subscribed to the Google Groups "TensorFlow Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to developers+...@tensorflow.org.
Visit this group at https://groups.google.com/a/tensorflow.org/group/developers/.
To view this discussion on the web visit https://groups.google.com/a/tensorflow.org/d/msgid/developers/1551088280942.41080%40axis.com.

Niclas Danielsson

unread,
Feb 25, 2019, 5:15:01 AM2/25/19
to Aakash Kumar Nain, TensorFlow Developers

Hi Aakash,


Yes Keras seems to be a much better API than TF-Slim but that is not what the question is about.


The problem is this:

The currently available Model Zoo with common classifiers (Inception, Resnets, Vgg16, MobileNet etc etc) are provided via the TF-Slim API in the graph formulation.


In TF 2.0 it seems obvious that defining CNNs in Keras in Eager formulation is the way to go. But there are no Imagenet-pretrained models available formulated in the "Eager-Keras" way except possibly in TensorFlow Hub, but I don't see the kind of support I need in TensorFlow Hub. So I see no way to get the model zoo unless I train the models myself.


Such model Zoo's are available for other Frameworks, since this has become a standard offering.

For instance, here is the source code for Resnet models in PyTorch defined in "Eager format".

https://pytorch.org/docs/stable/_modules/torchvision/models/resnet.html#resnet18


The definitions in TensorFlow Eager format can be written in a VERY similar way except minor naming replacements (such as inheriting from tf.keras.Model instead of from nn.Module etc).


I was expecting a similar Model Zoo for Tensorflow to be meaningful to use for developers that do not have arbitrarily huge datasets available.


So the question really is: Where is the new Model Zoo to use when TF-Slim becomes deprecated?!


BR,

/Niclas

  


From: Aakash Kumar Nain <nainaa...@gmail.com>
Sent: Monday, February 25, 2019 10:57
To: Niclas Danielsson
Cc: TensorFlow Developers
Subject: Re: Missing functionality in TensorFlow when TF-Slim is deprecated - serious issue(?)
 

Aakash Kumar Nain

unread,
Feb 25, 2019, 5:23:02 AM2/25/19
to Niclas Danielsson, TensorFlow Developers
`tf.keras` ships with a a lot of pretarained models. If you go through tf.keras.applications, you will find a lot of imagenet pretrained models such as
  • Xception
  • VGG16
  • VGG19
  • ResNet, ResNetV2, ResNeXt
  • InceptionV3
  • InceptionResNetV2
  • MobileNet
  • MobileNetV2
  • DenseNet
  • NASNet
I hope this helps.

Regards,
Aakash Nain

Aakash Kumar Nain

unread,
Feb 25, 2019, 5:25:50 AM2/25/19
to Niclas Danielsson, TensorFlow Developers
If you want a more verbose list of pre-trained models, you can find it here:

Regards,
Aakash Nain

Niclas Danielsson

unread,
Feb 25, 2019, 5:49:48 AM2/25/19
to Aakash Kumar Nain, TensorFlow Developers

Hi,


This certainly looks more like what I was looking for.


However, in TF 2.0 eager mode will be the default mode. These models seem to still be defined in the graph mode.

For instance:

https://github.com/keras-team/keras-applications/blob/master/keras_applications/vgg16.py

 

Are these models going to be available in eager formulation as well when TF 2.0 arrives, or what is the plan?


Also, obviously I did not see any reference to this when I was searching the TensorFlow/Keras documentations, but I guess this is because the framework is still in an early stage of development?

BR,

/Niclas



From: Aakash Kumar Nain <nainaa...@gmail.com>
Sent: Monday, February 25, 2019 11:25

Aakash Kumar Nain

unread,
Feb 25, 2019, 6:02:55 AM2/25/19
to Niclas Danielsson, TensorFlow Developers
You can directly use them in TF2.0 as well. Here is a sample example that I made right away:

Yes, the documentation needs to be updated and I think the TF team is already on that. But as of now, you can look up things here:

Regard,
Aakash Nain

Niclas Danielsson

unread,
Feb 25, 2019, 6:53:59 AM2/25/19
to Aakash Kumar Nain, TensorFlow Developers

Yes it is good that the models can be used. But they are still in a fixed graph formulation.

In Eager Execution you can have completely dynamic forward passes via the imperatively defined "call" function etc.

I still do not know how to use a graph for retraining in Eager Mode.


That is, following the design paradigm illustrated here:

https://www.tensorflow.org/tutorials/eager/custom_layers


Will the model zoo be available with Eager Models like that as well in TF 2.0? Similar to the example (from the link above):


As you can see below, this design paradigm is very similar to the PyTorch example I provided in the earlier email, if you just change nn.module to tf.Keras:Model, change the name of the forward function to "call" instead of "forward" and replace the PyTorch layer class definitions with their TensorFlow Keras equivalents, and it is really exactly that way of writing models for finetuning etc. that I am looking for. So the question is, will these also be available?


class
ResnetIdentityBlock(tf.keras.Model):
 
def __init__(self, kernel_size, filters):
   
super(ResnetIdentityBlock, self).__init__(name='')
    filters1
, filters2, filters3 = filters

   
self.conv2a = tf.keras.layers.Conv2D(filters1, (1, 1))
   
self.bn2a = tf.keras.layers.BatchNormalization()

   
self.conv2b = tf.keras.layers.Conv2D(filters2, kernel_size, padding='same')
   
self.bn2b = tf.keras.layers.BatchNormalization()

   
self.conv2c = tf.keras.layers.Conv2D(filters3, (1, 1))
   
self.bn2c = tf.keras.layers.BatchNormalization()

 
def call(self, input_tensor, training=False):
    x
= self.conv2a(input_tensor)
    x
= self.bn2a(x, training=training)
    x
= tf.nn.relu(x)

    x
= self.conv2b(x)
    x
= self.bn2b(x, training=training)
    x
= tf.nn.relu(x)

    x
= self.conv2c(x)
    x
= self.bn2c(x, training=training)

    x
+= input_tensor
   
return tf.nn.relu(x)

   
block
= ResnetIdentityBlock(1, [1, 2, 3])
print(block(tf.zeros([1, 2, 3, 3])))
print([x.name for x in block.trainable_variables])



From: Aakash Kumar Nain <nainaa...@gmail.com>
Sent: Monday, February 25, 2019 12:02

Aakash Kumar Nain

unread,
Feb 25, 2019, 7:59:32 AM2/25/19
to Niclas Danielsson, TensorFlow Developers
The example I provided is running in eager mode only. For fine-tuning, you just have to freeze the layers you don't want to train and keep rest of the layers as trainable. You can define the graph using subclassing as well and set the weights accordingly. 

Regards,
Aakash Nain

Niclas Danielsson

unread,
Feb 25, 2019, 8:30:34 AM2/25/19
to Aakash Kumar Nain, TensorFlow Developers

But they are not exactly  the same.


You use a "predict" function which seems to be implicitly defined by the VGG16(...) function call, whereas the tensorflow example explicitly states I should implement the "call" function myself.


Also even if the graph RUNS in eager mode, the CNN is still defined in terms of the def VGG16(...) function which explicitly specifies the inputs and outputs of all the layers in the CNN and thus fixes it just like a computation graph setup function fixes a computation graph.


I can not see that they are the same. Do you mean the models in tf.Keras.Applications are exactly equivalent to the example in

https://www.tensorflow.org/tutorials/eager/custom_layers in every aspect?


Can I override the "predict" function in the same way that I can override the "call" function?


Can I do conditional inference passes using the tf.Keras.Applications models. Say for instance that I execute some of the layers layers in the pretrained model only if a condition given by some preprocessing function is fulfulled, whereas I otherwise provide some new other layers for that inference pass?


Am I missing something here, because I really can not see that the current pretrained models fulfil the requirements I am asking for, at least not in a way that is on-par with what is available in PyTorch, which is the framework I am benchmarking against (and currently using while waiting for something similar to become available in TensorFlow)


One reason why I think this is important, rather than just "being able to run" the model in Eager mode is that we would really like to see TensorFlow as the more flexible framework, but it really requires that working with and modifying CNN models is as flexible as for the other frameworks out there. Otherwise there is a tendency for developers to pick up the other frameworks such as PyTorch just because they seem more intuitive, straightforward and user friendly.

BR,

/Niclas

 





From: Aakash Kumar Nain <nainaa...@gmail.com>
Sent: Monday, February 25, 2019 13:59

Aakash Kumar Nain

unread,
Feb 25, 2019, 9:23:37 AM2/25/19
to Niclas Danielsson, TensorFlow Developers, Francois Chollet, Paige Bailey
I think you need to go through tf.keras/Keras docs for getting a feel of it and how it works. Basically, there are three types of APIs within tf.keras
* Sequential
* Functional
* Impreatice/Model-subclassing

I have been using Keras for almost 2.5 years. If you ask me, Functional API will satisfy 99% of your requirements. It is like Lego blocks, You define layers and put together,. Given that you have used PyTorch, I understand that you are used to `call` method. The imperative API does the same thing. You can read about this more with detailed examples here: https://medium.com/tensorflow/what-are-symbolic-and-imperative-apis-in-tensorflow-2-0-dfccecb01021

Coming to your question:
``` Can I do conditional inference passes using the tf.Keras.Applications models. Say for instance that I execute some of the layers in the pre-trained model only if a condition given by some preprocessing function is fulfilled, whereas I otherwise provide some new other layers for that inference pass? 
```
Yes, you can do that. And you can do that with any API defined above. Also, if you don't want to use the out-of-box given model, you can define it in your class, load the weights and then modify your `call` method in the way you want. I can provide some example but currently, I have some imp stuff to do. Maybe someone from TF team can provide such an example. But all those things that you are expecting are completely doable and that too in a very neat way.

Regards,
Aakash Nain

Niclas Danielsson

unread,
May 31, 2019, 6:58:15 PM5/31/19
to Aakash Kumar Nain, TensorFlow Developers, Francois Chollet, Paige Bailey, Markus Skans

Hi again Akash and others,


Regarding the custom loading of pretrained weights in Tensorflow 2.0 that  we talked about before,

I finally got some more time to sit down and try to produce a customization example based on CNNs pretrained on Imagenet,

but I still find the process much less straightforward than it was when using TF-Slim,

and also I still get stuck on the loading of an arbitrary subset of weights...


I will try to show you where I ended up and maybe you can tell me if I got something completely wrong, or if this maybe IS less supported than it first seemed in the new TensorFlow 2.0?


Firstly, it is obviously true as you wrote in the example that you can download and initialize the model with a 2-liner like this:

from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input

vgg_model = VGG16(input_shape=(224,224,3), weights='imagenet')


 However, for the purpose of extracting an arbitrary subset of layers with pretrained weights to use as a building block for a new model this has the following drawbacks:


1: I don't have the model description available. It is still in the Keras repository. I need to have access to it locally to customize it and only load the unchanged layers from the pretrained weights file.

2: I need the weights to be stored locally as well since I can not assume to always have access to the git where the weights are stored (and for future proofing of the model I don't want to risk that something happens to the remote repo which would render my code useless, not having access to the pretrained weights, so storing them locally is a definitely must)

3: The loading of the weights is integrated in the model definition which I find strange since it makes it more complicated to use the subset of VGG16 CNN layers together with other code. Which means I would have to break out this code from the VGG16 definition in my customized CNN. TF-Slim has none of these awkwardnesses

4: It is not immediately clear exactly what happens with these model injections, but spontaneously it seems they could later interfere with my custom layers that I add later that are defined outside of this Keras Model.


5: The model is defined here:

https://github.com/tensorflow/tensorflow/blob/r2.0/tensorflow/python/keras/applications/vgg16.py

But this is just a wrapper layer to the Keras repository with some modules injection decorators that does some magic.

Apparently I need this code copied to my local code as well since these model insertions only refer to the Keras repository so I can not use them with my custom VGG16 based CNN.


SOLUTION:

-----------------------------------------------------------------------------------------------------------------------

So I copied the CNN model from the Keras repository and added the magical model insertions (and the utilities file that goes with the CNN as well)

Then this is what I get in the main file (the other files not listed here, but they are the same as they are in the Keras repository):


from tensorflow.python.keras.applications import keras_modules_injection
from tensorflow.python.util.tf_export import keras_export

@keras_export('keras.applications.vgg16.VGG16',
'keras.applications.VGG16')
@keras_modules_injection
def VGG16(*args, **kwargs):
return vgg16.VGG16(*args, **kwargs)

@keras_export('keras.applications.vgg16.decode_predictions')
@keras_modules_injection
def decode_predictions(*args, **kwargs):
return vgg16.decode_predictions(*args, **kwargs)

@keras_export('keras.applications.vgg16.preprocess_input')
@keras_modules_injection
def preprocess_input(*args, **kwargs):
return vgg16.preprocess_input(*args, **kwargs)

# This is the original vgg16 wrapper around downloading the models from the keras git.
#from tensorflow.keras.applications.vgg16 import VGG16, preprocess_input

import vgg16 # This now imports my customized VGG16 model from a local file in the same folder.
vgg_model = VGG16(input_shape=(224,224,3), weights='vgg16_weights_tf_dim_ordering_tf_kernels.h5')


Now, I still get stuck here, because the saved weights is in .h5 binary format and I only have 2 options:

1: To load the whole CNN inlcudiong imagenet final layer (vgg16_weights_tf_dim_ordering_tf_kernels.h5)

2: or I can load all layer except the last (vgg16_weights_tf_dim_ordering_tf_kernels_notop). If I modify the CNN by removing some layers I get the error message: ValueError: You are trying to load a weight file containing 16 layers into a model with 13 layers.


This has the same limitations as the Tensorflow Hub and is not really helpful if I want to load fewer layers than "all except the last layer".

In TF-Slim one COULD remove layers and it would only load those that still had matching names, which was extremely useful for customization purposes.

This is also the same as one can do in PyTorch by using the "nonstrict" option for the weights load command, so PyTorch still has what I can not reproduce in the new "improved" Tensorflow.


So the fact that this weight loading this option seems to have been removed leaves me a bit stuck.


Is there a more straightforward way to load the weights into my customized vgg16 CNN,  that do not involve coying the "module injection" code explicitly into my own CNN setup code and that allows me to load only the weights I am interested in from the h5 weight file?


I would appreciate some kind of example since this seems like a serious flaw if one has to go through this mess I had to create (if it works at all).

Thanks in advance,

/Niclas

Aakash Kumar Nain

unread,
Jun 1, 2019, 4:50:24 AM6/1/19
to Niclas Danielsson, TensorFlow Developers, Francois Chollet, Paige Bailey, Markus Skans
1: I don't have the model description available. It is still in the Keras repository. I need to have access to it locally to customize it and only load the unchanged layers from the pretrained weights file.

When you load model from `keras.applications.vgg16`, you are just importing the `VGG` model defined in a py file. It is the same as having a separate py file in your project and then importing the model from it. Regarding the `injections`, don't look at those. Check the `keras-applications` package in the original keras repo 
If you want it locally, I would say load the model first as I described earlier. Save the weights using `model.save_weights(filename)`. Now you have the weights saved locally. For the definition, either copy the code from repo I linked above or just rewrite the model definition yourself. Just make sure that the layer names match as the original repo weights. This is no hard rule but it makes things easier when loading weights.

3: The loading of the weights is integrated in the model definition which I find strange since it makes it more complicated to use the subset of VGG16 CNN layers together with other code. Which means I would have to break out this code from the VGG16 definition in my customized CNN. TF-Slim has none of these awkwardnesses 

As I said, just copy the model definition and leave out rest of the part that you don't need.

4: It is not immediately clear exactly what happens with these model injections, but spontaneously it seems they could later interfere with my custom layers that I add later that are defined outside of this Keras Model.


Injections have nothing to do with model definition. These were things that were introduced to make sure that nothing breaks after moving `keras-applications` as a separate package. Again, this has nothing to do with model definition and it is not the code you should look for.

Regarding all other problems that you described. Let us say you have your project structure as follows:
--my_project
      |_ vgg16.py
      |_ main.py
      |_vgg16_weights.h5

Here I am defining the vgg in a separate file as you are expecting it to be. 
1)
# Import the model definition
from vgg import VGG16
# instantiate and load weights
vgg_model = VGG16(input_shape=(224,224,3))

2) If you are using model instance to load weights, like `model.load_weights(filename.h5)`, weights are always loaded by checking the name of the layer or an exception is/should be raised. If you want to load weights for some specific layers, then this is the way:

weights_file = h5py.File('filename.h5', 'r')
for layer in model.layers:
  weight_arrays = [weights_file[layer.name][layer.name + '_W_1:0'].value, 
                              weights_file[layer.name][layer.name + '_b_1:0'].value]
  layer.set_weights(weights_arrays)

or if you just want to set weights for a particular layer:
model.get_layer(layerName).set_weights(weight_arrays) 


     
I think this is all you need. If there is something else that is confusing, please let us know.
PS: Saying it again, Keras is way better than slim

Regards,
Aakash Nain
 

 

Niclas Danielsson

unread,
Jun 3, 2019, 6:11:10 PM6/3/19
to Aakash Kumar Nain, TensorFlow Developers, Francois Chollet, Paige Bailey, Markus Skans

Thanks,


I think I got it to work now. Thanks for all the good feedback, it really helped in finding the relevant things to use for my use-case.

Just wondering one last thing... :-)


The weights are named with 'tf' in the name, whereas the default preprocessing mode is 'caffe' for VGG16.

This confused me a but initially since I assumed I should set 'tf' explicitly in the Keras preprocessing function. Do you have any idea why this is so? The file is named as:

'vgg16_weights_tf_dim_ordering_tf_kernels.h5'


Other than that I think I got it right, though the number of modifications when breaking out the code was still not that small (but does make sense, considering that I want to break out the code definition for customization like this). I include the code below and the changes I had to make. 

Let me know if how I implemented it makes sense to you.


I used the new Udacity course for TF 2.0 with Paige Bailey as reference to check that the CNN calculates the right thing with the right preprocessing. In particular the finetuning colab exercise that uses an image of a military uniform when doing an inference test with the mobilenet classifier downloaded from Tensorflow Hub (which curiously has 1001 classes whereas Keras only has 1000). (see included section below for details...)

The course: https://eu.udacity.com/course/intro-to-tensorflow-for-deep-learning--ud187

BR,

/Niclas


Code changes needed to break out the CNN definition:

-----------------------------------------------------------------------------

In the vgg16 model definition file:
- Changed weights input to None
- Removed function get_submodules_from_kwargs
  Instead specify explicitly: data_format = channels_last

- Also need to import layers explicitly (this is part of what was "injected" before)
  import tensorflow as tf
  import tensorflow.keras.layers as layers
  import tensorflow.keras.models as models

- Removed all preprocessing before model definition:
    # backend, layers, models, keras_utils = get_submodules_from_kwargs(kwargs)
    #
    # if not (weights in {'imagenet', None} or os.path.exists(weights)):
    #     raise ValueError('The `weights` argument should be either '
    #                      '`None` (random initialization), `imagenet` '
    #                      '(pre-training on ImageNet), '
    #                      'or the path to the weights file to be loaded.')
    #
    # if weights == 'imagenet' and include_top and classes != 1000:
    #     raise ValueError('If using `weights` as `"imagenet"` with `include_top`'
    #                      ' as true, `classes` should be 1000')
    # # Determine proper input shape
    # input_shape = _obtain_input_shape(input_shape,
    #                                   default_size=224,
    #                                   min_size=32,
    #                                   data_format=backend.image_data_format(),
    #                                   require_flatten=include_top,
    #                                   weights=weights)

- Kept:
  if input_tensor is None:
  img_input = layers.Input(shape=input_shape)


   but removed:
   else:
        if not backend.is_keras_tensor(input_tensor):
            img_input = layers.Input(tensor=input_tensor, shape=input_shape)
        else:
            img_input = input_tensor

- Removed this thingemabob:
  # Ensure that the model takes into account
  # any potential predecessors of `input_tensor`.
  if input_tensor is not None:
     inputs = keras_utils.get_source_inputs(input_tensor)
  else:
     inputs = img_input

- Change to matching input image name here: model = models.Model(img_input, x, name='vgg16')
- Removed the weights loading since we will do that outside the model definition.

- In the imagenet_utils function:
  backend, _, _, _ = get_submodules_from_kwargs(kwargs)"
  changed to
  import tensorflow.keras.backend as backend


The main file that seems to work properly listed below (I used OpenCV for loading images, though I noted there is a Keras preprocessing repository they recommend that uses PIL instead, but I wanted to make it fit my usual preprocessing flow)

-------------------------------------------------------------------------------

# https://www.tensorflow.org/community/style_guide
# All code needs to be compatible with Python 2 and 3
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf
import numpy as np
import cv2
import h5py

from vgg16 import VGG16
from imagenet_utils import preprocess_input

# This list has the background class as 0:th class and thus 1001 classes in total (tensorflow hub pretrained CNNs are trained like that,
# whereas the Keras pretrained models has the usual 1000 classes). We correct for that below.
labels_path = tf.keras.utils.get_file('ImageNetLabels.txt','https://storage.googleapis.com/download.tensorflow.org/data/ImageNetLabels.txt')
print(labels_path)
imagenet_labels = open(labels_path).read().splitlines()

# Print the Imagenet labels
for idx, label in enumerate(imagenet_labels):
# Don't print the background class, and also subtract 1 for indexing offset (start vector indices on 0)
if idx > 0:
print("{}: {}".format(idx-1, label))

# Load the model without doing weight initialization
model = VGG16(input_shape=(224,224,3), weights=None)

# ****************************************************
# Load the weights with a higher level of control
# ****************************************************
# Weights file expects integer input 255 range.
weights_file = h5py.File('vgg16_weights_tf_dim_ordering_tf_kernels.h5', 'r')
for layer in model.layers:
# Remove layers that do not have weights to be loaded.
if not any(item in layer.name for item in ["input", "pool", "flatten"]):
# Used the print to identify the names of layers that did not have weightts in the given format.
print(layer.name)
print(weights_file[layer.name].keys()) # This line gives the keys the weights are stored with.
# Useful if the weights are not just convlayers so one needs to figure out how to access them.
# Example outpus: <KeysViewHDF5 ['block5_conv2_W_1:0', 'block5_conv2_b_1:0']>

weight_arrays = [weights_file[layer.name][layer.name + '_W_1:0'].value,
weights_file[layer.name][layer.name + '_b_1:0'].value]
        layer.set_weights(weight_arrays)

# ****************************************************
# Load input image and preprocess
# ****************************************************
# The image used in the Tensorflow 2.0 course with Paige Bailey at Udacity in this exercise:
# https://colab.research.google.com/github/tensorflow/examples/blob/master/courses/udacity_intro_to_tensorflow_for_deep_learning/l06c01_tensorflow_hub_and_transfer_learning.ipynb
grace_hopper = tf.keras.utils.get_file('image.jpg','https://storage.googleapis.com/download.tensorflow.org/example_images/grace_hopper.jpg')
frame = cv2.imread(grace_hopper)

frame = cv2.resize(frame, (224,224), interpolation = cv2.INTER_NEAREST)
frame = frame[:,:,::-1] # OpenCV has flipped channel order vs PIL. This line corrects for that.
frame = np.expand_dims(frame, axis=0)
frame = preprocess_input(frame, data_format='channels_last')

# Run inference (should get 652 as class index (TF 2.0 course gets 653, since it has the added background class)
print(np.argmax(model.predict(frame)))



From: Aakash Kumar Nain <nainaa...@gmail.com>
Sent: Saturday, June 1, 2019 10:50
To: Niclas Danielsson
Cc: TensorFlow Developers; Francois Chollet; Paige Bailey; Markus Skans

Niclas Danielsson

unread,
Aug 2, 2019, 9:50:03 AM8/2/19
to Aakash Kumar Nain, TensorFlow Developers, Francois Chollet, Paige Bailey, Markus Skans

Hi again Aakash,


I have one more issue that appears when I try to use a more modern model than VGG16 the way you suggested.


In MobileNet v2 they dynamically need the dimensions of the input tensor in order to set up the CNN,

but even though it is claimed that since the tensors are all eager in TF 2.0, I can not get the values out of them.


Basically I get the error:

AttributeError: 'Tensor' object has no attribute 'numpy'


I fetch the model from here:

https://github.com/keras-team/keras-applications/blob/master/keras_applications/mobilenet_v2.py


And in the function

def _inverted_res_block(...)


the line that causes problems is originally:

in_channels = backend.int_shape(inputs)[channel_axis]


Now, I obviously do not have the backend object since, as you suggested, I am not using any Tensorflow "injections", but rather I should replace this line with something which explicitly uses Tensorflow but is equivalent.


It seemed most reasonable to get the shape and convert it to numpy values like this (but it obviously did not work)

in_channels = tf.shape(img_input).numpy()[channel_axis]


The tensor I want to get the channels value from looks like this when printed:

Tensor("Conv1_relu/Identity:0", shape=(None, 112, 112, 32), dtype=float32)


I created a short colab code snippet that illustrates the steps that go wrong:

https://colab.research.google.com/drive/1wPHqn-hc-itmWxy_TScTN_OUb9VdVz-Q


Is there any way to get that values (the value 32 which I can SEE in the print above, but can not access)


Since it works with the wrapped Keras Applications inside Tensorflow when used with the usual download API in Tensorflow, it should be possible, but I can not figure out how. The backend.int_shape() command does not seem to have any direct correspondence in Tensorflow either...


Furthermore, is there anywhere one can read about the contitions when one CAN get a numpy value out of an (Eager) tensor? It is very random when this command works. Before in the threads it turned out the numpy() command did not work when I wrapped the function with the decorator @tf.function in training loops, since the tensors seem to stop being eager then, and this makes the attribute vanish from the tensor object. But nothing of this is documented (other than in my old email threads here).


Thanks in advance!

/Niclas



From: Aakash Kumar Nain <nainaa...@gmail.com>
Sent: Saturday, June 1, 2019 10:50
To: Niclas Danielsson
Cc: TensorFlow Developers; Francois Chollet; Paige Bailey; Markus Skans

Paige Bailey

unread,
Aug 2, 2019, 11:12:14 AM8/2/19
to Niclas Danielsson, Adrian Chmielewski-Anders, Aakash Kumar Nain, TensorFlow Developers, Francois Chollet, Markus Skans
Also adding +Adrian Chmielewski-Anders to the discussion, for visibility.
--

Paige Bailey   

Product Manager (TensorFlow)

@DynamicWebPaige

webp...@google.com


 

Aakash Kumar Nain

unread,
Aug 2, 2019, 11:33:47 AM8/2/19
to Paige Bailey, Niclas Danielsson, Adrian Chmielewski-Anders, TensorFlow Developers, Francois Chollet, Markus Skans
Hi Niclas,

The problem is much simpler but I guess you are taking too much of reference from the keras-applications code that is designed for multi-backend. Here are two things:

1) Every tensor has a shape which can be accessed using `shape` attribute. For example, in your case, it would be 
print(img_input.shape)  # prints the shape of the tensor
last_dim = img_input.shape[-1]   # returns 3 or 32 depending on the number of channels  

2) If you are writing something that should work both in eager mode and graph mode, then `tf.shape` is your friend. The only catch is that it accepts an *array/Tensor* and return an integer/s representing the shape of that. So, when you are applying it to the input layer, it should return 4 but I haven't checked.

Hope this helps..

Regards,
Aakash Nain 

Niclas Danielsson

unread,
Aug 5, 2019, 4:12:17 AM8/5/19
to Aakash Kumar Nain, Paige Bailey, Adrian Chmielewski-Anders, TensorFlow Developers, Francois Chollet, Markus Skans

Hi Aakash,


Thanks for the feedback. Using tensor.shape works much better. Though the documentation (TF 2.0 doc) confusingly says it returns a tf.tensorshape object which "represents a possibly-partial shape specification for a Tensor", it DOES surprisingly seem like what I get back is in fact a tuple of integers.

https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/TensorShape


Though I finally realized which with hindsight should have been obvious, that the backend object IS present in the tf.keras API. That is, replacing:


backend, layers, models, keras_utils = get_submodules_from_kwargs(kwargs)

with    


import tensorflow.keras.backend as backend
import tensorflow.keras.utils as keras_utils


import tensorflow.keras.layers as layers
import tensorflow.keras.models as models


did the trick with much fewer code modifications. Maybe something to document in the framework? :-)


---------------------------------------------------

Now regarding the .numpy() issue, I still beliewve this is an issue that needs to be adressed. Let me explain why:

-----------------------------------------------------


As I mentioned before, what I wanted to achieve is to use the code available in Keras Applications with as little modification as possible but kept as a local copy so I can do whatever customizations I wish (similar again to how it was possible to do with TF-Slim).


This obvisously meant I needed to handle the fact that the code is written for multiple backends in some way.

Mainly it means dealing with the imports described above.


I did not (initially) find much information about what some of these objects corresponded to in native Tensorflow so I tried to replace the affected code with TF native equivalents. That's where the .numpy() command did not work on what looked very much like Eager tensors to me. And that is where I think either there might be a bug but where at least documentation is clearly lacking.


In particular, it seems like the fact that .numpy() does not work when you add the tf.function decorator to functions that contain tensorflow operations highlights a general mechanism that is obviously there but whose consequences are not fully documented. I think it is not so common to have programming patterns where attributes of classes can vanish without any warning or where there are different "run-modes" (i.e. Eager vs graph mode) where the class objects seem to be converted to something else in an opaque way under-the-hood , sometimes due to secondary code that I as a framework user might not have developed myself.


So a clear documentation of this at each place in the documentation where an attribute or function is described that is affected by eager versus graph modes of execution is clearly needed in my opinion, as well as some conceptual explanation in a top-level tutorial about how the opaque graph mode affects tensors etc.


In particular if you are fresh to Tensorflow 2.0 without any prior experience of earlier versions of Tensorflow this "vanishing of attributes" would seem very confusing since the graph mode of execution is now kept opaque to the user.


With that said, I think in general the direction of TF 2.0 is indeed the right one. I am happy to see that so many things in general are becoming much more streamlined and "Pythonic" in the framework. :-)


BR,

/Niclas






From: Aakash Kumar Nain <nainaa...@gmail.com>
Sent: Friday, August 2, 2019 17:33
To: Paige Bailey
Cc: Niclas Danielsson; Adrian Chmielewski-Anders; TensorFlow Developers; Francois Chollet; Markus Skans
Reply all
Reply to author
Forward
0 new messages