Why do Keras losses consider only the last dimension of the input arrays?weighted average of tensorImplementing a custom loss function in KerasException in Tensorflow function used as Keras custom lossTensorflow placeholder in Keras custom objective functionMake a custom loss function in kerasWhy use axis=-1 in Keras metrics function?Keras custom loss function: Accessing current input patternImplementing custom loss function in keras with conditionKeras custom loss function for YOLOdesign a custom loss function in Keras (on the element index in tensors in Keras)

How can lift be less than thrust that is less than weight?

Encounter design and XP thresholds

How did Gollum enter Moria?

Do I have any obligations to my PhD supervisor's requests after I have graduated?

Greeting with "Ho"

Why is it easier to balance a non-moving bike standing up than sitting down?

Why do all the teams that I have worked with always finish a sprint without completion of all the stories?

Why does Linux list NVMe drives as /dev/nvme0 instead of /dev/sda?

Heavily limited premature compiler translates text into excecutable python code

How to maintain a closed environment for one person for a long period of time

Ruining the family name

Loss of power when I remove item from the outlet

What's currently blocking the construction of the wall between Mexico and the US?

Explain why a line can never intersect a plane in exactly two points.

Hit the Bulls Eye with T in the Center

What is "industrial ethernet"?

Understanding the reasoning of the woman who agreed with Shlomo to "cut the baby in half"

Has there been any indication at all that further negotiation between the UK and EU is possible?

When to remove insignificant variables?

Why does using different ArrayList constructors cause a different growth rate of the internal array?

Get list of shortcodes from content

Dates on degrees don’t make sense – will people care?

Can I enter the UK for 24 hours from a Schengen area, holding an Indian passport?

What happened to Steve's Shield in Iron Man 2?



Why do Keras losses consider only the last dimension of the input arrays?


weighted average of tensorImplementing a custom loss function in KerasException in Tensorflow function used as Keras custom lossTensorflow placeholder in Keras custom objective functionMake a custom loss function in kerasWhy use axis=-1 in Keras metrics function?Keras custom loss function: Accessing current input patternImplementing custom loss function in keras with conditionKeras custom loss function for YOLOdesign a custom loss function in Keras (on the element index in tensors in Keras)






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








1















In the current master branch on the Keras repository, one can find this inside losses.py:



def mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1)


https://github.com/keras-team/keras/blob/78e1f57c484da15466a34ed543a1cc4709617a2f/keras/losses.py#L17



Note how the mean is computed only along the last axis of the arrays y_true and y_pred due to the use of axis=-1.



Similarly, the TensorFlow implementation reads:



def mean_absolute_error(y_true, y_pred):
return K.mean(math_ops.abs(y_pred - y_true), axis=-1)


https://github.com/tensorflow/tensorflow/blob/6612da89516247503f03ef76e974b51a434fb52e/tensorflow/python/keras/losses.py#L414



And the same is true for all other losses defined in the same files.



Why is that? (This is my question.) To my understanding, losses are scalars, so why are some array dimensions kept?



The following is just background and not part of my question.
The reason I stumbled across this is that I was trying to implement some normalized MAE in Keras, and without thinking more about this issue, I tried this:



def normalized_mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1) / K.mean(K.abs(y_true), axis=-1)


This results in NaNs, though, possibly (I am still investigating) due to me trying to divide an non-scalar array by another one.



(Edit: I know that my samples, which are images, have some all-zero rows and columns in the target, so that would induce a division by zero for these rows or columns. Computing means over multiple dimensions at once, that it, complete images which are guaranteed to be not all-zero, solves this immediate issue, but still this raises the question above.)










share|improve this question



















  • 1





    You are getting Nan because K.mean(K.abs(y_true), axis=-1) could be zero. So you should add some small value to this denumerator(e.g epsilon).

    – Mitiku
    Mar 25 at 6:25











  • @Mitiku you are right, but unless I have a made a serious mistake, none of my samples (which are images) should be all zero. Some rows or columns could be all zero, though, which would in fact explain the observed behavior. Which, if my interpretation is correct, raises the question why Keras computes losses by one dimension and not by sample.

    – bers
    Mar 25 at 6:46











  • I't because you are using axis=-1 argument. You are specifying to get mean over the last axis.

    – Mitiku
    Mar 25 at 7:07











  • @Mitiku I know that :) (Please do not confuse my motivation for the question with the question itself.) The question (still) is, why does Keras do that in its own implementation of the losses?

    – bers
    Mar 25 at 7:11












  • can you clarify more: why is using the last dimension unexected? Wouldnt y be one dimensional as these are your predictions?

    – Kai Aeberli
    Mar 25 at 9:02

















1















In the current master branch on the Keras repository, one can find this inside losses.py:



def mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1)


https://github.com/keras-team/keras/blob/78e1f57c484da15466a34ed543a1cc4709617a2f/keras/losses.py#L17



Note how the mean is computed only along the last axis of the arrays y_true and y_pred due to the use of axis=-1.



Similarly, the TensorFlow implementation reads:



def mean_absolute_error(y_true, y_pred):
return K.mean(math_ops.abs(y_pred - y_true), axis=-1)


https://github.com/tensorflow/tensorflow/blob/6612da89516247503f03ef76e974b51a434fb52e/tensorflow/python/keras/losses.py#L414



And the same is true for all other losses defined in the same files.



Why is that? (This is my question.) To my understanding, losses are scalars, so why are some array dimensions kept?



The following is just background and not part of my question.
The reason I stumbled across this is that I was trying to implement some normalized MAE in Keras, and without thinking more about this issue, I tried this:



def normalized_mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1) / K.mean(K.abs(y_true), axis=-1)


This results in NaNs, though, possibly (I am still investigating) due to me trying to divide an non-scalar array by another one.



(Edit: I know that my samples, which are images, have some all-zero rows and columns in the target, so that would induce a division by zero for these rows or columns. Computing means over multiple dimensions at once, that it, complete images which are guaranteed to be not all-zero, solves this immediate issue, but still this raises the question above.)










share|improve this question



















  • 1





    You are getting Nan because K.mean(K.abs(y_true), axis=-1) could be zero. So you should add some small value to this denumerator(e.g epsilon).

    – Mitiku
    Mar 25 at 6:25











  • @Mitiku you are right, but unless I have a made a serious mistake, none of my samples (which are images) should be all zero. Some rows or columns could be all zero, though, which would in fact explain the observed behavior. Which, if my interpretation is correct, raises the question why Keras computes losses by one dimension and not by sample.

    – bers
    Mar 25 at 6:46











  • I't because you are using axis=-1 argument. You are specifying to get mean over the last axis.

    – Mitiku
    Mar 25 at 7:07











  • @Mitiku I know that :) (Please do not confuse my motivation for the question with the question itself.) The question (still) is, why does Keras do that in its own implementation of the losses?

    – bers
    Mar 25 at 7:11












  • can you clarify more: why is using the last dimension unexected? Wouldnt y be one dimensional as these are your predictions?

    – Kai Aeberli
    Mar 25 at 9:02













1












1








1








In the current master branch on the Keras repository, one can find this inside losses.py:



def mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1)


https://github.com/keras-team/keras/blob/78e1f57c484da15466a34ed543a1cc4709617a2f/keras/losses.py#L17



Note how the mean is computed only along the last axis of the arrays y_true and y_pred due to the use of axis=-1.



Similarly, the TensorFlow implementation reads:



def mean_absolute_error(y_true, y_pred):
return K.mean(math_ops.abs(y_pred - y_true), axis=-1)


https://github.com/tensorflow/tensorflow/blob/6612da89516247503f03ef76e974b51a434fb52e/tensorflow/python/keras/losses.py#L414



And the same is true for all other losses defined in the same files.



Why is that? (This is my question.) To my understanding, losses are scalars, so why are some array dimensions kept?



The following is just background and not part of my question.
The reason I stumbled across this is that I was trying to implement some normalized MAE in Keras, and without thinking more about this issue, I tried this:



def normalized_mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1) / K.mean(K.abs(y_true), axis=-1)


This results in NaNs, though, possibly (I am still investigating) due to me trying to divide an non-scalar array by another one.



(Edit: I know that my samples, which are images, have some all-zero rows and columns in the target, so that would induce a division by zero for these rows or columns. Computing means over multiple dimensions at once, that it, complete images which are guaranteed to be not all-zero, solves this immediate issue, but still this raises the question above.)










share|improve this question
















In the current master branch on the Keras repository, one can find this inside losses.py:



def mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1)


https://github.com/keras-team/keras/blob/78e1f57c484da15466a34ed543a1cc4709617a2f/keras/losses.py#L17



Note how the mean is computed only along the last axis of the arrays y_true and y_pred due to the use of axis=-1.



Similarly, the TensorFlow implementation reads:



def mean_absolute_error(y_true, y_pred):
return K.mean(math_ops.abs(y_pred - y_true), axis=-1)


https://github.com/tensorflow/tensorflow/blob/6612da89516247503f03ef76e974b51a434fb52e/tensorflow/python/keras/losses.py#L414



And the same is true for all other losses defined in the same files.



Why is that? (This is my question.) To my understanding, losses are scalars, so why are some array dimensions kept?



The following is just background and not part of my question.
The reason I stumbled across this is that I was trying to implement some normalized MAE in Keras, and without thinking more about this issue, I tried this:



def normalized_mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1) / K.mean(K.abs(y_true), axis=-1)


This results in NaNs, though, possibly (I am still investigating) due to me trying to divide an non-scalar array by another one.



(Edit: I know that my samples, which are images, have some all-zero rows and columns in the target, so that would induce a division by zero for these rows or columns. Computing means over multiple dimensions at once, that it, complete images which are guaranteed to be not all-zero, solves this immediate issue, but still this raises the question above.)







python tensorflow keras






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 26 at 23:22







bers

















asked Mar 25 at 5:52









bersbers

840624




840624







  • 1





    You are getting Nan because K.mean(K.abs(y_true), axis=-1) could be zero. So you should add some small value to this denumerator(e.g epsilon).

    – Mitiku
    Mar 25 at 6:25











  • @Mitiku you are right, but unless I have a made a serious mistake, none of my samples (which are images) should be all zero. Some rows or columns could be all zero, though, which would in fact explain the observed behavior. Which, if my interpretation is correct, raises the question why Keras computes losses by one dimension and not by sample.

    – bers
    Mar 25 at 6:46











  • I't because you are using axis=-1 argument. You are specifying to get mean over the last axis.

    – Mitiku
    Mar 25 at 7:07











  • @Mitiku I know that :) (Please do not confuse my motivation for the question with the question itself.) The question (still) is, why does Keras do that in its own implementation of the losses?

    – bers
    Mar 25 at 7:11












  • can you clarify more: why is using the last dimension unexected? Wouldnt y be one dimensional as these are your predictions?

    – Kai Aeberli
    Mar 25 at 9:02












  • 1





    You are getting Nan because K.mean(K.abs(y_true), axis=-1) could be zero. So you should add some small value to this denumerator(e.g epsilon).

    – Mitiku
    Mar 25 at 6:25











  • @Mitiku you are right, but unless I have a made a serious mistake, none of my samples (which are images) should be all zero. Some rows or columns could be all zero, though, which would in fact explain the observed behavior. Which, if my interpretation is correct, raises the question why Keras computes losses by one dimension and not by sample.

    – bers
    Mar 25 at 6:46











  • I't because you are using axis=-1 argument. You are specifying to get mean over the last axis.

    – Mitiku
    Mar 25 at 7:07











  • @Mitiku I know that :) (Please do not confuse my motivation for the question with the question itself.) The question (still) is, why does Keras do that in its own implementation of the losses?

    – bers
    Mar 25 at 7:11












  • can you clarify more: why is using the last dimension unexected? Wouldnt y be one dimensional as these are your predictions?

    – Kai Aeberli
    Mar 25 at 9:02







1




1





You are getting Nan because K.mean(K.abs(y_true), axis=-1) could be zero. So you should add some small value to this denumerator(e.g epsilon).

– Mitiku
Mar 25 at 6:25





You are getting Nan because K.mean(K.abs(y_true), axis=-1) could be zero. So you should add some small value to this denumerator(e.g epsilon).

– Mitiku
Mar 25 at 6:25













@Mitiku you are right, but unless I have a made a serious mistake, none of my samples (which are images) should be all zero. Some rows or columns could be all zero, though, which would in fact explain the observed behavior. Which, if my interpretation is correct, raises the question why Keras computes losses by one dimension and not by sample.

– bers
Mar 25 at 6:46





@Mitiku you are right, but unless I have a made a serious mistake, none of my samples (which are images) should be all zero. Some rows or columns could be all zero, though, which would in fact explain the observed behavior. Which, if my interpretation is correct, raises the question why Keras computes losses by one dimension and not by sample.

– bers
Mar 25 at 6:46













I't because you are using axis=-1 argument. You are specifying to get mean over the last axis.

– Mitiku
Mar 25 at 7:07





I't because you are using axis=-1 argument. You are specifying to get mean over the last axis.

– Mitiku
Mar 25 at 7:07













@Mitiku I know that :) (Please do not confuse my motivation for the question with the question itself.) The question (still) is, why does Keras do that in its own implementation of the losses?

– bers
Mar 25 at 7:11






@Mitiku I know that :) (Please do not confuse my motivation for the question with the question itself.) The question (still) is, why does Keras do that in its own implementation of the losses?

– bers
Mar 25 at 7:11














can you clarify more: why is using the last dimension unexected? Wouldnt y be one dimensional as these are your predictions?

– Kai Aeberli
Mar 25 at 9:02





can you clarify more: why is using the last dimension unexected? Wouldnt y be one dimensional as these are your predictions?

– Kai Aeberli
Mar 25 at 9:02












0






active

oldest

votes












Your Answer






StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55331915%2fwhy-do-keras-losses-consider-only-the-last-dimension-of-the-input-arrays%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55331915%2fwhy-do-keras-losses-consider-only-the-last-dimension-of-the-input-arrays%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Kamusi Yaliyomo Aina za kamusi | Muundo wa kamusi | Faida za kamusi | Dhima ya picha katika kamusi | Marejeo | Tazama pia | Viungo vya nje | UrambazajiKuhusu kamusiGo-SwahiliWiki-KamusiKamusi ya Kiswahili na Kiingerezakuihariri na kuongeza habari

Swift 4 - func physicsWorld not invoked on collision? The Next CEO of Stack OverflowHow to call Objective-C code from Swift#ifdef replacement in the Swift language@selector() in Swift?#pragma mark in Swift?Swift for loop: for index, element in array?dispatch_after - GCD in Swift?Swift Beta performance: sorting arraysSplit a String into an array in Swift?The use of Swift 3 @objc inference in Swift 4 mode is deprecated?How to optimize UITableViewCell, because my UITableView lags

Access current req object everywhere in Node.js ExpressWhy are global variables considered bad practice? (node.js)Using req & res across functionsHow do I get the path to the current script with Node.js?What is Node.js' Connect, Express and “middleware”?Node.js w/ express error handling in callbackHow to access the GET parameters after “?” in Express?Modify Node.js req object parametersAccess “app” variable inside of ExpressJS/ConnectJS middleware?Node.js Express app - request objectAngular Http Module considered middleware?Session variables in ExpressJSAdd properties to the req object in expressjs with Typescript