Why do Keras losses consider only the last dimension of the input arrays?weighted average of tensorImplementing a custom loss function in KerasException in Tensorflow function used as Keras custom lossTensorflow placeholder in Keras custom objective functionMake a custom loss function in kerasWhy use axis=-1 in Keras metrics function?Keras custom loss function: Accessing current input patternImplementing custom loss function in keras with conditionKeras custom loss function for YOLOdesign a custom loss function in Keras (on the element index in tensors in Keras)
How can lift be less than thrust that is less than weight?
Encounter design and XP thresholds
How did Gollum enter Moria?
Do I have any obligations to my PhD supervisor's requests after I have graduated?
Greeting with "Ho"
Why is it easier to balance a non-moving bike standing up than sitting down?
Why do all the teams that I have worked with always finish a sprint without completion of all the stories?
Why does Linux list NVMe drives as /dev/nvme0 instead of /dev/sda?
Heavily limited premature compiler translates text into excecutable python code
How to maintain a closed environment for one person for a long period of time
Ruining the family name
Loss of power when I remove item from the outlet
What's currently blocking the construction of the wall between Mexico and the US?
Explain why a line can never intersect a plane in exactly two points.
Hit the Bulls Eye with T in the Center
What is "industrial ethernet"?
Understanding the reasoning of the woman who agreed with Shlomo to "cut the baby in half"
Has there been any indication at all that further negotiation between the UK and EU is possible?
When to remove insignificant variables?
Why does using different ArrayList constructors cause a different growth rate of the internal array?
Get list of shortcodes from content
Dates on degrees don’t make sense – will people care?
Can I enter the UK for 24 hours from a Schengen area, holding an Indian passport?
What happened to Steve's Shield in Iron Man 2?
Why do Keras losses consider only the last dimension of the input arrays?
weighted average of tensorImplementing a custom loss function in KerasException in Tensorflow function used as Keras custom lossTensorflow placeholder in Keras custom objective functionMake a custom loss function in kerasWhy use axis=-1 in Keras metrics function?Keras custom loss function: Accessing current input patternImplementing custom loss function in keras with conditionKeras custom loss function for YOLOdesign a custom loss function in Keras (on the element index in tensors in Keras)
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
In the current master
branch on the Keras repository, one can find this inside losses.py
:
def mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1)
https://github.com/keras-team/keras/blob/78e1f57c484da15466a34ed543a1cc4709617a2f/keras/losses.py#L17
Note how the mean is computed only along the last axis of the arrays y_true
and y_pred
due to the use of axis=-1
.
Similarly, the TensorFlow implementation reads:
def mean_absolute_error(y_true, y_pred):
return K.mean(math_ops.abs(y_pred - y_true), axis=-1)
https://github.com/tensorflow/tensorflow/blob/6612da89516247503f03ef76e974b51a434fb52e/tensorflow/python/keras/losses.py#L414
And the same is true for all other losses defined in the same files.
Why is that? (This is my question.) To my understanding, losses are scalars, so why are some array dimensions kept?
The following is just background and not part of my question.
The reason I stumbled across this is that I was trying to implement some normalized MAE in Keras, and without thinking more about this issue, I tried this:
def normalized_mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1) / K.mean(K.abs(y_true), axis=-1)
This results in NaN
s, though, possibly (I am still investigating) due to me trying to divide an non-scalar array by another one.
(Edit: I know that my samples, which are images, have some all-zero rows and columns in the target, so that would induce a division by zero for these rows or columns. Computing means over multiple dimensions at once, that it, complete images which are guaranteed to be not all-zero, solves this immediate issue, but still this raises the question above.)
python tensorflow keras
|
show 4 more comments
In the current master
branch on the Keras repository, one can find this inside losses.py
:
def mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1)
https://github.com/keras-team/keras/blob/78e1f57c484da15466a34ed543a1cc4709617a2f/keras/losses.py#L17
Note how the mean is computed only along the last axis of the arrays y_true
and y_pred
due to the use of axis=-1
.
Similarly, the TensorFlow implementation reads:
def mean_absolute_error(y_true, y_pred):
return K.mean(math_ops.abs(y_pred - y_true), axis=-1)
https://github.com/tensorflow/tensorflow/blob/6612da89516247503f03ef76e974b51a434fb52e/tensorflow/python/keras/losses.py#L414
And the same is true for all other losses defined in the same files.
Why is that? (This is my question.) To my understanding, losses are scalars, so why are some array dimensions kept?
The following is just background and not part of my question.
The reason I stumbled across this is that I was trying to implement some normalized MAE in Keras, and without thinking more about this issue, I tried this:
def normalized_mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1) / K.mean(K.abs(y_true), axis=-1)
This results in NaN
s, though, possibly (I am still investigating) due to me trying to divide an non-scalar array by another one.
(Edit: I know that my samples, which are images, have some all-zero rows and columns in the target, so that would induce a division by zero for these rows or columns. Computing means over multiple dimensions at once, that it, complete images which are guaranteed to be not all-zero, solves this immediate issue, but still this raises the question above.)
python tensorflow keras
1
You are gettingNan
becauseK.mean(K.abs(y_true), axis=-1)
could be zero. So you should add some small value to this denumerator(e.g epsilon).
– Mitiku
Mar 25 at 6:25
@Mitiku you are right, but unless I have a made a serious mistake, none of my samples (which are images) should be all zero. Some rows or columns could be all zero, though, which would in fact explain the observed behavior. Which, if my interpretation is correct, raises the question why Keras computes losses by one dimension and not by sample.
– bers
Mar 25 at 6:46
I't because you are usingaxis=-1
argument. You are specifying to get mean over the last axis.
– Mitiku
Mar 25 at 7:07
@Mitiku I know that :) (Please do not confuse my motivation for the question with the question itself.) The question (still) is, why does Keras do that in its own implementation of the losses?
– bers
Mar 25 at 7:11
can you clarify more: why is using the last dimension unexected? Wouldnt y be one dimensional as these are your predictions?
– Kai Aeberli
Mar 25 at 9:02
|
show 4 more comments
In the current master
branch on the Keras repository, one can find this inside losses.py
:
def mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1)
https://github.com/keras-team/keras/blob/78e1f57c484da15466a34ed543a1cc4709617a2f/keras/losses.py#L17
Note how the mean is computed only along the last axis of the arrays y_true
and y_pred
due to the use of axis=-1
.
Similarly, the TensorFlow implementation reads:
def mean_absolute_error(y_true, y_pred):
return K.mean(math_ops.abs(y_pred - y_true), axis=-1)
https://github.com/tensorflow/tensorflow/blob/6612da89516247503f03ef76e974b51a434fb52e/tensorflow/python/keras/losses.py#L414
And the same is true for all other losses defined in the same files.
Why is that? (This is my question.) To my understanding, losses are scalars, so why are some array dimensions kept?
The following is just background and not part of my question.
The reason I stumbled across this is that I was trying to implement some normalized MAE in Keras, and without thinking more about this issue, I tried this:
def normalized_mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1) / K.mean(K.abs(y_true), axis=-1)
This results in NaN
s, though, possibly (I am still investigating) due to me trying to divide an non-scalar array by another one.
(Edit: I know that my samples, which are images, have some all-zero rows and columns in the target, so that would induce a division by zero for these rows or columns. Computing means over multiple dimensions at once, that it, complete images which are guaranteed to be not all-zero, solves this immediate issue, but still this raises the question above.)
python tensorflow keras
In the current master
branch on the Keras repository, one can find this inside losses.py
:
def mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1)
https://github.com/keras-team/keras/blob/78e1f57c484da15466a34ed543a1cc4709617a2f/keras/losses.py#L17
Note how the mean is computed only along the last axis of the arrays y_true
and y_pred
due to the use of axis=-1
.
Similarly, the TensorFlow implementation reads:
def mean_absolute_error(y_true, y_pred):
return K.mean(math_ops.abs(y_pred - y_true), axis=-1)
https://github.com/tensorflow/tensorflow/blob/6612da89516247503f03ef76e974b51a434fb52e/tensorflow/python/keras/losses.py#L414
And the same is true for all other losses defined in the same files.
Why is that? (This is my question.) To my understanding, losses are scalars, so why are some array dimensions kept?
The following is just background and not part of my question.
The reason I stumbled across this is that I was trying to implement some normalized MAE in Keras, and without thinking more about this issue, I tried this:
def normalized_mean_absolute_error(y_true, y_pred):
return K.mean(K.abs(y_pred - y_true), axis=-1) / K.mean(K.abs(y_true), axis=-1)
This results in NaN
s, though, possibly (I am still investigating) due to me trying to divide an non-scalar array by another one.
(Edit: I know that my samples, which are images, have some all-zero rows and columns in the target, so that would induce a division by zero for these rows or columns. Computing means over multiple dimensions at once, that it, complete images which are guaranteed to be not all-zero, solves this immediate issue, but still this raises the question above.)
python tensorflow keras
python tensorflow keras
edited Mar 26 at 23:22
bers
asked Mar 25 at 5:52
bersbers
840624
840624
1
You are gettingNan
becauseK.mean(K.abs(y_true), axis=-1)
could be zero. So you should add some small value to this denumerator(e.g epsilon).
– Mitiku
Mar 25 at 6:25
@Mitiku you are right, but unless I have a made a serious mistake, none of my samples (which are images) should be all zero. Some rows or columns could be all zero, though, which would in fact explain the observed behavior. Which, if my interpretation is correct, raises the question why Keras computes losses by one dimension and not by sample.
– bers
Mar 25 at 6:46
I't because you are usingaxis=-1
argument. You are specifying to get mean over the last axis.
– Mitiku
Mar 25 at 7:07
@Mitiku I know that :) (Please do not confuse my motivation for the question with the question itself.) The question (still) is, why does Keras do that in its own implementation of the losses?
– bers
Mar 25 at 7:11
can you clarify more: why is using the last dimension unexected? Wouldnt y be one dimensional as these are your predictions?
– Kai Aeberli
Mar 25 at 9:02
|
show 4 more comments
1
You are gettingNan
becauseK.mean(K.abs(y_true), axis=-1)
could be zero. So you should add some small value to this denumerator(e.g epsilon).
– Mitiku
Mar 25 at 6:25
@Mitiku you are right, but unless I have a made a serious mistake, none of my samples (which are images) should be all zero. Some rows or columns could be all zero, though, which would in fact explain the observed behavior. Which, if my interpretation is correct, raises the question why Keras computes losses by one dimension and not by sample.
– bers
Mar 25 at 6:46
I't because you are usingaxis=-1
argument. You are specifying to get mean over the last axis.
– Mitiku
Mar 25 at 7:07
@Mitiku I know that :) (Please do not confuse my motivation for the question with the question itself.) The question (still) is, why does Keras do that in its own implementation of the losses?
– bers
Mar 25 at 7:11
can you clarify more: why is using the last dimension unexected? Wouldnt y be one dimensional as these are your predictions?
– Kai Aeberli
Mar 25 at 9:02
1
1
You are getting
Nan
because K.mean(K.abs(y_true), axis=-1)
could be zero. So you should add some small value to this denumerator(e.g epsilon).– Mitiku
Mar 25 at 6:25
You are getting
Nan
because K.mean(K.abs(y_true), axis=-1)
could be zero. So you should add some small value to this denumerator(e.g epsilon).– Mitiku
Mar 25 at 6:25
@Mitiku you are right, but unless I have a made a serious mistake, none of my samples (which are images) should be all zero. Some rows or columns could be all zero, though, which would in fact explain the observed behavior. Which, if my interpretation is correct, raises the question why Keras computes losses by one dimension and not by sample.
– bers
Mar 25 at 6:46
@Mitiku you are right, but unless I have a made a serious mistake, none of my samples (which are images) should be all zero. Some rows or columns could be all zero, though, which would in fact explain the observed behavior. Which, if my interpretation is correct, raises the question why Keras computes losses by one dimension and not by sample.
– bers
Mar 25 at 6:46
I't because you are using
axis=-1
argument. You are specifying to get mean over the last axis.– Mitiku
Mar 25 at 7:07
I't because you are using
axis=-1
argument. You are specifying to get mean over the last axis.– Mitiku
Mar 25 at 7:07
@Mitiku I know that :) (Please do not confuse my motivation for the question with the question itself.) The question (still) is, why does Keras do that in its own implementation of the losses?
– bers
Mar 25 at 7:11
@Mitiku I know that :) (Please do not confuse my motivation for the question with the question itself.) The question (still) is, why does Keras do that in its own implementation of the losses?
– bers
Mar 25 at 7:11
can you clarify more: why is using the last dimension unexected? Wouldnt y be one dimensional as these are your predictions?
– Kai Aeberli
Mar 25 at 9:02
can you clarify more: why is using the last dimension unexected? Wouldnt y be one dimensional as these are your predictions?
– Kai Aeberli
Mar 25 at 9:02
|
show 4 more comments
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55331915%2fwhy-do-keras-losses-consider-only-the-last-dimension-of-the-input-arrays%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f55331915%2fwhy-do-keras-losses-consider-only-the-last-dimension-of-the-input-arrays%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
You are getting
Nan
becauseK.mean(K.abs(y_true), axis=-1)
could be zero. So you should add some small value to this denumerator(e.g epsilon).– Mitiku
Mar 25 at 6:25
@Mitiku you are right, but unless I have a made a serious mistake, none of my samples (which are images) should be all zero. Some rows or columns could be all zero, though, which would in fact explain the observed behavior. Which, if my interpretation is correct, raises the question why Keras computes losses by one dimension and not by sample.
– bers
Mar 25 at 6:46
I't because you are using
axis=-1
argument. You are specifying to get mean over the last axis.– Mitiku
Mar 25 at 7:07
@Mitiku I know that :) (Please do not confuse my motivation for the question with the question itself.) The question (still) is, why does Keras do that in its own implementation of the losses?
– bers
Mar 25 at 7:11
can you clarify more: why is using the last dimension unexected? Wouldnt y be one dimensional as these are your predictions?
– Kai Aeberli
Mar 25 at 9:02