Question
Give an example of human preference ordering reversal that contradicts the use of exponential discounting in a reward function.
For instance, humans may prefer a single sparse reward of + 1 (e.g., $1) now over a reward of + 2 (e.g.,$2) received a week later, but may also prefer a re- ward of + 2 received after 20 weeks over a reward of + 1 after 19 weeks.

For instance, humans may prefer a single sparse reward of + 1 (e.g., $1) now over a reward of + 2 (e.g.,$2) received a week later, but may also prefer a re- ward of + 2 received after 20 weeks over a reward of + 1 after 19 weeks.
For instance, humans may prefer a single sparse reward of + 1 (e.g., $1) now over a reward of + 2 (e.g.,$2) received a week later, but may also prefer a re- ward of + 2 received after 20 weeks over a reward of + 1 after 19 weeks.

No repetitions