Gradient Descent is Hard
Big things happen when you do the little things right
Remember that time when you told yourself, “I am going to work continuously for six hours and get this thing done!” and it didn’t happen? That time when you went to bed at three in the night with the resolution that you’d get up at seven and work, only to wake up much later?
Gradient descent is almost how water flows downhill along the direction in which height decreases most rapidly. It’s about aligning your path towards the local minimum by taking steps proportional to the negative of the gradient or derivative.
The thing is, in life, it doesn’t always work.
Tackling problems head on is tempting. Everybody wants to make a definitive to-do list on a daily basis and strike off items one by one. Habitual changes don’t come about this way. Getting up early in the morning and fresh for work doesn’t happen by going to bed at three and setting ten alarms for seven. It might work for a day or two, but you’re more than likely to revert back to the old routine or start cheating on the rules by pushing the time to the next half-hour round figure, etc.
Similar is the case with concentrating for long hours. Deciding right at the outset that you’re going to work continuously for six hours is almost never going to work. It’s too ambitious a goal. There are bound to be distractions and interruptions in such a long time interval. It might even seem okay to take five-minute breaks now and then because you still have n hours.
Set short-term achievable goals. Getting yourself to work without distractions for an hour is a much more practical target to aim for. In my case, it almost always gets extended to two or three hours. It doesn’t work the other way round though, i.e. the ambitious pledge governing the commensurate outcome.
Waking up early means you have to finish work and be in bed by twelve to manage a good seven-hour sleep which in turn means that you have to finish socializing and dinner and other errands by ten.
The key lies in taking care of the tangential, individually inconsequential but necessary components first rather than a direct gradient descent approach. Bigger things will happen when you get the little things right.