fix typo & a QUESTION about 16-bit training

#121
by jaycha - opened

fix typo in floating point upper bound

Thanks for the awesome work! This is the best book I've ever seen to start a large scale training.

One quick question?
Once the weights are 0, they will remain at 0 for the rest of training as there is no gradient signal coming through anymore.
This statement could usually be true in practice, but IMO it is not strictly true as non-zero gradient updates is possible.
If my opinion is correct, maybe adding some comment about it on the book would be nice :)

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment