go to ORKG: http://orkg.org/orkg/predicate/P59115
hidden size
This field is the size of the hidden layers within each transformer decoder-only block. It represents the dimensionality of the hidden state in the model
This field is the size of the hidden layers within each transformer decoder-only block. It represents the dimensionality of the hidden state in the model