API Reference

Flux.Losses.crossentropyMethod
crossentropy(ŷ::AbstractArray, y::AbstractArray, m::AbstractSeqMask; ϵ)
crossentropy(sum, ŷ::AbstractArray, y::AbstractArray, m::AbstractSeqMask; ϵ)

Flux.crossentropy with an extra sequence mask for masking out non-needed token loss. y is the labels. By default it take the mean by dividing the number of valid tokens. This can be change to simply sum the valid losses by add the first argument sum. See also safe_crossentropy

source
Flux.Losses.logitcrossentropyMethod
logitcrossentropy(ŷ::AbstractArray, y::AbstractArray, m::AbstractSeqMask)
logitcrossentropy(sum, ŷ::AbstractArray, y::AbstractArray, m::AbstractSeqMask)

Flux.logitcrossentropy with an extra sequence mask for masking out non-needed token loss. y is the labels. By default it take the mean by dividing the number of valid tokens. This can be change to simply sum the valid losses by add the first argument sum. See also safe_logitcrossentropy

source
Transformers.enable_gpuFunction
enable_gpu(t=true)

Enable gpu for todevice, disable with enable_gpu(false). The backend is selected by Flux.gpu_backend!. Should only be used in user scripts.

source
Transformers.firsttokenMethod
firsttoken(x, m::AbstractSeqMask)

Slice the first token from the hidden states. The "first" token is defined by the sequence mask.

source
Transformers.lasttokenMethod
lasttoken(x, m::AbstractSeqMask)

Slice the last token from the hidden states. The "last" token is defined by the sequence mask.

source
Transformers.safe_crossentropyMethod
safe_crossentropy(ŷ::AbstractArray, y::AbstractArray, m::AbstractSeqMask; ϵ)
safe_crossentropy(sum, ŷ::AbstractArray, y::AbstractArray, m::AbstractSeqMask; ϵ)

crossentropy. If the label y is an integer array, then it would also call maximum on the label to make sure no label number is large then the first dimension of . See also unsafe_crossentropy.

source
Transformers.safe_logitcrossentropyMethod
safe_logitcrossentropy(ŷ::AbstractArray, y::AbstractArray, m::AbstractSeqMask)
safe_logitcrossentropy(sum, ŷ::AbstractArray, y::AbstractArray, m::AbstractSeqMask)

logitcrossentropy. If the label y is an integer array, then it would also call maximum on the label to make sure no label number is large then the first dimension of . See also unsafe_logitcrossentropy.

source
Transformers.todeviceMethod
todevice(x)

Move data to device, only when gpu is enable with enable_gpu, basically equal Flux.gpu. Otherwise just Flux.cpu.

source
Transformers.unsafe_crossentropyMethod
unsafe_crossentropy(ŷ::AbstractArray, y::AbstractArray{<:Integer}, m::AbstractSeqMask; ϵ)
unsafe_crossentropy(sum, ŷ::AbstractArray, y::AbstractArray{<:Integer}, m::AbstractSeqMask; ϵ)

Compute crossentropy with integer labels. The prefix "unsafe" means that if y contain any number larger than the first dimension of , the behavior is undefined. See also safe_crossentropy.

source
Transformers.unsafe_logitcrossentropyMethod
unsafe_logitcrossentropy(ŷ::AbstractArray, y::AbstractArray{<:Integer}, m::AbstractSeqMask)
unsafe_logitcrossentropy(sum, ŷ::AbstractArray, y::AbstractArray{<:Integer}, m::AbstractSeqMask)

Compute logitcrossentropy with integer labels. The prefix "unsafe" means that if y contain any number larger than the first dimension of , the behavior is undefined. See also safe_logitcrossentropy.

source