Softmax or LogSoftmax

The beautiful softmax function
>>> for x in range(20):print(x, math.e**x)
...
0 1.0
1 2.718281828459045
2 7.3890560989306495
3 20.085536923187664
4 54.59815003314423
5 148.41315910257657
6 403.428793492735
7 1096.6331584284583
8 2980.957987041727
9 8103.08392757538
10 22026.465794806703
11 59874.14171519778
12 162754.79141900383
13 442413.3920089202
14 1202604.2841647759
15 3269017.372472108
16 8886110.520507865
17 24154952.753575277
18 65659969.13733045
19 178482300.96318707

How to solve this problem?

Why logsoftmax is more stable?

An example to compare the output of Softmax and LogSoftmax

>>> m = nn.Softmax(dim=1)
>>> input = torch.randn(2, 3)
>>> output = m(input)
>>> input
tensor([[-1.1723, 0.3103, 1.7434],
[ 0.1054, 0.0876, 1.9890]])
>>> output
tensor([[0.0419, 0.1845, 0.7736],
[0.1168, 0.1148, 0.7684]])
>>> mm = nn.LogSoftmax(dim=1)
>>> output2 = mm(input)
>>> output2
tensor([[-3.1724, -1.6899, -0.2568],
[-2.1470, -2.1649, -0.2634]])
>>> torch.exp(output2)
tensor([[0.0419, 0.1845, 0.7736],
[0.1168, 0.1148, 0.7684]])
>>>

--

--

--

Data Scientist/MLE/SWE @takemobi

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Create Webservice in just 2 minutes with WSO2 Microservices Framework for Java (MSF4J)

10 Python Libraries for Data Engineering That You Didn’t Know You Needed

10 Best Wireless Access Points in 2022

Ubiquiti Wave2

0–5M ARR — Day #128 — It’s Webflow Time pt.3

Deploying AEMaaCS

Why Should you Upgrade Dynamics NAV to Business Central?

Write Spike Wars, Episode II: The Attack of the Reads

CI/CD for databases: bad idea

CI CD on Mobile application

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Jimmy Shen

Jimmy Shen

Data Scientist/MLE/SWE @takemobi

More from Medium

Understanding Transformers

Dijkstra’s algorithm

Brute Force Algorithm

PRIM’S ALGORITHM