PyTorch笔记07----基本运算

Posted on 2021-12-16 Edited on 2022-01-19 In Learning Views:

PyTorch的一些基本运算

add / minus / multiply / divide
matmul
pow
sqrt / rsqrt
round

basic

a = torch.rand(3, 4)
b = torch.rand(4)

torch.all(torch.eq(a + b, torch.add(a, b))) #tensor(True)

torch.all(torch.eq(a - b, torch.sub(a, b))) #tensor(True)

torch.all(torch.eq(a * b, torch.mul(a, b))) #tensor(True)

torch.all(torch.eq(a / b, torch.div(a, b))) #tensor(True)

//是整除

matmul 矩阵乘法

torch.mm only for 2d，因此不推荐
torch.matmul 或 @

*是相同位置元素相乘，.matmul是矩阵乘法

a                   #tensor([[3., 3.],
                             [3., 3.]])
b = torch.ones(2, 2)

torch.mm(a, b)      #tensor([[6., 6.],
                             [6., 6.]])
                             
torch.matmul(a, b)
a @ b
#结果相同

example

神经网络线性层的相加

a = torch.rand(4, 784)
x = torch.rand(4, 784)

w = torch.rand(512, 784)    #降维，把784降到512
# 默认是[channel-out, channel-in]

(x @ w.t()).shape           #tensor.Size([4, 512])
#如果w是高维的，使用transpose交换

tensor matmul

a = torch.rand(4, 3, 28, 64)
b = torch.rand(4, 3, 64, 32)

torch.mm(a, b).shape
--------ERROR--------


torch.matmul(a, b).shape        #tensor.Size([4, 3, 28, 32]) 只用最后两维运算，前面的不变

b = torch.rand(4, 1, 64, 32)
torch.matmul(a, b).shape        #tensor.Size([4, 3, 28, 32]) 使用了broadcast机制

b = torch.rand(2, 64, 32)
torch.matmul(a, b).shape
--------ERROR--------

power / sqrt / rsqrt

a = torch.full([2, 2], 3)
a.power(2)      #tensor([[9., 9.],
                         [9., 9.]])
a ** 2          #结果相同

aa = a ** 2
aa.sqrt()       #tensor([[3., 3.],
                         [3., 3.]])
aa.rsqrt()      #tensor([[0.3333, 0.3333],
                         [0.3333, 0.3333]])
#是sqrt的倒数

a ** 0.5        #同sqrt()
```                         

# exp / log

a = torch.exp(torch.ones(2, 2)) a #tensor([[2.7183, 2.7183], [2.7183, 2.7183]])

torch.log(a) #tensor([[1., 1.], [1., 1.]]) #默认以e为底，还可使用log2()、log10()


# approximation 近似值
- `.floor()` / `.ceil()` 向下取整与向上取整
- `.round()` 四舍五入
- `.trunc()` / `.frac()` 整数部分与小数部分

# clamp 裁剪
- gradient clipping 梯度裁剪（梯度弥散、梯度爆炸）
> 时不时要打印梯度的模`w.grad.norm(2)`

grad = torch.rand(2, 3) * 15 grad #tensor([[14.8737, 10.1571, 4.4872], [11.3591, 8.9101, 14.0524]])

grad.max() #tensor(14.8737) grad.median() #tensor(10.1571) 中间值

grad.clamp(10) #将grad中的小于10的部分变为10 #tensor([[14.8737, 10.1571, 10.0000], [11.3591, 10.0000, 14.0524]])

grad.clamp(0, 10) #将grad中的数据设置为0～10直接，超出算10，小于算0 #tensor([[10.0000, 10.0000, 4.4872], [10.0000, 8.9101, 10.0000]])


梯度裁剪：

for w in [] #wlist clamp(w.grad, 10) ```