PyTorch教程-4.3. 基本分类模型

您可能已经注意到，在回归的情况下，从头开始的实现和使用框架功能的简洁实现非常相似。分类也是如此。由于本书中的许多模型都处理分类，因此值得添加专门支持此设置的功能。本节为分类模型提供了一个基类，以简化以后的代码。
import torchfrom d2l import torch as d2l
from mxnet import autograd, gluon, np, npxfrom d2l import mxnet as d2lnpx.set_np()
from functools import partialimport jaximport optaxfrom jax import numpy as jnpfrom d2l import jax as d2l
no gpu/tpu found, falling back to cpu. (set tf_cpp_min_log_level=0 and rerun for more info.)
import tensorflow as tffrom d2l import tensorflow as d2l
4.3.1. 类classifier_
我们在下面定义classifier类。在中，validation_step我们报告了验证批次的损失值和分类准确度。我们为每个批次绘制一个更新num_val_batches 。这有利于在整个验证数据上生成平均损失和准确性。如果最后一批包含的示例较少，则这些平均数并不完全正确，但我们忽略了这一微小差异以保持代码简单。
class classifier(d2l.module): #@save the base class of classification models. def validation_step(self, batch): y_hat = self(*batch[:-1]) self.plot('loss', self.loss(y_hat, batch[-1]), train=false) self.plot('acc', self.accuracy(y_hat, batch[-1]), train=false)
we define the classifier class below. in the validation_step we report both the loss value and the classification accuracy on a validation batch. we draw an update for every num_val_batches batches. this has the benefit of generating the averaged loss and accuracy on the whole validation data. these average numbers are not exactly correct if the last batch contains fewer examples, but we ignore this minor difference to keep the code simple.
class classifier(d2l.module): #@save the base class of classification models. def validation_step(self, batch): y_hat = self(*batch[:-1]) self.plot('loss', self.loss(y_hat, batch[-1]), train=false) self.plot('acc', self.accuracy(y_hat, batch[-1]), train=false)
we define the classifier class below. in the validation_step we report both the loss value and the classification accuracy on a validation batch. we draw an update for every num_val_batches batches. this has the benefit of generating the averaged loss and accuracy on the whole validation data. these average numbers are not exactly correct if the last batch contains fewer examples, but we ignore this minor difference to keep the code simple.
we also redefine the training_step method for jax since all models that will subclass classifier later will have a loss that returns auxiliary data. this auxiliary data can be used for models with batch normalization (to be explained in section 8.5), while in all other cases we will make the loss also return a placeholder (empty dictionary) to represent the auxiliary data.
class classifier(d2l.module): #@save the base class of classification models. def training_step(self, params, batch, state): # here value is a tuple since models with batchnorm layers require # the loss to return auxiliary data value, grads = jax.value_and_grad( self.loss, has_aux=true)(params, batch[:-1], batch[-1], state) l, _ = value self.plot(loss, l, train=true) return value, grads def validation_step(self, params, batch, state): # discard the second returned value. it is used for training models # with batchnorm layers since loss also returns auxiliary data l, _ = self.loss(params, batch[:-1], batch[-1], state) self.plot('loss', l, train=false) self.plot('acc', self.accuracy(params, batch[:-1], batch[-1], state), train=false)
we define the classifier class below. in the validation_step we report both the loss value and the classification accuracy on a validation batch. we draw an update for every num_val_batches batches. this has the benefit of generating the averaged loss and accuracy on the whole validation data. these average numbers are not exactly correct if the last batch contains fewer examples, but we ignore this minor difference to keep the code simple.
class classifier(d2l.module): #@save the base class of classification models. def validation_step(self, batch): y_hat = self(*batch[:-1]) self.plot('loss', self.loss(y_hat, batch[-1]), train=false) self.plot('acc', self.accuracy(y_hat, batch[-1]), train=false)
默认情况下，我们使用随机梯度下降优化器，在小批量上运行，就像我们在线性回归的上下文中所做的那样。
@d2l.add_to_class(d2l.module) #@savedef configure_optimizers(self): return torch.optim.sgd(self.parameters(), lr=self.lr)
@d2l.add_to_class(d2l.module) #@savedef configure_optimizers(self): params = self.parameters() if isinstance(params, list): return d2l.sgd(params, self.lr) return gluon.trainer(params, 'sgd', {'learning_rate': self.lr})
@d2l.add_to_class(d2l.module) #@savedef configure_optimizers(self): return optax.sgd(self.lr)
@d2l.add_to_class(d2l.module) #@savedef configure_optimizers(self): return tf.keras.optimizers.sgd(self.lr)
4.3.2. 准确性
给定预测概率分布y_hat，每当我们必须输出硬预测时，我们通常会选择预测概率最高的类别。事实上，许多应用程序需要我们做出选择。例如，gmail 必须将电子邮件分类为“主要”、“社交”、“更新”、“论坛”或“垃圾邮件”。它可能会在内部估计概率，但最终它必须在类别中选择一个。
当预测与标签 class 一致时y，它们是正确的。分类准确度是所有正确预测的分数。尽管直接优化精度可能很困难（不可微分），但它通常是我们最关心的性能指标。它通常是基准测试中的相关数量。因此，我们几乎总是在训练分类器时报告它。
准确度计算如下。首先，如果y_hat是一个矩阵，我们假设第二个维度存储每个类别的预测分数。我们使用argmax每行中最大条目的索引来获取预测类。然后我们将预测的类别与真实的元素进行比较y。由于相等运算符== 对数据类型敏感，因此我们转换的y_hat数据类型以匹配的数据类型y。结果是一个包含条目 0（假）和 1（真）的张量。求和得出正确预测的数量。
@d2l.add_to_class(classifier) #@savedef accuracy(self, y_hat, y, averaged=true): compute the number of correct predictions. y_hat = y_hat.reshape((-1, y_hat.shape[-1])) preds = y_hat.argmax(axis=1).type(y.dtype) compare = (preds == y.reshape(-1)).type(torch.float32) return compare.mean() if averaged else compare
@d2l.add_to_class(classifier) #@savedef accuracy(self, y_hat, y, averaged=true): compute the number of correct predictions. y_hat = y_hat.reshape((-1, y_hat.shape[-1])) preds = y_hat.argmax(axis=1).astype(y.dtype) compare = (preds == y.reshape(-1)).astype(np.float32) return compare.mean() if averaged else compare@d2l.add_to_class(d2l.module) #@savedef get_scratch_params(self): params = [] for attr in dir(self): a = getattr(self, attr) if isinstance(a, np.ndarray): params.append(a) if isinstance(a, d2l.module): params.extend(a.get_scratch_params()) return params@d2l.add_to_class(d2l.module) #@savedef parameters(self): params = self.collect_params() return params if isinstance(params, gluon.parameter.parameterdict) and len( params.keys()) else self.get_scratch_params()
@d2l.add_to_class(classifier) #@save@partial(jax.jit, static_argnums=(0, 5))def accuracy(self, params, x, y, state, averaged=true): compute the number of correct predictions. y_hat = state.apply_fn({'params': params, 'batch_stats': state.batch_stats}, # batchnorm only *x) y_hat = y_hat.reshape((-1, y_hat.shape[-1])) preds = y_hat.argmax(axis=1).astype(y.dtype) compare = (preds == y.reshape(-1)).astype(jnp.float32) return compare.mean() if averaged else compare
@d2l.add_to_class(classifier) #@savedef accuracy(self, y_hat, y, averaged=true): compute the number of correct predictions. y_hat = tf.reshape(y_hat, (-1, y_hat.shape[-1])) preds = tf.cast(tf.argmax(y_hat, axis=1), y.dtype) compare = tf.cast(preds == tf.reshape(y, -1), tf.float32) return tf.reduce_mean(compare) if averaged else compare
4.3.3. 概括
分类是一个足够普遍的问题，它保证了它自己的便利功能。分类中最重要的是分类器的准确性。请注意，虽然我们通常主要关心准确性，但出于统计和计算原因，我们训练分类器以优化各种其他目标。然而，无论在训练过程中哪个损失函数被最小化，有一个方便的方法来根据经验评估我们的分类器的准确性是有用的。
4.3.4. 练习
表示为lv验证损失，让lvq是通过本节中的损失函数平均计算的快速而肮脏的估计。最后，表示为lvb最后一个小批量的损失。表达lv按照lvq, lvb，以及样本和小批量大小。
表明快速而肮脏的估计lvq是公正的。也就是说，表明e[lv]=e[lvq]. 为什么你还想使用lv反而？
给定多类分类损失，表示为l(y,y′) 估计的惩罚y′当我们看到y并给出一个概率p(y∣x), 制定最佳选择规则y′. 提示：表达预期损失，使用 l和p(y∣x).

基于Numpy实现神经网络：如何加入和调整dropout？
自制交流自动稳压器电路
Pyramid 发现，在LTE时代，WiMax会在细分市场战
组态王和触摸屏与200Smart之间PN无线通讯
三星note7翻版重做，从头来过的note7你还会买吗？
PyTorch教程-4.3. 基本分类模型
语音识别芯片WTK6900H在小夜灯中的应用
天府七中建立数据驱动等资源共享分发平台
高振实密度提升锂硫电池体积和质量能量密度
UVLED点光源固化机的十大特性
如何变压器线圈的直流电阻
Jeep自在客一款硬派越野车还是以实用性为主,报价十分亲民,你会买吗?
怎样使用Facebook作为树莓派终端
万和热水器不打火怎么解决
便携式Arduino机器人的制作方法
白光干涉仪(光学3D表面轮廓仪)与台阶仪的区别
AI聊天机器人ChatGPT与3DFrame集成以革新虚拟产品演示
预锂化功能隔膜实现长循环高能锂离子电池
探讨人工智能企业的破局之路
基于意法半导体Arm Cortex-M7 MCU STM32H743 的语音辨识解决方案