<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://lug.ustc.edu.cn/feed/planet.xml" rel="self" type="application/atom+xml" /><link href="https://lug.ustc.edu.cn/" rel="alternate" type="text/html" /><updated>2026-02-10T14:05:41+08:00</updated><id>https://lug.ustc.edu.cn/feed/planet.xml</id><title type="html">LUG @ USTC | Planet</title><subtitle>中国科学技术大学 Linux 用户协会</subtitle><author><name>USTCLUG</name></author><entry><title type="html">提升命令行使用体验──tmux 终端复用</title><link href="https://lug.ustc.edu.cn/planet/2025/07/how-to-use-tmux/" rel="alternate" type="text/html" title="提升命令行使用体验──tmux 终端复用" /><published>2025-07-06T00:00:00+08:00</published><updated>2025-07-16T10:31:56+08:00</updated><id>https://lug.ustc.edu.cn/planet/2025/07/how-to-use-tmux</id><content type="html" xml:base="https://lug.ustc.edu.cn/planet/2025/07/how-to-use-tmux/"><![CDATA[<p>本文会介绍 tmux 的基础使用方法，以及长期使用下来的一些常用配置。</p>
      <h2 id="背景--原理">背景 &amp; 原理</h2>
      <p>在使用命令行的过程中，我们经常遇到一个需求，那就是让程序在退出当前终端后保持运行。在以往，我们可能会使用 <code class="language-plaintext highlighter-rouge">nohup</code> 加让程序后台运行的方式，比如：</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>➜  ~ <span class="nb">nohup </span>ping 127.0.0.1 &amp;
<span class="o">[</span>1] 89083
<span class="nb">nohup</span>: ignoring input and appending output to <span class="s1">'nohup.out'</span>
➜  ~ ps <span class="nt">-elf</span> |grep ping
0 S yfy        89083   88672  0  85   5 -  3352 -      11:39 pts/0    00:00:00 ping 127.0.0.1
</code></pre>
        </div>
      </div>
      <p>这样在断开终端连接后，重新登陆到服务器，可以发现 ping 进程仍然在运行。nohup 默认将程序输出追加到 <code class="language-plaintext highlighter-rouge">nohup.out</code> 文件中，可以通过 tail 检验这一点。</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>➜  ~ <span class="nb">tail</span> <span class="nt">-f</span> nohup.out
64 bytes from 127.0.0.1: <span class="nv">icmp_seq</span><span class="o">=</span>7 <span class="nv">ttl</span><span class="o">=</span>64 <span class="nb">time</span><span class="o">=</span>0.024 ms
64 bytes from 127.0.0.1: <span class="nv">icmp_seq</span><span class="o">=</span>8 <span class="nv">ttl</span><span class="o">=</span>64 <span class="nb">time</span><span class="o">=</span>0.027 ms
64 bytes from 127.0.0.1: <span class="nv">icmp_seq</span><span class="o">=</span>9 <span class="nv">ttl</span><span class="o">=</span>64 <span class="nb">time</span><span class="o">=</span>0.023 ms
64 bytes from 127.0.0.1: <span class="nv">icmp_seq</span><span class="o">=</span>10 <span class="nv">ttl</span><span class="o">=</span>64 <span class="nb">time</span><span class="o">=</span>0.035 ms
➜  ~ ps <span class="nt">-elf</span> |head <span class="nt">-n1</span>
F S UID          PID    PPID  C PRI  NI ADDR SZ WCHAN  STIME TTY          TIME CMD
➜  ~ ps <span class="nt">-elf</span> |grep ping
0 S yfy        89083       1  0  85   5 -  3352 -      11:39 ?        00:00:00 ping 127.0.0.1
</code></pre>
        </div>
      </div>
      <p>正常情况下，当终端退出时（如关闭终端窗口，或者 ssh 网络中断），shell 进程会给所有子进程发送一个 SIGHUP 信号，从而导致进程结束。而 nohup 可以让进程忽略掉 SIGHUP 信号，从而在后台保持运行。</p>
      <blockquote>
        <p>Tip: ps 输出还可以观察到 ping 进程父进程变为了 1，这是因为原本的父进程 shell 已经退出，Linux 内核会将孤儿进程交由 init/systemd 进程托管。</p>
      </blockquote>
      <p>nohup 在简单场景比较有用，但是不适用更复杂的场景。比如用户无法再对后台进程进行输入操作，并且同时维护多个进程也很不方便。不过我们其实还有更优雅、更强大的方案，那就是 tmux。</p>
      <p>tmux 是一个终端复用器（<strong>T</strong>erminal <strong>MU</strong>ltiple<strong>X</strong>er）。简单来说，tmux 的功能可以概括成一句话：在一个终端里创建多个窗口，并且将窗口分割成多个窗格，每个窗格都运行独立的 shell 进程。这样你就可以让多个应用程序同时运行，而无需打开多个终端模拟器窗口（terminal emulator，如 Xshell、PuTTY、Windows Terminal 等）。下图展示了使用 tmux 创建了一个窗口，并将窗口上下分屏成两个窗格，每个窗格都运行一个 zsh。</p>
      <p><img src="/static/planet/20250705151056.png" alt="tmux example" /></p>
      <p>更重要的是，tmux 将终端和会话（session）进行了分离。我们把创建的窗口和窗格（和其中创建的所有 shell 进程及子进程）看成一个会话，用户可以随时退出（detach）当前会话，会话会在服务器后台保持运行。在之后，用户可以重新连接上（attach）这些会话，从而继续之前的工作。</p>
      <p>原理上，tmux 采用了客户端和服务器分离的架构（C/S 架构）。用户看到和交互的是 tmux client，tmux client 中创建的窗口（和 shell 进程）连接在 tmux server 进程上。因此，当用户退出 tmux client 后，会话内容可以在后台运行。
        通过 pstree 命令，可以看到上图中运行的两个 zsh 的父进程是 <code class="language-plaintext highlighter-rouge">tmux: server</code>。而和用户交互的其实是 <code class="language-plaintext highlighter-rouge">tmux: client</code>，其下面并无其他子进程。</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>➜  ~ pstree <span class="nt">-u</span> yfy|less
sshd---zsh---tmux: client
tmux: server-+-zsh-+-less
             |     <span class="sb">`</span><span class="nt">-pstree</span>
             <span class="sb">`</span><span class="nt">-zsh---ping</span>
</code></pre>
        </div>
      </div>
      <p>在简单了解了下 tmux 的原理后，马上开始 tmux 的上手体验吧。tmux 基础使用非常简单，花 10 分钟就可以完全了解。</p>
      <h2 id="tmux-quick-start">tmux quick start</h2>
      <p>大部分 Linux 发行版都提供 <code class="language-plaintext highlighter-rouge">tmux</code> 包，可以通过包管理器安装。其他安装方式（如编译安装）也可以参考官方文档：<a href="https://github.com/tmux/tmux/wiki/Installing">Installing · tmux/tmux Wiki</a>。</p>
      <p>在安装 tmux 后的第一步，我们首先创建一个 tmux 会话。运行 <code class="language-plaintext highlighter-rouge">tmux</code> 命令后，会进入 tmux 并创建第一个窗口，在这里你可以像往常一样执行命令。</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>tmux
</code></pre>
        </div>
      </div>
      <p>接着我们介绍最重要的退出话会话和重连会话的方法。按下 <code class="language-plaintext highlighter-rouge">Ctrl-B</code> 后，再按 <code class="language-plaintext highlighter-rouge">d</code> 可以退出（detach）当前会话。这样会回到输入 tmux 前的 shell。</p>
      <p>通过 <code class="language-plaintext highlighter-rouge">tmux list-sessions</code> 或者 <code class="language-plaintext highlighter-rouge">tmux ls</code> 可以查看当前所有的会话信息。</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>➜  ~ tmux <span class="nb">ls
</span>0: 1 windows <span class="o">(</span>created Thu Jul  3 14:06:05 2025<span class="o">)</span>
</code></pre>
        </div>
      </div>
      <p>通过 <code class="language-plaintext highlighter-rouge">tmux attach</code> 可以连接回第一个会话。</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>tmux a  <span class="c"># attach</span>
</code></pre>
        </div>
      </div>
      <p>接下来介绍 tmux 最关键的多窗口管理功能，有了它就可以大大提升多任务操作的效率。
        tmux 有 session（会话）, window（窗口）, pane（窗格）三个粒度。session 是 tmux 最大的一个粒度，session 下可以创建多个 window，window 可以分割成多个 pane。这里重点介绍窗口和窗格相关的快捷键。</p>
      <h3 id="窗口操作">窗口操作</h3>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="c"># 创建</span>
Ctrl-B c <span class="c"># 创建新的窗口</span>
Ctrl-B &amp; <span class="c"># 删除当前窗口</span>

<span class="c"># 切換</span>
Ctrl-B Tab        <span class="c"># 切换到刚刚的窗口</span>
Ctrl-B p          <span class="c"># 切换上一个</span>
Ctrl-B n          <span class="c"># 切换下一个</span>
Ctrl-B 数字编号    <span class="c"># 切换到指定一个窗口</span>

<span class="c"># 修改窗口名字</span>
Ctrl-B ,          <span class="c"># 修改当前窗口名字</span>
</code></pre>
        </div>
      </div>
      <h3 id="窗格操作">窗格操作</h3>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="c"># 创建</span>
Ctrl-B <span class="se">\"</span>     <span class="c"># 上下切分</span>
Ctrl-B %      <span class="c"># 左右切分</span>
Ctrl-B x      <span class="c"># 删除</span>

<span class="c"># 切换</span>
Ctrl-B 方向键      <span class="c"># 方向键上下左右</span>
Ctrl-B <span class="o">[</span>hjkl]     <span class="c"># 使用 vi 风格的 hjkl 键切换，分别对应左上下右</span>

Ctrl-B z      <span class="c">#  切换全屏</span>
</code></pre>
        </div>
      </div>
      <p>相信你操作了这些快捷键后，已经可以感觉到 tmux 的方便和强大了。tmux 也是一个像 vim 那样灵活性很高的软件，很多时候一个你不知道的快捷键就能极大体验。因此需要去更多尝试。<code class="language-plaintext highlighter-rouge">tmux list-keys</code> 可以查看 tmux 的所有快捷键绑定。</p>
      <p>本文只介绍了 tmux 的快捷键使用方法，tmux 的软件设计非常简洁高效，以上所有快捷键都有对应它的一个子命令，比如垂直划分窗格对应 <code class="language-plaintext highlighter-rouge">tmux split-window -v</code> 命令。如果记不住快捷键，可以直接使用 tmux 命令。tmux man 手册的 <code class="language-plaintext highlighter-rouge">COMMAND</code> 章节提供了所有命令的详细解释。另外本文没有介绍 tmux 的 vi copy 模式，使用该模式可以方便查找、复制程序输出内容，该内容比较进阶，感兴趣的读者可以自行了解。</p>
      <h2 id="常用配置分享">常用配置分享</h2>
      <p>tmux 像 vim 一样支持高度定制化，可以通过修改 <code class="language-plaintext highlighter-rouge">~/.tmux.conf</code> 配置各种快捷键，网络上也有 <a href="https://github.com/gpakosz/.tmux">oh-my-tmux</a> 这样的项目来帮你做一些配置。接下来分享一些我长期使用下来，感觉很有必要的配置，完整配置见我的 <a href="https://github.com/TheRainstorm/my-vim-config/blob/master/tmux/.tmux.conf">github</a> 仓库。</p>
      <blockquote>
        <p>Bonus：在 <code class="language-plaintext highlighter-rouge">~/.bashrc</code> 或 <code class="language-plaintext highlighter-rouge">~/.zshrc</code> 中设置 <code class="language-plaintext highlighter-rouge">alias t="tmux"</code>，tmux attach 效率立刻提升 300%！</p>
      </blockquote>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>t      <span class="c"># 创建 sessin</span>
t a    <span class="c"># attach 到第一个 session</span>
</code></pre>
        </div>
      </div>
      <h3 id="修改快捷键前缀">修改快捷键前缀</h3>
      <p><code class="language-plaintext highlighter-rouge">Ctrl-B</code> 按起来距离比较远，很不方便。可以修改成 <code class="language-plaintext highlighter-rouge">Ctrl-X</code>。</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="c"># prefix</span>
<span class="nb">set</span> <span class="nt">-g</span> prefix C-x
unbind-key C-b        <span class="c"># disable default prefix</span>
<span class="nb">bind </span>C-x send-prefix
</code></pre>
        </div>
      </div>
      <h3 id="划分窗格快捷键">划分窗格快捷键</h3>
      <p>窗格左右划分和上下划分快捷键比较难记。修改为下划线 <code class="language-plaintext highlighter-rouge">_</code> 是上下分屏，<code class="language-plaintext highlighter-rouge">-</code> 是左右分屏。</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="c"># @ pane</span>
<span class="c"># split current window horizontally</span>
<span class="nb">bind </span>_ split-window <span class="nt">-v</span> <span class="nt">-c</span> <span class="s2">"#{pane_current_path}"</span>

<span class="c"># split current window vertically</span>
<span class="nb">bind</span> - split-window <span class="nt">-h</span> <span class="nt">-c</span> <span class="s2">"#{pane_current_path}"</span>
</code></pre>
        </div>
      </div>
      <h3 id="鼠标操作">鼠标操作</h3>
      <p>tmux 默认没有启用鼠标，导致无法使用鼠标滚动历史记录。启用鼠标后，还可以直接点击切换不同窗格，以及拖动选择文字复制。</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>set-option <span class="nt">-g</span> mouse on <span class="c"># open mouse scroll</span>
</code></pre>
        </div>
      </div>
      <h3 id="打开窗口时默认路径">打开窗口时默认路径</h3>
      <p>tmux 打开新窗口时，shell 的默认路径是启动 tmux 客户端时的路径。可以通过以下配置实现：</p>
      <ul>
        <li><code class="language-plaintext highlighter-rouge">Ctrl-X Alt-C</code> ：更改默认路径为当前路径</li>
        <li>创建 panel 时，使用当前路径（<code class="language-plaintext highlighter-rouge">-c</code> 参数）</li>
      </ul>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="nb">bind </span>M-c attach-session <span class="nt">-c</span> <span class="s2">"#{pane_current_path}"</span> <span class="c"># Alt-C, to change current path</span>

<span class="c"># @ pane</span>
<span class="c"># split current window horizontally</span>
<span class="nb">bind </span>_ split-window <span class="nt">-v</span> <span class="nt">-c</span> <span class="s2">"#{pane_current_path}"</span>

<span class="c"># split current window vertically</span>
<span class="nb">bind</span> - split-window <span class="nt">-h</span> <span class="nt">-c</span> <span class="s2">"#{pane_current_path}"</span>
</code></pre>
        </div>
      </div>
      <h3 id="移动窗口顺序">移动窗口顺序</h3>
      <p><code class="language-plaintext highlighter-rouge">Ctrl-Shift</code> 加方向键左右，可以调整 window 的顺序。</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="nb">bind</span> <span class="nt">-n</span> C-S-Left swap-window <span class="nt">-t</span> <span class="nt">-1</span><span class="se">\;</span> <span class="k">select</span><span class="nt">-window</span> <span class="nt">-t</span> <span class="nt">-1</span>
<span class="nb">bind</span> <span class="nt">-n</span> C-S-Right swap-window <span class="nt">-t</span> +1<span class="se">\;</span> <span class="k">select</span><span class="nt">-window</span> <span class="nt">-t</span> +1
</code></pre>
        </div>
      </div>
      <h3 id="session-操作">session 操作</h3>
      <p>虽然 window 和 pane 已经足够进行多任务管理了。但是 tmux 提供的多 session 操作也有其用处。适合管理若干完全不相关的任务。</p>
      <p>修改快捷键，使其和 window 快捷键类似：<code class="language-plaintext highlighter-rouge">Ctrl-C</code> 创建新 session，<code class="language-plaintext highlighter-rouge">N</code>/<code class="language-plaintext highlighter-rouge">P</code> （大写）切换上一个和下一个 session。</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="c"># create session</span>
<span class="nb">bind </span>C-c new-session

<span class="c"># session navigation</span>
<span class="c"># (/)   #move to prev/next session</span>
<span class="nb">bind</span> <span class="nt">-r</span> BTab switch-client <span class="nt">-l</span>  <span class="c"># move to last session</span>
<span class="nb">bind</span> <span class="nt">-r</span> N switch-client <span class="nt">-n</span>
<span class="nb">bind</span> <span class="nt">-r</span> P switch-client <span class="nt">-p</span>
</code></pre>
        </div>
      </div>
      <h3 id="复制文字后不自动跳到结尾">复制文字后不自动跳到结尾</h3>
      <p>tmux 鼠标选中文字会自动复制，但是会自动跳到结尾。有时在查看一些很长的输出历史时，并不想回到结尾，可以通过以下配置禁用跳转：</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="c"># copy select text, and don't jump to end</span>
<span class="c"># https://stackoverflow.com/questions/32374907/tmux-mouse-copy-mode-jumps-to-bottom</span>
<span class="nb">bind</span> <span class="nt">-T</span> copy-mode-vi MouseDragEnd1Pane send-keys <span class="nt">-X</span> copy-selection
</code></pre>
        </div>
      </div>
      <blockquote>
        <p>鼠标选中文字后会进入 tmux 的 copy 模式，无法输入命令，此时可以按 <code class="language-plaintext highlighter-rouge">q</code> 退出。</p>
      </blockquote>
      <h3 id="tmux-in-tmux">tmux in tmux</h3>
      <p>有时候想要在 tmux 内再连接另一台服务器，然后也使用 tmux。正常情况下 tmux 快捷键只会传递给外层的 tmux 因此无法工作，但是可以使用以下配置实现按 F12 在内外 tmux 间切换。</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="c"># 按 F12 切换到内嵌 tmux，在 macos 里需要系统设置中取消 F12 占用</span>
<span class="c"># 1. prefix 为 None，不再拦截快捷键</span>
<span class="c"># 2. key-table 为 off，下面再绑定 off 下的 F12，使之能退出内嵌模式</span>
<span class="c"># 3. 改变 statusbar 颜色，以便知道已进入内嵌模式</span>
<span class="c"># 4. 如果处于特殊模式，退出</span>
unbind <span class="nt">-T</span> root F12
<span class="nb">bind</span> <span class="nt">-T</span> root F12 <span class="se">\</span>
  <span class="nb">set </span>prefix None <span class="se">\;\</span>
  <span class="nb">set </span>key-table off <span class="se">\;\</span>
  <span class="nb">set </span>status-style <span class="nb">bg</span><span class="o">=</span>colour235 <span class="se">\;\</span>
  <span class="k">if</span> <span class="nt">-F</span> <span class="s1">'#{pane_in_mode}'</span> <span class="s1">'send-keys -X cancel'</span> <span class="se">\;\</span>
  refresh-client <span class="nt">-S</span>

<span class="c"># 在 off 表里绑定 F12，恢复之前的设置，以退出该模式</span>
<span class="nb">bind</span> <span class="nt">-T</span> off F12 <span class="se">\</span>
  <span class="nb">set</span> <span class="nt">-u</span> prefix <span class="se">\;\</span>
  <span class="nb">set</span> <span class="nt">-u</span> key-table <span class="se">\;\</span>
  <span class="nb">set</span> <span class="nt">-u</span> status-style <span class="se">\;\</span>
  refresh-client <span class="nt">-S</span>
</code></pre>
        </div>
      </div>
      <p>另一个小技巧是：连续按两次 <code class="language-plaintext highlighter-rouge">Ctrl-B</code> 时快捷键会传递给内部 tmux。</p>
      <h2 id="参考资料">参考资料</h2>
      <ul>
        <li><a href="https://www.ruanyifeng.com/blog/2019/10/tmux.html">Tmux 使用教程 - 阮一峰的网络日志</a></li>
        <li><a href="https://hamvocke.com/blog/a-quick-and-easy-guide-to-tmux/">Ham Vocke — A Quick and Easy Guide to tmux - Ham Vocke</a></li>
        <li><a href="https://fatfatson.github.io/2019/08/11/tmux%E5%86%85%E8%81%94%E4%BD%BF%E7%94%A8%E6%96%B9%E6%B3%95/">tmux 内联使用方法</a></li>
      </ul>
      ]]></content><author><name>TheRainstorm</name></author><category term="Tech Tutorial" /><category term="tmux" /><category term="netcat" /><summary type="html"><![CDATA[本文会介绍 tmux 的基础使用方法，以及长期使用下来的一些常用配置。]]></summary></entry><entry><title type="html">Rootless Docker in Docker 在 Hackergame 中的实践</title><link href="https://lug.ustc.edu.cn/planet/2025/02/hackergame-rootless-docker/" rel="alternate" type="text/html" title="Rootless Docker in Docker 在 Hackergame 中的实践" /><published>2025-02-08T00:00:00+08:00</published><updated>2025-02-27T16:35:18+08:00</updated><id>https://lug.ustc.edu.cn/planet/2025/02/hackergame-rootless-docker</id><content type="html" xml:base="https://lug.ustc.edu.cn/planet/2025/02/hackergame-rootless-docker/"><![CDATA[<p>本文介绍了 2024 年 USTC Hackergame 中 Rootless Docker in Docker 在 Web 类题目容器隔离中的实践。</p>
    <h2 id="背景">背景</h2>
    <p>USTC Hackergame 长期以来使用 Docker 及 Docker Compose 来部署和管理各种题目。对于 <code class="language-plaintext highlighter-rouge">nc</code> 类题目，我们使用一个<a href="https://github.com/USTC-Hackergame/hackergame-challenge-docker">简单的 Python 管理程序</a>来为每个已验证的传入连接创建一个单独的题目运行环境容器，以保证选手之间的隔离。为了动态创建容器，我们将 <code class="language-plaintext highlighter-rouge">/var/run/docker.sock</code> 暴露给这个 Python 程序，以便它可以调用 Docker API 来创建容器。由于这个 Python 程序足够简单，我们认为这样做是安全的。而对于 Web 类题目，我们要求出题人在题目内部做好隔离，然而这样带来了额外的心智负担，而且容易出错，可能导致非预期解或出现能够干扰其他选手的情况。</p>
    <p>在 2024 年 Hackergame 中，我们决定实现 Web 类题目的容器隔离方案，北京大学 GeekGame 基于我们的 nc 容器管理方案实现了一个<a href="https://github.com/PKU-GeekGame/web-docker-manager">简单的 Web 类题目容器管理方案</a>，该方案同样通过透传 Docker Socket 实现对 Web 题目的容器管理。然而，Web 题目的反向代理比 nc 复杂得多，这使得我们对该方案的安全性存在较大的担忧，一旦该管理程序被攻破，攻击者可以直接控制整个服务器。因此，我们需要一个更安全的方案，使得即使 Web 题目的容器管理程序被攻破，攻击者也无法轻易控制整个服务器，这就需要在隔离的、低权限的环境中运行 Docker Daemon。</p>
    <h2 id="rootless-docker-in-docker">Rootless Docker-in-Docker</h2>
    <p>Rootless Docker 是在低权限环境中运行 Docker 的一种方案。Docker 官方提供了一种 <a href="https://docs.docker.com/engine/security/rootless/#rootless-docker-in-docker">Rootless Docker in Docker</a> 的方案，只需要一行命令即可启动：</p>
    <div class="language-bash highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>docker run <span class="nt">-d</span> <span class="nt">--name</span> dind-rootless <span class="nt">--privileged</span> docker:25.0-dind-rootless
</code></pre>
      </div>
    </div>
    <p>该方案在容器中以非 <code class="language-plaintext highlighter-rouge">root</code> 用户（UID 1000）创建用户命名空间，并分离其他命名空间和运行 Docker Daemon。然而，该方案存在一些问题：</p>
    <ol>
      <li>该方案需要 <code class="language-plaintext highlighter-rouge">--privileged</code> 选项，以禁用 seccomp、AppArmor 和 mount masks，但这意味着容器将获得更高的权限，可能导致安全问题。</li>
      <li>该方案无法使用 cgroup 来限制容器资源使用，因为 Rootless Docker 需要 systemd 将 cgroup 路径委托给 Docker Daemon 才能执行资源限制。虽然可以用 <code class="language-plaintext highlighter-rouge">rlimit</code> 等方案来限制资源，但其工作在进程粒度而非容器粒度，而且可以被轻松禁用。由于 Hackergame 需要强制执行容器资源限制，该问题是致命的。</li>
    </ol>
    <h2 id="基于-systemd-user-instance-的-rootless-docker-in-docker">基于 Systemd User Instance 的 Rootless Docker-in-Docker</h2>
    <p>为了解决上述问题，我们采用了一个基于 Systemd User Instance 的 Rootless Docker-in-Docker 方案。该方案在 Systemd User Instance 中运行 Rootless Docker Daemon，以实现资源限制和更好的安全性。</p>
    <h3 id="systemd-in-docker">Systemd in Docker</h3>
    <p>为了实现该方案，首先需要在容器中运行 <code class="language-plaintext highlighter-rouge">systemd</code>。systemd 的网站上列出了<a href="https://systemd.io/CONTAINER_INTERFACE/">在容器中运行 systemd 的要求</a>，对于 Docker 来说，主要有以下几点：</p>
    <ol>
      <li>保留 <code class="language-plaintext highlighter-rouge">CAP_SYS_ADMIN</code> 特权。</li>
      <li>启用私有 <code class="language-plaintext highlighter-rouge">cgroup</code> 命名空间，并将 <code class="language-plaintext highlighter-rouge">/sys/fs/cgroup</code> 挂载为可写。</li>
      <li>将 <code class="language-plaintext highlighter-rouge">/tmp</code>, <code class="language-plaintext highlighter-rouge">/run</code>, <code class="language-plaintext highlighter-rouge">/run/lock</code>, <code class="language-plaintext highlighter-rouge">/var/lib/journal</code> 挂载为 <code class="language-plaintext highlighter-rouge">tmpfs</code>。</li>
      <li>将 <code class="language-plaintext highlighter-rouge">stop_signal</code> 设置为 <code class="language-plaintext highlighter-rouge">SIGRTMIN+3</code>，以便 systemd 可以正确关闭。</li>
      <li>禁用 AppArmor 或 SELinux。</li>
    </ol>
    <p>我们在 <code class="language-plaintext highlighter-rouge">docker-compose.yml</code> 中添加了以下内容来实现这些要求：</p>
    <div class="language-yaml highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code><span class="na">cap_add</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="s">SYS_ADMIN</span>
  <span class="pi">-</span> <span class="s">NET_ADMIN</span>
<span class="na">cgroup</span><span class="pi">:</span> <span class="s">private</span>
<span class="na">devices</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="s">/dev/net/tun:/dev/net/tun</span>
<span class="na">tmpfs</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="s">/tmp</span>
  <span class="pi">-</span> <span class="s">/run</span>
  <span class="pi">-</span> <span class="s">/run/lock</span>
  <span class="pi">-</span> <span class="s">/var/lib/journal</span>
<span class="na">stop_signal</span><span class="pi">:</span> <span class="s">SIGRTMIN+3</span>
<span class="na">tty</span><span class="pi">:</span> <span class="kc">true</span>
<span class="na">security_opt</span><span class="pi">:</span>
  <span class="pi">-</span> <span class="s">seccomp=seccomp.json</span>
  <span class="pi">-</span> <span class="s">apparmor=unconfined</span>
  <span class="pi">-</span> <span class="s">systempaths=unconfined</span>
</code></pre>
      </div>
    </div>
    <p>其中 <code class="language-plaintext highlighter-rouge">seccomp.json</code> 可以从<a href="https://github.com/USTC-Hackergame/web-docker-manager/raw/refs/heads/main/rootless/seccomp.json">这里</a>获取，该文件在 Docker <a href="https://github.com/moby/moby/raw/refs/heads/master/profiles/seccomp/default.json">默认 seccomp 配置</a>的基础上允许了 <code class="language-plaintext highlighter-rouge">keyctl</code> 和 <code class="language-plaintext highlighter-rouge">pivot_root</code> 系统调用，这些系统调用是 Docker 所需要的。此外，该配置还一并禁用了 mount masks (<code class="language-plaintext highlighter-rouge">systempaths=unconfined</code>)，因为 Docker 启动容器时需要重新挂载 <code class="language-plaintext highlighter-rouge">/sys</code>。此外，我们还保留了 <code class="language-plaintext highlighter-rouge">NET_ADMIN</code> 特权，因为一些 systemd 的组件需要该特权。</p>
    <p>需要注意的是，<code class="language-plaintext highlighter-rouge">cgroup: private</code> 并不会将 <code class="language-plaintext highlighter-rouge">/sys/fs/cgroup</code> 挂载为可写，因此我们需要在容器启动时手动处理，可以通过一个自定义的 <code class="language-plaintext highlighter-rouge">entrypoint.sh</code> 来实现：</p>
    <div class="language-bash highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code><span class="c">#!/bin/bash</span>

<span class="nb">set</span> <span class="nt">-euo</span> pipefail

mount <span class="nt">--make-rshared</span> /

<span class="c"># Remount cgroup</span>
umount /sys/fs/cgroup
mount <span class="nt">-t</span> cgroup2 <span class="nt">-o</span> rw,relatime,nsdelegate cgroup2 /sys/fs/cgroup

<span class="nb">exec</span> /lib/systemd/systemd
</code></pre>
      </div>
    </div>
    <h3 id="rootless-docker-daemon-in-systemd-user-instance">Rootless Docker Daemon in Systemd User Instance</h3>
    <p>在容器中以非 <code class="language-plaintext highlighter-rouge">root</code> 身份运行 Docker Daemon 非常简单，只需要在容器中安装带 Rooeless 支持的 Docker：</p>
    <div class="language-dockerfile highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code><span class="k">RUN </span>apt-get <span class="nb">install</span> <span class="nt">-y</span> ca-certificates curl <span class="o">&amp;&amp;</span> <span class="se">\
</span>    <span class="nb">install</span> <span class="nt">-m</span> 0755 <span class="nt">-d</span> /etc/apt/keyrings <span class="o">&amp;&amp;</span> <span class="se">\
</span>    curl <span class="nt">-fsSL</span> https://download.docker.com/linux/debian/gpg <span class="nt">-o</span> /etc/apt/keyrings/docker.asc <span class="o">&amp;&amp;</span> <span class="se">\
</span>    <span class="nb">chmod </span>a+r /etc/apt/keyrings/docker.asc <span class="o">&amp;&amp;</span> <span class="se">\
</span>    <span class="nb">echo</span> <span class="s2">"deb [arch=</span><span class="si">$(</span>dpkg <span class="nt">--print-architecture</span><span class="si">)</span><span class="s2"> signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian bookworm stable"</span> <span class="o">&gt;</span> /etc/apt/sources.list.d/docker.list <span class="o">&amp;&amp;</span> <span class="se">\
</span>    apt-get update <span class="o">&amp;&amp;</span> <span class="se">\
</span>    apt-get <span class="nb">install</span> <span class="nt">-y</span> docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin uidmap <span class="o">&amp;&amp;</span> <span class="se">\
</span>    systemctl disable docker.service docker.socket containerd.service
</code></pre>
      </div>
    </div>
    <p>并创建一个用户，假设名为 <code class="language-plaintext highlighter-rouge">rootless</code>，然后使用 <code class="language-plaintext highlighter-rouge">machinectl shell rootless@</code> 切换到该用户，并运行 <code class="language-plaintext highlighter-rouge">dockerd-rootless-setuptool.sh install</code> 即可安装 Rootless Docker Daemon，安装程序会自动创建 <code class="language-plaintext highlighter-rouge">~/.config/systemd/user/docker.service</code> 并将其安装到 <code class="language-plaintext highlighter-rouge">default.target</code>。设置环境变量 <code class="language-plaintext highlighter-rouge">DOCKER_HOST=unix:///run/user/$UID/docker.sock</code> 即可通过 Docker CLI 工具访问该 Rootless Docker Daemon。</p>
    <h3 id="wrapping-up">Wrapping Up</h3>
    <p>将 Rootless Docker Daemon 安装脚本生成的 <code class="language-plaintext highlighter-rouge">~/.config/systemd/user/docker.service</code> 文件复制出来，并创建 <code class="language-plaintext highlighter-rouge">~/.config/systemd/user/default.target.wants</code> 中的相对路径符号链接，并在 Dockerfile 中添加创建用户和将这些文件到用户目录中的内容，最后通过在 <code class="language-plaintext highlighter-rouge">/var/lib/systemd/linger</code> 目录下创建名为用户名的空文件来使对应的 Systemd User Instance 自动启动，即可实现自动启动 Rootless Docker Daemon。</p>
    <p>该方案最终版本代码位于 <a href="https://github.com/USTC-Hackergame/web-docker-manager">USTC-Hackergame/web-docker-manager</a> 仓库的 <code class="language-plaintext highlighter-rouge">rootless</code> 目录，其中添加了一些用于持久化数据和暴露一些目录到主机的内容。</p>
    <h3 id="安全性分析">安全性分析</h3>
    <p>Rootless Docker 通过 <a href="https://github.com/rootless-containers/rootlesskit">RootlessKit</a> 分离用户命名空间，来实现非特权用户的特权操作以创建容器，因此该方案的容器（命名空间）层次结构如下：</p>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>Host (Full Privilege)
└── systemd in Docker (UID 0, CAP_SYS_ADMIN, CAP_NET_ADMIN)
        └── RootlessKit (UID 0 in inner user namespace, with ALL capabilities; UID 1000 in outer user namespace, with no capabilities)
            └── Manager Container
            └── Web Challenge Container 1
            └── Web Challenge Container 2
</code></pre>
      </div>
    </div>
    <p>若管理程序被攻破，则攻击者可以取得 RootlessKit 所创建的用户命名空间中的特权，但由于 RootlessKit 在外层用户命名空间是非特权用户，且运行在容器中，因此攻击者无法直接控制整个服务器，也无法直接访问 Host 的文件系统其他部分。</p>
    <h2 id="总结">总结</h2>
    <p>通过基于 Systemd User Instance 的 Rootless Docker in Docker 方案，我们成功地实现了在隔离的、低权限的环境中运行 Docker Daemon，以实现 Web 类题目的容器隔离。该方案强制执行了容器的资源限制，并通过隔离 Docker Daemon 提高了安全性，使得即使 Web 题目的容器管理程序被攻破，攻击者也无法轻易控制整个服务器。该方案已经在 2024 年 USTC Hackergame 中应用于数道 Web 类题目并稳定运行。</p>
    <p>尽管如此，该方案仍有改进空间，如当前外层容器仍需 <code class="language-plaintext highlighter-rouge">SYS_ADMIN</code> 特权，这实际上是相当大的权限，而 <a href="https://github.com/nestybox/sysbox">Sysbox</a> 容器运行时可以在创建容器时直接分离用户命名空间，提供了一种不需要这些特权的替代方案，是一个可以探索的方向。此外，我们的方案为了方便直接禁用了 AppArmor Profile，但实际上只需要创建自定义的 AppArmor Profile 放松一些限制（如允许 <code class="language-plaintext highlighter-rouge">mount</code> 和放松一些路径下的限制），而不需要将其完全置于 <code class="language-plaintext highlighter-rouge">unconfined</code> 状态，这样可以进一步提升安全性，但仍需要进一步探索。</p>
    ]]></content><author><name>RTXUX</name></author><category term="Hackergame" /><category term="Hackergame" /><category term="Docker" /><category term="Container" /><summary type="html"><![CDATA[本文介绍了 2024 年 USTC Hackergame 中 Rootless Docker in Docker 在 Web 类题目容器隔离中的实践。]]></summary></entry><entry><title type="html">快速建立本地 HTTP 文件服务</title><link href="https://lug.ustc.edu.cn/planet/2025/01/local-file-serving/" rel="alternate" type="text/html" title="快速建立本地 HTTP 文件服务" /><published>2025-01-03T00:00:00+08:00</published><updated>2025-01-06T16:32:35+08:00</updated><id>https://lug.ustc.edu.cn/planet/2025/01/local-file-serving</id><content type="html" xml:base="https://lug.ustc.edu.cn/planet/2025/01/local-file-serving/"><![CDATA[<p>USTCLUG 的一个重要服务是它的 <a href="https://mirrors.ustc.edu.cn">开源软件镜像</a>。但是我们这里要讨论的，是另一种自己在本地搭建的文件服务，目的是加速对远程静态文件的访问。这篇文章来源于帮同学解决的一个小问题。</p>
  <h2 id="目的">目的</h2>
  <p>某些课程作业附带的代码示例会通过 HTTP 的方式获取运行需要的输入数据，但是如果这个数据不巧放在某台访问速率很低的服务器上（例如不少用 <code class="language-plaintext highlighter-rouge">.edu</code> 域名的国外大学的课程网站上），则会使得作业的运行时间大大加长——通过只有几 KB/s 的网络下载数十 MB 的数据总不是个好主意。更有可能的是，根据作业内容，样例代码还需要做一些修改，这时候每次调试运行代码时都要去下载一次输入数据，简直难以忍受。本文将针对 Windows 用户，讲解如何在手工下载一次相关的文件后，在本地快速搭建一个 HTTP 服务器提供数据文件，以避免反复访问慢速的源地址。对于不能修改访问地址的情况，也提供了相应的对策。</p>
  <blockquote>
    <p>为何不写针对 Linux 用户的教程呢？这是因为，由于有 Apache 和 Nginx 这类成熟好用的服务器软件，再加上 Linux 操作系统优秀的包管理器，结合网上的教程，用几分钟的时间在 Linux 平台上快速建立并运行一个 HTTP 服务器应该不是难事。而在 Windows 上，配置 IIS 这类大而笨重的服务器非常麻烦，使用 Apache/Nginx 之类的服务器软件又由于操作习惯的不同，也不方便。</p>
  </blockquote>
  <h2 id="使用-python-httpserver">使用 Python <code class="language-plaintext highlighter-rouge">http.server</code></h2>
  <p><strong>对比下面的（老旧的）<code class="language-plaintext highlighter-rouge">EasyWebSvr</code>，使用 Python 很容易完成一个简单的 HTTP 服务器搭建。</strong></p>
  <p>如果你已经安装了 Python，则可以直接使用</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>python3 <span class="nt">-m</span> http.server &lt;port&gt;
</code></pre>
    </div>
  </div>
  <p>来以当前目录为根目录，<code class="language-plaintext highlighter-rouge">&lt;port&gt;</code> 为端口运行一个 HTTP 服务器。</p>
  <p>如果需要指定别的目录 <code class="language-plaintext highlighter-rouge">&lt;path&gt;</code> 作为 HTTP 服务的根目录，可以使用</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>python3 <span class="nt">-m</span> http.server &lt;port&gt; <span class="nt">--directory</span> &lt;path&gt;
</code></pre>
    </div>
  </div>
  <p>如果需要 bind 到特定的地址（对于本地服务而言，应该使用 <code class="language-plaintext highlighter-rouge">127.0.0.1</code>），可以使用下面的命令。<code class="language-plaintext highlighter-rouge">bind</code> 是指将一个网络地址（IP 和端口）与服务器软件中的套接字关联的过程，使得该服务器能接收指定地址上的网络数据。<strong>Python 的 http.server 默认值为 <code class="language-plaintext highlighter-rouge">0.0.0.0</code>，代表所有人都可以访问</strong>。如果只需要本地访问，则可以使用本地回环地址 <code class="language-plaintext highlighter-rouge">127.0.0.1</code>。</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>python3 <span class="nt">-m</span> http.server &lt;port&gt; <span class="nt">--bind</span> 127.0.0.1
</code></pre>
    </div>
  </div>
  <h2 id="使用-easywebserver">使用 EasyWebServer</h2>
  <h3 id="获取一个-http-服务器">获取一个 HTTP 服务器</h3>
  <p>首先当然是要获取一个 HTTP 服务器了，这里我们选择的是 <code class="language-plaintext highlighter-rouge">EasyWebSvr</code>。这个服务器是一个体积超小（只有数十 KB）、单文件、几乎不需要配置的小型 HTTP 服务器，甚至支持 CGI 和 PHP。<code class="language-plaintext highlighter-rouge">EasyWebSvr</code> 是一个历史久远的<a href="https://github.com/baojianjob/EasyWebSvr">开源项目</a>，使用 MSVC 作为开发环境，而且很容易下载到它的编译好的版本（搜索 <code class="language-plaintext highlighter-rouge">EasyWebSvr</code> 就可以了）。下载后你将得到一个 <code class="language-plaintext highlighter-rouge">EasyWebSvr.exe</code> 和一些其他的文件。不用理会其他的文件（事实上它们是不必要的），直接将 <code class="language-plaintext highlighter-rouge">EasyWebSvr.exe</code> 复制到一个空文件夹里面就可以了。</p>
  <blockquote>
    <p>提示：根据<a href="https://github.com/baojianjob/EasyWebSvr/blob/796448f6b312ad0676add90024940316dfa34299/EasyWebSvr/Socket.cpp#L439">代码</a>，EasyWebSvr 会 bind 到 <code class="language-plaintext highlighter-rouge">0.0.0.0</code>，这可能会导致<strong>安全问题</strong>。如果需要 bind 到 <code class="language-plaintext highlighter-rouge">127.0.0.1</code>，可能需要修改代码并重新构建。</p>
  </blockquote>
  <h3 id="准备数据">准备数据</h3>
  <ul>
    <li>从原始的地址上手工下载一份要被访问的数据，放到上面所说的那个空文件夹里面。</li>
    <li>运行 <code class="language-plaintext highlighter-rouge">EasyWebSvr.exe</code>，可以看到一个小窗口。</li>
    <li>点击右下角的锤子按钮（菜单），选择最下面的 <code class="language-plaintext highlighter-rouge">设置</code> 一项。</li>
    <li>在弹出的设置对话框里面，选择 <code class="language-plaintext highlighter-rouge">主目录</code> 为当前 <code class="language-plaintext highlighter-rouge">EasyWebSvr.exe</code> 所在的目录。</li>
    <li>在 <code class="language-plaintext highlighter-rouge">文档</code> 选项卡里面勾选 <code class="language-plaintext highlighter-rouge">允许目录浏览</code> 和 <code class="language-plaintext highlighter-rouge">总是显示目录内容</code>。</li>
    <li>点击 <code class="language-plaintext highlighter-rouge">确定</code> 结束并保存配置。</li>
    <li>然后在主窗口里面点击右下方的红色按钮（在菜单按钮的边上），正常应该看到它变成蓝绿色（这时候如果有 Windows 防火墙提示，请选择允许）。</li>
    <li>此时服务器就配置好了，打开浏览器输入 <code class="language-plaintext highlighter-rouge">http://localhost</code> 即可看到文件列表。</li>
  </ul>
  <h3 id="修改原先的代码">修改原先的代码</h3>
  <ul>
    <li>例如原先提供的代码中数据文件的 URL 是 <code class="language-plaintext highlighter-rouge">http://some.site.edu/path/to/file/data.csv</code>，而这份数据就和 <code class="language-plaintext highlighter-rouge">EasyWebSvr.exe</code> 在同一文件夹下，则将代码中的地址修改为 <code class="language-plaintext highlighter-rouge">http://localhost/data.csv</code> 就可以使用本地的文件服务了！</li>
    <li>服务器最小化后会显示一个小托盘图标，右键点击它可以显示各种选项，包括重新显示主窗口。</li>
    <li>使用完后，点击蓝绿色图标，它会变成红色，此时服务器就关闭了。</li>
  </ul>
  <h3 id="特殊情况">特殊情况</h3>
  <h4 id="代码不支持修改地址">代码不支持修改地址</h4>
  <p>如果代码不支持修改地址（这很少见），那么也可以按照原来的 URL 在本地建立文件夹，类似下面这样：</p>
  <div class="language-text highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>EasyWebSvr.exe
path/                -&gt; 建立这个目录
    to/              -&gt; 建立这个目录
        file/        -&gt; 建立这个目录
            data.csv -&gt; 数据文件在这里
</code></pre>
    </div>
  </div>
  <p>然后修改 <code class="language-plaintext highlighter-rouge">hosts</code> 文件（一般在 <code class="language-plaintext highlighter-rouge">C:\Windows\System32\drivers\etc</code> 里面，需要管理员权限，具体请看相关教程），加上一句</p>
  <div class="language-text highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>127.0.0.1 some.site.edu
</code></pre>
    </div>
  </div>
  <p>然后以管理员权限运行</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>ipconfig /flushdns
</code></pre>
    </div>
  </div>
  <p>即可。可以通过浏览器访问 <code class="language-plaintext highlighter-rouge">http://some.site.edu/path/to/file/data.csv</code> 来确认。这一操作会修改所有对 <code class="language-plaintext highlighter-rouge">some.site.edu</code> 的请求，因此用完后请删去在 <code class="language-plaintext highlighter-rouge">hosts</code> 中加入的那一行，然后再运行一次 <code class="language-plaintext highlighter-rouge">ipconfig /flushdns</code>。</p>
  <h4 id="原地址有非-80-的端口号">原地址有非 80 的端口号</h4>
  <p>如果你拿到的地址使用了别的端口（例如 <code class="language-plaintext highlighter-rouge">some.site.edu:8080</code>）且不能修改，那么请在 EasyWebServer 的设置页面 <code class="language-plaintext highlighter-rouge">端口号</code> 处输入对应的端口号 <code class="language-plaintext highlighter-rouge">8080</code>，对应的访问地址就是 <code class="language-plaintext highlighter-rouge">localhost:8080</code>。</p>
  <h4 id="提示端口已被占用">提示端口已被占用</h4>
  <p>这可能是因为有其他应用在使用 80 端口。通过 <code class="language-plaintext highlighter-rouge">netstat -ano</code> 命令可以查看机器上正被使用的端口号和对应的进程 ID。如果不希望关闭占用 80 端口的程序，那么和上面非 80 端口一样，将我们的 HTTP 文件服务器换到一个别的端口就好了，例如 <code class="language-plaintext highlighter-rouge">8080</code> 或者 <code class="language-plaintext highlighter-rouge">8000</code>，然后再次点红色圆球图标开启服务器。这时候记得访问地址也要加上对应的端口号。</p>
  ]]></content><author><name>luojh</name></author><category term="Tech Tutorial" /><category term="Windows" /><category term="HTTP" /><summary type="html"><![CDATA[USTCLUG 的一个重要服务是它的 开源软件镜像。但是我们这里要讨论的，是另一种自己在本地搭建的文件服务，目的是加速对远程静态文件的访问。这篇文章来源于帮同学解决的一个小问题。]]></summary></entry><entry><title type="html">镜像站 ZFS 实践</title><link href="https://lug.ustc.edu.cn/planet/2024/12/ustc-mirrors-zfs-rebuild/" rel="alternate" type="text/html" title="镜像站 ZFS 实践" /><published>2024-12-09T00:00:00+08:00</published><updated>2026-02-10T14:04:47+08:00</updated><id>https://lug.ustc.edu.cn/planet/2024/12/ustc-mirrors-zfs-rebuild</id><content type="html" xml:base="https://lug.ustc.edu.cn/planet/2024/12/ustc-mirrors-zfs-rebuild/"><![CDATA[<p>A.K.A. 如何让 2000 元的机械硬盘跑得比 3000 元的固态硬盘还快（</p>
  <p>本文另有<a href="https://ibug.io/p/74">英文版</a>及在南京大学分享时使用的<a href="https://ibug.io/p/72">幻灯片</a>。</p>
  <h2 id="background">背景</h2>
  <p>由中科大 Linux 用户协会维护的<a href="https://mirrors.ustc.edu.cn/">中科大开源软件镜像站</a>是中国大陆高校访问量最大、收录最全的开源软件镜像之一。
    在 2024 年 5 月至 6 月期间，我们的镜像站服务的流量大约是每天 36 TiB，主要分为以下两大类：</p>
  <ul>
    <li>HTTP/HTTPS 流量 19 TiB，请求量 1700 万</li>
    <li>Rsync 流量 10.3 TiB，请求量 2.18 万（如果算上一个异常的客户端，那么总数是 14.78 万）</li>
  </ul>
  <p>多年以来，随着现有镜像仓库容量的增加和新镜像仓库的加入，我们的服务器硬盘容量已经十分紧张了。目前<sup id="fnref:as-of"><a href="#fn:as-of" class="footnote" rel="footnote" role="doc-noteref">1</a></sup>提供镜像服务的两台服务器的磁盘容量都已经接近极限了：</p>
  <ul>
    <li>主（HTTP）服务器采用 XFS 文件系统，在 2023 年 12 月 18 日达到了 63.3 TiB（总容量 66.0 TiB，使用率 96%）；</li>
    <li>副（Rsync）服务器采用 ZFS 文件系统，在 2023 年 11 月 21 日达到了 42.4 TiB（总容量 43.2 TiB，使用率 98%）。</li>
  </ul>
  <p>两台服务器的配置分别如下：</p>
  <dl>
    <dt>HTTP 服务器</dt>
    <dd>
      <ul>
        <li>2020 年秋季搭建</li>
        <li>第二代至强可扩展处理器（Cascade Lake）和 256 GB DDR4 内存</li>
        <li>12 块 10 TB HDD + 一块 2 TB SSD</li>
        <li>在硬件 RAID 上使用 LVM 和 XFS</li>
        <li>由于 XFS（截至本次重建时）不支持压缩，因此为了应对其他分区的潜在的扩容需求，我们在 LVM VG 层面保留了 free PE</li>
      </ul>
    </dd>
    <dt>Rsync 服务器</dt>
    <dd>
      <ul>
        <li>2016 年底搭建</li>
        <li>至强 E5 v4 处理器（Broadwell）和 256 GB DDR4 内存</li>
        <li>12 块 6 TB HDD 和一些小容量 SSD 用来装系统和当缓存</li>
        <li>组建了 ZFS RAID-Z3 阵列，分为 8 块数据盘 + 3 块校验盘，最后一块留作热备</li>
        <li>全默认参数（仅修改了 <code class="language-plaintext highlighter-rouge">zfs_arc_max</code>）</li>
      </ul>
    </dd>
  </dl>
  <p>这两台服务器的磁盘负载非常高，日常维持在 90% 以上，以至于即使从科大校园网内下载镜像，速度也很难达到 50 MB/s。
    显然对于镜像站这种专用于存储用途的机器来说，这样的性能表现是不尽人意的。</p>
  <figure class=""><a href="/static/planet/ustc-mirrors-zfs-rebuild/mirrors-io-utilization-may-2024.png" class="image-popup" title="2024 年 5 月期间镜像站两台服务器的 I/O 负载
"><img src="/static/planet/ustc-mirrors-zfs-rebuild/mirrors-io-utilization-may-2024.png" alt="2024 年 5 月期间镜像站两台服务器的 I/O 负载" /></a>
    <figcaption>
      2024 年 5 月期间镜像站两台服务器的 I/O 负载
    </figcaption>
  </figure>
  <h2 id="zfs">ZFS</h2>
  <p>ZFS 以“单机存储的终极解决方案”著称。
    它集 RAID、逻辑卷管理和文件系统于一体，具有包括快照、克隆、发送/接收等高级功能。
    ZFS 内的所有数据都有校验，可以在硬盘出现比特翻转等极端情况下尽可能确保文件系统的完整性。
    对于专用于存储的服务器来说，ZFS 看起来是个可以“开箱即用”的解决方案，但当你看到它有如此多的可调节参数之后，你马上就不会这么想了。</p>
  <p>作为前期学习和实验，我在自己的工作站上增加了一批额外的硬盘并把它们组成了两个 ZFS pool，然后注册了一些 PT 站<s>开始刷流</s>来制造一些磁盘负载以便学习研究。
    在 PT 站的<s>刷流</s>成果十分可观：这个单机的 seed box 在两年半间产生了 1.20 PiB 的上传量。</p>
  <p>这两年刷 PT 站刷下来，我总结出来几个重要的 ZFS 学习资料来源：</p>
  <ul>
    <li>UToronto 的 Chris Siebenmann 的博客：<a href="https://utcc.utoronto.ca/~cks/space/blog/">https://utcc.utoronto.ca/~cks/space/blog/</a></li>
    <li>OpenZFS 的官方文档：<a href="https://openzfs.github.io/openzfs-docs/">https://openzfs.github.io/openzfs-docs/</a></li>
    <li>我自己攒出的一篇博客：<a href="https://ibug.io/p/62">Understanding ZFS block sizes</a>
      <ul>
        <li>以及这篇博客底部列出的参考文献</li>
      </ul>
    </li>
  </ul>
  <figure class=""><a href="/static/planet/ustc-mirrors-zfs-rebuild/2024-06-05.png" class="image-popup" title="学习 ZFS 过程中的副产物：一个为 qBittorrent 定制的 Grafana 面板（xs
"><img src="/static/planet/ustc-mirrors-zfs-rebuild/2024-06-05.png" alt="一个为 qBittorrent 定制的 Grafana 面板" /></a>
    <figcaption>
      学习 ZFS 过程中的副产物：一个为 qBittorrent 定制的 Grafana 面板（xs
    </figcaption>
  </figure>
  <p>经过多年的 ZFS 学习，我意识到镜像站服务器上的配置其实有很大的优化空间，方法就是 all-in ZFS 并正确地调节一些参数。</p>
  <h2 id="镜像站">镜像站</h2>
  <p>在开工重建 ZFS pool 之前，我们需要正确地理解和分析镜像站的负载类型。简而言之，镜像站的特点是：</p>
  <ul>
    <li>提供文件下载服务</li>
    <li>
      <s>也（被迫）提供“家庭宽带上下行流量平衡服务”</s>
      <p>（责任全在 PCDN 方）</p>
    </li>
    <li>读多写少，且大部分读取都是全文件顺序读取</li>
    <li>能够容忍少量的数据丢失，毕竟镜像内容可以轻易地从上游重新同步回来</li>
  </ul>
  <figure class=""><a href="/static/planet/ustc-mirrors-zfs-rebuild/mirrors-file-size-distribution-2024-08.png" class="image-popup" title="2024 年 8 月镜像站上的文件大小分布
"><img src="/static/planet/ustc-mirrors-zfs-rebuild/mirrors-file-size-distribution-2024-08.png" alt="2024 年 8 月镜像站上的文件大小分布" /></a>
    <figcaption>
      2024 年 8 月镜像站上的文件大小分布
    </figcaption>
  </figure>
  <p>基于以上思考，我们分析了镜像站上存储的内容。从上图中可以看出，镜像站上总文件数超过 4000 万，其中一半的文件大小不到 10 KiB，并且 90% 的文件大小不到 1 MiB。
    尽管如此，所有文件的平均大小仍然达到了 1.6 MiB。</p>
  <h2 id="mirrors2">重建 Rsync 服务器</h2>
  <p>Rsync 服务器的流量较少，但磁盘使用率较为极端，加上我们认定 Rsync 服务的重要性较低，因此在今年 6 月，我们先动手重建了这台服务器。
    我们制定了如下的重建计划：</p>
  <ul>
    <li>首先，考虑到镜像站上一半的文件都不到 10 KiB（注意我们的磁盘的物理扇区大小是 4 KiB），RAID-Z3 的开销过高，因此我们决定将其重建为 RAID-Z2 并且拆成两组 vdev。这样做还有一个额外的好处，即期望情况下我们还可以在这个 ZFS pool 中获得两倍的 IOPS，毕竟文件的每个“块”只存储在一个 vdev 上。</li>
    <li>
      <p>然后我们仔细研究了如何为镜像站场景调优 ZFS dataset 参数：</p>
      <ul>
        <li><code class="language-plaintext highlighter-rouge">recordsize=1M</code>：尽可能优化顺序读写性能，同时减少碎片化。</li>
        <li>
          <p><code class="language-plaintext highlighter-rouge">compression=zstd</code>：开点压缩来试试能节约多少磁盘空间。</p>
          <ul>
            <li>
              <p>OpenZFS 2.2 开始将 early-abort 机制引入了 Zstd 压缩算法（Zstd-3 以上的等级）。该机制会首先尝试使用 LZ4 和 Zstd-1 来压缩数据以便评估数据的可压缩性，如果数据不可压缩（熵太大），则不再尝试用设定的 Zstd 等级压缩，而是直接原样写入磁盘上，避免在不可压缩的数据上浪费 CPU。</p>
              <p>我们已知镜像站上的大部分内容都是已经压缩过的，因此 early-abort 算是给我们兜了个底，让我们可以放心地开 Zstd。</p>
            </li>
          </ul>
        </li>
        <li><code class="language-plaintext highlighter-rouge">xattr=off</code>：镜像站上的文件不需要扩展属性。</li>
        <li><code class="language-plaintext highlighter-rouge">atime=off</code>：镜像站上的文件不需要记录，也不需要更新 atime，可以省掉不少写入。</li>
        <li><code class="language-plaintext highlighter-rouge">setuid=off</code>、<code class="language-plaintext highlighter-rouge">exec=off</code>、<code class="language-plaintext highlighter-rouge">devices=off</code> 也是我们不需要的挂载选项（也是一个更安全的做法）。</li>
        <li><code class="language-plaintext highlighter-rouge">secondarycache=metadata</code> 让 L2ARC 仅缓存 ZFS 内部的元数据。这是因为 Rsync 服务器上的文件访问模式更加均匀，而不像面向终端用户的 HTTP 服务器上冷热分明，因此仅缓存元数据可以节约 SSD 寿命。</li>
      </ul>
    </li>
    <li>
      <p>以及一些可能有潜在（但我们认为我们可以容忍的）风险的选项：</p>
      <ul>
        <li><code class="language-plaintext highlighter-rouge">sync=disabled</code>：禁用同步写入语义（<code class="language-plaintext highlighter-rouge">open(O_SYNC)</code>、<code class="language-plaintext highlighter-rouge">sync()</code> 和 <code class="language-plaintext highlighter-rouge">fsync()</code> 等）以让 ZFS 能够充分发挥写缓冲区的意义，如降低碎片率等。</li>
        <li><code class="language-plaintext highlighter-rouge">redundant_metadata=some</code>：（OpenZFS 2.2）减少元数据的冗余度来获得更好的写入性能。</li>
      </ul>
      <p>我们认为这两个选项符合我们对镜像站仓库内容的数据安全和完整性需求的理解，它们在其他场景下不一定“安全”。</p>
    </li>
    <li>
      <p>对于 ZFS 模块层面的参数，光是 290+ 的数量就已经很劝退了。
        此处感谢 Debian ZFS 维护者兼北京外国语大学镜像站管理员 @happyaron 的帮助，我们快速找出了十几个常用的参数进行针对性调节。</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="c"># 设置 ARC 大小范围为 160-200 GiB，并为操作系统保留 16 GiB 空闲</span>
options zfs <span class="nv">zfs_arc_max</span><span class="o">=</span>214748364800
options zfs <span class="nv">zfs_arc_min</span><span class="o">=</span>171798691840
options zfs <span class="nv">zfs_arc_sys_free</span><span class="o">=</span>17179869184

<span class="c"># 设置元数据对用户数据优先级的权重为 20x (OpenZFS 2.2+)</span>
options zfs <span class="nv">zfs_arc_meta_balance</span><span class="o">=</span>2000

<span class="c"># 允许 dnode 占用至多 80% 的 ARC 容量</span>
options zfs <span class="nv">zfs_arc_dnode_limit_percent</span><span class="o">=</span>80

<span class="c"># 以下几行参见 man page 中的 "ZFS I/O Scheduler" 一节</span>
options zfs <span class="nv">zfs_vdev_async_read_max_active</span><span class="o">=</span>8
options zfs <span class="nv">zfs_vdev_async_read_min_active</span><span class="o">=</span>2
options zfs <span class="nv">zfs_vdev_scrub_max_active</span><span class="o">=</span>5
options zfs <span class="nv">zfs_vdev_max_active</span><span class="o">=</span>20000

<span class="c"># 避免因内存压力降低 ARC 读写速度</span>
options zfs <span class="nv">zfs_arc_lotsfree_percent</span><span class="o">=</span>0

<span class="c"># L2ARC 参数</span>
options zfs <span class="nv">l2arc_headroom</span><span class="o">=</span>8
options zfs <span class="nv">l2arc_write_max</span><span class="o">=</span>67108864
options zfs <span class="nv">l2arc_noprefetch</span><span class="o">=</span>0
</code></pre>
        </div>
    </div>
      <p>另外还有 <code class="language-plaintext highlighter-rouge">zfs_dmu_offset_next_sync</code>，但由于它从 OpenZFS 2.1.5 开始已经默认启用了，因此我们将其从本列表中略去。</p>
    </li>
  </ul>
  <p>将 Rsync 服务暂时转移到由 HTTP 服务器兼任之后，我们 destroy 了原有的 ZFS pool 并重新组建了一个新的 pool，然后再从外面（上游或 TUNA、BFSU 等友校镜像站）把原有的仓库同步回来。
    令我们感到惊讶的是，把总共近 40 TiB 的仓库同步回来只花了 3 天，比我们预想的要快得多。
    其他的一些数据看起来也令人振奋：</p>
  <ul>
    <li>
      <p>ZFS 压缩率：39.5T / 37.1T (1.07x)</p>
      <p>需要特别指出的是，ZFS 只显示压缩率小数点后两位，所以更高的精度，需要通过原始数据自己计算：</p>
      <div class="language-shell highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>zfs list <span class="nt">-po</span> name,logicalused,used
</code></pre>
        </div>
    </div>
      <p>我们更精确的压缩率是 1 + 6.57%，即压掉了 2.67 TB（2.43 TiB），约等于 <a href="/static/planet/ustc-mirrors-zfs-rebuild/lenovo-legion-wechat-data.jpg">9 份微信数据</a>（不是</p>
    </li>
    <li>
      <p>最关键的是更合理的 I/O 负载：</p>
      <figure class=""><a href="/static/planet/ustc-mirrors-zfs-rebuild/mirrors2-io-utilization-and-free-space-june-july-2024.png" class="image-popup" title="mirrors2 机器在重建前后的 I/O 负载
"><img src="/static/planet/ustc-mirrors-zfs-rebuild/mirrors2-io-utilization-and-free-space-june-july-2024.png" alt="mirrors2 机器在重建前后的 I/O 负载" /></a>
        <figcaption>
          mirrors2 机器在重建前后的 I/O 负载
        </figcaption>
      </figure>
    </li>
  </ul>
  <p>可以看出，经过几天的预热之后，I/O 负载维持在了 20% 左右，而在重建之前一直维持在 90% 以上。</p>
  <h2 id="mirrors4">重建 HTTP 服务器</h2>
  <p>我们的 HTTP 服务器是在 2020 年秋季搭建的，并且当时也有一些不同的背景。
    申请这台服务器正是因为 Rsync 服务器容量过满且性能不佳，加上当时社团内也没有熟悉 ZFS 的同学，我们对 ZFS 的印象很差，所以我们决定完全避开 ZFS，使用硬件 RAID、LVM 和 XFS，其中使用 LVM 的原因是 RAID 卡不支持跨两个控制器组 RAID。
    对于“内存做缓存”这部分，我们决定直接使用内核的 page cache；而对于 SSD 缓存，我们则率先吃了 LVMcache 的螃蟹。</p>
  <p>然而这些过于“新鲜”的技术并没有带来比（现在的 ZFS）更好的体验：</p>
  <ul>
    <li>XFS 无法缩小，因此我们不得不在 LVM VG 层面保留了 free PE。同时我们也不能把 XFS 文件系统用满，因此这里就有了两层无法利用的空闲空间。</li>
    <li>我们最初分配了 1.5 TB 的 SSD 缓存，但 LVMcache 又建议我们不要超过 100 万个 chunk，我们当时也没有足够的精力和知识水平去研究这个建议背后的技术细节，因此我们最终只分配了 1 TiB（1 MiB chunk size * 1 Mi chunks）的 SSD 缓存。</li>
    <li>SSD 缓存策略不可调，多年以后我们翻了 kernel 源码才发现它是一个 64 级的 LRU。</li>
    <li>配好 cache 之后 GRUB 立刻挂了（难绷），我们调查发现原因是 GRUB 有一套自己的解析 LVM metadata 的代码，它并没有正确处理（或者说根本没处理）VG 中有 cache volume 的情况，我们不得不自己 <a href="https://github.com/taoky/grub/commit/85b260baec91aa4f7db85d7592f6be92d549a0ae">patch</a> 了 GRUB 才能正常开机。</li>
    <li>由于我们对 LVMcache 的 chunk 不够了解，我们的 SSD 在不到 2 年的时间里就严重超过了标称的写入寿命，我们被迫申请换新。</li>
  </ul>
  <p>在 SSD 换新之后，即使我们认为我们对 LVMcache 做出了稍微合理一点的调参，坚持忽略警告采用 128 KiB 的 chunk size 和 800 万个 chunk 之后，它的性能（命中率）也并不可观：</p>
  <figure class=""><a href="/static/planet/ustc-mirrors-zfs-rebuild/mirrors4-dmcache-may-june-2024.png" class="image-popup" title="2024 年 5 月至 6 月期间 LVMcache 的命中率
"><img src="/static/planet/ustc-mirrors-zfs-rebuild/mirrors4-dmcache-may-june-2024.png" alt="2024 年 5 月至 6 月期间 LVMcache 的命中率" /></a>
    <figcaption>
      2024 年 5 月至 6 月期间 LVMcache 的命中率
    </figcaption>
  </figure>
  <p>这些年来我们已经踩够了 LVMcache 的坑了，加上 Rsync 服务器重建的巨大成功，我们重新开始相信 ZFS 是天下第一的存储方案了。
    所以一个月之后，我们又制定了一个相似的重建计划准备重建 HTTP 服务器，但是有一些微小的差别：</p>
  <ul>
    <li>我们的 Rsync 服务器采用原生的 Debian kernel + <code class="language-plaintext highlighter-rouge">zfs-dkms</code>，但根据我们使用 PVE 的经验，我们准备在 HTTP 服务器上直接用 <code class="language-plaintext highlighter-rouge">6.8.8-3-pve</code> kernel，它打包了 <code class="language-plaintext highlighter-rouge">zfs.ko</code>，这样我们就不用在 DKMS 上浪费时间了。</li>
    <li>由于磁盘数目相同（12 块），我们也采用了两个 6 盘 RAID-Z2 vdev 的组合。
      <ul>
        <li>考虑到这台服务器直接向用户提供 HTTP 服务，磁盘的访问模式会比 Rsync 服务器更加冷热分明，因此我们保持了 <code class="language-plaintext highlighter-rouge">secondarycache=all</code> 的设置（采用默认值，不动）。</li>
        <li>这台新服务器的 CPU 更新更好，因此我们把压缩等级提高到了 <code class="language-plaintext highlighter-rouge">zstd-8</code> 来试试能否获得更好的压缩比。</li>
      </ul>
    </li>
    <li>我们在 Rsync 服务器上已经有了一个完整的、经过 ZFS 优化过的仓库，因此我们可以直接用 <code class="language-plaintext highlighter-rouge">zfs send -Lcp</code> 把数据倒过来。我们最终只花了 36 小时就把超过 50 TiB 的数据都倒回来了。</li>
    <li>由于两台服务器上存储的镜像仓库有所不同，HTTP 服务器上的压缩比略低一些，为 1 + 3.93%（压掉了 2.42 TB / 2.20 TiB）。</li>
  </ul>
  <p>我们把两台服务器的 I/O 负载放在一张图里对比：</p>
  <figure class=""><a href="/static/planet/ustc-mirrors-zfs-rebuild/mirrors2-4-io-utilization-june-july-2024.png" class="image-popup" title="镜像站两台服务器在重建前后的 I/O 负载
"><img src="/static/planet/ustc-mirrors-zfs-rebuild/mirrors2-4-io-utilization-june-july-2024.png" alt="镜像站两台服务器在重建前后的 I/O 负载" /></a>
    <figcaption>
      镜像站两台服务器在重建前后的 I/O 负载
    </figcaption>
  </figure>
  <p>上图左半部分为重建前的情况，中间部分为仅重建了 Rsync 服务器的情况，右半部分为两台服务器都重建完毕后的情况。</p>
  <p>ZFS ARC 的命中率也十分可观：</p>
  <figure class=""><a href="/static/planet/ustc-mirrors-zfs-rebuild/mirrors2-4-zfs-arc-hit-rate.png" class="image-popup" title="两台服务器的 ZFS ARC 命中率
"><img src="/static/planet/ustc-mirrors-zfs-rebuild/mirrors2-4-zfs-arc-hit-rate.png" alt="两台服务器的 ZFS ARC 命中率" /></a>
    <figcaption>
      两台服务器的 ZFS ARC 命中率
    </figcaption>
  </figure>
  <p>稳定下来之后，两台服务器的 I/O 负载还更低了：</p>
  <figure class=""><a href="/static/planet/ustc-mirrors-zfs-rebuild/mirrors2-4-disk-io-after-rebuild.png" class="image-popup" title="两台服务器重建后磁盘 I/O 的稳定情况
"><img src="/static/planet/ustc-mirrors-zfs-rebuild/mirrors2-4-disk-io-after-rebuild.png" alt="两台服务器重建后磁盘 I/O 的稳定情况" /></a>
    <figcaption>
      两台服务器重建后磁盘 I/O 的稳定情况
    </figcaption>
  </figure>
  <h2 id="杂项">杂项</h2>
  <h3 id="zfs-透明压缩">ZFS 透明压缩</h3>
  <p>我们并没有想到这么多仓库的压缩率都还不错：</p>
  <table>
    <thead>
      <tr>
        <th style="text-align: left">NAME</th>
        <th style="text-align: right">LUSED</th>
        <th style="text-align: right">USED</th>
        <th style="text-align: right">RATIO</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td style="text-align: left">pool0/repo/crates.io-index</td>
        <td style="text-align: right">2.19G</td>
        <td style="text-align: right">1.65G</td>
        <td style="text-align: right">3.01x</td>
      </tr>
      <tr>
        <td style="text-align: left">pool0/repo/elpa</td>
        <td style="text-align: right">3.35G</td>
        <td style="text-align: right">2.32G</td>
        <td style="text-align: right">1.67x</td>
      </tr>
      <tr>
        <td style="text-align: left">pool0/repo/rfc</td>
        <td style="text-align: right">4.37G</td>
        <td style="text-align: right">3.01G</td>
        <td style="text-align: right">1.56x</td>
      </tr>
      <tr>
        <td style="text-align: left">pool0/repo/debian-cdimage</td>
        <td style="text-align: right">1.58T</td>
        <td style="text-align: right">1.04T</td>
        <td style="text-align: right">1.54x</td>
      </tr>
      <tr>
        <td style="text-align: left">pool0/repo/tldp</td>
        <td style="text-align: right">4.89G</td>
        <td style="text-align: right">3.78G</td>
        <td style="text-align: right">1.48x</td>
      </tr>
      <tr>
        <td style="text-align: left">pool0/repo/loongnix</td>
        <td style="text-align: right">438G</td>
        <td style="text-align: right">332G</td>
        <td style="text-align: right">1.34x</td>
      </tr>
      <tr>
        <td style="text-align: left">pool0/repo/rosdistro</td>
        <td style="text-align: right">32.2M</td>
        <td style="text-align: right">26.6M</td>
        <td style="text-align: right">1.31x</td>
      </tr>
    </tbody>
  </table>
  <p>有些数字看着不太对劲（比如第一个），我们认为是这个问题造成的：<a href="https://github.com/openzfs/zfs/issues/7639"><i class="fab fa-github"></i> openzfs/zfs#7639</a></p>
  <p>如果我们按照压缩量排序，结果如下：</p>
  <table>
    <thead>
      <tr>
        <th style="text-align: left">NAME</th>
        <th style="text-align: right">LUSED</th>
        <th style="text-align: right">USED</th>
        <th style="text-align: right">DIFF</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td style="text-align: left">pool0/repo</td>
        <td style="text-align: right">58.3T</td>
        <td style="text-align: right">56.1T</td>
        <td style="text-align: right">2.2T</td>
      </tr>
      <tr>
        <td style="text-align: left">pool0/repo/debian-cdimage</td>
        <td style="text-align: right">1.6T</td>
        <td style="text-align: right">1.0T</td>
        <td style="text-align: right">549.6G</td>
      </tr>
      <tr>
        <td style="text-align: left">pool0/repo/opensuse</td>
        <td style="text-align: right">2.5T</td>
        <td style="text-align: right">2.3T</td>
        <td style="text-align: right">279.7G</td>
      </tr>
      <tr>
        <td style="text-align: left">pool0/repo/turnkeylinux</td>
        <td style="text-align: right">1.2T</td>
        <td style="text-align: right">1.0T</td>
        <td style="text-align: right">155.2G</td>
      </tr>
      <tr>
        <td style="text-align: left">pool0/repo/loongnix</td>
        <td style="text-align: right">438.2G</td>
        <td style="text-align: right">331.9G</td>
        <td style="text-align: right">106.3G</td>
      </tr>
      <tr>
        <td style="text-align: left">pool0/repo/alpine</td>
        <td style="text-align: right">3.0T</td>
        <td style="text-align: right">2.9T</td>
        <td style="text-align: right">103.9G</td>
      </tr>
      <tr>
        <td style="text-align: left">pool0/repo/openwrt</td>
        <td style="text-align: right">1.8T</td>
        <td style="text-align: right">1.7T</td>
        <td style="text-align: right">70.0G</td>
      </tr>
    </tbody>
  </table>
  <p><code class="language-plaintext highlighter-rouge">debian-cdimage</code> 一个仓库就占了总压缩量的 1/4。</p>
  <h3 id="grafana-for-zfs-io">Grafana for ZFS I/O</h3>
  <p>重建后，我们也修了一个显示 ZFS I/O 的 Grafana 面板。
    因为 ZFS 的 I/O 统计数据是通过 <code class="language-plaintext highlighter-rouge">/proc/spl/kstat/zfs/$POOL/objset-$OBJSETID_HEX</code> 获取的，并且是分“object set”（即 dataset）累计统计的，所以我们需要先对每个 dataset 的数据做差分，再按 pool 加起来。
    也就是说，一个 InfluxQL subquery 是跑不掉的。</p>
  <div class="language-sql highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="k">SELECT</span>
  <span class="n">non_negative_derivative</span><span class="p">(</span><span class="k">sum</span><span class="p">(</span><span class="nv">"reads"</span><span class="p">),</span> <span class="mi">1</span><span class="n">s</span><span class="p">)</span> <span class="k">AS</span> <span class="nv">"read"</span><span class="p">,</span>
  <span class="n">non_negative_derivative</span><span class="p">(</span><span class="k">sum</span><span class="p">(</span><span class="nv">"writes"</span><span class="p">),</span> <span class="mi">1</span><span class="n">s</span><span class="p">)</span> <span class="k">AS</span> <span class="nv">"write"</span>
<span class="k">FROM</span> <span class="p">(</span>
  <span class="k">SELECT</span>
    <span class="k">first</span><span class="p">(</span><span class="nv">"reads"</span><span class="p">)</span> <span class="k">AS</span> <span class="nv">"reads"</span><span class="p">,</span>
    <span class="k">first</span><span class="p">(</span><span class="nv">"writes"</span><span class="p">)</span> <span class="k">AS</span> <span class="nv">"writes"</span>
  <span class="k">FROM</span> <span class="nv">"zfs_pool"</span>
  <span class="k">WHERE</span> <span class="p">(</span><span class="nv">"host"</span> <span class="o">=</span> <span class="s1">'taokystrong'</span> <span class="k">AND</span> <span class="nv">"pool"</span> <span class="o">=</span> <span class="s1">'pool0'</span><span class="p">)</span> <span class="k">AND</span> <span class="err">$</span><span class="n">timeFilter</span>
  <span class="k">GROUP</span> <span class="k">BY</span> <span class="nb">time</span><span class="p">(</span><span class="err">$</span><span class="n">interval</span><span class="p">),</span> <span class="nv">"host"</span><span class="p">::</span><span class="n">tag</span><span class="p">,</span> <span class="nv">"pool"</span><span class="p">::</span><span class="n">tag</span><span class="p">,</span> <span class="nv">"dataset"</span><span class="p">::</span><span class="n">tag</span> <span class="n">fill</span><span class="p">(</span><span class="k">null</span><span class="p">)</span>
<span class="p">)</span>
<span class="k">WHERE</span> <span class="err">$</span><span class="n">timeFilter</span>
<span class="k">GROUP</span> <span class="k">BY</span> <span class="nb">time</span><span class="p">(</span><span class="err">$</span><span class="n">interval</span><span class="p">),</span> <span class="nv">"pool"</span><span class="p">::</span><span class="n">tag</span> <span class="n">fill</span><span class="p">(</span><span class="n">linear</span><span class="p">)</span>
</code></pre>
    </div>
  </div>
  <p>由于 subquery 的存在，这个 query 确实有点慢，但我们也没啥能优化的。</p>
  <p>如果要显示读写速率的话，直接把内层查询的 <code class="language-plaintext highlighter-rouge">reads</code> 和 <code class="language-plaintext highlighter-rouge">writes</code> 换成 <code class="language-plaintext highlighter-rouge">nread</code> 和 <code class="language-plaintext highlighter-rouge">nwritten</code> 就行了。</p>
  <figure class=""><a href="/static/planet/ustc-mirrors-zfs-rebuild/mirrors2-4-zfs-io-count.png" class="image-popup" title="ZFS IOPS 和带宽
"><img src="/static/planet/ustc-mirrors-zfs-rebuild/mirrors2-4-zfs-io-count.png" alt="ZFS IOPS 和带宽" /></a>
    <figcaption>
      ZFS IOPS 和带宽
    </figcaption>
  </figure>
  <p>令 UC 震惊部出动的是，一个机械盘阵列竟然能跑出平均 15k、最高 50k 的 IOPS。
    我们发现这个统计数字算上了 ARC hit，也就是只有一小部分 I/O 请求是真正落盘的，那就好解释了。</p>
  <h3 id="apparmor">AppArmor</h3>
  <p>换上先进的 PVE kernel 之后，我们很快就发现同步任务全挂了（）
    排查发现 <code class="language-plaintext highlighter-rouge">rsync</code> 在调用 <code class="language-plaintext highlighter-rouge">socketpair(2)</code> 的时候冒出了 <code class="language-plaintext highlighter-rouge">EPERM</code>，这是我们从来没遇到过的情况。
    实际上这些系统调用都被 AppArmor 拦下来了，最终查到是 Ubuntu 在 kernel 里加的私货 <code class="language-plaintext highlighter-rouge">security/apparmor/af_unix.c</code> 导致的。
    由于 Proxmox VE 的 kernel 是从 Ubuntu fork 过来的，这个私货也就被夹带到我们服务器上了。</p>
  <p>我们发现 PVE 也打包了自己的 AppArmor <code class="language-plaintext highlighter-rouge">features</code> 配置，就把它直接拉过来用：</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>dpkg-divert <span class="nt">--package</span> lxc-pve <span class="nt">--rename</span> <span class="nt">--divert</span> /usr/share/apparmor-features/features.stock <span class="nt">--add</span> /usr/share/apparmor-features/features
wget <span class="nt">-O</span> /usr/share/apparmor-features/features https://github.com/proxmox/lxc/raw/master/debian/features
</code></pre>
    </div>
  </div>
  <h3 id="file-deduplication">文件级去重</h3>
  <p>我们注意到个别仓库有大量的重复的、内容相同的目录，怀疑可能是同步方法（HTTP）的限制导致目录的符号链接变成了完整内容的拷贝。</p>
  <figure class=""><a href="/static/planet/ustc-mirrors-zfs-rebuild/ls-zerotier-redhat-el.png" class="image-popup" title="ZeroTier 仓库中的一些目录
"><img src="/static/planet/ustc-mirrors-zfs-rebuild/ls-zerotier-redhat-el.png" alt="ZeroTier 仓库中的一些目录" /></a>
    <figcaption>
      ZeroTier 仓库中的一些目录
    </figcaption>
  </figure>
  <p>我们想到了 ZFS 的 deduplication，于是在 ZeroTier 仓库上做了一个初步的测试：</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>zfs create <span class="nt">-o</span> <span class="nv">dedup</span><span class="o">=</span>on pool0/repo/zerotier
<span class="c"># 导入数据</span>
</code></pre>
    </div>
  </div>
  <div class="language-console highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="gp">#</span><span class="w"> </span>zdb <span class="nt">-DDD</span> pool0
<span class="go">dedup = 4.93, compress = 1.23, copies = 1.00, dedup * compress / copies = 6.04
</span></code></pre>
    </div>
  </div>
  <p>结果十分可观，但考虑到 ZFS dedup 一向来糟糕的名声，我们还是不太想在镜像站上启用。
    所以我们重新找了个更灵车的方案：</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="c"># post-sync.sh</span>
<span class="c"># Do file-level deduplication for select repos</span>
<span class="k">case</span> <span class="s2">"</span><span class="nv">$NAME</span><span class="s2">"</span> <span class="k">in
  </span>docker-ce|influxdata|nginx|openresty|proxmox|salt|tailscale|zerotier<span class="p">)</span>
    jdupes <span class="nt">-L</span> <span class="nt">-Q</span> <span class="nt">-r</span> <span class="nt">-q</span> <span class="s2">"</span><span class="nv">$DIR</span><span class="s2">"</span> <span class="p">;;</span>
<span class="k">esac</span>
</code></pre>
    </div>
  </div>
  <p>这个用户态的文件去重工具十分好用，效果堪比 ZFS，而且没有性能损失。
    我们对几个明显有重复内容的仓库跑了一下 jdupes，结果如下：</p>
  <table>
    <thead>
      <tr>
        <th>Name</th>
        <th>Orig</th>
        <th>Dedup</th>
        <th>Diff</th>
        <th>Ratio</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td>proxmox</td>
        <td>395.4G</td>
        <td>162.6G</td>
        <td>232.9G</td>
        <td>2.43x</td>
      </tr>
      <tr>
        <td>docker-ce</td>
        <td>539.6G</td>
        <td>318.2G</td>
        <td>221.4G</td>
        <td>1.70x</td>
      </tr>
      <tr>
        <td>influxdata</td>
        <td>248.4G</td>
        <td>54.8G</td>
        <td>193.6G</td>
        <td>4.54x</td>
      </tr>
      <tr>
        <td>salt</td>
        <td>139.0G</td>
        <td>87.2G</td>
        <td>51.9G</td>
        <td>1.59x</td>
      </tr>
      <tr>
        <td>nginx</td>
        <td>94.9G</td>
        <td>59.7G</td>
        <td>35.2G</td>
        <td>1.59x</td>
      </tr>
      <tr>
        <td>zerotier</td>
        <td>29.8G</td>
        <td>6.1G</td>
        <td>23.7G</td>
        <td>4.88x</td>
      </tr>
      <tr>
        <td>mysql-repo</td>
        <td>647.8G</td>
        <td>632.5G</td>
        <td>15.2G</td>
        <td>1.02x</td>
      </tr>
      <tr>
        <td>openresty</td>
        <td>65.1G</td>
        <td>53.4G</td>
        <td>11.7G</td>
        <td>1.22x</td>
      </tr>
      <tr>
        <td>tailscale</td>
        <td>17.9G</td>
        <td>9.0G</td>
        <td>9.0G</td>
        <td>2.00x</td>
      </tr>
    </tbody>
  </table>
  <p>参考上述表格，我们排除了 <code class="language-plaintext highlighter-rouge">mysql-repo</code>，因为它的去重比例太低，不值得花费跑一遍去重产生的 I/O 负载。</p>
  <h2 id="conclusion">总结</h2>
  <p>ZFS 解决了我们镜像站上的一大堆问题，并且有了此次调参经验，我们现在宣布 <strong>ZFS 天下第一</strong>（不是）</p>
  <p>有了 ZFS 之后：</p>
  <ul>
    <li>我们不再担心分区问题，ZFS 可以灵活分配。</li>
    <li>我们的机械盘比别人的固态盘跑得还快，这非常 excited！
      <ul>
        <li>我们成为了第一个不再<strong>羡慕</strong> TUNA 的全闪服务器的镜像站！</li>
      </ul>
    </li>
    <li>零成本获得额外容量，由 ZFS 透明压缩和去重联合赞助！</li>
  </ul>
  <h3 id="considerations">思考</h3>
  <p>虽然我们的 ZFS 配置看起来非常高效，但我们也知道 ZFS 在长期运行中可能会因为碎片化而导致性能下降的问题。
    我们会持续关注我们的服务器，监控长期的性能变化。</p>
  <div class="footnotes" role="doc-endnotes">
    <ol>
      <li id="fn:as-of">
        <p>指本文所述的重建工程之前，即 2024 年 6 月中旬。 <a href="#fnref:as-of" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
      </li>
    </ol>
  </div>
  ]]></content><author><name>iBug</name></author><category term="mirrors" /><category term="linux" /><category term="服务器" /><category term="zfs" /><summary type="html"><![CDATA[A.K.A. 如何让 2000 元的机械硬盘跑得比 3000 元的固态硬盘还快（]]></summary></entry><entry><title type="html">使用 Rclone 备份 OneDrive 内容</title><link href="https://lug.ustc.edu.cn/planet/2024/05/onedrive-backup-with-rclone/" rel="alternate" type="text/html" title="使用 Rclone 备份 OneDrive 内容" /><published>2024-05-10T00:00:00+08:00</published><updated>2024-09-15T01:34:15+08:00</updated><id>https://lug.ustc.edu.cn/planet/2024/05/onedrive-backup-with-rclone</id><content type="html" xml:base="https://lug.ustc.edu.cn/planet/2024/05/onedrive-backup-with-rclone/"><![CDATA[<p>本文用于介绍如何使用 Rclone 备份 OneDrive 内容。</p>
  <h2 id="rclone-简介">Rclone 简介</h2>
  <p>Rclone 是一个命令行工具，用于同步文件和目录到和从云存储服务。它支持 Google Drive、Amazon S3、Dropbox、Microsoft OneDrive、Yandex Disk、Box 和其他一些云存储服务。Rclone 是一个 Go 程序，可以在 Windows、macOS、Linux 和其他操作系统上运行。</p>
  <h2 id="安装-rclone">安装 Rclone</h2>
  <p>Rclone 官方的下载链接在 <a href="https://rclone.org/downloads/">这里</a>。你可以根据自己的操作系统下载对应的版本。</p>
  <p>部分常见的包管理工具也提供了 Rclone 的安装方式，例如：</p>
  <ul>
    <li>
      <p>在 Windows 上，你可以使用 Chocolatey 安装 Rclone：</p>
      <div class="language-bash highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>choco <span class="nb">install </span>rclone
</code></pre>
        </div>
    </div>
    </li>
    <li>
      <p>在 Windows 上，你也可以使用 Winget 安装 Rclone：</p>
      <div class="language-bash highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>winget <span class="nb">install </span>Rclone.Rclone
</code></pre>
        </div>
    </div>
    </li>
    <li>
      <p>在 macOS 上，你可以使用 Homebrew 安装 Rclone：</p>
      <div class="language-bash highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>brew <span class="nb">install </span>rclone
</code></pre>
        </div>
    </div>
    </li>
    <li>
      <p>在 Ubuntu 上，你可以使用 apt 安装 Rclone：</p>
      <div class="language-bash highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="nb">sudo </span>apt <span class="nb">install </span>rclone
</code></pre>
        </div>
    </div>
    </li>
    <li>
      <p>更多包管理器的安装方法可以参考 <a href="https://rclone.org/install/#package-manager">Rclone 官方文档</a>。</p>
    </li>
  </ul>
  <p>运行 <code class="language-plaintext highlighter-rouge">rclone --version</code> 来检查 Rclone 是否安装成功，如果正常，你应该类似看到如下输出：</p>
  <div class="language-console highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>rclone <span class="nt">--version</span>
<span class="go">rclone v1.66.0
- os/version: darwin 14.5 (64 bit)
- os/kernel: 23.5.0 (arm64)
- os/type: darwin
- os/arch: arm64 (ARMv8 compatible)
- go/version: go1.22.1
- go/linking: dynamic
- go/tags: none
</span></code></pre>
    </div>
  </div>
  <h2 id="配置-rclone">配置 Rclone</h2>
  <p>我们在这里以 <code class="language-plaintext highlighter-rouge">mail.ustc.edu.cn</code> 为例，做一个 step-by-step 的教程：</p>
  <div class="language-bash highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>rclone config
</code></pre>
    </div>
  </div>
  <div class="language-console highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="go">No remotes found, make a new one?
n) New remote
s) Set configuration password
q) Quit config
</span><span class="gp">n/s/q&gt;</span><span class="w">
</span></code></pre>
    </div>
  </div>
  <p>输入 <code class="language-plaintext highlighter-rouge">n</code>，然后回车。</p>
  <div class="language-console highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="go">Enter name for new remote.
</span><span class="gp">name&gt;</span><span class="w">
</span></code></pre>
    </div>
  </div>
  <p>输入一个名字，这里我们输入 <code class="language-plaintext highlighter-rouge">onedrive</code>，然后回车。</p>
  <div class="language-console highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="go">Option Storage.
Type of storage to configure.
Choose a number from below, or type in your own value.
 1 / 1Fichier
   \ (fichier)
 2 / Akamai NetStorage
   \ (netstorage)
 3 / Alias for an existing remote
   \ (alias)
 4 / Amazon S3 Compliant Storage Providers including AWS, Alibaba, ArvanCloud, Ceph, ChinaMobile, Cloudflare, DigitalOcean, Dreamhost, GCS, HuaweiOBS, IBMCOS, IDrive, IONOS, LyveCloud, Leviia, Liara, Linode, Minio, Netease, Petabox, RackCorp, Rclone, Scaleway, SeaweedFS, StackPath, Storj, Synology, TencentCOS, Wasabi, Qiniu and others
   \ (s3)

</span><span class="c">...
</span><span class="go">
33 / Microsoft OneDrive
   \ (onedrive)

</span><span class="c">...
</span><span class="go">
</span><span class="gp">Storage&gt;</span><span class="w">
</span></code></pre>
    </div>
  </div>
  <p>输入 <code class="language-plaintext highlighter-rouge">onedrive</code>，然后回车。</p>
  <div class="language-console highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="go">Option client_id.
OAuth Client Id.
Leave blank normally.
Enter a value. Press Enter to leave empty.
</span><span class="gp">client_id&gt;</span><span class="w">
</span><span class="go">
Option client_secret.
OAuth Client Secret.
Leave blank normally.
Enter a value. Press Enter to leave empty.
</span><span class="gp">client_secret&gt;</span><span class="w">
</span><span class="go">
Option region.
Choose national cloud region for OneDrive.
Choose a number from below, or type in your own string value.
Press Enter for the default (global).
 1 / Microsoft Cloud Global
   \ (global)
 2 / Microsoft Cloud for US Government
   \ (us)
 3 / Microsoft Cloud Germany
   \ (de)
 4 / Azure and Office 365 operated by Vnet Group in China
   \ (cn)
</span><span class="gp">region&gt;</span><span class="w">
</span><span class="go">
Edit advanced config?
y) Yes
n) No (default)
</span><span class="gp">y/n&gt;</span><span class="w">
</span><span class="go">
Use web browser to automatically authenticate rclone with remote?
 * Say Y if the machine running rclone has a web browser you can use
 * Say N if running rclone on a (remote) machine without web browser access
If not sure try Y. If Y failed, try N.

y) Yes (default)
n) No
</span><span class="gp">y/n&gt;</span><span class="w">
</span></code></pre>
    </div>
  </div>
  <p>这里我们都直接回车，不输入任何内容。(五次回车)</p>
  <p>在这其中会弹出一个网页，让你登录你的 OneDrive 账号，然后授权 Rclone 访问你的 OneDrive 账号。</p>
  <div class="language-console highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="go">Option config_type.
Type of connection
Choose a number from below, or type in an existing string value.
Press Enter for the default (onedrive).
 1 / OneDrive Personal or Business
   \ (onedrive)
 2 / Root Sharepoint site
   \ (sharepoint)
   / Sharepoint site name or URL
 3 | E.g. mysite or https://contoso.sharepoint.com/sites/mysite
   \ (url)
 4 / Search for a Sharepoint site
   \ (search)
 5 / Type in driveID (advanced)
   \ (driveid)
 6 / Type in SiteID (advanced)
   \ (siteid)
   / Sharepoint server-relative path (advanced)
 7 | E.g. /teams/hr
   \ (path)
</span><span class="gp">config_type&gt;</span><span class="w">
</span></code></pre>
    </div>
  </div>
  <p>这里我们直接回车，不输入任何内容。</p>
  <div class="language-console highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="go">Option config_driveid.
Select drive you want to use
Choose a number from below, or type in your own string value.
Press Enter for the default (b!****************************************************************).
 1 / OneDrive (business)
   \ (b!****************************************************************)
</span><span class="gp">config_driveid&gt;</span><span class="w">
</span><span class="go">
Drive OK?

Found drive "root" of type "business"
URL: https://mailustceducn-my.sharepoint.com/personal/tiankaima_mail_ustc_edu_cn/Documents

y) Yes (default)
n) No
</span><span class="gp">y/n&gt;</span><span class="w">
</span><span class="go">
Configuration complete.
Options:
- type: onedrive
- token:
***
- drive_id: b!****************************************************************
- drive_type: business
Keep this "onedrive" remote?
y) Yes this is OK (default)
e) Edit this remote
d) Delete this remote
</span><span class="gp">y/e/d&gt;</span><span class="w">
</span></code></pre>
    </div>
  </div>
  <p>依旧是直接回车，不输入任何内容。</p>
  <div class="language-console highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="go">Current remotes:

Name                 Type
====                 ====
onedrive             onedrive

e) Edit existing remote
n) New remote
d) Delete remote
r) Rename remote
c) Copy remote
s) Set configuration password
q) Quit config
</span><span class="gp">e/n/d/r/c/s/q&gt;</span><span class="w">
</span></code></pre>
    </div>
  </div>
  <p>输入 <code class="language-plaintext highlighter-rouge">q</code>，然后回车。现在我们已经配置好了 Rclone。</p>
  <h2 id="将文件从-onedrive-备份到本地">将文件从 OneDrive 备份到本地</h2>
  <div class="language-console highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="go">rclone copy onedrive: /path/to/local/folder -P
</span></code></pre>
    </div>
  </div>
  <h2 id="总结">总结</h2>
  <p>Rclone 是个非常强大的工具，支持的云存储服务也非常多，你可以通过 <code class="language-plaintext highlighter-rouge">rclone config</code> 来配置其他的云存储服务。</p>
  <p>限于篇幅和时间关系，本文只介绍了 Rclone 的基本使用方法，更多的功能和用法请参考 <a href="https://rclone.org/docs/">Rclone 官方文档</a>。</p>
  <p>如果您对这篇内容有任何问题或建议，欢迎 <a href="/wiki/lug/contact/">联系我们</a>。</p>
  ]]></content><author><name>tiankaima</name></author><category term="Tech Tutorial" /><category term="rclone" /><category term="OneDrive" /><summary type="html"><![CDATA[本文用于介绍如何使用 Rclone 备份 OneDrive 内容。]]></summary></entry><entry><title type="html">菜鸡写给菜鸡的 NetHack 入门教程</title><link href="https://lug.ustc.edu.cn/planet/2021/09/nethack-gitgud/" rel="alternate" type="text/html" title="菜鸡写给菜鸡的 NetHack 入门教程" /><published>2021-09-27T00:00:00+08:00</published><updated>2021-10-18T09:09:24+08:00</updated><id>https://lug.ustc.edu.cn/planet/2021/09/nethack-gitgud</id><content type="html" xml:base="https://lug.ustc.edu.cn/planet/2021/09/nethack-gitgud/"><![CDATA[<!-- Workaround for jekyll-titles-from-headings -->
    <h2 id="教程说明">教程说明</h2>
    <p>本教程只介绍最基本的操作和游戏刚刚开始阶段的策略，帮助新手活过第一层。引用内容为补充或 Fun facts，可以忽略。图中游戏“截图”均为字符界面的直接复制——虽然没有了颜色或加粗效果，但还是觉得这样更合适一些。</p>
    <h2 id="关于">关于</h2>
    <p>NetHack 是一款历史悠久的 Roguelike 游戏，基于龙与地下城规则，也有着 Roguelike 典型特征：随机生成地图，永久死亡，难度和复杂度非常高。同时，游戏中也融合了各种文化和领域的元素，并且有多种有趣的提示信息和死亡方式。</p>
    <blockquote>
      <p>总有一二十年才通关的老玩家说 NetHack 不难——或许和某些其他 Roguelike 相比确实如此，但这和你有什么关系呢？</p>
    </blockquote>
    <p>此游戏默认只有字符界面，以下为一个游戏画面，可以感受一下：</p>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>This kobold corpse tastes terrible!--More--

                                                       --------
                      ----------                       |......-
                      |........|`######################-......|
                      |........|     -----------       |&lt;.....|
                      |........|     |.........|  #####.....{.|
                      |........-#####|.........-###    --------
                      |........|    #..........|
                      --------|-     -------.---
                             ##
                              ##
                              #
                             #
                     --------.--
                     |$........|
                     |.........|
    ------------    #..........|
    |.....@.d..|    #|..........
    |.....^.....#####|....&gt;....|
    ------------     -----------

Petergu the Stripling          St:15 Dx:14 Co:18 In:9 Wi:7 Ch:10 Lawful
Dlvl:1 $:11 HP:7(16) Pw:1(2) AC:6 Xp:1
</code></pre>
      </div>
    </div>
    <blockquote>
      <p>贴图模式也是有的，但是不如字符模式经典</p>
    </blockquote>
    <h2 id="安装和配置">安装和配置</h2>
    <p>如果你使用 Linux 等操作系统，那么 NetHack 多半可以直接从软件源中安装：包名通常为 <code class="language-plaintext highlighter-rouge">nethack</code> 或 <code class="language-plaintext highlighter-rouge">nethack-console</code>。如果使用 Windows，可以在<a href="https://nethack.org/v366/ports/download-win.html">官方网站</a>下载。建议使用默认的英文版。</p>
    <p>为了更好地游戏体验，在 Linux 上推荐将以下内容添加至 <code class="language-plaintext highlighter-rouge">~/.nethackrc</code> 配置文件（这也是 Ubuntu 中自带的配置）。</p>
    <pre><code class="language-dosini">#
# System-wide NetHack configuration file for tty-based NetHack.
#

OPTIONS=windowtype:tty,toptenwin,hilite_pet
OPTIONS=fixinv,safe_pet,sortpack,tombstone,color
OPTIONS=verbose,news,fruit:potato
OPTIONS=dogname:Slinky
OPTIONS=catname:Rex
OPTIONS=pickup_types:$
OPTIONS=nomail

# Enable this if you want to see your inventory sorted in alphabetical
# order by item instead of by index letter:
# OPTIONS=sortloot:full
# or if you just want containers sorted:
# OPTIONS=sortloot:loot

#
# Some sane menucolor defaults
#

OPTIONS=menucolors
MENUCOLOR=" blessed "=green
MENUCOLOR=" holy "=green
MENUCOLOR=" uncursed "=yellow
MENUCOLOR=" cursed "=red
MENUCOLOR=" unholy "=red
MENUCOLOR=" cursed .* (being worn)"=orange&amp;underline
</code></pre>
    <h2 id="开局">开局</h2>
    <p>运行 NetHack 后将看到以下提示：</p>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>NetHack, Copyright 1985-2020
         By Stichting Mathematisch Centrum and M. Stephenson.
         Version 3.6.6 Unix, built Mar 18 18:21:43 2020.
         See license for details.


Shall I pick character's race, role, gender and alignment for you? [ynaq]
</code></pre>
      </div>
    </div>
    <p>你可以按 <code class="language-plaintext highlighter-rouge">n</code> 然后自己选择人物的种族（人类，矮人，精灵，地精，兽人）、角色（考古学家、骑士、游客等 13 种）、性别（男、女）和阵营（守序、中立、混乱），或者按 <code class="language-plaintext highlighter-rouge">y</code> 让系统随机选一个。对于新手，女武神（Valkyrie）、野蛮人（Barbarian）和武士（Samurai）是开局较为容易的角色，虽然我觉得巫师（Wizard）的游戏体验比较有趣。</p>
    <blockquote>
      <p>性别和阵营可以在游戏中因道具改变</p>
    </blockquote>
    <blockquote>
      <p>每个角色的每个阵营对应一个不同的神。其中僧侣（Monk）的神是中国神话传说中的”人物”：山海经、赤松子和黄帝</p>
    </blockquote>
    <p>选好之后根据提示按 <code class="language-plaintext highlighter-rouge">y</code> 开始。出现 <code class="language-plaintext highlighter-rouge">--More--</code> 时说明提示未显示完，按空格到下一句话。</p>
    <blockquote>
      <p>被怪打一下后出现 <code class="language-plaintext highlighter-rouge">--More--</code>，下一句话基本是 <code class="language-plaintext highlighter-rouge">You die...</code></p>
    </blockquote>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>Hello petergu, welcome to NetHack!  You are a neutral male human Wizard.
--More--




  -----
  |..@|
  |.f(|
  |x..|
  |...|
  |?..+
  -----









Petergu the Evoker             St:9 Dx:8 Co:16 In:17 Wi:14 Ch:11 Neutral
Dlvl:1 $:0 HP:12(12) Pw:7(7) AC:9 Xp:1
</code></pre>
      </div>
    </div>
    <p>图中，<code class="language-plaintext highlighter-rouge">@</code> 是你，身边白底（复制出来看不到了）的 <code class="language-plaintext highlighter-rouge">f</code> 是你的宠物猫，<code class="language-plaintext highlighter-rouge">x</code> 是一只怪（grid bug），<code class="language-plaintext highlighter-rouge">+</code> 是关着的门。各种字符代表什么需要慢慢记忆，或者按 <code class="language-plaintext highlighter-rouge">/</code> 然后根据提示再按 <code class="language-plaintext highlighter-rouge">/</code>，移动光标查看地图上的信息，结束按 <code class="language-plaintext highlighter-rouge">ESC</code>。</p>
    <blockquote>
      <p>无论任何时候都要仔细阅读提示。这和学习 Linux 命令行是一样的。</p>
    </blockquote>
    <p>底栏是各种状态。新手主要要关注的就是 HP，即生命值；AC，Armor Class，代表防御，越低越好；以及可能出现在底栏的其他临时状态，比如失明（Blind）。其他属性的含义可在 Wiki 上查询。</p>
    <blockquote>
      <p>HP 降为 0 只是最无聊的死亡方式</p>
    </blockquote>
    <p>按 <code class="language-plaintext highlighter-rouge">i</code> 可以打开装备栏查看自己的物品。空格返回。</p>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code> Weapons
 a - a blessed +1 quarterstaff (weapon in hands)
 Armor
 b - an uncursed +0 cloak of magic resistance (being worn)
 Scrolls
 i - an uncursed scroll of magic mapping
 j - an uncursed scroll of enchant weapon
 k - an uncursed scroll of remove curse
 Spellbooks
 l - a blessed spellbook of force bolt
 m - an uncursed spellbook of create monster
 Potions
 f - an uncursed potion of gain ability
 g - an uncursed potion of monster detection
 h - an uncursed potion of healing
 Rings
 d - an uncursed ring of fire resistance
 e - an uncursed ring of invisibility
 Wands
 c - a wand of magic missile (0:4)
 Tools
 n - a magic marker (0:55)
 o - an uncursed blindfold
 (end)
</code></pre>
      </div>
    </div>
    <p>按 <code class="language-plaintext highlighter-rouge">Ctrl-X</code> 可以查看自身可以查看的属性。空格翻页和返回。</p>
    <blockquote>
      <p>更多的属性不能自我感觉到，只有特殊情况下才会显示出来</p>
    </blockquote>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code> Petergu the Wizard's attributes:

 Background:
  You are an Evoker, a level 1 male human Wizard.
  You are neutral, on a mission for Thoth
  who is opposed by Ptah (lawful) and Anhur (chaotic).
  You are in the Dungeons of Doom, on level 1.
  You entered the dungeon 9 turns ago.
  There is a full moon in effect.
  You have 0 experience points.

 Basics:
  You have all 12 hit points.
  You have all 7 energy points (spell power).
  Your armor class is 9.
  Your wallet is empty.
  Autopickup is on for '$' plus thrown.

 Current Characteristics:
  Your strength is 9.
  Your dexterity is 8.
  Your constitution is 16.
  Your intelligence is 17.
 (1 of 2)
</code></pre>
      </div>
    </div>
    <p>“静态”操作就这些：因为 NetHack 是回合制游戏，你可以随时查看这些信息并在进行动作前充分思考。</p>
    <p>那么接下来的任务就是活下去了。毕竟，游戏的目的是在地牢深处找到炎多的护符（Amulet of Yendor）并将其献给自己的神，而不是原地不动。</p>
    <h2 id="游戏开始">游戏开始</h2>
    <p>首先是<strong>移动</strong>。推荐的移动方式是和 Vim 操作类似的 8 个方向的 <code class="language-plaintext highlighter-rouge">hjklyubn</code>，当然新手用四个方向键也未尝不可。</p>
    <blockquote>
      <p>事实上，用方向键移动 as good as dead。按下一个移动键不松手也是常见的死亡原因。</p>
      <p>可以用大写的 <code class="language-plaintext highlighter-rouge">HJKLYUBN</code> 向一个方向长距离移动</p>
    </blockquote>
    <p>比如说，按下箭头或 <code class="language-plaintext highlighter-rouge">j</code> 移动到门边上。移动途中，<code class="language-plaintext highlighter-rouge">x</code> 被猫杀死了。</p>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">Rex misses the grid bug. Rex bites the grid bug.</code></p>
      <p><code class="language-plaintext highlighter-rouge">Rex bites the grid bug. The grid bug is killed!</code></p>
    </blockquote>
    <blockquote>
      <p>按 <code class="language-plaintext highlighter-rouge">Ctrl-P</code> 可以查看之前的消息。</p>
    </blockquote>
    <blockquote>
      <p>前期，宠物比你战斗力强也是常有的事</p>
    </blockquote>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code> -----
 |..&lt;|
 |..(|
 |..f|
 |...|
 |..@+
 -----
</code></pre>
      </div>
    </div>
    <p><strong>开门</strong>才能出去。按 <code class="language-plaintext highlighter-rouge">ol</code> 开右侧（<code class="language-plaintext highlighter-rouge">l</code> 移动方向）的门。之后就可以向右探险了。沿着过道（<code class="language-plaintext highlighter-rouge">#</code>）走了一段，发现没有路了。</p>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>-----
|..&lt;|
|..(|
|...|
|...|
|...-#
-----#
     ###
       #
       #
       ###f
          @
</code></pre>
      </div>
    </div>
    <p>这也是常见操作：需要搜索以下才能找到路。按 <code class="language-plaintext highlighter-rouge">s</code> <strong>搜索</strong>，或者更常用的，按 <code class="language-plaintext highlighter-rouge">10s</code> 搜索 10 次。如果没有路就再来 10 次。发现了隐藏的门，于是继续开门走进屋子。</p>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>You find a hidden door.





  -----
  |..&lt;|
  |..(|
  |...|
  |...|
  |...-#
  -----#
       ###
         #
         #
         ###f
            @+
</code></pre>
      </div>
    </div>
    <p>屋子里有一个 <code class="language-plaintext highlighter-rouge">[</code> ，走到物品上方并按 <code class="language-plaintext highlighter-rouge">,</code> <strong>捡起</strong>。<code class="language-plaintext highlighter-rouge">p - a splint mail.</code>。这是一件盔甲，按 <code class="language-plaintext highlighter-rouge">w</code> <strong>穿戴</strong>，再按 <code class="language-plaintext highlighter-rouge">？</code> 选择可穿戴的物品，<code class="language-plaintext highlighter-rouge">p</code> 选择新捡到的盔甲。<code class="language-plaintext highlighter-rouge">You cannot wear armor over a cloak.</code>，先把外套<strong>脱下</strong>。按 <code class="language-plaintext highlighter-rouge">T</code>，看提示已经脱掉了，再穿盔甲，然后再次 <code class="language-plaintext highlighter-rouge">W</code> 穿上外套。这是可以看见 AC 降为 3，盔甲不错。</p>
    <blockquote>
      <p>我们的运气不错。如果发现 <code class="language-plaintext highlighter-rouge">AC</code> 上升，多半是盔甲被诅咒（cursed），并且被诅咒的兵器和物品无法脱下。</p>
    </blockquote>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>                                             ----------------
                                            #@d.............|
  -----                                     f|...............           ....
  |..&lt;|                                  #### ..............|
  |..(|                                  ####+ .............+
  |...|                                  ##    --------- ----
  |...|                                  ##
  |...-#                                 ##
  -----#     ---------                   ##
       ###   |.......|      -----+----  ###
         #   |.......|      |.........###
         #   |.......|      |.......{|####
         ####|........######-........|
            #-.......|      |........|
             |.......|      |..&gt;.....|
             ---------      |........|
                            ----------

Petergu the Evoker             St:9 Dx:8 Co:16 In:17 Wi:14 Ch:11 Neutral
Dlvl:1 $:0 HP:10(12) Pw:7(7) AC:3 Xp:1
</code></pre>
      </div>
    </div>
    <p>继续探险，开门看见一只豺狼。直接按向右移动的键（<code class="language-plaintext highlighter-rouge">l</code>）<strong>攻击</strong>它。几步之后，狼被打死，你也只剩 8 HP 了。</p>
    <blockquote>
      <p>当然，见到怪不掂量一下自己的能力就直接攻击并不是好的策略。</p>
    </blockquote>
    <blockquote>
      <p>比如如果正面击打一个冰冻之眼（Frozen eye），你会被冻住若干回合，然后被旁边路过的小怪打死。</p>
    </blockquote>
    <blockquote>
      <p>可以按 <code class="language-plaintext highlighter-rouge">.</code> 原地休息，这会缓慢回复失去的 HP。</p>
    </blockquote>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>You kill the jackal!  You hear water falling on coins.



                                             ----------------
                                            f@%.............|
  -----                                     #|...............           ....
  |..&lt;|                                  #### ..............|
  |..(|                                  ####+ .............+
  |...|                                  ##    --------- ----
  |...|                                  ##
  |...-#                                 ##
  -----#     ---------                   ##
       ###   |.......|      -----+----  ###
         #   |.......|      |.........###
         #   |.......|      |.......{|####
         ####|........######-........|
            #-.......|      |........|
             |.......|      |..&gt;.....|
             ---------      |........|
                            ----------

Petergu the Evoker             St:9 Dx:8 Co:16 In:17 Wi:14 Ch:11 Neutral
Dlvl:1 $:0 HP:8(12) Pw:7(7) AC:3 Xp:1
</code></pre>
      </div>
    </div>
    <p>可以看到提示狼被打死。同时你听到一些声音。</p>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">You hear water falling on coins.</code> 的意思是本层有喷泉（中间屋子里的 <code class="language-plaintext highlighter-rouge">{</code>）</p>
    </blockquote>
    <p>地上的 <code class="language-plaintext highlighter-rouge">%</code> 是狼的尸体。杀死怪物后你多半希望<strong>吃掉</strong>尸体，否则没有食物会饿死。移动到 <code class="language-plaintext highlighter-rouge">%</code> 上方并按 <code class="language-plaintext highlighter-rouge">e</code>，选择 <code class="language-plaintext highlighter-rouge">y</code> 吃掉。</p>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">You see here a jackal corpse.</code></p>
      <p><code class="language-plaintext highlighter-rouge">There is a jackal corpse here; eat it? [ynq] (n) y</code></p>
      <p><code class="language-plaintext highlighter-rouge">This jackal corpse tastes okay. You finish eating the jackal corpse.</code></p>
    </blockquote>
    <blockquote>
      <p>当然，不是什么都能吃。有些怪的尸体会让你中毒。如果因此呕吐，你会变得更饿。然而有些尸体可以让你获得某些能力（比如抵抗中毒、抵抗寒冷等）。如果触摸鸡头蛇怪（cockatrace）的尸体，会变成石头。</p>
    </blockquote>
    <p>这时，可以看到我们吃饱了，变得 Satiated。</p>
    <blockquote>
      <p>这时继续大量吃东西可能会被噎死。</p>
    </blockquote>
    <p>同时发现，有一扇锁着的门。我们目前没有撬锁工具，只能<strong>踢</strong>开。按 <code class="language-plaintext highlighter-rouge">Ctrl-D</code> 然后方向 <code class="language-plaintext highlighter-rouge">h</code> 踢门几次。</p>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>This door is locked.



                                             ----------------
                                            #-..............|            -----
  -----                                     #|.f.............           .....|
  |..&lt;|                                  ####|..............|
  |..(|                                  ####+@.............+
  |...|                                  ##  ----------- ----
  |...|                                  ##
  |...-#                                 ##
  -----#     ---------                   ##
       ###   |.......|      -----+----  ###
         #   |.......|      |.........###
         #   |.......|      |.......{|####
         ####|........######-........|
            #-.......|      |........|
             |.......|      |..&gt;.....|
             ---------      |........|
                            ----------

Petergu the Evoker             St:9 Dx:8 Co:16 In:17 Wi:14 Ch:11 Neutral
Dlvl:1 $:0 HP:9(12) Pw:7(7) AC:3 Xp:1
</code></pre>
      </div>
    </div>
    <p>我们踢门多次门才打开，中途来了两只狼，第一只被猫一击咬死，第二只也被猫一击咬死。</p>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">This door is locked.</code></p>
      <p><code class="language-plaintext highlighter-rouge">WHAMMM!!! Rex misses the jackal.</code></p>
      <p><code class="language-plaintext highlighter-rouge">WHAMMM!!! Rex bites the jackal. The jackal is killed!</code></p>
      <p><code class="language-plaintext highlighter-rouge">WHAMMM!!!</code></p>
      <p><code class="language-plaintext highlighter-rouge">WHAMMM!!!</code></p>
      <p><code class="language-plaintext highlighter-rouge">WHAMMM!!!</code></p>
      <p><code class="language-plaintext highlighter-rouge">As you kick the door, it crashes open! The jackal bites!</code></p>
      <p><code class="language-plaintext highlighter-rouge">Rex misses the jackal.</code></p>
      <p><code class="language-plaintext highlighter-rouge">You miss the jackal. The jackal bites! Rex bites the jackal.</code></p>
      <p><code class="language-plaintext highlighter-rouge">The jackal is killed!</code></p>
    </blockquote>
    <blockquote>
      <p>注意到图右边部分显示的房间了吗？因为从隔壁房间的门口“路过”，视野所限只能看到屋内的一部分。字符界面不代表没有光照的处理。</p>
    </blockquote>
    <p>一层探险结束后，捡到了一些其他护身符和戒指等。因为捡东西过多，处于 <code class="language-plaintext highlighter-rouge">Burdened</code> 状态。走到 <code class="language-plaintext highlighter-rouge">&gt;</code> 的位置，按 <code class="language-plaintext highlighter-rouge">&gt;</code> <strong>下行</strong>到下一层地牢。</p>
    <blockquote>
      <p>负重是非常不好的，但本人就是不想扔东西。</p>
    </blockquote>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">You fall down the stairs.</code></p>
    </blockquote>
    <blockquote>
      <p>下楼的时候宠物如果在你身边一格以内，会和你一起下去。否则会被留在原地，并逐渐野化</p>
    </blockquote>
    <p>这时候，背包里已经有一些不知道是什么的装备了。</p>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">r - a sapphire ring</code></p>
      <p><code class="language-plaintext highlighter-rouge">t - a pyramidal amulet</code></p>
    </blockquote>
    <p><strong>鉴定物品</strong>（Identification）是 NetHack 中重要而困难的。最基础的方法就是亲自体验：<code class="language-plaintext highlighter-rouge">P</code> - <code class="language-plaintext highlighter-rouge">?</code> - <code class="language-plaintext highlighter-rouge">t</code> 戴上护身符。好像没什么变化，仍然不知道是什么。按 <code class="language-plaintext highlighter-rouge">R</code> 移除。戒指也一样，什么也没鉴定成。只能作罢。</p>
    <blockquote>
      <p>鉴定物品的高级技巧有很多，比如在商店观察卖出价格。</p>
    </blockquote>
    <blockquote>
      <p>NetHack 中每种戒指、护身符和卷轴有不同的外观。每一局游戏中外观和功能的对应保持不变，但不同局游戏的这种对应是随机的。这一点并不是 trivial 的，因为，比如说，戒指的材质（金属、石头等）不一样，从而决定了每一局中玩家能通过变身（polymorph）并吃掉戒指获得何种永久属性。</p>
    </blockquote>
    <p>继续同样的探险。看到商店，但我们没有钱，只好离开。</p>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">"Hello, petergu! Welcome to Akhalataki's used armor dealership!"</code></p>
    </blockquote>
    <blockquote>
      <p>攻击商店老板是非常不明智的行为。但是，如果店内有死亡魔杖（Wand of death），可以用其将老板杀死并获得所有的物品。如果有许愿魔杖（Wand of wishing），可以许愿得到死亡魔杖。这二者的基础价格都是 1200，很容易被认出。</p>
    </blockquote>
    <blockquote>
      <p>如果你是游客（Tourist），并且穿着夏威夷 T 恤（Hawaiian shirt），会被商店老板加价“宰客”。</p>
    </blockquote>
    <blockquote>
      <p>宠物可以捡起商店中的物品并在店门外放下—— shoplifting。</p>
      <p><code class="language-plaintext highlighter-rouge">You hear someone cursing shoplifters.</code></p>
    </blockquote>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>         #
        ##
        #
        #
--------|--
|.......@.|
|[[[[[[f@[|
|[[[[[)[[)|
|))[[.[[[[|
-----------
</code></pre>
      </div>
    </div>
    <p>捡到并戴上一枚红宝石戒指后，发现这是被诅咒的——摘不下来了。然而还不知道它是干什么的，但多半不是什么好东西。</p>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">E - a cursed ruby ring (on right hand)</code></p>
    </blockquote>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>You see here a blindfold.  You faint from lack of food.--More--

                                       ------
                                       |....|
                                      #......###
                           ------######|....|  ##     ------
      ----------------     |....-######|....|   ###   |....|
      |..............-     |....|#     |....|     ####....F|
      .......@..F....|     |....|      |....|         |....-##
      |......f.......|     |..^..      --)---         ------ ###  ------------
      ---.------------     -|-.--        #                     ###..{........|
         ###                #            #####                    |..........|
           ####             #                ##                   |........&gt;.|
             -.----      ---.-      ----------.--                 ------------
             |....|      |...|      ............|
             |....|      |...|      |...........|
             |....|      |...|      |.&lt;.........+
             |.....######-...|      |...........|
             ------      -----      -------------



Petergu the Evoker             St:8 Dx:8 Co:16 In:17 Wi:14 Ch:11 Neutral
Dlvl:3 $:28 HP:15(15) Pw:23(23) AC:1 Xp:2 Fainted Burdened Deaf
</code></pre>
      </div>
    </div>
    <p>不久之后，因饥饿晕倒。戴上的戒指多半是 Ring of hunger。按 <code class="language-plaintext highlighter-rouge">e</code> 吃一个鸡蛋，回过神来。正好背包里有解除诅咒的卷轴，读一个：按 <code class="language-plaintext highlighter-rouge">r</code> - <code class="language-plaintext highlighter-rouge">?</code> - <code class="language-plaintext highlighter-rouge">k</code>。</p>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">As you read the scroll, it disappears.--More--</code></p>
      <p><code class="language-plaintext highlighter-rouge">You feel like someone is helping you. Rex misses the lichen.--More--</code></p>
    </blockquote>
    <p>然后，戒指就可以摘下来了。</p>
    <p>又饿了，吃个鸡蛋结果变质了。</p>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">Blecch! Rotten food! The world spins and goes dark.</code></p>
    </blockquote>
    <p>又走了一阵，没有吃的了（因为尸体都让猫吃了），已经晕倒，怎么办？只有神能救我们了。</p>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">Wizard Needs Food, Badly!</code></p>
    </blockquote>
    <p><code class="language-plaintext highlighter-rouge">#pray</code></p>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">You begin praying to Thoth. You are surrounded by a shimmering light.--More--</code></p>
      <p><code class="language-plaintext highlighter-rouge">You finish your prayer. You feel that Thoth is well-pleased.--More--</code></p>
      <p><code class="language-plaintext highlighter-rouge">Your stomach feels content.</code></p>
    </blockquote>
    <p><strong>祈祷</strong>是非常强大的——看，这就饱了。但是如果频繁祈祷，神会对我们生气。</p>
    <blockquote>
      <p>原地祈祷三次，就会被雷劈死。</p>
    </blockquote>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>

                                                 --------
                            ---------------######........#
           --------         |..@&gt;.........|#     |......|#     -----
           |......|         |.............|#    #-..B...|#     |...|
           |...:..|         ---|-----------#    #|......|#     |...|
           |......|            #   ##   ####    #|.......##    |...|
           |......|         ##*##########       #.......| #####....|
           |......|        ##  #    #####     ###-------- #    |...|
           -----|--       ##   #        #     # #         #    -|---
                #         #  ###        ##    ##          #     #
                #        ##  #           #    #           #     #
                #        #---.------     #    #           #     #
                #        #|........|    -|----#           ##    ##
                #       ##|........|    |..|&gt;|#            ####--|---------
               ##     ### |........|    |....|#               #...........|
           ----.-     #   |&lt;.......|    |....-#                |..........|
           |.....######   |........|    ------                 |..........|
           |...{| ######  .........|                           ------------
           ------         ----------

Petergu the Evoker             St:9 Dx:8 Co:16 In:17 Wi:14 Ch:11 Neutral
Dlvl:4 $:62 HP:15(15) Pw:23(23) AC:1 Xp:2 Burdened
</code></pre>
      </div>
    </div>
    <p>发现，本层有两个下楼梯。这是因为有一个是通往支线任务地精矿洞（Gnomish Mines）的。这个支线较难，如果不是女武神等前期强角色基本不要擅自进入。支线地图是这种风格，周围都是地精 <code class="language-plaintext highlighter-rouge">G</code>。稍微进去一下，立刻被围攻。幸亏猫比较强打死几只怪。还剩 7 HP 逃了出来。当然，矿洞里装备也很多，我们换了双鞋子，现在 <code class="language-plaintext highlighter-rouge">AC</code> 是 0，而代价是沉重的盔甲和行动不便。</p>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">b - an uncursed +0 cloak of magic resistance (being worn)</code></p>
      <p><code class="language-plaintext highlighter-rouge">p - a +0 splint mail (being worn)</code></p>
      <p><code class="language-plaintext highlighter-rouge">B - a +0 iron skull cap (being worn)</code></p>
      <p><code class="language-plaintext highlighter-rouge">I - a +0 pair of hard shoes (being worn)</code></p>
    </blockquote>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>   ---
   |..--
 ---....- -
 |.......|.
  .........-
   -.......|
    -......|
     -..f..--
     --.....|
      -..G..|
   |....|...| ---
   | ----.......|
       -.@......|
      |.........|
      |.......---
      ..*.G.....
     ------------
</code></pre>
      </div>
    </div>
    <p>下一层看到一个箱子。可以 <code class="language-plaintext highlighter-rouge">#loot</code> <strong>抢劫</strong>。但箱子上锁了，所以可以踢几脚把锁打开。</p>
    <blockquote>
      <p>这么做的代价是箱内的药水和易碎的魔杖会损坏。</p>
    </blockquote>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">THUD! You break open the lock!</code></p>
    </blockquote>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>You see here a large box.


                                                           -----
                  -----------     #########################....|
                  |.........|    ###########        #      |...|
                  |.........|    #------      ------       .....
                  |.........|    #.....| #####...&lt;.|       |...|
                #`|..f..&gt;...|    #|.....##    |...)|       --.--
                 #-...@.....+     |....|      |...)|         #
                 #-----------    #-.----      |....|         #
                 #####        ##  ##          |....|###      #
                   # ################        #---.--##############
                              #              `############ --.-+-|----
                   .                               ###     ..........|
                  |.                                       |.........|
                  |.                                       |...... ..|
                  |.                                       |.........|
                  ---                                      |.........|
                                                           -----------


Petergu the Conjurer           St:9 Dx:8 Co:16 In:17 Wi:14 Ch:11 Neutral
Dlvl:5 $:75 HP:25(25) Pw:31(31) AC:0 Xp:3 Burdened
</code></pre>
      </div>
    </div>
    <p>然而箱子是空的。</p>
    <blockquote>
      <p>你也可以往箱子里放一些东西。</p>
    </blockquote>
    <p>下一层看到了一些类似宫殿的东西。中间有先知，可以花 50 块钱向它咨询（<code class="language-plaintext highlighter-rouge">#chat</code>）。</p>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>You see here a statue of a plains centaur.


                                                   -------
             ------------           ###############-....*|
             |..........|        ####              |.....|
             ...........| ############        #####-.^....
             |.....&lt;....|##     #-----|-------#    -------
             -|----------#       |C.........C.#
              ############       |.....C.....|#
                                 |...-----...|
                                 |...|.{.|...|
                                 |..@|{.{|C..|
                                 |.....{.|...|
                                 |..f-----...|
                                 |.....C.....|
                                 |C.........C|
                                 -------------




Petergu the Conjurer           St:9 Dx:8 Co:16 In:17 Wi:14 Ch:11 Neutral
Dlvl:6 $:75 HP:25(25) Pw:31(31) AC:0 Xp:3 Burdened
</code></pre>
      </div>
    </div>
    <p>下一层，遇到一只巨大的蝙蝠（Giant bat），被咬到残血后我们逃跑，但是它追了过来，狭路相逢。</p>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>The giant bat bites!--More--

                                          -----                       -----
                                          |...|                #######-...|
                                         #|....#################      |...|
                                         #-...|#                      |...|
                                         #|...|                       -...|
                                         #|...|                       |.&gt;.|
                                         #-----                       ---.-
                                         ##                              #
                                         #                               *
                                        ###                              #
                                         # #                             #
                                         ###                             #
                                           #                          -|-|-
                                           #                          |...|
                                           #                          |...|
                                          +######B@######`            ...&lt;|
                                         .|#                          -----
                                         .-#
                                         --#

Petergu the Conjurer           St:9 Dx:8 Co:16 In:17 Wi:14 Ch:11 Neutral
Dlvl:7 $:88 HP:5(29) Pw:43(43) AC:0 Xp:4 Burdened
</code></pre>
      </div>
    </div>
    <blockquote>
      <p><code class="language-plaintext highlighter-rouge">You die...--More--</code></p>
      <p><code class="language-plaintext highlighter-rouge">Do you want your possessions identified? [ynq] (n) </code></p>
    </blockquote>
    <p>于是毙命此地。</p>
    <p>死后游戏会问你要不要看看你的东西到底是什么。你可以看一眼。</p>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code> L - 3 uncursed scrolls of blank paper
 R - an uncursed scroll of fire
 Spellbooks
 l - a blessed spellbook of force bolt
 m - an uncursed spellbook of create monster
 x - an uncursed spellbook of slow monster
 Potions
 f - an uncursed potion of gain ability
 g - an uncursed potion of monster detection
 h - an uncursed potion of healing
 G - an uncursed potion of sickness
 O - an uncursed potion of water
 S - a blessed potion of sleeping
 Rings
 d - an uncursed ring of fire resistance
 e - an uncursed ring of invisibility
 r - an uncursed ring of cold resistance
 E - an uncursed ring of aggravate monster
 H - an uncursed ring of sustain ability
 Wands
 c - a wand of magic missile (0:4)
 Tools
 n - a magic marker (0:55)
 (2 of 3)
</code></pre>
      </div>
    </div>
    <p>通常，如果看了，就会发现背包有至少一件能避免死亡的物品（比如残血之后喝个药之类的，或者这个 magic missile 看起来很 promising）。然而生命只有一次，这就是 Roguelike。</p>
    <div class="language-plaintext highlighter-rouge">
      <div class="highlight">
        <pre class="highlight"><code>
                       ----------
                      /          \
                     /    REST    \
                    /      IN      \
                   /     PEACE      \
                  /                  \
                  |     petergu      |
                  |      88 Au       |
                  |   killed by a    |
                  |    giant bat     |
                  |                  |
                  |                  |
                  |       2021       |
                 *|     *  *  *      | *
        _________)/\\_//(\/(/\)/\//\/|_)_______


Goodbye petergu the Wizard...

You died in The Dungeons of Doom on dungeon level 7 with 704 points,
and 88 pieces of gold, after 3310 moves.
You were level 4 with a maximum of 29 hit points when you died.
</code></pre>
      </div>
    </div>
    <h2 id="尾声">尾声</h2>
    <p>通过本菜鸡的一局典型游戏流程，希望你已经了解了 NetHack 的基本操作，并能够在地下城中跌跌撞撞地存活一段有限大的时间。仍然有非常多重要的机制和操作本教程中没有提及（甚至 BUC 都没有说），这就要各位自己寻找了。对于一般玩家，要胜任这种游戏，作弊、剧透甚至是阅读源代码都是有必要而并不可耻的行为。</p>
    <h2 id="参考资料">参考资料</h2>
    <ul>
      <li><a href="https://nethackwiki.com">https://nethackwiki.com</a></li>
      <li><a href="https://www.zhihu.com/question/40177337">https://www.zhihu.com/question/40177337</a></li>
      <li><a href="https://www.melankolia.net/nethack/nethack.guide.html">https://www.melankolia.net/nethack/nethack.guide.html</a></li>
      <li><a href="https://nethack.org/">https://nethack.org/</a></li>
      <li><a href="https://github.com/regymm/nethackassistant">https://github.com/regymm/nethackassistant</a></li>
      <li><a href="https://alt.org/nethack/">https://alt.org/nethack/</a></li>
    </ul>
    ]]></content><author><name>petergu</name></author><category term="Tutorial" /><category term="游戏" /><summary type="html"><![CDATA[]]></summary></entry><entry><title type="html">用 Python 处理大物实验数据</title><link href="https://lug.ustc.edu.cn/planet/2021/01/physexp-using-python/" rel="alternate" type="text/html" title="用 Python 处理大物实验数据" /><published>2021-01-25T00:00:00+08:00</published><updated>2024-09-17T14:00:46+08:00</updated><id>https://lug.ustc.edu.cn/planet/2021/01/physexp-using-python</id><content type="html" xml:base="https://lug.ustc.edu.cn/planet/2021/01/physexp-using-python/"><![CDATA[<p>身为某<a href="http://世界一流退学.com">世界一流退学</a>的学生，大物实验自然是逃不过。本人有幸选择了大物实验最多的专业方向，从一级做到六级，直到上学期刚刚结束。大物实验里数据处理是占了很多时间的，那么怎么才能「优雅」地完成这一工作呢？</p>
  <h2 id="开始">开始</h2>
  <p>大一的时候讲座推荐的软件是 Origin，画图、拟合等虽然方便，但完全鼠标操作，并且只有 Windows 上能运行，Wine 上有时会遇到个别功能不好用，很影响体验（当时我的电脑配置不好，玩不起虚拟机）。思来想去，还是觉得 Python 比较自然，于是故事就这样开始了。</p>
  <p>在一级、二级大物中，需求基本是：散点画图（有时是对数）、线性拟合、组合的画图，以及不确定度计算和一般的数值计算。输入输出方面，要手工输入有时候多达几页的手写数据，画出来的图打印上交、计算出的结果手写进实验报告。</p>
  <p>Python 做这些事情其实都不困难。画图用 Matplotlib 非常方便，也可以直接保存图片。数据计算自然是 NumPy。线性拟合的话，最初选择了 SciPy。输入输出还是比较基本，从文件读入。</p>
  <p>于是，第一个实验「自由落体」的数据和处理就是这样：</p>
  <div class="language-plaintext highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>#光电门间距 H cm
#数量级
e -2
90.0	90.0	90.0	80.0	80.0	80.0	70.0	70.0	70.0	60.0	60.0	60.0	50.0	50.0	50.0	40.0	40.0	40.0	30.0	30.0	30.0	20.0	20.0	20.0
#时间差 t ms
e -3
331.6	331.5	331.8	307.9	307.9	307.9	282.9	282.8	282.9	255.8	255.7	255.7	226.9	227.0	226.9	195.2	195.2	195.2	159.9	159.9	159.8	119.2	119.1	119.2
</code></pre>
    </div>
  </div>
  <div class="language-python highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="c1">#!/usr/bin/env python3
# -*- coding: utf-8 -*-
</span><span class="kn">import</span> <span class="n">numpy</span> <span class="k">as</span> <span class="n">np</span>
<span class="kn">import</span> <span class="n">matplotlib.pyplot</span> <span class="k">as</span> <span class="n">plt</span>
<span class="kn">from</span> <span class="n">scipy</span> <span class="kn">import</span> <span class="n">stats</span>
<span class="kn">import</span> <span class="n">math</span>
<span class="c1">#avoid font problem
</span><span class="n">plt</span><span class="p">.</span><span class="n">rcParams</span><span class="p">[</span><span class="sh">'</span><span class="s">font.sans-serif</span><span class="sh">'</span><span class="p">]</span> <span class="o">=</span> <span class="p">[</span><span class="sh">'</span><span class="s">SimHei</span><span class="sh">'</span><span class="p">]</span>
<span class="n">plt</span><span class="p">.</span><span class="n">rcParams</span><span class="p">[</span><span class="sh">'</span><span class="s">axes.unicode_minus</span><span class="sh">'</span><span class="p">]</span> <span class="o">=</span> <span class="bp">False</span>
<span class="c1">#read data
</span><span class="n">data</span> <span class="o">=</span> <span class="p">[]</span>
<span class="c1">#order of magnitude
</span><span class="n">oom</span> <span class="o">=</span> <span class="mi">0</span>
<span class="n">fin</span> <span class="o">=</span> <span class="nf">open</span><span class="p">(</span><span class="sh">'</span><span class="s">./data.txt</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">r</span><span class="sh">'</span><span class="p">)</span>
<span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">fin</span><span class="p">.</span><span class="nf">readlines</span><span class="p">():</span>
    <span class="k">if</span> <span class="n">i</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="sh">'</span><span class="s">#</span><span class="sh">'</span><span class="p">:</span>
        <span class="c1">#line start with # is comment
</span>        <span class="k">pass</span>
    <span class="k">elif</span> <span class="n">i</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="sh">'</span><span class="s">e</span><span class="sh">'</span><span class="p">:</span>
        <span class="n">oom</span> <span class="o">=</span> <span class="nf">int</span><span class="p">(</span><span class="n">i</span><span class="p">.</span><span class="nf">split</span><span class="p">()[</span><span class="mi">1</span><span class="p">])</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="n">data</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="nf">array</span><span class="p">([</span><span class="nf">float</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">*</span> <span class="nf">pow</span><span class="p">(</span><span class="mi">10</span><span class="p">,</span> <span class="n">oom</span><span class="p">)</span> <span class="k">for</span> <span class="n">x</span> <span class="ow">in</span> <span class="n">i</span><span class="p">.</span><span class="nf">split</span><span class="p">()]))</span>
        <span class="n">oom</span> <span class="o">=</span> <span class="mi">0</span>
<span class="c1"># print(data)
### main processing ###
</span><span class="n">l</span> <span class="o">=</span> <span class="nf">len</span><span class="p">(</span><span class="n">data</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span>
<span class="n">x</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>
<span class="n">y0</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
<span class="n">data</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">y0</span> <span class="o">/</span> <span class="n">x</span><span class="p">)</span>
<span class="n">y1</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span>
<span class="n">slope</span><span class="p">,</span> <span class="n">intercept</span><span class="p">,</span> <span class="n">r_value</span><span class="p">,</span> <span class="n">p_value</span><span class="p">,</span> <span class="n">std_err</span> <span class="o">=</span> <span class="n">stats</span><span class="p">.</span><span class="nf">linregress</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y1</span><span class="p">)</span>
<span class="c1"># z = np.polyfit(x, y1, 1)
</span><span class="n">s_slope</span> <span class="o">=</span> <span class="n">slope</span> <span class="o">*</span> <span class="n">math</span><span class="p">.</span><span class="nf">sqrt</span><span class="p">((</span><span class="n">r_value</span> <span class="o">**</span> <span class="o">-</span><span class="mi">2</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="n">l</span> <span class="o">-</span> <span class="mi">2</span><span class="p">))</span>
<span class="n">s_intercept</span> <span class="o">=</span> <span class="n">s_slope</span> <span class="o">*</span> <span class="n">math</span><span class="p">.</span><span class="nf">sqrt</span><span class="p">(</span><span class="n">np</span><span class="p">.</span><span class="nf">mean</span><span class="p">(</span><span class="n">x</span> <span class="o">**</span> <span class="mi">2</span><span class="p">))</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">linear regression:</span><span class="sh">'</span> <span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">slope:</span><span class="sh">'</span><span class="p">,</span> <span class="n">slope</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">intercept:</span><span class="sh">'</span><span class="p">,</span> <span class="n">intercept</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">r-value:</span><span class="sh">'</span><span class="p">,</span> <span class="n">r_value</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">p-value:</span><span class="sh">'</span><span class="p">,</span> <span class="n">p_value</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">std-err:</span><span class="sh">'</span><span class="p">,</span> <span class="n">std_err</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">r-squared:</span><span class="sh">'</span><span class="p">,</span> <span class="n">r_value</span> <span class="o">**</span> <span class="mi">2</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">斜率标准差：</span><span class="sh">'</span><span class="p">,</span> <span class="n">s_slope</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">截距标准差：</span><span class="sh">'</span><span class="p">,</span> <span class="n">s_intercept</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">算得重力加速度：</span><span class="sh">'</span><span class="p">,</span> <span class="mi">2</span> <span class="o">*</span> <span class="n">slope</span><span class="p">)</span>
<span class="c1">#plot
</span><span class="n">plt</span><span class="p">.</span><span class="nf">scatter</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y1</span><span class="p">,</span> <span class="n">marker</span><span class="o">=</span><span class="sh">'</span><span class="s">*</span><span class="sh">'</span><span class="p">,</span> <span class="n">color</span><span class="o">=</span><span class="sh">'</span><span class="s">black</span><span class="sh">'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="sh">'</span><span class="s">原始数据</span><span class="sh">'</span><span class="p">)</span>
<span class="c1"># plt.plot(x, y1, '--', color='green', label='光滑曲线')
</span><span class="n">plt</span><span class="p">.</span><span class="nf">plot</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">intercept</span> <span class="o">+</span> <span class="n">slope</span> <span class="o">*</span> <span class="n">x</span><span class="p">,</span> <span class="sh">'</span><span class="s">r</span><span class="sh">'</span><span class="p">,</span> <span class="n">label</span><span class="o">=</span><span class="sh">'</span><span class="s">拟合直线</span><span class="sh">'</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="nf">xlabel</span><span class="p">(</span><span class="sh">'</span><span class="s">时间 t/s</span><span class="sh">'</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="nf">ylabel</span><span class="p">(</span><span class="sh">'</span><span class="s">平均速度 H/t / m/s</span><span class="sh">'</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="nf">legend</span><span class="p">(</span><span class="n">loc</span><span class="o">=</span><span class="mi">4</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="nf">title</span><span class="p">(</span><span class="sh">'</span><span class="s">小球下落平均速度与时间关系图</span><span class="sh">'</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="nf">savefig</span><span class="p">(</span><span class="sh">'</span><span class="s">pic.png</span><span class="sh">'</span><span class="p">)</span>
<span class="n">plt</span><span class="p">.</span><span class="nf">show</span><span class="p">()</span>
</code></pre>
    </div>
  </div>
  <p><img src="/static/planet/2021-01-25-physexp-using-python-1.png" alt="1" /></p>
  <p>只是画了张图就这么麻烦，很明显，除了练习了 Python 之外，和 Origin 相比生产力负提升。</p>
  <p>之后的问题就是简化这些过程了。</p>
  <h2 id="两年之后">两年之后</h2>
  <p>经过一年多的开发，我将一些常用的画图和数据处理操作打包成库，并添加了方便的文件输入和自动生成 docx 文件的功能。同时将 Python 包 <code class="language-plaintext highlighter-rouge">physicsexp</code> 发布到了 PyPI（和 AUR 一样，在 PyPI 发布包的门槛几乎没有）。</p>
  <p>这里以三级大物 β 射线吸收为例。现在，读入文件只需要这样：</p>
  <div class="language-python highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="n">fin</span> <span class="o">=</span> <span class="nf">open</span><span class="p">(</span><span class="sh">'</span><span class="s">./data.txt</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">r</span><span class="sh">'</span><span class="p">,</span> <span class="n">encoding</span><span class="o">=</span><span class="sh">'</span><span class="s">utf-8</span><span class="sh">'</span><span class="p">)</span>
<span class="n">pos</span> <span class="o">=</span> <span class="nf">readoneline</span><span class="p">(</span><span class="n">fin</span><span class="p">)</span>
<span class="n">N</span> <span class="o">=</span> <span class="nf">readoneline</span><span class="p">(</span><span class="n">fin</span><span class="p">)</span>
<span class="n">Al_num</span> <span class="o">=</span> <span class="nf">readoneline</span><span class="p">(</span><span class="n">fin</span><span class="p">)</span>
<span class="n">Cnt</span> <span class="o">=</span> <span class="nf">readoneline</span><span class="p">(</span><span class="n">fin</span><span class="p">)</span>
<span class="n">fin</span><span class="p">.</span><span class="nf">close</span><span class="p">()</span>
</code></pre>
    </div>
  </div>
  <p>数据做图也只要一行代码，一张图上多个曲线也只是一个参数的事，比如这是一张有三条线的图：</p>
  <div class="language-python highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="nf">simple_plot</span><span class="p">(</span><span class="n">Momentum</span><span class="p">,</span> <span class="n">Emeasure</span><span class="p">,</span> <span class="n">show</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">issetrange</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">dot</span><span class="o">=</span><span class="sh">'</span><span class="s">+</span><span class="sh">'</span><span class="p">,</span> <span class="n">lab</span><span class="o">=</span><span class="sh">'</span><span class="s">测量动能</span><span class="sh">'</span><span class="p">)</span>
<span class="nf">simple_plot</span><span class="p">(</span><span class="n">Momentum</span><span class="p">,</span> <span class="n">Eclassic</span><span class="p">,</span> <span class="n">show</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">issetrange</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">dot</span><span class="o">=</span><span class="sh">'</span><span class="s">*</span><span class="sh">'</span><span class="p">,</span> <span class="n">lab</span><span class="o">=</span><span class="sh">'</span><span class="s">经典动能</span><span class="sh">'</span><span class="p">)</span>
<span class="nf">simple_plot</span><span class="p">(</span><span class="n">Momentum</span><span class="p">,</span> <span class="n">Erela</span><span class="p">,</span> <span class="n">dot</span><span class="o">=</span><span class="sh">'</span><span class="s">o</span><span class="sh">'</span><span class="p">,</span> <span class="n">save</span><span class="o">=</span><span class="sh">'</span><span class="s">1.png</span><span class="sh">'</span><span class="p">,</span> <span class="n">issetrange</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">xlab</span><span class="o">=</span><span class="sh">'</span><span class="s">$pc/MeV$</span><span class="sh">'</span><span class="p">,</span> <span class="n">ylab</span><span class="o">=</span><span class="sh">'</span><span class="s">$E/MeV$</span><span class="sh">'</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="sh">'</span><span class="s">电子动能随动量变化曲线</span><span class="sh">'</span><span class="p">,</span> <span class="n">lab</span><span class="o">=</span><span class="sh">'</span><span class="s">相对论动能</span><span class="sh">'</span><span class="p">)</span>
</code></pre>
    </div>
  </div>
  <p>画图并线性拟合也是非常常见的操作，于是也加入了库：</p>
  <div class="language-python highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="n">slope</span><span class="p">,</span> <span class="n">intercept</span> <span class="o">=</span> <span class="nf">simple_linear_plot</span><span class="p">(</span><span class="n">Al_Real</span><span class="p">,</span> <span class="n">CntLn</span><span class="p">,</span> <span class="n">xlab</span><span class="o">=</span><span class="sh">'</span><span class="s">质量厚度$g/cm^{-2}$</span><span class="sh">'</span><span class="p">,</span> <span class="n">ylab</span><span class="o">=</span><span class="sh">'</span><span class="s">选区计数率对数 (射线强度)</span><span class="sh">'</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="sh">'</span><span class="s">半对数曲线曲线</span><span class="sh">'</span><span class="p">,</span> <span class="n">save</span><span class="o">=</span><span class="sh">'</span><span class="s">3.png</span><span class="sh">'</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="o">-</span><span class="n">slope</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">math</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="mf">1e4</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="o">-</span><span class="n">slope</span><span class="p">))</span>
<span class="nf">print</span><span class="p">((</span><span class="n">math</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="n">Cnt</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span> <span class="o">-</span> <span class="mi">4</span> <span class="o">*</span> <span class="n">math</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span> <span class="o">-</span> <span class="n">intercept</span><span class="p">)</span> <span class="o">/</span> <span class="n">slope</span><span class="p">)</span>
</code></pre>
    </div>
  </div>
  <p>最后，一行代码将所有的数据和图片生成直接可以打印的 docx 文件，包含名字和日期：</p>
  <div class="language-python highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="nf">gendocx</span><span class="p">(</span><span class="sh">'</span><span class="s">gen.docx</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">1.png</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">2.png</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">3.png</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">slope, intercept: %f %f</span><span class="sh">'</span> <span class="o">%</span> <span class="p">(</span><span class="n">slope</span><span class="p">,</span> <span class="n">intercept</span><span class="p">))</span>
</code></pre>
    </div>
  </div>
  <p>文件如图所示，可以直接拿去打印了：</p>
  <p><img src="/static/planet/2021-01-25-physexp-using-python-2.png" alt="2" /></p>
  <p>下面是整体代码：</p>
  <div class="language-python highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="c1">#!/usr/bin/env python3
# -*- coding: utf-8 -*-
</span>
<span class="kn">from</span> <span class="n">physicsexp.mainfunc</span> <span class="kn">import</span> <span class="o">*</span>
<span class="kn">from</span> <span class="n">physicsexp.gendocx</span> <span class="kn">import</span> <span class="o">*</span>

<span class="c1"># read data
</span><span class="n">fin</span> <span class="o">=</span> <span class="nf">open</span><span class="p">(</span><span class="sh">'</span><span class="s">./data.txt</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">r</span><span class="sh">'</span><span class="p">,</span> <span class="n">encoding</span><span class="o">=</span><span class="sh">'</span><span class="s">utf-8</span><span class="sh">'</span><span class="p">)</span>
<span class="n">pos</span> <span class="o">=</span> <span class="nf">readoneline</span><span class="p">(</span><span class="n">fin</span><span class="p">)</span>
<span class="n">N</span> <span class="o">=</span> <span class="nf">readoneline</span><span class="p">(</span><span class="n">fin</span><span class="p">)</span>
<span class="n">Al_num</span> <span class="o">=</span> <span class="nf">readoneline</span><span class="p">(</span><span class="n">fin</span><span class="p">)</span>
<span class="n">Cnt</span> <span class="o">=</span> <span class="nf">readoneline</span><span class="p">(</span><span class="n">fin</span><span class="p">)</span>
<span class="n">fin</span><span class="p">.</span><span class="nf">close</span><span class="p">()</span>

<span class="c1"># data process
</span>
<span class="c1"># calculated calibration values in class
</span><span class="n">a</span> <span class="o">=</span> <span class="mf">2.373e-3</span>
<span class="n">b</span> <span class="o">=</span> <span class="o">-</span><span class="p">.</span><span class="mi">0161</span>
<span class="n">dEk</span> <span class="o">=</span> <span class="p">.</span><span class="mi">20</span>

<span class="n">c0</span> <span class="o">=</span> <span class="mf">299792458.</span>
<span class="n">MeV</span> <span class="o">=</span> <span class="mf">1e6</span> <span class="o">*</span> <span class="n">electron</span>

<span class="n">Emeasure</span> <span class="o">=</span> <span class="n">a</span> <span class="o">*</span> <span class="n">N</span> <span class="o">+</span> <span class="n">b</span> <span class="o">+</span> <span class="n">dEk</span>
<span class="n">x0</span> <span class="o">=</span> <span class="p">.</span><span class="mi">10</span>
<span class="n">R</span> <span class="o">=</span> <span class="p">(</span><span class="n">pos</span> <span class="o">-</span> <span class="n">x0</span><span class="p">)</span> <span class="o">/</span> <span class="mi">2</span>
<span class="n">B</span> <span class="o">=</span> <span class="mf">640.01e-4</span>
<span class="n">Momentum</span> <span class="o">=</span> <span class="mi">300</span> <span class="o">*</span> <span class="n">B</span> <span class="o">*</span> <span class="n">R</span>
<span class="n">Eclassic</span> <span class="o">=</span> <span class="p">((</span><span class="n">Momentum</span> <span class="o">*</span> <span class="n">MeV</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span> <span class="o">/</span> <span class="p">(</span><span class="mi">2</span> <span class="o">*</span> <span class="n">me</span> <span class="o">*</span> <span class="n">c0</span><span class="o">**</span><span class="mi">2</span><span class="p">))</span> <span class="o">/</span> <span class="n">MeV</span>
<span class="n">Erela</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">array</span><span class="p">([</span><span class="n">math</span><span class="p">.</span><span class="nf">sqrt</span><span class="p">((</span><span class="n">i</span> <span class="o">*</span> <span class="n">MeV</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span> <span class="o">+</span> <span class="p">(</span><span class="n">me</span> <span class="o">*</span> <span class="n">c0</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span> <span class="o">-</span> <span class="n">me</span> <span class="o">*</span> <span class="n">c0</span><span class="o">**</span><span class="mi">2</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="n">Momentum</span><span class="p">])</span> <span class="o">/</span> <span class="n">MeV</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">pos</span><span class="se">\t</span><span class="sh">'</span><span class="p">,</span> <span class="n">pos</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">R</span><span class="se">\t</span><span class="sh">'</span><span class="p">,</span> <span class="n">R</span><span class="o">*</span><span class="mi">100</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">pc</span><span class="se">\t</span><span class="sh">'</span><span class="p">,</span> <span class="n">Momentum</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">N</span><span class="se">\t</span><span class="sh">'</span><span class="p">,</span> <span class="n">N</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">Eclas</span><span class="se">\t</span><span class="sh">'</span><span class="p">,</span> <span class="n">Eclassic</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">Erela</span><span class="se">\t</span><span class="sh">'</span><span class="p">,</span> <span class="n">Erela</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="sh">'</span><span class="s">Emes</span><span class="se">\t</span><span class="sh">'</span><span class="p">,</span> <span class="n">Emeasure</span><span class="p">)</span>

<span class="nf">simple_plot</span><span class="p">(</span><span class="n">Momentum</span><span class="p">,</span> <span class="n">Emeasure</span><span class="p">,</span> <span class="n">show</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">issetrange</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">dot</span><span class="o">=</span><span class="sh">'</span><span class="s">+</span><span class="sh">'</span><span class="p">,</span> <span class="n">lab</span><span class="o">=</span><span class="sh">'</span><span class="s">测量动能</span><span class="sh">'</span><span class="p">)</span>
<span class="nf">simple_plot</span><span class="p">(</span><span class="n">Momentum</span><span class="p">,</span> <span class="n">Eclassic</span><span class="p">,</span> <span class="n">show</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">issetrange</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span> <span class="n">dot</span><span class="o">=</span><span class="sh">'</span><span class="s">*</span><span class="sh">'</span><span class="p">,</span> <span class="n">lab</span><span class="o">=</span><span class="sh">'</span><span class="s">经典动能</span><span class="sh">'</span><span class="p">)</span>
<span class="nf">simple_plot</span><span class="p">(</span><span class="n">Momentum</span><span class="p">,</span> <span class="n">Erela</span><span class="p">,</span> <span class="n">dot</span><span class="o">=</span><span class="sh">'</span><span class="s">o</span><span class="sh">'</span><span class="p">,</span> <span class="n">save</span><span class="o">=</span><span class="sh">'</span><span class="s">1.png</span><span class="sh">'</span><span class="p">,</span> <span class="n">issetrange</span><span class="o">=</span><span class="mi">0</span><span class="p">,</span>
            <span class="n">xlab</span><span class="o">=</span><span class="sh">'</span><span class="s">$pc/MeV$</span><span class="sh">'</span><span class="p">,</span> <span class="n">ylab</span><span class="o">=</span><span class="sh">'</span><span class="s">$E/MeV$</span><span class="sh">'</span><span class="p">,</span> <span class="n">title</span><span class="o">=</span><span class="sh">'</span><span class="s">电子动能随动量变化曲线</span><span class="sh">'</span><span class="p">,</span> <span class="n">lab</span><span class="o">=</span><span class="sh">'</span><span class="s">相对论动能</span><span class="sh">'</span><span class="p">)</span>

<span class="n">Len</span> <span class="o">=</span> <span class="mi">150</span>
<span class="n">Cnt</span> <span class="o">=</span> <span class="n">Cnt</span> <span class="o">/</span> <span class="n">Len</span>
<span class="nf">simple_plot</span><span class="p">(</span><span class="n">Al_num</span><span class="p">,</span> <span class="n">Cnt</span><span class="p">,</span> <span class="n">xlab</span><span class="o">=</span><span class="sh">'</span><span class="s">铝片数</span><span class="sh">'</span><span class="p">,</span> <span class="n">ylab</span><span class="o">=</span><span class="sh">'</span><span class="s">选区计数率 (射线强度)</span><span class="sh">'</span><span class="p">,</span>
            <span class="n">title</span><span class="o">=</span><span class="sh">'</span><span class="s">$</span><span class="se">\\</span><span class="s">beta$射线强度随铝片数衰减曲线</span><span class="sh">'</span><span class="p">,</span> <span class="n">save</span><span class="o">=</span><span class="sh">'</span><span class="s">2.png</span><span class="sh">'</span><span class="p">)</span>
<span class="n">CntLn</span> <span class="o">=</span> <span class="n">np</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="n">Cnt</span><span class="p">)</span>
<span class="c1"># d = 50 mg / cm^2
</span><span class="n">d</span> <span class="o">=</span> <span class="mi">50</span>
<span class="n">Al_Real</span> <span class="o">=</span> <span class="n">Al_num</span> <span class="o">*</span> <span class="n">d</span>
<span class="n">slope</span><span class="p">,</span> <span class="n">intercept</span> <span class="o">=</span> <span class="nf">simple_linear_plot</span><span class="p">(</span><span class="n">Al_Real</span><span class="p">,</span> <span class="n">CntLn</span><span class="p">,</span> <span class="n">xlab</span><span class="o">=</span><span class="sh">'</span><span class="s">质量厚度$g/cm^{-2}$</span><span class="sh">'</span><span class="p">,</span> <span class="n">ylab</span><span class="o">=</span><span class="sh">'</span><span class="s">选区计数率对数 (射线强度)</span><span class="sh">'</span><span class="p">,</span>
                                      <span class="n">title</span><span class="o">=</span><span class="sh">'</span><span class="s">半对数曲线曲线</span><span class="sh">'</span><span class="p">,</span> <span class="n">save</span><span class="o">=</span><span class="sh">'</span><span class="s">3.png</span><span class="sh">'</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="o">-</span><span class="n">slope</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="n">math</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="mf">1e4</span><span class="p">)</span> <span class="o">/</span> <span class="p">(</span><span class="o">-</span><span class="n">slope</span><span class="p">))</span>
<span class="nf">print</span><span class="p">((</span><span class="n">math</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="n">Cnt</span><span class="p">[</span><span class="mi">0</span><span class="p">])</span> <span class="o">-</span> <span class="mi">4</span> <span class="o">*</span> <span class="n">math</span><span class="p">.</span><span class="nf">log</span><span class="p">(</span><span class="mi">10</span><span class="p">)</span> <span class="o">-</span> <span class="n">intercept</span><span class="p">)</span> <span class="o">/</span> <span class="n">slope</span><span class="p">)</span>

<span class="nf">gendocx</span><span class="p">(</span><span class="sh">'</span><span class="s">gen.docx</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">1.png</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">2.png</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">3.png</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">slope, intercept: %f %f</span><span class="sh">'</span> <span class="o">%</span> <span class="p">(</span><span class="n">slope</span><span class="p">,</span> <span class="n">intercept</span><span class="p">))</span>
</code></pre>
    </div>
  </div>
  <p>整个代码不含空行和注释共 46 行，而上面的自由落体代码 45 行。可见包装程度还是不错的。先计算然后画这三张图，写上面的代码和 Origin 相比哪个更快呢？还是看个人习惯，反正我是比较习惯代码。至少，这说明了大物实验需要的数据处理还是有很多相似点的，以至于打包一些函数可以提高一些生产力。</p>
  <p>当然，这只是一个例子，对于个别实验，不管怎么处理都是要多花写时间的（直流辉光等离子体，说的就是你）。</p>
  <p>你可能会问，为什么没有误差分析的功能。这确实是一个问题：计算 A 类和 B 类不确定度等不同实验差别很大，水平有限，我能做的除了将常数打表提供最基础的功能外，好像就没什么了。并且，开发得差不多时已经到二级中期了，算不确定度也不常见了（所以，别被一级吓怕了），所以这部分也没怎么测试过，属于 TODO。</p>
  <h2 id="迟到的-jupyter-notebook">迟到的 Jupyter Notebook</h2>
  <p>以上还是标准的 Python，但五级大物时，我尝试了仅一次就发现明显 Jupyter Notebook 更适合做类似的工作——尽管它不能用 Vim 编辑代码！这下子，数据直接输入在 Notebook 里就好了，画图也是所见即所得，不用等一张一张弹出来了。而进行临时的运算也不必影响正常流程。如图：</p>
  <p><img src="/static/planet/2021-01-25-physexp-using-python-3.png" alt="3" /></p>
  <p>既然都到了 Jupyter，如果<strong>多人合作</strong>的话，<a href="https://github.com/jupyterhub/jupyterhub">JupyterHub</a> 是非常不错的选择，可以多个人在一台服务器上使用 Jupyter Notebook。我之前配置的是每个用户一个隔离的 Docker 容器，里面的 Python 已经装好了包，可以直接使用，同时挂载了一个共享空间可以分享写好的 Notebook。其实 JupyterHub 有用 Github 帐号登录之类的权限管理功能，但当时我们是几个认识的人合作，就没有管这些。</p>
  <p>具体的代码在我的 <a href="https://github.com/ustcpetergu/physicsexp">GitHub</a> 上，如果有人在写大物实验报告的过程中无聊了想找个地方摸鱼浪费点时间，不妨来看看。</p>
  <h2 id="总结">总结</h2>
  <p>如果您想尝试用 Python 处理大物实验数据，我可以比较负责地说对于 95% 以上的实验是完全没有问题的。使用 NumPy 和 SciPy 计算，Matplotlib 做图，配以 docx 生成、Jupyter Notebook 或 JupyterHub 团队合作，可以比较轻松（但不意味着节省时间）地完成所有需要的操作，并可以通过包装库提高效率。</p>
  <p>之前也有学长学姐尝试过类似的大物实验自动化项目，但因为暂时无法全部找到并对比，这里就不说了。大一的时候确实是想搞一套自动化程度很高的东西，但水平实在有限，并且不同的实验处理过程不太一样，一己之力完成每一个实验专属的程序也不太现实，所以结果就是自己挖了个坑并跳进去出不来：有时想想，或许还是左手卡西欧 991 右手座标纸来得快一些呢！</p>
  ]]></content><author><name>petergu</name></author><category term="USTC" /><category term="大物实验" /><category term="Python" /><summary type="html"><![CDATA[身为某世界一流退学的学生，大物实验自然是逃不过。本人有幸选择了大物实验最多的专业方向，从一级做到六级，直到上学期刚刚结束。大物实验里数据处理是占了很多时间的，那么怎么才能「优雅」地完成这一工作呢？]]></summary></entry><entry><title type="html">在 Linux 内核中测试程序性能</title><link href="https://lug.ustc.edu.cn/planet/2020/12/tic-toc-in-kernel/" rel="alternate" type="text/html" title="在 Linux 内核中测试程序性能" /><published>2020-12-19T00:00:00+08:00</published><updated>2024-09-17T14:00:46+08:00</updated><id>https://lug.ustc.edu.cn/planet/2020/12/tic-toc-in-kernel</id><content type="html" xml:base="https://lug.ustc.edu.cn/planet/2020/12/tic-toc-in-kernel/"><![CDATA[<p>本学期，我担任了李诚老师编译原理课程的助教。在课程实验中，我们基于 LLVM 构建了一套编译系统，其中一个实验需要编写后端优化算法。为了评估学生们的优化代码，我们需要比较优化前后的代码（在这里是 LLVM IR）的性能。我们通过统计程序运行的时间来比较代码的性能，但是用户程序会受到<a href="https://101.lug.ustc.edu.cn/Ch04/#schedule">内核调度</a>。因为不是连续执行程序，所以受到调度造成的延迟会导致统计到的时间出现噪音，这些噪音可能会让测试结果不准确<del>甚至影响到了同学们的分数</del>。最开始，我想到是不是可以统计指令数来评估性能，打算用 <code class="language-plaintext highlighter-rouge">perf stat</code> 进行测试。然而，我们提供的实验环境基于虚拟机，<a href="https://www.virtualbox.org/ticket/10754?cversion=0&amp;cnum_hist=5">VirtualBox</a> 和 <a href="https://github.com/microsoft/WSL/issues/4678">WSL</a> 都没有实现相关的虚拟寄存器，要让同学们方便地使用该指令会比较困难，只得作罢。这时，我突然想到，内核线程可以不受到调度，那么使用内核线程是不是可以更精确的测量时间呢？于是我便开始了尝试。</p>
  <p>注：由于原本目的是用于 LLVM IR，所以使用了 <code class="language-plaintext highlighter-rouge">clang</code> 来编译内核，没有这种需求可以完全无视 <code class="language-plaintext highlighter-rouge">clang</code>。</p>
  <h2 id="编写内核模块">编写内核模块</h2>
  <h3 id="超简单的内核模块编写方法">超简单的内核模块编写方法</h3>
  <p>为了创建内核线程，我们可以构建一个内核模块，由它来执行相关的函数。构建一个基础的内核模块非常简单，这里我们 参考 <a href="https://tldp.org/LDP/lkmpg/2.6/html/x121.html"><strong>The Linux Kernel Module Programming Guide</strong></a>：</p>
  <div class="language-c highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="cm">/*
 *  hello_mod.c - The simplest kernel module.
 */</span>
<span class="cp">#include</span> <span class="cpf">&lt;linux/module.h&gt;</span><span class="c1">	/* Needed by all modules */</span><span class="cp">
#include</span> <span class="cpf">&lt;linux/kernel.h&gt;</span><span class="c1">	/* Needed for KERN_INFO */</span><span class="cp">
</span>
<span class="kt">int</span> <span class="nf">init_module</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span>
<span class="p">{</span>
	<span class="n">printk</span><span class="p">(</span><span class="n">KERN_INFO</span> <span class="s">"Hello world.</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>

	<span class="cm">/*
	 * A non 0 return means init_module failed; module can't be loaded.
	 */</span>
	<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="p">}</span>

<span class="kt">void</span> <span class="nf">cleanup_module</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span>
<span class="p">{</span>
	<span class="n">printk</span><span class="p">(</span><span class="n">KERN_INFO</span> <span class="s">"Goodbye world.</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
<span class="p">}</span>
</code></pre>
    </div>
  </div>
  <p>然后再加上一个简单的 Makefile：</p>
  <div class="language-makefile highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="c"># obj-m 告诉构建系统以模块编译，编译目标是 hello.ko
</span><span class="nv">obj-m</span> <span class="o">+=</span> hello.o
<span class="nv">hello-objs</span> <span class="o">:=</span> hello_mod.o

<span class="nl">all</span><span class="o">:</span>
	make <span class="nt">-C</span> /lib/modules/<span class="p">$(</span>shell <span class="nb">uname</span> <span class="nt">-r</span><span class="p">)</span>/build <span class="nv">M</span><span class="o">=</span><span class="p">$(</span>PWD<span class="p">)</span> modules

<span class="nl">clean</span><span class="o">:</span>
	make <span class="nt">-C</span> /lib/modules/<span class="p">$(</span>shell <span class="nb">uname</span> <span class="nt">-r</span><span class="p">)</span>/build <span class="nv">M</span><span class="o">=</span><span class="p">$(</span>PWD<span class="p">)</span> clean
</code></pre>
    </div>
  </div>
  <p>这样，一个超简单的内核模块就完成啦！</p>
  <p>为了编译内核模块，我们需要有头文件，在 Ubuntu 上可以安装 <code class="language-plaintext highlighter-rouge">linux-headers-&lt;version&gt;</code> 来实现。又或者你像我一样需要用到 <code class="language-plaintext highlighter-rouge">clang</code> ，那就下载 Linux 源码重新编译安装（PS：如果全量编译搭配 VSCode 补全体验很好）。在这之后，运行 <code class="language-plaintext highlighter-rouge">sudo make CC=/path/to/clang</code> 就可以编译这个模块了。编译完成后，会发现目录下多出来一个叫 <code class="language-plaintext highlighter-rouge">hello.ko</code> 的文件，它就是我们编译得到的内核模块。</p>
  <p>在这个模块中，<code class="language-plaintext highlighter-rouge">init_module</code> 会在模块被载入时调用，而 <code class="language-plaintext highlighter-rouge">cleanup_module</code> 会被模块被卸载时被调用。通过 <code class="language-plaintext highlighter-rouge">insmod hello.ko</code>，可以载入它，而 <code class="language-plaintext highlighter-rouge">rmmod hello</code> 可以卸载这个模块。<code class="language-plaintext highlighter-rouge">printk</code>会将日志打印到内核的日志中，调用 <code class="language-plaintext highlighter-rouge">dmesg</code> 可以看到模块打印的欢迎和告别信息。</p>
  <h3 id="tic-toc-">Tic-Toc …</h3>
  <p>虽然已经编写了一个模块，但是它啥都干不来，我们还得实现计时功能。在编写用户态程序时，语言库里通常有获取当前时间的函数，内核也一样。我们可以通过<code class="language-plaintext highlighter-rouge">ktime.h</code> 中的 <code class="language-plaintext highlighter-rouge">ktime_get</code> 来获取时间，大概的逻辑如下：</p>
  <div class="language-c highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="n">ktime_t</span> <span class="n">start</span><span class="p">,</span> <span class="n">end</span><span class="p">;</span>
<span class="n">u64</span> <span class="n">dur_ns</span><span class="p">;</span>
<span class="n">start</span> <span class="o">=</span> <span class="n">ktime_get</span><span class="p">();</span>
<span class="n">foo</span><span class="p">();</span> <span class="c1">// 被测试的函数</span>
<span class="n">end</span> <span class="o">=</span> <span class="n">ktime_get</span><span class="p">();</span>
<span class="n">dur_ns</span> <span class="o">=</span> <span class="n">ktime_to_ns</span><span class="p">(</span><span class="n">ktime_sub</span><span class="p">(</span><span class="n">end</span><span class="p">,</span> <span class="n">start</span><span class="p">));</span>
<span class="n">printk</span><span class="p">(</span><span class="n">KERN_INFO</span> <span class="s">"foo runtime: %lluns"</span><span class="p">,</span> <span class="n">dur_ns</span><span class="p">);</span>
</code></pre>
    </div>
  </div>
  <p>看得出来，内核里的函数接口也很平易近人，简单几行就把我们最核心的算法实现完了。</p>
  <h3 id="怎么使用它">怎么使用它？</h3>
  <p>现在，我们已经知道了如何统计时间，还得设计一个接口来让人触发相关的功能。虽然让模块在初始化时运行也不是不行，但是扩展性就变得太糟糕了。我决定用 <a href="https://en.wikipedia.org/wiki/Procfs">procfs</a> 来设计一个接口。Procfs 来源于 UNIX 哲学中一切皆文件的思想，它把系统运行时的一些信息用文件目录的形式展示出来，这样可以通过简单的文件操作（比如 <code class="language-plaintext highlighter-rouge">cat</code>、管道等）来访问和控制系统相关的参数，你可以通过 <code class="language-plaintext highlighter-rouge">/proc/</code> 目录访问里面的内容。</p>
  <p>在初始化的时候，我们通过 <code class="language-plaintext highlighter-rouge">proc_mkdir</code> 和 <code class="language-plaintext highlighter-rouge">proc_create</code> 可以像普通的文件系统一样在 <code class="language-plaintext highlighter-rouge">/proc/</code> 目录下创建目录和文件：</p>
  <div class="language-c highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="c1">// 在 /proc/ 下创建一个目录</span>
<span class="k">if</span> <span class="p">((</span><span class="n">root</span> <span class="o">=</span> <span class="n">proc_mkdir</span><span class="p">(</span><span class="n">proc_dirname</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">))</span> <span class="o">==</span> <span class="nb">NULL</span><span class="p">)</span>
    <span class="k">return</span> <span class="o">-</span><span class="n">EEXIST</span><span class="p">;</span>
<span class="c1">// 在 /proc/&lt;proc_dirname&gt;/ 目录下创建一个文件，并让该文件使用 bench_fops</span>
<span class="k">if</span> <span class="p">(</span><span class="n">proc_create</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="mo">0444</span><span class="p">,</span> <span class="n">root</span><span class="p">,</span> <span class="o">&amp;</span><span class="n">bench_fops</span><span class="p">)</span> <span class="o">==</span> <span class="nb">NULL</span><span class="p">)</span>
    <span class="k">return</span> <span class="o">-</span><span class="n">ENOMEM</span><span class="p">;</span>
</code></pre>
    </div>
  </div>
  <p>这里的 <code class="language-plaintext highlighter-rouge">bench_fops</code> 是一个类型为 <code class="language-plaintext highlighter-rouge">struct file_operations</code> 的结构体。它能为这个文件注册功能，比如 open、read、write 等。在内核中设计一个文件系统也需要实现类似的操作，万幸的是，procfs 下不需要实现 <a href="https://en.wikipedia.org/wiki/POSIX">POSIX</a> 语义。在这里，我希望只要有 <code class="language-plaintext highlighter-rouge">write</code> 被调用时，就会运行我的测试程序，这样我只要在 shell 中用 <code class="language-plaintext highlighter-rouge">echo</code> 和管道重定向就能调用它了：</p>
  <div class="language-c highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="c1">// 所有参数都用不到，我们只希望触发一次测试</span>
<span class="kt">ssize_t</span> <span class="nf">run_bench</span><span class="p">(</span><span class="k">struct</span> <span class="n">file</span> <span class="o">*</span><span class="n">filp</span><span class="p">,</span> <span class="k">const</span> <span class="kt">char</span> <span class="n">__user</span> <span class="o">*</span><span class="n">buf</span><span class="p">,</span>
                  <span class="kt">size_t</span> <span class="n">len</span><span class="p">,</span> <span class="n">loff_t</span> <span class="o">*</span><span class="n">ppos</span><span class="p">)</span> <span class="p">{</span>
    <span class="c1">// 测试相关的函数</span>
    <span class="n">foo</span><span class="p">();</span>
    <span class="c1">// 返回长度和参数一样代表写入成功</span>
    <span class="k">return</span> <span class="n">len</span><span class="p">;</span>
<span class="p">}</span>

<span class="c1">// 这是 C99 加入的结构体指定初始化，其它成员默认为空。</span>
<span class="c1">// 这意味着该文件只支持写</span>
<span class="k">static</span> <span class="k">const</span> <span class="k">struct</span> <span class="n">file_operations</span> <span class="n">bench_fops</span> <span class="o">=</span> <span class="p">{</span>
	<span class="p">.</span><span class="n">owner</span>		<span class="o">=</span> <span class="n">THIS_MODULE</span><span class="p">,</span>
	<span class="p">.</span><span class="n">write</span>		<span class="o">=</span> <span class="n">run_bench</span>
<span class="p">}</span>
</code></pre>
    </div>
  </div>
  <p>模块功能已经完成了，最后得给模块取个好听的名字。我把英文中表示时间流逝的 tic-toc（tick-tock）和表示内核的 kernel 合在一起，就变成了 tiktok。</p>
  <h2 id="链接函数">链接函数</h2>
  <p>我本以为编写了寥寥数行 tiktok，将 LLVM IR 编译到目标文件再链接到模块上便可以了。然而经过测试，这样做会导致进程一直无法返回。经过反汇编比较了同一段代码在内核模块里编译的结果和外部产生的结果后，我才理解是因为编译选项不同导致了内存布局的差别，这使得默认参数编译的代码无法在内核中使用。由于内核编译选项繁杂，根据不同架构也会产生不同的选项，我们必须把代码放置进内核的构建系统内来让它产生内核想要的汇编（现在的内核中禁用了浮点数，所以浮点计算就无法测试了）。对于普通的 c 文件，只需要在 Makefile 中加上几句，比如：</p>
  <div class="language-makefile highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="nv">hello-objs</span> <span class="o">:=</span> hello_mod.o another_file.o
</code></pre>
    </div>
  </div>
  <p>但是 LLVM IR 还需要一些技巧来骗过编译系统才行。我把自制编译器产生的 IR 文件改成了 .c 文件，并修改了 Makefile：</p>
  <div class="language-makefile highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="nv">hello-objs</span> <span class="o">:=</span> hello_mod.o another_file.o
<span class="nv">CFLAGS_another_file.o</span> <span class="o">:=</span> <span class="nt">-O0</span> <span class="nt">-x</span> ir
</code></pre>
    </div>
  </div>
  <p>内核编译系统会在产生 another_file.o 时将这里的 <code class="language-plaintext highlighter-rouge">-O0 -x ir</code> 传给编译器，这让编译器知道接下来要读的语言是 LLVM IR，并且关掉大部分优化（这样才能体现出自制编译器中的优化效果）。由于 LLVM IR 编译时会忽略一些选项，所以我让编译器在产生中间代码时就加上了属性来确保编译正确。</p>
  <p>虽然知道了怎么链接函数，但是内核是无法链接库函数的，而 IO 函数也无法直接在内核中使用。对此，我们可以修改相关的函数，在 tiktok 实现相似的功能。比如， <code class="language-plaintext highlighter-rouge">printf</code> 可以转换成 <code class="language-plaintext highlighter-rouge">printk</code>，而 <code class="language-plaintext highlighter-rouge">exit</code> 可以换成内核中的 <code class="language-plaintext highlighter-rouge">do_exit</code>。<code class="language-plaintext highlighter-rouge">malloc</code> 和 <code class="language-plaintext highlighter-rouge">free</code> 比较复杂，内核中的内存可没有进程那样退出就自动释放这么好的待遇。所以，在内核中必须得手动维护每一次分配和释放，避免出现事故导致 panic 或者内存泄漏（当然我很懒，课程不需要我就都没实现）。</p>
  <h2 id="一个小测试">一个小测试</h2>
  <p>既然实现完了，我们进行一个测试来检验 tiktok 的功能：</p>
  <div class="language-c highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="cp">#define OUT_LOOP 100
</span><span class="kt">int</span> <span class="nf">main</span><span class="p">(</span><span class="kt">void</span><span class="p">){</span>
    <span class="kt">int</span> <span class="n">i</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">a</span><span class="p">;</span>
    <span class="kt">int</span> <span class="n">j</span><span class="p">;</span>

    <span class="n">a</span> <span class="o">=</span> <span class="mi">2</span><span class="p">;</span>
    <span class="n">j</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
    <span class="k">while</span> <span class="p">(</span><span class="n">j</span> <span class="o">&lt;</span> <span class="n">OUT_LOOP</span><span class="p">){</span>
        <span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
        <span class="k">while</span><span class="p">(</span><span class="n">i</span> <span class="o">&lt;</span> <span class="mi">1000000</span><span class="p">)</span> <span class="p">{</span>
            <span class="n">a</span> <span class="o">=</span> <span class="n">a</span> <span class="o">*</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
            <span class="n">i</span> <span class="o">=</span> <span class="n">i</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
        <span class="p">}</span>
        <span class="n">j</span> <span class="o">=</span> <span class="n">j</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="n">a</span><span class="p">;</span>
<span class="p">}</span>
</code></pre>
    </div>
  </div>
  <p>上面是一段进行一些无意义计算的代码，我们修改 <code class="language-plaintext highlighter-rouge">OUT_LOOPS</code> 的大小（100、99、98）来比较不同方法的灵敏度。除了普通的 time 测量，我还加入了一组使用 <code class="language-plaintext highlighter-rouge">taskset(1)</code> 的对照组。 <code class="language-plaintext highlighter-rouge">taskset</code> 控制了程序的亲核性，这使得程序总能被调度到同一个核上，减少了跨核导致的缓存失效开销。为了减少随机误差，每种配置我都运行进行了 100 次函数得到虚列。实验数据可以在<a href="https://github.com/gloit042/tiktok/tree/main/bench">这里</a>查看。要反映灵敏度，我们无法直接拿不同方法的结果进行比较，而是要在同一个方法内看看能否显著区分出不同循环次数带来运行时间差距（约为 1%~%2）。短暂尝试了复习概率论和数理统计后，我谷歌到了 <a href="https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test">K-S 检验</a> （Kolmogorov-Smirnov test），对于两组输入数据它可以检验它们是否同分布。这里我们假设实际运行时间是固有的，而调度等开销造成的是一个均匀同分布的随机误差。如果测量工具对两种配置得到的两组数据无法拒绝同分布假设，我们可以认为它无法准确得检测出程序性能的差异（统计全忘光了，我不知道我在说啥，如果有误欢迎指正）。我使用了 <code class="language-plaintext highlighter-rouge">scipy.stat.kstest</code> 来进行了 K-S 检验，结果如下表（p 值小于 0.05 拒绝同分布假设）：</p>
  <table>
    <thead>
      <tr>
        <th style="text-align: center">p 值</th>
        <th style="text-align: center">98-99</th>
        <th style="text-align: center">99-100</th>
        <th style="text-align: center">98-100</th>
        <th style="text-align: center">98-98</th>
        <th style="text-align: center">99-99</th>
        <th style="text-align: center">100-100</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td style="text-align: center">normal_time</td>
        <td style="text-align: center">$2.11*10^{-1}$</td>
        <td style="text-align: center">$5.83*10^{-1}$</td>
        <td style="text-align: center">$3.68*10^{-1}$</td>
        <td style="text-align: center">$3.6*10^{-2}$</td>
        <td style="text-align: center">$5.83*10^{-1}$</td>
        <td style="text-align: center">$1.29*10^{-3}$</td>
      </tr>
      <tr>
        <td style="text-align: center">taskset_time</td>
        <td style="text-align: center">$1.56*10^{-2}$</td>
        <td style="text-align: center">$2.11*10^{-1}$</td>
        <td style="text-align: center">$3.21*10^{-5}$</td>
        <td style="text-align: center">$8.15*10^{-1}$</td>
        <td style="text-align: center">$8.15*10^{-1}$</td>
        <td style="text-align: center">$2.4*10^{-2}$</td>
      </tr>
      <tr>
        <td style="text-align: center">tiktok</td>
        <td style="text-align: center">$3.73*10^{-36}$</td>
        <td style="text-align: center">$6.31*10^{-19}$</td>
        <td style="text-align: center">$2.6*10^{-38}$</td>
        <td style="text-align: center">$4.11*10^{-43}$</td>
        <td style="text-align: center">$1.41*10^{-28}$</td>
        <td style="text-align: center">$2.24*10^{-4}$</td>
      </tr>
    </tbody>
  </table>
  <p>gg，完全没有用，连同一个配置都分不清楚。虽然占据了一个核，但是虚拟机本身也在被调度，这个误差并不满足我们的假设。但是，还不能放弃，还可以抢救一下，我把 tiktok 放裸机上测试了一下，得到的结果如下：</p>
  <table>
    <thead>
      <tr>
        <th style="text-align: center">p 值</th>
        <th style="text-align: center">98-99</th>
        <th style="text-align: center">99-100</th>
        <th style="text-align: center">98-100</th>
        <th style="text-align: center">98-98</th>
        <th style="text-align: center">99-99</th>
        <th style="text-align: center">100-100</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td style="text-align: center">normal_time</td>
        <td style="text-align: center">$4.04*10^{-35}$</td>
        <td style="text-align: center">$2.82*10^{-1}$</td>
        <td style="text-align: center">$3.05*10^{-31}$</td>
        <td style="text-align: center">$8.15*10^{-1}$</td>
        <td style="text-align: center">$5.83*10^{-1}$</td>
        <td style="text-align: center">$1.54*10^{-1}$</td>
      </tr>
      <tr>
        <td style="text-align: center">taskset_time</td>
        <td style="text-align: center">$1.42*10^{-51}$</td>
        <td style="text-align: center">$3.64*10^{-2}$</td>
        <td style="text-align: center">$4.39*10^{-55}$</td>
        <td style="text-align: center">$5.39*10^{-1}$</td>
        <td style="text-align: center">$9.08*10^{-1}$</td>
        <td style="text-align: center">$2.11*10^{-1}$</td>
      </tr>
      <tr>
        <td style="text-align: center">tiktok</td>
        <td style="text-align: center">$2.20*10^{-59}$</td>
        <td style="text-align: center">$4.39*10^{-55}$</td>
        <td style="text-align: center">$2.20*10^{-59}$</td>
        <td style="text-align: center">$2.11*10^{-1}$</td>
        <td style="text-align: center">$7.02*10^{-1}$</td>
        <td style="text-align: center">$2.11*10^{-1}$</td>
      </tr>
    </tbody>
  </table>
  <p>在裸机上，tiktok 的优势就体现了出来，它比 <code class="language-plaintext highlighter-rouge">taskset</code> 要更加灵敏一点。然而裸机上使用 <code class="language-plaintext highlighter-rouge">perf stat</code>就可以看到更加精确的指令数结果…</p>
  <h2 id="小结">小结</h2>
  <p>虽然 tiktok 被证明没有太多实际价值，但是研究的过程中我还是学到了挺多 Linux 内核编译系统的知识。我想分享一下这个过程，是因为我觉得这很有趣。如果你对 tiktok 感兴趣，它在我的 <a href="https://github.com/gloit042/tiktok">Github</a> 上，它比本文介绍的要稍微复杂一丢丢。如果你想要了解更多关于 Linux 内核的内容，网络上有许多很好的资料。至于编译课程，我估计细小的差距是没法在虚拟机里准确测量了，还是要尽量避开这种测试样例（Windows 上应该也有工具，但是不太方便）。</p>
  ]]></content><author><name>gloit</name></author><category term="Technology" /><category term="Benchmark" /><category term="Kernel" /><category term="LLVM" /><summary type="html"><![CDATA[本学期，我担任了李诚老师编译原理课程的助教。在课程实验中，我们基于 LLVM 构建了一套编译系统，其中一个实验需要编写后端优化算法。为了评估学生们的优化代码，我们需要比较优化前后的代码（在这里是 LLVM IR）的性能。我们通过统计程序运行的时间来比较代码的性能，但是用户程序会受到内核调度。因为不是连续执行程序，所以受到调度造成的延迟会导致统计到的时间出现噪音，这些噪音可能会让测试结果不准确甚至影响到了同学们的分数。最开始，我想到是不是可以统计指令数来评估性能，打算用 perf stat 进行测试。然而，我们提供的实验环境基于虚拟机，VirtualBox 和 WSL 都没有实现相关的虚拟寄存器，要让同学们方便地使用该指令会比较困难，只得作罢。这时，我突然想到，内核线程可以不受到调度，那么使用内核线程是不是可以更精确的测量时间呢？于是我便开始了尝试。]]></summary></entry><entry><title type="html">使用 Beancount 进行记账并自动记录一卡通消费</title><link href="https://lug.ustc.edu.cn/planet/2020/08/keeping-account-with-beancount/" rel="alternate" type="text/html" title="使用 Beancount 进行记账并自动记录一卡通消费" /><published>2020-08-06T00:00:00+08:00</published><updated>2024-09-17T14:00:46+08:00</updated><id>https://lug.ustc.edu.cn/planet/2020/08/keeping-account-with-beancount</id><content type="html" xml:base="https://lug.ustc.edu.cn/planet/2020/08/keeping-account-with-beancount/"><![CDATA[<p>本文首发于 <a href="https://charlesliu7.github.io/blackboard/2019/07/24/beancount/">https://charlesliu7.github.io/blackboard/2019/07/24/beancount/</a></p>
  <p>偶尔看到了复式记账这个概念，对精细记账的我而言很受用，选择 Beancount 这样的开源工具的原因莫过于账本数据完全由自己掌握，而不是被各大 APP
    所保管。本文从一次个人实践的角度来说明一下复式记账的使用。</p>
  <p>本篇文章是一个从零开始的个人实践记录，涵盖 <strong>文件组织 -&gt; 基本账本书写 -&gt; 爬取一卡通数据并自动记录</strong>，供同样使用 Beancount
    的同学做参考，但此实践并不一定完全合乎其他人的使用习惯，如果有其它记录策略也是可以的。本文内容基于读者对复式记账和 Beancount
    语法有一定了解的情况下撰写的，关于复式记账的概念和一些诸多基本功能介绍，可以参考阅读以下文章：</p>
  <ul>
    <li><a href="https://plaintextaccounting.org/#comparisons">文本记账综述、复式记账开源工具比较</a></li>
    <li><a href="https://www.byvoid.com/zhs/blog/beancount-bookkeeping-1">Beancount 复式记账（一）：为什么</a></li>
  </ul>
  <p>开始！</p>
  <h2 id="安装使用">安装使用</h2>
  <p>Beancount 是一个 Python 实现的开源工具，在本地即可运行，首先从 PyPI 获取：</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>pip <span class="nb">install </span>beancount fava
</code></pre>
    </div>
  </div>
  <p>其中 <code class="language-plaintext highlighter-rouge">beancount</code> 是核心包，包含核心的命令行工具；<code class="language-plaintext highlighter-rouge">fava</code> 是网页可视化工具。 <del>这里有一个<a href="https://fava.pythonanywhere.com/huge-example-file/balance_sheet/">fava
        示例账本</a> ，对应的
      Beancount 源代码可以在 <a href="https://bitbucket.org/blais/beancount/src/default/examples/">Bitbucket
        上下载</a>。</del>
    本文的示例账本以及可视化可以在该<a href="https://git.lug.ustc.edu.cn/Charles/ecard_beancount/-/tree/master">仓库</a>查看。</p>
  <p>克隆该仓库，在命令行中使用 <code class="language-plaintext highlighter-rouge">fava main.beancount</code>。</p>
  <div class="language-console highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="gp">$</span><span class="w"> </span>fava main.beancount
<span class="go">Running Fava on http://localhost:5000
</span></code></pre>
    </div>
  </div>
  <p>打开浏览器即可看到可视化账本。</p>
  <h2 id="文件结构">文件结构</h2>
  <p>Beancount 支持 <code class="language-plaintext highlighter-rouge">include</code> 语法来拓展账簿，个人采用按时间划分文件，辅之特殊事件（比如旅游）单独记录的方法，目录结构如下：</p>
  <div class="language-text highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>.
├── 2018
│   └── ...
├── 2019
│   └── 01.beancount
│   └── 02.beancount
│   └── 03.beancount
│   └── 04.beancount ; 注释用分号
│   └── xx.event.beancount ; 单独针对某一特别事件的账本，比如旅游
│   └── 05.beancount
│   └── 06.beancount
│   └── 07.beancount
├── accounts.beancount ; 记录初始账户信息
├── main.beancount ; 主文件
</code></pre>
    </div>
  </div>
  <h2 id="账本书写">账本书写</h2>
  <h3 id="账户信息设置">账户信息设置</h3>
  <p>首先要定义账户，即文件 <code class="language-plaintext highlighter-rouge">accounts.beancount</code>，Beancount 系统中预定义了五个分类：</p>
  <ul>
    <li>
      <p>Assets 资产：本人按照<code class="language-plaintext highlighter-rouge">账户类型:国家:金融机构名字:具体账户</code>的策略划分，时间是开户时间，比如：</p>
      <div class="language-conf highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Assets</span>:<span class="n">CN</span>:<span class="n">Bank</span>:<span class="n">BoC</span>:<span class="n">C1234</span> <span class="n">CNY</span> ; 学校银行卡
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Assets</span>:<span class="n">CN</span>:<span class="n">Card</span>:<span class="n">USTC</span> <span class="n">CNY</span> ; 一卡通
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Assets</span>:<span class="n">CN</span>:<span class="n">Web</span>:<span class="n">AliPay</span> <span class="n">CNY</span> ; 支付宝
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Assets</span>:<span class="n">CN</span>:<span class="n">Web</span>:<span class="n">WeChatPay</span> <span class="n">CNY</span> ; 微信支付
</code></pre>
        </div>
    </div>
      <p>有一类针对 AA 付款或者个人向自己借款的账户，需要专门记录。</p>
      <div class="language-conf highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Assets</span>:<span class="n">Receivables</span>:<span class="n">X</span> ; 对 <span class="n">X</span> 的应收款项
</code></pre>
        </div>
    </div>
    </li>
    <li>
      <p>Liabilities 负债：本人主要是信用卡和向他人借款的账户，比如：</p>
      <div class="language-conf highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Liabilities</span>:<span class="n">Payable</span>:<span class="n">X</span> ; 对 <span class="n">X</span> 的债务
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Liabilities</span>:<span class="n">CreditCard</span>:<span class="n">CN</span>:<span class="n">BoC</span>:<span class="n">C1111</span> <span class="n">CNY</span> ; 信用卡
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Liabilities</span>:<span class="n">CreditCard</span>:<span class="n">CN</span>:<span class="n">Huabei</span> <span class="n">CNY</span> ; 花呗
</code></pre>
        </div>
    </div>
    </li>
    <li>
      <p>Equity 权益（净资产）：目前只有一个用于平衡开户的时候账户资金的权益。</p>
      <div class="language-conf highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="m">1990</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Equity</span>:<span class="n">Opening</span>-<span class="n">Balances</span>
</code></pre>
        </div>
    </div>
    </li>
    <li>
      <p>Expenses 支出：支出就非常的多样化，可以根据自己需求分门别类，比如：</p>
      <div class="language-conf highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Expenses</span>:<span class="n">Clothing</span> ; 包括上衣，裤子和装饰，袜子，围巾，帽子
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Expenses</span>:<span class="n">Shoes</span> ; 鞋
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Expenses</span>:<span class="n">Food</span>:<span class="n">Dinner</span>
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Expenses</span>:<span class="n">Food</span>:<span class="n">Lunch</span>
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Expenses</span>:<span class="n">Food</span>:<span class="n">Breakfast</span>
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Expenses</span>:<span class="n">Food</span>:<span class="n">Fruits</span>
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Expenses</span>:<span class="n">Food</span>:<span class="n">Nightingale</span> ; 校门口夜宵
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Expenses</span>:<span class="n">Food</span>:<span class="n">Drinks</span>
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Expenses</span>:<span class="n">Food</span>:<span class="n">Snack</span> ; 杂食、零食
</code></pre>
        </div>
    </div>
      <p>等等……</p>
    </li>
    <li>
      <p>Income 收入：收入也可以根据自己的实际收入来源来建立账户，比如：</p>
      <div class="language-conf highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Income</span>:<span class="n">Salary</span>:<span class="n">XXX</span>
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Income</span>:<span class="n">Salary</span>:<span class="n">Others</span>
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">open</span> <span class="n">Income</span>:<span class="n">Others</span>
</code></pre>
        </div>
    </div>
    </li>
  </ul>
  <h3 id="主文件设置">主文件设置</h3>
  <p>然后设置主文件 <code class="language-plaintext highlighter-rouge">main.beancount</code> 内容，主文件任务是设置全局变量，然后去涵盖各个子账本：</p>
  <div class="language-conf highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="n">option</span> <span class="s2">"title"</span> <span class="s2">"取个霸气的名字吧"</span> ; 账簿名称
<span class="n">option</span> <span class="s2">"operating_currency"</span> <span class="s2">"CNY"</span> ; 账簿主货币
<span class="n">option</span> <span class="s2">"operating_currency"</span> <span class="s2">"USD"</span> ; 可以添加多个主货币

<span class="n">include</span> <span class="s2">"accounts.beancount"</span> ; 包含账户信息

; 每个月的账本
<span class="n">include</span> <span class="s2">"2020/06.beancount"</span>
<span class="n">include</span> <span class="s2">"2020/07.beancount"</span>
</code></pre>
    </div>
  </div>
  <h3 id="账户初始余额设置">账户初始余额设置</h3>
  <p>在开始记账前，要设置每个账户的余额信息，采用以下方法来给每个账户设置余额/借记账单：</p>
  <div class="language-conf highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="m">2019</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">pad</span> <span class="n">Assets</span>:<span class="n">Bank</span>:<span class="n">CN</span>:<span class="n">BoC</span>:<span class="n">C1111</span> <span class="n">Equity</span>:<span class="n">Opening</span>-<span class="n">Balances</span> ; 从 <span class="n">Opening</span>-<span class="n">Balances</span> 中划取 <span class="n">XX</span> 帐到银行卡中
<span class="m">2019</span>-<span class="m">01</span>-<span class="m">02</span> <span class="n">balance</span> <span class="n">Assets</span>:<span class="n">Bank</span>:<span class="n">CN</span>:<span class="n">BoC</span>:<span class="n">C1111</span>    +<span class="n">xxx</span>.<span class="n">xx</span> <span class="n">CNY</span> ; 银行卡余额为 <span class="n">xxx</span>.<span class="n">xx</span>
</code></pre>
    </div>
  </div>
  <p>该语句的含义是无论 <code class="language-plaintext highlighter-rouge">Assets:Bank:CN:BoC:C1111</code> 之前余额多少，在 2019 年 1 月 2 日开始之前都调整到 xxx.xx
    CNY，差额从 Equity:Opening-Balances 来。注意两行之间差一天的时间，<code class="language-plaintext highlighter-rouge">balance</code>
    断言界定为当天开始；一般储蓄卡余额为正，信用卡余额为负。</p>
  <h3 id="记账">记账</h3>
  <ul>
    <li>
      <p>基本记账，记账语法为：</p>
      <div class="language-text highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>YYYY-mm-dd * ["Payee"] "Narration"
  posting 1
  posting 2
  posting 3
  ...
</code></pre>
        </div>
    </div>
      <p>比如：</p>
      <div class="language-text highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>2019-01-01 * "Walmart" "在超市买两件衣服和晚餐"
  Expenses:Clothing 20 USD
  Expenses:Clothing 10 USD
  Expenses:Food:Dinner 10 USD
  Liabilities:CreditCard:US:Discover -40 USD
</code></pre>
        </div>
    </div>
    </li>
    <li>
      <p>多货币转换使用 <code class="language-plaintext highlighter-rouge">@@</code> 作为货币转换即可，货币 Beancount 会进行汇率计算，比如：</p>
      <div class="language-text highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code>2019-01-01 * "日本航空" "纽约 - 东京"
  Expenses:Transport:Airline 1000 USD @@ 110000 JPY
  Liabilities:CreditCard:JP:Rakuten -110000 JPY
</code></pre>
        </div>
    </div>
    </li>
    <li>
      <p>账户结息：账户的利息肯定难以每日都记录，本人采用 <code class="language-plaintext highlighter-rouge">pad</code>+<code class="language-plaintext highlighter-rouge">balance</code> 断言，每隔一段时间结算一下。</p>
    </li>
    <li>
      <p>分期付款：这是个常见的购买方式，需要单独设置开一个 Liabilities Account，手续费记利息支出，每个月账单出现的时候转移一下。Beancount 提供了一个<a href="https://beancount.github.io/fava/api/beancount.plugins.html">插件</a> <code class="language-plaintext highlighter-rouge">plugin "beancount.plugins.forecast</code> 专门用来处理分期、订阅情况，可以用于每月费用的自动生成。</p>
    </li>
  </ul>
  <h3 id="核账">核账</h3>
  <p>本人选择每个月还款日核实一下账本，在 Fava 左侧 <code class="language-plaintext highlighter-rouge">Balance Sheet</code> 或者 <code class="language-plaintext highlighter-rouge">Holdings</code>
    里可以看到各个账户当前的状况，如果和实际的账户金额有出入的话就需要点进对应账户查看每笔交易的情况，看看是否漏记或者错记。</p>
  <h2 id="用-importer-自动记录一卡通消费">用 Importer 自动记录一卡通消费</h2>
  <h3 id="综述">综述</h3>
  <p><code class="language-plaintext highlighter-rouge">Importer</code> 个人理解的作用是将整理好的账单文本转化为 Beancount 记录的形式，即格式化 (表格，JSON 等) 账单 -&gt; Importer -&gt;
    Beancount 记录，Importer 在其中起到一个消费记录格式转化作用。</p>
  <p>Beancount 作者对 Importer 有详细的文档叙述，即 <a href="http://furius.ca/beancount/doc/ingest">Importing External Data in
      Beancount</a>。Beancount 官方也有基于机器学习的智能
    importer
    <a href="https://github.com/beancount/smart_importer">beancount/smart_importer</a>。</p>
  <p>而本人的需求是：</p>
  <ol>
    <li>利用<a href="https://ecard.ustc.edu.cn/login">校园一卡通门户系统</a>获取每日的一卡通使用记录，并生成 <code class="language-plaintext highlighter-rouge">CSV</code> 记录。</li>
    <li>基于 <code class="language-plaintext highlighter-rouge">CSV</code> 的账单生成 <code class="language-plaintext highlighter-rouge">beancount</code> 文件。</li>
    <li>能够自行定制规则来实现对不同消费的分类。</li>
  </ol>
  <h3 id="将当日的一卡通消费生成为-csv">将当日的一卡通消费生成为 <code class="language-plaintext highlighter-rouge">CSV</code></h3>
  <p>爬取一卡通数据的代码为
    <a href="https://git.lug.ustc.edu.cn/Charles/ecard_beancount/-/blob/master/crawler.py">crawler.py</a>
    ，其作用为爬取当日的一卡通消费记录，并自定义规则区分早、午、晚餐，生成符合 Beancount 格式的 <code class="language-plaintext highlighter-rouge">CSV</code>。（代码可以直接运行）</p>
  <div class="language-python highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="kn">import</span> <span class="n">requests</span>
<span class="kn">from</span> <span class="n">datetime</span> <span class="kn">import</span> <span class="n">datetime</span>
<span class="kn">from</span> <span class="n">bs4</span> <span class="kn">import</span> <span class="n">BeautifulSoup</span>
<span class="kn">import</span> <span class="n">json</span>
<span class="kn">import</span> <span class="n">codecs</span>
<span class="kn">import</span> <span class="n">csv</span>

<span class="n">name</span> <span class="o">=</span> <span class="sh">'</span><span class="s">XXX</span><span class="sh">'</span>  <span class="c1"># 姓名
</span><span class="n">stu_no</span> <span class="o">=</span> <span class="sh">'</span><span class="s">PBXXXXXXXX</span><span class="sh">'</span>  <span class="c1"># 学号
</span><span class="n">pwd</span> <span class="o">=</span> <span class="sh">'</span><span class="s">user_pwd</span><span class="sh">'</span>  <span class="c1"># 统一身份认证密码
</span>
<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="sh">'</span><span class="s">__main__</span><span class="sh">'</span><span class="p">:</span>
    <span class="c1"># 利用统一身份认证登陆校园一卡通门户系统
</span>    <span class="n">casurl</span> <span class="o">=</span> <span class="sh">'</span><span class="s">https://passport.ustc.edu.cn/login?service=http%3A%2F%2Fecard.ustc.edu.cn%2Fcaslogin</span><span class="sh">'</span>
    <span class="n">caspost</span> <span class="o">=</span> <span class="p">{</span><span class="sh">'</span><span class="s">username</span><span class="sh">'</span><span class="p">:</span> <span class="n">stu_no</span><span class="p">,</span> <span class="sh">'</span><span class="s">password</span><span class="sh">'</span><span class="p">:</span> <span class="n">pwd</span><span class="p">}</span>  <span class="c1"># 统一身份认证
</span>    <span class="n">msg</span> <span class="o">=</span> <span class="sh">''</span>
    <span class="n">s</span> <span class="o">=</span> <span class="n">requests</span><span class="p">.</span><span class="nf">session</span><span class="p">()</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">r</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="nf">post</span><span class="p">(</span><span class="n">casurl</span><span class="p">,</span> <span class="n">caspost</span><span class="p">)</span>
    <span class="k">except</span> <span class="nb">Exception</span> <span class="k">as</span> <span class="n">e</span><span class="p">:</span>
        <span class="n">msg</span> <span class="o">=</span> <span class="sh">'</span><span class="s">{0} - INFO: USTC ecard CAS 登陆失败 {1}</span><span class="sh">'</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">datetime</span><span class="p">.</span><span class="nf">now</span><span class="p">().</span><span class="nf">strftime</span><span class="p">(</span><span class="sh">'</span><span class="s">%Y-%m-%d %H:%M:%S</span><span class="sh">'</span><span class="p">),</span> <span class="n">e</span><span class="p">)</span>
    <span class="n">remaining</span> <span class="o">=</span> <span class="mi">0</span>
    <span class="k">if</span> <span class="ow">not</span> <span class="n">name</span> <span class="ow">in</span> <span class="n">r</span><span class="p">.</span><span class="n">text</span><span class="p">:</span>
        <span class="n">msg</span> <span class="o">=</span> <span class="sh">'</span><span class="s">{0} - INFO: USTC ecard CAS 登陆失败 NOOOOOOOO!!!!!!!!</span><span class="sh">'</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">datetime</span><span class="p">.</span><span class="nf">now</span><span class="p">().</span><span class="nf">strftime</span><span class="p">(</span><span class="sh">'</span><span class="s">%Y-%m-%d %H:%M:%S</span><span class="sh">'</span><span class="p">))</span>
        <span class="nf">print</span><span class="p">(</span><span class="n">msg</span><span class="p">)</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="n">msg</span> <span class="o">=</span> <span class="sh">'</span><span class="s">{0} - INFO: USTC ecard CAS 登陆成功</span><span class="sh">'</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">datetime</span><span class="p">.</span><span class="nf">now</span><span class="p">().</span><span class="nf">strftime</span><span class="p">(</span><span class="sh">'</span><span class="s">%Y-%m-%d %H:%M:%S</span><span class="sh">'</span><span class="p">))</span>
        <span class="nf">print</span><span class="p">(</span><span class="n">msg</span><span class="p">)</span>
        <span class="n">paylist</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="nf">get</span><span class="p">(</span><span class="sh">'</span><span class="s">https://ecard.ustc.edu.cn/paylist</span><span class="sh">'</span><span class="p">)</span>
        <span class="n">b</span> <span class="o">=</span> <span class="nc">BeautifulSoup</span><span class="p">(</span><span class="n">paylist</span><span class="p">.</span><span class="n">text</span><span class="p">,</span> <span class="n">features</span><span class="o">=</span><span class="sh">"</span><span class="s">lxml</span><span class="sh">"</span><span class="p">)</span>
        <span class="n">token</span> <span class="o">=</span> <span class="n">b</span><span class="p">.</span><span class="nf">findAll</span><span class="p">(</span><span class="sh">'</span><span class="s">input</span><span class="sh">'</span><span class="p">)[</span><span class="o">-</span><span class="mi">1</span><span class="p">].</span><span class="nf">get_attribute_list</span><span class="p">(</span><span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span>
        <span class="n">data</span> <span class="o">=</span> <span class="n">s</span><span class="p">.</span><span class="nf">post</span><span class="p">(</span><span class="n">url</span><span class="o">=</span><span class="sh">'</span><span class="s">https://ecard.ustc.edu.cn/paylist/ajax_get_paylist</span><span class="sh">'</span><span class="p">,</span> <span class="n">data</span><span class="o">=</span><span class="p">{</span><span class="sh">'</span><span class="s">date</span><span class="sh">'</span><span class="p">:</span> <span class="sh">''</span><span class="p">,</span> <span class="sh">'</span><span class="s">page</span><span class="sh">'</span><span class="p">:</span> <span class="sh">''</span><span class="p">},</span> <span class="n">headers</span><span class="o">=</span><span class="p">{</span><span class="sh">'</span><span class="s">origin</span><span class="sh">'</span><span class="p">:</span> <span class="sh">'</span><span class="s">https://ecard.ustc.edu.cn</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">referer</span><span class="sh">'</span><span class="p">:</span> <span class="sh">'</span><span class="s">https://ecard.ustc.edu.cn/paylist</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">sec-fetch-mode</span><span class="sh">'</span><span class="p">:</span> <span class="sh">'</span><span class="s">cors</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">sec-fetch-site</span><span class="sh">'</span><span class="p">:</span> <span class="sh">'</span><span class="s">same-origin</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">x-csrf-token</span><span class="sh">'</span><span class="p">:</span> <span class="n">token</span><span class="p">,</span> <span class="sh">'</span><span class="s">x-requested-with</span><span class="sh">'</span><span class="p">:</span> <span class="sh">'</span><span class="s">XMLHttpRequest</span><span class="sh">'</span><span class="p">})</span>
        <span class="n">b</span> <span class="o">=</span> <span class="nc">BeautifulSoup</span><span class="p">(</span><span class="n">data</span><span class="p">.</span><span class="n">text</span><span class="p">,</span> <span class="n">features</span><span class="o">=</span><span class="sh">"</span><span class="s">lxml</span><span class="sh">"</span><span class="p">)</span>
        <span class="n">table</span> <span class="o">=</span> <span class="n">b</span><span class="p">.</span><span class="nf">find</span><span class="p">(</span><span class="sh">'</span><span class="s">table</span><span class="sh">'</span><span class="p">)</span>
        <span class="n">th_index</span> <span class="o">=</span> <span class="p">[]</span>
        <span class="k">for</span> <span class="n">th</span> <span class="ow">in</span> <span class="n">table</span><span class="p">.</span><span class="nf">findAll</span><span class="p">(</span><span class="sh">'</span><span class="s">th</span><span class="sh">'</span><span class="p">):</span>
            <span class="n">th_index</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">th</span><span class="p">.</span><span class="nf">getText</span><span class="p">())</span>
        <span class="n">year</span><span class="p">,</span> <span class="n">month</span><span class="p">,</span> <span class="n">day</span> <span class="o">=</span> <span class="n">datetime</span><span class="p">.</span><span class="nf">now</span><span class="p">().</span><span class="n">year</span><span class="p">,</span> <span class="n">datetime</span><span class="p">.</span><span class="nf">now</span><span class="p">().</span><span class="n">month</span><span class="p">,</span> <span class="n">datetime</span><span class="p">.</span><span class="nf">now</span><span class="p">().</span><span class="n">day</span>
        <span class="c1"># 根据自己定义的规则判定早餐、午餐、晚餐
</span>        <span class="n">payinfo</span> <span class="o">=</span> <span class="p">{</span><span class="sh">'</span><span class="s">breakfast</span><span class="sh">'</span><span class="p">:</span> <span class="p">{</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">:</span> <span class="sh">''</span><span class="p">,</span> <span class="sh">'</span><span class="s">type</span><span class="sh">'</span><span class="p">:</span> <span class="sh">'</span><span class="s">科大餐饮</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">:</span> <span class="mf">0.0</span><span class="p">,</span> <span class="p">},</span> <span class="sh">'</span><span class="s">lunch</span><span class="sh">'</span><span class="p">:</span> <span class="p">{</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">:</span> <span class="sh">''</span><span class="p">,</span> <span class="sh">'</span><span class="s">type</span><span class="sh">'</span><span class="p">:</span> <span class="sh">'</span><span class="s">科大餐饮</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">:</span> <span class="mf">0.0</span><span class="p">,</span> <span class="p">},</span> <span class="sh">'</span><span class="s">dinner</span><span class="sh">'</span><span class="p">:</span> <span class="p">{</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">:</span> <span class="sh">''</span><span class="p">,</span> <span class="sh">'</span><span class="s">type</span><span class="sh">'</span><span class="p">:</span> <span class="sh">'</span><span class="s">科大餐饮</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">:</span> <span class="mf">0.0</span><span class="p">,</span> <span class="p">},</span> <span class="sh">'</span><span class="s">transferin</span><span class="sh">'</span><span class="p">:</span> <span class="p">{</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">:</span> <span class="sh">'</span><span class="s">一卡通充值</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">type</span><span class="sh">'</span><span class="p">:</span> <span class="sh">''</span><span class="p">,</span> <span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">:</span> <span class="mf">0.0</span><span class="p">,</span> <span class="p">}</span> <span class="p">}</span>
        <span class="n">flag</span> <span class="o">=</span> <span class="bp">True</span>
        <span class="k">for</span> <span class="n">tr</span> <span class="ow">in</span> <span class="n">table</span><span class="p">.</span><span class="nf">findAll</span><span class="p">(</span><span class="sh">'</span><span class="s">tr</span><span class="sh">'</span><span class="p">):</span>
            <span class="n">line</span> <span class="o">=</span> <span class="p">[]</span>
            <span class="k">for</span> <span class="n">td</span> <span class="ow">in</span> <span class="n">tr</span><span class="p">.</span><span class="nf">findAll</span><span class="p">(</span><span class="sh">'</span><span class="s">td</span><span class="sh">'</span><span class="p">):</span>
                <span class="n">line</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">td</span><span class="p">.</span><span class="nf">getText</span><span class="p">())</span>
            <span class="k">if</span> <span class="n">line</span> <span class="ow">and</span> <span class="n">flag</span><span class="p">:</span>
                <span class="n">remaining</span> <span class="o">=</span> <span class="nf">float</span><span class="p">(</span><span class="n">line</span><span class="p">[</span><span class="mi">3</span><span class="p">])</span>
                <span class="n">flag</span> <span class="o">=</span> <span class="bp">False</span>
            <span class="k">if</span> <span class="ow">not</span> <span class="n">line</span><span class="p">:</span>
                <span class="k">pass</span>
            <span class="k">elif</span> <span class="n">line</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="sh">'</span><span class="s">圈存机充值</span><span class="sh">'</span> <span class="ow">and</span> <span class="nf">int</span><span class="p">(</span><span class="n">line</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span> <span class="o">==</span> <span class="mi">0</span><span class="p">:</span>
                <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">transferin</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">]</span> <span class="o">=</span> <span class="nf">float</span><span class="p">(</span><span class="n">line</span><span class="p">[</span><span class="mi">4</span><span class="p">])</span>
            <span class="k">elif</span> <span class="n">line</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o">==</span> <span class="sh">'</span><span class="s">消费</span><span class="sh">'</span><span class="p">:</span>
                <span class="n">linetime</span> <span class="o">=</span> <span class="n">datetime</span><span class="p">.</span><span class="nf">strptime</span><span class="p">(</span><span class="n">line</span><span class="p">[</span><span class="mi">5</span><span class="p">],</span> <span class="sh">'</span><span class="s">%Y-%m-%d %H:%M:%S</span><span class="sh">'</span><span class="p">)</span>
                <span class="k">if</span> <span class="n">linetime</span> <span class="o">&gt;</span> <span class="nf">datetime</span><span class="p">(</span><span class="n">year</span><span class="p">,</span> <span class="n">month</span><span class="p">,</span> <span class="n">day</span><span class="p">,</span> <span class="mi">6</span><span class="p">)</span> <span class="ow">and</span> <span class="n">linetime</span> <span class="o">&lt;</span> <span class="nf">datetime</span><span class="p">(</span><span class="n">year</span><span class="p">,</span> <span class="n">month</span><span class="p">,</span> <span class="n">day</span><span class="p">,</span> <span class="mi">10</span><span class="p">):</span> <span class="c1"># 判定为早餐
</span>                    <span class="k">if</span> <span class="n">line</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="ow">in</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">breakfast</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">]:</span>
                        <span class="k">pass</span>
                    <span class="k">else</span><span class="p">:</span>
                        <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">breakfast</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">]</span> <span class="o">+=</span> <span class="p">(</span><span class="n">line</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">+</span> <span class="sh">'</span><span class="s"> </span><span class="sh">'</span><span class="p">)</span>
                    <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">breakfast</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">]</span> <span class="o">+=</span> <span class="nf">float</span><span class="p">(</span><span class="n">line</span><span class="p">[</span><span class="mi">4</span><span class="p">])</span>
                <span class="k">elif</span> <span class="n">linetime</span> <span class="o">&gt;</span> <span class="nf">datetime</span><span class="p">(</span><span class="n">year</span><span class="p">,</span> <span class="n">month</span><span class="p">,</span> <span class="n">day</span><span class="p">,</span> <span class="mi">10</span><span class="p">)</span> <span class="ow">and</span> <span class="n">linetime</span> <span class="o">&lt;</span> <span class="nf">datetime</span><span class="p">(</span><span class="n">year</span><span class="p">,</span> <span class="n">month</span><span class="p">,</span> <span class="n">day</span><span class="p">,</span> <span class="mi">14</span><span class="p">):</span> <span class="c1"># 判定为午餐
</span>                    <span class="k">if</span> <span class="n">line</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="ow">in</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">lunch</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">]:</span>
                        <span class="k">pass</span>
                    <span class="k">else</span><span class="p">:</span>
                        <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">lunch</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">]</span> <span class="o">+=</span> <span class="p">(</span><span class="n">line</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">+</span> <span class="sh">'</span><span class="s"> </span><span class="sh">'</span><span class="p">)</span>
                    <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">lunch</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">]</span> <span class="o">+=</span> <span class="nf">float</span><span class="p">(</span><span class="n">line</span><span class="p">[</span><span class="mi">4</span><span class="p">])</span>
                <span class="k">elif</span> <span class="n">linetime</span> <span class="o">&gt;</span> <span class="nf">datetime</span><span class="p">(</span><span class="n">year</span><span class="p">,</span> <span class="n">month</span><span class="p">,</span> <span class="n">day</span><span class="p">,</span> <span class="mi">16</span><span class="p">)</span> <span class="ow">and</span> <span class="n">linetime</span> <span class="o">&lt;</span> <span class="nf">datetime</span><span class="p">(</span><span class="n">year</span><span class="p">,</span> <span class="n">month</span><span class="p">,</span> <span class="n">day</span><span class="p">,</span> <span class="mi">20</span><span class="p">):</span> <span class="c1"># 判定为晚餐
</span>                    <span class="k">if</span> <span class="n">line</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="ow">in</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">dinner</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">]:</span>
                        <span class="k">pass</span>
                    <span class="k">else</span><span class="p">:</span>
                        <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">dinner</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">]</span> <span class="o">+=</span> <span class="p">(</span><span class="n">line</span><span class="p">[</span><span class="mi">6</span><span class="p">]</span> <span class="o">+</span> <span class="sh">'</span><span class="s"> </span><span class="sh">'</span><span class="p">)</span>
                    <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">dinner</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">]</span> <span class="o">+=</span> <span class="nf">float</span><span class="p">(</span><span class="n">line</span><span class="p">[</span><span class="mi">4</span><span class="p">])</span>
                <span class="k">elif</span> <span class="n">linetime</span> <span class="o">&lt;</span> <span class="nf">datetime</span><span class="p">(</span><span class="n">year</span><span class="p">,</span> <span class="n">month</span><span class="p">,</span> <span class="n">day</span><span class="p">,</span> <span class="mi">0</span><span class="p">):</span>
                    <span class="k">break</span>
                <span class="k">else</span><span class="p">:</span>
                    <span class="n">mtmp</span> <span class="o">=</span> <span class="sh">'</span><span class="s">{0} - INFO: 未知消费 {1}</span><span class="sh">'</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">datetime</span><span class="p">.</span><span class="nf">now</span><span class="p">().</span><span class="nf">strftime</span><span class="p">(</span><span class="sh">'</span><span class="s">%Y-%m-%d %H:%M:%S</span><span class="sh">'</span><span class="p">),</span> <span class="n">line</span><span class="p">)</span>
                    <span class="nf">print</span><span class="p">(</span><span class="n">mtmp</span><span class="p">)</span>
            <span class="k">else</span><span class="p">:</span>
                <span class="n">mtmp</span> <span class="o">=</span> <span class="sh">'</span><span class="s">{0} - INFO: 异常消费 {1}</span><span class="sh">'</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span><span class="n">datetime</span><span class="p">.</span><span class="nf">now</span><span class="p">().</span><span class="nf">strftime</span><span class="p">(</span><span class="sh">'</span><span class="s">%Y-%m-%d %H:%M:%S</span><span class="sh">'</span><span class="p">),</span> <span class="n">line</span><span class="p">)</span>
                <span class="nf">print</span><span class="p">(</span><span class="n">mtmp</span><span class="p">)</span>
        <span class="n">mtmp</span> <span class="o">=</span> <span class="sh">'</span><span class="s">{0} - INFO: 卡内余额 {1}</span><span class="sh">'</span><span class="p">.</span><span class="nf">format</span><span class="p">(</span>
            <span class="n">datetime</span><span class="p">.</span><span class="nf">now</span><span class="p">().</span><span class="nf">strftime</span><span class="p">(</span><span class="sh">'</span><span class="s">%Y-%m-%d %H:%M:%S</span><span class="sh">'</span><span class="p">),</span> <span class="n">remaining</span><span class="p">)</span>
        <span class="nf">print</span><span class="p">(</span><span class="n">mtmp</span><span class="p">)</span>

        <span class="c1"># CSV Part
</span>        <span class="n">today</span> <span class="o">=</span> <span class="n">datetime</span><span class="p">.</span><span class="nf">now</span><span class="p">().</span><span class="nf">strftime</span><span class="p">(</span><span class="sh">'</span><span class="s">%Y-%m-%d</span><span class="sh">'</span><span class="p">)</span>
        <span class="n">headers</span> <span class="o">=</span> <span class="p">[</span><span class="sh">'</span><span class="s">记账日期</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">收款人</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">交易摘要</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">人民币金额</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">类别</span><span class="sh">'</span><span class="p">]</span>
        <span class="n">csvinfo</span> <span class="o">=</span> <span class="p">[]</span>
        <span class="k">if</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">transferin</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">]</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">:</span>
            <span class="n">csvinfo</span><span class="p">.</span><span class="nf">append</span><span class="p">({</span><span class="n">headers</span><span class="p">[</span><span class="mi">0</span><span class="p">]:</span> <span class="n">today</span><span class="p">,</span> <span class="n">headers</span><span class="p">[</span><span class="mi">1</span><span class="p">]:</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">transferin</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">type</span><span class="sh">'</span><span class="p">],</span> <span class="n">headers</span><span class="p">[</span><span class="mi">2</span><span class="p">]:</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">transferin</span><span class="sh">'</span><span class="p">]</span>
                            <span class="p">[</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">],</span> <span class="n">headers</span><span class="p">[</span><span class="mi">3</span><span class="p">]:</span> <span class="sh">"</span><span class="s">%.2f</span><span class="sh">"</span> <span class="o">%</span> <span class="o">-</span><span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">transferin</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">],</span> <span class="n">headers</span><span class="p">[</span><span class="mi">4</span><span class="p">]:</span> <span class="sh">'</span><span class="s">Transferin</span><span class="sh">'</span><span class="p">})</span>
        <span class="k">if</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">breakfast</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">]</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">:</span>
            <span class="n">csvinfo</span><span class="p">.</span><span class="nf">append</span><span class="p">({</span><span class="n">headers</span><span class="p">[</span><span class="mi">0</span><span class="p">]:</span> <span class="n">today</span><span class="p">,</span> <span class="n">headers</span><span class="p">[</span><span class="mi">1</span><span class="p">]:</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">breakfast</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">type</span><span class="sh">'</span><span class="p">],</span> <span class="n">headers</span><span class="p">[</span><span class="mi">2</span><span class="p">]:</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">breakfast</span><span class="sh">'</span><span class="p">]</span>
                            <span class="p">[</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">],</span> <span class="n">headers</span><span class="p">[</span><span class="mi">3</span><span class="p">]:</span> <span class="sh">"</span><span class="s">%.2f</span><span class="sh">"</span> <span class="o">%</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">breakfast</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">],</span> <span class="n">headers</span><span class="p">[</span><span class="mi">4</span><span class="p">]:</span> <span class="sh">'</span><span class="s">Breakfast</span><span class="sh">'</span><span class="p">})</span>
        <span class="k">if</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">lunch</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">]</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">:</span>
            <span class="n">csvinfo</span><span class="p">.</span><span class="nf">append</span><span class="p">({</span><span class="n">headers</span><span class="p">[</span><span class="mi">0</span><span class="p">]:</span> <span class="n">today</span><span class="p">,</span> <span class="n">headers</span><span class="p">[</span><span class="mi">1</span><span class="p">]:</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">lunch</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">type</span><span class="sh">'</span><span class="p">],</span> <span class="n">headers</span><span class="p">[</span><span class="mi">2</span><span class="p">]:</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">lunch</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">],</span> <span class="n">headers</span><span class="p">[</span><span class="mi">3</span><span class="p">]:</span> <span class="sh">"</span><span class="s">%.2f</span><span class="sh">"</span> <span class="o">%</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">lunch</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">],</span> <span class="n">headers</span><span class="p">[</span><span class="mi">4</span><span class="p">]:</span> <span class="sh">'</span><span class="s">Lunch</span><span class="sh">'</span><span class="p">})</span>
        <span class="k">if</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">dinner</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">]</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">:</span>
            <span class="n">csvinfo</span><span class="p">.</span><span class="nf">append</span><span class="p">({</span><span class="n">headers</span><span class="p">[</span><span class="mi">0</span><span class="p">]:</span> <span class="n">today</span><span class="p">,</span> <span class="n">headers</span><span class="p">[</span><span class="mi">1</span><span class="p">]:</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">dinner</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">type</span><span class="sh">'</span><span class="p">],</span> <span class="n">headers</span><span class="p">[</span><span class="mi">2</span><span class="p">]:</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">dinner</span><span class="sh">'</span><span class="p">]</span>
                            <span class="p">[</span><span class="sh">'</span><span class="s">loc</span><span class="sh">'</span><span class="p">],</span> <span class="n">headers</span><span class="p">[</span><span class="mi">3</span><span class="p">]:</span> <span class="sh">"</span><span class="s">%.2f</span><span class="sh">"</span> <span class="o">%</span> <span class="n">payinfo</span><span class="p">[</span><span class="sh">'</span><span class="s">dinner</span><span class="sh">'</span><span class="p">][</span><span class="sh">'</span><span class="s">value</span><span class="sh">'</span><span class="p">],</span> <span class="n">headers</span><span class="p">[</span><span class="mi">4</span><span class="p">]:</span> <span class="sh">'</span><span class="s">Dinner</span><span class="sh">'</span><span class="p">})</span>
        <span class="k">with</span> <span class="nf">open</span><span class="p">(</span><span class="n">today</span><span class="o">+</span><span class="sh">'</span><span class="s">.csv</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">w</span><span class="sh">'</span><span class="p">)</span> <span class="k">as</span> <span class="n">f</span><span class="p">:</span>
            <span class="n">f_csv</span> <span class="o">=</span> <span class="n">csv</span><span class="p">.</span><span class="nc">DictWriter</span><span class="p">(</span><span class="n">f</span><span class="p">,</span> <span class="n">headers</span><span class="p">)</span>
            <span class="n">f_csv</span><span class="p">.</span><span class="nf">writeheader</span><span class="p">()</span>
            <span class="n">f_csv</span><span class="p">.</span><span class="nf">writerows</span><span class="p">(</span><span class="n">csvinfo</span><span class="p">)</span>
</code></pre>
    </div>
  </div>
  <p>代码执行完毕后会生成 <code class="language-plaintext highlighter-rouge">20XX-XX-XX.csv</code>，例如 <code class="language-plaintext highlighter-rouge">2020-07-02.csv</code>：</p>
  <table>
    <thead>
      <tr>
        <th>记账日期</th>
        <th>收款人</th>
        <th>交易摘要</th>
        <th>人民币金额</th>
        <th>类别</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td>2020-07-02</td>
        <td>科大餐饮</td>
        <td>一卡通充值</td>
        <td>-200.00</td>
        <td>Transferin</td>
      </tr>
      <tr>
        <td>2020-07-02</td>
        <td>科大餐饮</td>
        <td>西区芳华园餐厅</td>
        <td>5.00</td>
        <td>Breakfast</td>
      </tr>
      <tr>
        <td>2020-07-02</td>
        <td>科大餐饮</td>
        <td>西区芳华园餐厅</td>
        <td>10.00</td>
        <td>Lunch</td>
      </tr>
      <tr>
        <td>2020-07-02</td>
        <td>科大餐饮</td>
        <td>西区芳华园餐厅</td>
        <td>10.00</td>
        <td>Dinner</td>
      </tr>
    </tbody>
  </table>
  <h3 id="准备-importer-config">准备 Importer Config</h3>
  <p>Beancount Importer Config 文件为
    <a href="https://git.lug.ustc.edu.cn/Charles/ecard_beancount/-/blob/master/importers/ustc_card_importer.py">importers/ustc_card_importer.py</a>。</p>
  <div class="language-python highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="c1">#!/usr/bin/env python
</span><span class="kn">import</span> <span class="n">os</span>
<span class="kn">import</span> <span class="n">sys</span>
<span class="kn">import</span> <span class="n">beancount.ingest.extract</span>
<span class="kn">from</span> <span class="n">beancount.ingest.importers</span> <span class="kn">import</span> <span class="n">csv</span>

<span class="n">beancount</span><span class="p">.</span><span class="n">ingest</span><span class="p">.</span><span class="n">extract</span><span class="p">.</span><span class="n">HEADER</span> <span class="o">=</span> <span class="sh">''</span>

<span class="k">def</span> <span class="nf">dumb_USTCecard_categorizer</span><span class="p">(</span><span class="n">txn</span><span class="p">):</span>
    <span class="c1"># At this time the txn has only one posting
</span>    <span class="k">try</span><span class="p">:</span>
        <span class="n">posting1</span> <span class="o">=</span> <span class="n">txn</span><span class="p">.</span><span class="n">postings</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
    <span class="k">except</span> <span class="nb">IndexError</span><span class="p">:</span>
        <span class="k">return</span> <span class="n">txn</span>

    <span class="c1"># Guess the account(s) of the other posting(s)
</span>    <span class="k">if</span> <span class="sh">'</span><span class="s">breakfast</span><span class="sh">'</span> <span class="ow">in</span> <span class="n">txn</span><span class="p">.</span><span class="n">narration</span><span class="p">.</span><span class="nf">lower</span><span class="p">():</span>
        <span class="n">account</span> <span class="o">=</span> <span class="sh">'</span><span class="s">Expenses:Food:Breakfast</span><span class="sh">'</span>
    <span class="k">elif</span> <span class="sh">'</span><span class="s">lunch</span><span class="sh">'</span> <span class="ow">in</span> <span class="n">txn</span><span class="p">.</span><span class="n">narration</span><span class="p">.</span><span class="nf">lower</span><span class="p">():</span>
        <span class="n">account</span> <span class="o">=</span> <span class="sh">'</span><span class="s">Expenses:Food:Lunch</span><span class="sh">'</span>
    <span class="k">elif</span> <span class="sh">'</span><span class="s">dinner</span><span class="sh">'</span> <span class="ow">in</span> <span class="n">txn</span><span class="p">.</span><span class="n">narration</span><span class="p">.</span><span class="nf">lower</span><span class="p">():</span>
        <span class="n">account</span> <span class="o">=</span> <span class="sh">'</span><span class="s">Expenses:Food:Dinner</span><span class="sh">'</span>
    <span class="k">elif</span> <span class="sh">'</span><span class="s">transferin</span><span class="sh">'</span> <span class="ow">in</span> <span class="n">txn</span><span class="p">.</span><span class="n">narration</span><span class="p">.</span><span class="nf">lower</span><span class="p">():</span>
        <span class="n">account</span> <span class="o">=</span> <span class="sh">'</span><span class="s">Assets:CN:Bank:BoC:C1234</span><span class="sh">'</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="k">return</span> <span class="n">txn</span>
    <span class="c1"># Make the other posting(s)
</span>    <span class="n">posting2</span> <span class="o">=</span> <span class="n">posting1</span><span class="p">.</span><span class="nf">_replace</span><span class="p">(</span>
        <span class="n">account</span><span class="o">=</span><span class="n">account</span><span class="p">,</span>
        <span class="n">units</span><span class="o">=-</span><span class="n">posting1</span><span class="p">.</span><span class="n">units</span>
    <span class="p">)</span>
    <span class="c1"># Insert / Append the posting into the transaction
</span>    <span class="k">if</span> <span class="n">posting1</span><span class="p">.</span><span class="n">units</span> <span class="o">&lt;</span> <span class="n">posting2</span><span class="p">.</span><span class="n">units</span><span class="p">:</span>
        <span class="n">txn</span><span class="p">.</span><span class="n">postings</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">posting2</span><span class="p">)</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="n">txn</span><span class="p">.</span><span class="n">postings</span><span class="p">.</span><span class="nf">insert</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">posting2</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">txn</span>

<span class="n">CONFIG</span> <span class="o">=</span> <span class="p">[</span>
    <span class="c1"># USTC canteen
</span>    <span class="n">csv</span><span class="p">.</span><span class="nc">Importer</span><span class="p">(</span>
        <span class="p">{</span>
            <span class="n">csv</span><span class="p">.</span><span class="n">Col</span><span class="p">.</span><span class="n">DATE</span><span class="p">:</span> <span class="sh">'</span><span class="s">记账日期</span><span class="sh">'</span><span class="p">,</span>
            <span class="n">csv</span><span class="p">.</span><span class="n">Col</span><span class="p">.</span><span class="n">PAYEE</span><span class="p">:</span> <span class="sh">'</span><span class="s">收款人</span><span class="sh">'</span><span class="p">,</span>
            <span class="n">csv</span><span class="p">.</span><span class="n">Col</span><span class="p">.</span><span class="n">NARRATION1</span><span class="p">:</span> <span class="sh">'</span><span class="s">交易摘要</span><span class="sh">'</span><span class="p">,</span>
            <span class="n">csv</span><span class="p">.</span><span class="n">Col</span><span class="p">.</span><span class="n">AMOUNT_DEBIT</span><span class="p">:</span> <span class="sh">'</span><span class="s">人民币金额</span><span class="sh">'</span><span class="p">,</span>
            <span class="n">csv</span><span class="p">.</span><span class="n">Col</span><span class="p">.</span><span class="n">NARRATION2</span><span class="p">:</span> <span class="sh">'</span><span class="s">类别</span><span class="sh">'</span>
        <span class="p">},</span>
        <span class="n">account</span><span class="o">=</span><span class="sh">'</span><span class="s">Assets:CN:Card:USTC</span><span class="sh">'</span><span class="p">,</span>
        <span class="n">currency</span><span class="o">=</span><span class="sh">'</span><span class="s">CNY</span><span class="sh">'</span><span class="p">,</span>
        <span class="n">categorizer</span><span class="o">=</span><span class="n">dumb_USTCecard_categorizer</span><span class="p">,</span>
    <span class="p">),</span>
<span class="p">]</span>
</code></pre>
    </div>
  </div>
  <p>语法说明参见 <a href="https://charlesliu7.github.io/blackboard/2019/12/03/beancount-importer/">Beancount 系列二：Importer
      设置</a>。</p>
  <p>执行命令生成 beancount 账单。</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>bean-extract ustc_card_importer.py 2020-07-02.csv
</code></pre>
    </div>
  </div>
  <p>得到账单：</p>
  <div class="language-text highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>**** /path/to/2020-07-02.csv

2020-07-02 * "科大餐饮" "一卡通充值; Transferin"
    Assets:CN:Card:USTC        200.00 CNY
    Assets:CN:Bank:BoC:C1234  -200.00 CNY

2020-07-02 * "科大餐饮" "西区芳华园餐厅; Breakfast"
    Assets:CN:Card:USTC      -5.00 CNY
    Expenses:Food:Breakfast   5.00 CNY

2020-07-02 * "科大餐饮" "西区芳华园餐厅; Lunch"
    Assets:CN:Card:USTC  -10.00 CNY
    Expenses:Food:Lunch   10.00 CNY

2020-07-02 * "科大餐饮" "西区芳华园餐厅; Dinner"
    Assets:CN:Card:USTC   -10.00 CNY
    Expenses:Food:Dinner   10.00 CNY
</code></pre>
    </div>
  </div>
  <p>校园卡消费可以直接使用该 importer。支付宝账单、信用卡账单等也可以通过导出 CSV 账单的方式利用自己编写的 importer 导入。</p>
  <h3 id="自动化">自动化</h3>
  <p>上述过程需要执行多个命令和脚本，利用 <code class="language-plaintext highlighter-rouge">crontab</code> 在每日睡前 (23:30) 执行一遍代码即可自动化记录消费。</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="nv">$ </span>python crawler.py&gt;&gt;log.log
<span class="nv">$ </span><span class="nb">cd </span>importers
<span class="nv">$ </span>python ustc_card_importer_pipeline.py <span class="c"># 注意这里需要修改要记录的账本文件</span>
</code></pre>
    </div>
  </div>
  <p>Done!</p>
  <h2 id="fava">Fava</h2>
  <ul>
    <li>Fava 可视化网页中提供了编辑功能，对于多文件的编辑，默认打开的是主文件，要想修改编辑器默认打开的文件，需将 <code class="language-plaintext highlighter-rouge">2019-07-11 custom "fava-option" "default-file"</code> 这个设置放在想要设定的文件里。</li>
    <li>Fava 系统中也提供了添加记录的功能，但添加的记录默认写入了主文件里，根据<a href="https://github.com/beancount/fava/issues/875">Fava insert-entry options</a>, <a href="https://github.com/beancount/fava/issues/882">default-file could also set the insertion file</a> 作者似乎不 care 添加在哪个文件里这个问题，但依然可以利用 <code class="language-plaintext highlighter-rouge">insert-entry</code> 关键字变相设置一下，比如将 <code class="language-plaintext highlighter-rouge">2019-01-01 custom "fava-option" "insert-entry" ".*"</code> 断言写在 <code class="language-plaintext highlighter-rouge">2019/01.bean</code> 文件的末尾，所有在 2019-01-01 之后的记录，通过 Fava 添加记录的话，该记录会 write 在这个断言之前。</li>
    <li>Fava 是不带有密码功能的，根据 <a href="https://github.com/beancount/fava/issues/314">Make fava password-protected</a> 作者认为这不应该是 Fava 应该做的工作；利用 <a href="https://www.digitalocean.com/community/tutorials/how-to-set-up-password-authentication-with-apache-on-ubuntu-16-04?comment=76154">Apache</a> 或者 <a href="https://docs.nginx.com/nginx/admin-guide/security-controls/configuring-http-basic-authentication/">Nginx</a> 的认证功能可以满足这个需求。</li>
    <li>
      <p>可视化工具 Fava 也支持 Importer，可以通过设置：</p>
      <div class="language-conf highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">custom</span> <span class="s2">"fava-option"</span> <span class="s2">"import-config"</span> <span class="s2">"./importers/path/to/importer.py"</span>
<span class="m">2017</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">custom</span> <span class="s2">"fava-option"</span> <span class="s2">"import-dirs"</span> <span class="s2">"./importers/path/to/csv_tmp/"</span>
</code></pre>
        </div>
    </div>
      <p>在 Fava 界面侧栏看到 Importer，并手动导入数据。注：Importer 在 Fava 中使用的时候 metadata 会被去除。</p>
    </li>
    <li>
      <p>Fava 还支持自定义 side bar link，即：</p>
      <div class="language-conf highlighter-rouge">
        <div class="highlight">
          <pre class="highlight"><code><span class="m">2099</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">custom</span> <span class="s2">"fava-sidebar-link"</span> <span class="s2">"This Week"</span> <span class="s2">"/jump?time=day-6+-+day"</span>
<span class="m">2099</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">custom</span> <span class="s2">"fava-sidebar-link"</span> <span class="s2">"This Month"</span> <span class="s2">"/jump?time=month"</span>
<span class="m">2099</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">custom</span> <span class="s2">"fava-sidebar-link"</span> <span class="s2">"3 Month"</span> <span class="s2">"/jump?time=month-1+-+month%2B1"</span>
<span class="m">2099</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">custom</span> <span class="s2">"fava-sidebar-link"</span> <span class="s2">"Year-To-Date"</span> <span class="s2">"/jump?time=year+-+month"</span>
<span class="m">2099</span>-<span class="m">01</span>-<span class="m">01</span> <span class="n">custom</span> <span class="s2">"fava-sidebar-link"</span> <span class="s2">"All dates"</span> <span class="s2">"/jump?time="</span>
</code></pre>
        </div>
    </div>
    </li>
  </ul>
  ]]></content><author><name>Xenon</name></author><category term="USTC" /><category term="Beancount" /><category term="eCard" /><summary type="html"><![CDATA[本文首发于 https://charlesliu7.github.io/blackboard/2019/07/24/beancount/]]></summary></entry><entry><title type="html">【译稿】树莓派 4 评测与基准测试：与树莓派 3B+ 相比的改进</title><link href="https://lug.ustc.edu.cn/planet/2019/09/raspberry-4/" rel="alternate" type="text/html" title="【译稿】树莓派 4 评测与基准测试：与树莓派 3B+ 相比的改进" /><published>2019-09-17T00:00:00+08:00</published><updated>2020-11-09T14:22:35+08:00</updated><id>https://lug.ustc.edu.cn/planet/2019/09/raspberry-4</id><content type="html" xml:base="https://lug.ustc.edu.cn/planet/2019/09/raspberry-4/"><![CDATA[<p>原文地址：https://ibugone.com/blog/2019/09/raspberry-pi-4-review-benchmark/，作者为 @iBug 同学。以下为翻译部分。</p>
  <hr />
  <p>前段时间，我终于拿到了自己心念的树莓派 4（4 GB 的型号），之后我就忍不住试了试，看看它在报道中提到的改进与提升究竟是什么样子的。</p>
  <p>我自己的树莓派 3B+ 有个外壳，所以这次购买树莓派 4 的同时，我也订了个铝制的外壳，以减轻发热给树莓派带来的压力。与之前的外壳不同的是，它还配备了两个小风扇，能够大幅度提高散热效率。</p>
  <p>就让我们来看一看吧。</p>
  <h2 id="概览">概览</h2>
  <figure class="third ">
    <a href="/static/planet/box.jpg" title="Package of Raspberry Pi 4">
      <img src="/static/planet/box.jpg" alt="Package of Raspberry Pi 4" />
    </a>
    <a href="/static/planet/box-bottom.jpg" title="Bottom of the package">
      <img src="/static/planet/box-bottom.jpg" alt="Bottom of the package" />
    </a>
    <a href="/static/planet/box-open-1024x768.jpg" title="Opening the package">
      <img src="/static/planet/box-open-1024x768.jpg" alt="The package is open" />
    </a>
  </figure>
  <p>新的树莓派 4 装在了类似 3B+ 的包装中，包装正面是树莓派 4 的红底白线的结构图，与本体大小相同。与树莓派 3B 不同的是，3B+ 和 4 都没有用防静电袋包好。当然，这不是什么问题。</p>
  <p>它的结构与前几代类似，不过有一些明显的变化。比如说，你肯定会最先注意到那几个 USB 3.0 接口——因为它们是蓝色的。在你观察那几个接口的同时，你很可能也注意到了，有可能是因为千兆网口的升级的缘故，以太网接口换了位置。</p>
  <figure class="third ">
    <a href="/static/planet/overview.jpg" title="Overview of Raspberry Pi 4">
      <img src="/static/planet/overview.jpg" alt="Overview of Raspberry Pi 4" />
    </a>
    <a href="/static/planet/overview-usb.jpg" title="The USB ports and the Ethernet port">
      <img src="/static/planet/overview-usb.jpg" alt="Raspberry Pi 4 on top of the box, showing the USB ports and the Ethernet port" />
    </a>
    <a href="/static/planet/overview-side-ports.jpg" title="The USB Type-C port and the HDMI ports">
      <img src="/static/planet/overview-side-ports.jpg" alt="Focusing on the USB Type-C port and the HDMI ports" />
    </a>
  </figure>
  <p>有一些小接口，即供电口和视频输出，也有一些变化。树莓派 4 现在需要一根 Type-C 线供电，并且需求提升到了 5V / 3A。目前尚且不知树莓派 4 是否支持类似于高通快充或者 USB PD 这样的快充技术，但是从用户反馈来看，是没有的。旧款上标准大小的 HDMI 也被 micro-HDMI 口替换——而且变成了两个：它们都支持 4K 60fps 输出，而且可以同时输出！尽管我打算把树莓派当无头（无显示的）服务器来用，用树莓派配桌面环境的人可能会喜欢这个特性。</p>
  <p>内存条也从树莓派的背面移到了正面，现在在 SoC 旁边。SoC 长得和 3B+ 的一样，但是内部却完全不同。Wi-Fi 屏蔽罩和天线没有变化，另外在千兆网口的前面又多出了一个额外的芯片。</p>
  <h2 id="参数">参数</h2>
  <p>新的树莓派 4 带来了大量激动人心的更新，包括了：</p>
  <ul>
    <li>博通 BCM2711 SoC, 四核 1.5 GHz Cortex-A72 CPU</li>
    <li>有 1 GB, 2GB 和 4 GB RAM 四种版本（4 GB 就是我手中的这款）</li>
    <li>博通 VideoCore VI GPU</li>
    <li>真正的千兆以太网口</li>
    <li>蓝牙 5.0</li>
    <li>原生 USB 3.0 支持，包含了 2 个 Type-A 的接口</li>
    <li>双 HDMI 口，支持同时 4K 60fps 输出</li>
    <li>更快的 microSD 卡插槽</li>
  </ul>
  <p>在之后的基准测试中，你可以看到这些更新参数意味着什么。以下是一张对比表：</p>
  <table>
    <thead>
      <tr>
        <th> </th>
        <th style="text-align: center">树莓派 3B</th>
        <th style="text-align: center">树莓派 3B+</th>
        <th style="text-align: center">树莓派 4</th>
      </tr>
    </thead>
    <tbody>
      <tr>
        <td>CPU</td>
        <td style="text-align: center">四核 1.20 GHz Cortex-A53</td>
        <td style="text-align: center">四核 1.40 GHz Cortex-A53</td>
        <td style="text-align: center">四核 1.50 GHz Cortex-A72</td>
      </tr>
      <tr>
        <td>内存</td>
        <td style="text-align: center">1 GB DDR2</td>
        <td style="text-align: center">1 GB DDR2</td>
        <td style="text-align: center">1 / 2 / <strong>4</strong> GB DDR4</td>
      </tr>
      <tr>
        <td>GPU</td>
        <td style="text-align: center">VideoCore IV</td>
        <td style="text-align: center">VideoCore IV</td>
        <td style="text-align: center">VideoCore VI</td>
      </tr>
      <tr>
        <td>以太网</td>
        <td style="text-align: center">100 Mbps</td>
        <td style="text-align: center">300 Mbps 有效</td>
        <td style="text-align: center">1 Gbps</td>
      </tr>
      <tr>
        <td>Wi-Fi</td>
        <td style="text-align: center">2.4 GHz</td>
        <td style="text-align: center">2.4 GHz / 5 GHz</td>
        <td style="text-align: center">2.4 GHz / 5 GHz</td>
      </tr>
      <tr>
        <td>蓝牙</td>
        <td style="text-align: center">4.0</td>
        <td style="text-align: center">4.2</td>
        <td style="text-align: center">5.0</td>
      </tr>
      <tr>
        <td>USB</td>
        <td style="text-align: center">4 个 USB 2.0</td>
        <td style="text-align: center">4 个 USB 2.0</td>
        <td style="text-align: center">2 个 USB 2.0 和 2 个 USB 3.0</td>
      </tr>
      <tr>
        <td>官方价格</td>
        <td style="text-align: center">35 美元</td>
        <td style="text-align: center">35 美元</td>
        <td style="text-align: center">35 / 45 / <strong>55</strong> 美元<br />
          （根据内存选择不同而不同）</td>
      </tr>
    </tbody>
  </table>
  <p>在我买到 3B+ 之后，我就把 3B 卖（给了译者），所以那个树莓派 3B 就没法参与接下来的评测了。（译者注：那个 3B 现在被我放在家里吃灰……对不起……）</p>
  <h2 id="我的设置">我的设置</h2>
  <figure class=" ">
    <a href="/static/planet/rpis-powered.jpg" title="Both Raspberry Pis, powered through their GPIO pins">
      <img src="/static/planet/rpis-powered.jpg" alt="Both Raspberry Pis, powered through their GPIO pins" />
    </a>
  </figure>
  <p>正如你看到的那样，两个树莓派都是无头的服务器，只连接了电源和以太网。你可能在疑惑，为什么它们看起来这么诡异，这是因为我所在的实验室有很多标称 5V / 6A 的电源供应线，所以我就拿了一个来通过 GPIO 给这两个树莓派供电。这两个树莓派峰值标称 5V / 2.5A 和 5V / 3A，所以一根供电线就够了。</p>
  <div class="notice--danger">
    <h4 class="no_toc" id="section"><i class="fas fa-exclamation-circle"></i> 警告</h4>
    <p>除非你有稳定的供电，请不要使用 GPIO 给树莓派供电。手机充电器不能作为电源供应，你永远都不应该用手机充电器通过 GPIO 供电给树莓派。</p>
  </div>
  <h2 id="基准测试">基准测试</h2>
  <p>两个树莓派被分配了静态 IP，所有的操作都通过 SSH 完成。操作系统是最新版的 Raspbian Buster Lite。</p>
  <h3 id="sysbench-cpu-测试">SysBench CPU 测试</h3>
  <p>SysBench 是一个可以快速获取系统性能的测试套件。这里我使用 SysBench 来测试 CPU 与内存。</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>sysbench <span class="nt">--test</span><span class="o">=</span>cpu run
sysbench <span class="nt">--test</span><span class="o">=</span>cpu <span class="nt">--num-threads</span><span class="o">=</span>4 run
sysbench <span class="nt">--test</span><span class="o">=</span>cpu <span class="nt">--num-threads</span><span class="o">=</span>8 run
</code></pre>
    </div>
  </div>
  <figure class=""><img src="/static/planet/sysbench-cpu.png" alt="" />
    <figcaption>
      SysBench CPU 性能，单位为秒，越低越好
    </figcaption>
  </figure>
  <p>如表中显示的那样，在 CPU 性能方面，树莓派 4 相比 3B+ 而言有巨大提升，在所有情况中都少花了 19.3% 的时间。</p>
  <h3 id="sysbench-内存测试">SysBench 内存测试</h3>
  <p>内存测试有点复杂，并且我发现了一些意料之外的结果。</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>sysbench <span class="nt">--test</span><span class="o">=</span>memory <span class="nt">--memory-block-size</span><span class="o">=</span>1K <span class="nt">--memory-total-size</span><span class="o">=</span>2G <span class="nt">--memory-oper</span><span class="o">=</span><span class="nb">read </span>run
sysbench <span class="nt">--test</span><span class="o">=</span>memory <span class="nt">--memory-block-size</span><span class="o">=</span>1K <span class="nt">--memory-total-size</span><span class="o">=</span>2G <span class="nt">--memory-oper</span><span class="o">=</span>write run
sysbench <span class="nt">--test</span><span class="o">=</span>memory <span class="nt">--memory-block-size</span><span class="o">=</span>1K <span class="nt">--memory-total-size</span><span class="o">=</span>2G <span class="nt">--memory-oper</span><span class="o">=</span><span class="nb">read</span> <span class="nt">--num-threads</span><span class="o">=</span>4 run
sysbench <span class="nt">--test</span><span class="o">=</span>memory <span class="nt">--memory-block-size</span><span class="o">=</span>1K <span class="nt">--memory-total-size</span><span class="o">=</span>2G <span class="nt">--memory-oper</span><span class="o">=</span>write <span class="nt">--num-threads</span><span class="o">=</span>4 run
sysbench <span class="nt">--test</span><span class="o">=</span>memory <span class="nt">--memory-block-size</span><span class="o">=</span>1M <span class="nt">--memory-total-size</span><span class="o">=</span>2G <span class="nt">--memory-oper</span><span class="o">=</span>write <span class="nt">--num-threads</span><span class="o">=</span>4 run
</code></pre>
    </div>
  </div>
  <figure class=""><img src="/static/planet/sysbench-memory.png" alt="" />
    <figcaption>
      SysBench 内存性能，单位为每秒指令数，越高越好
    </figcaption>
  </figure>
  <p>新的 DDR4 内存竟然比老古董 DDR2 内存慢，而且在多线程情况下差距进一步拉大了！唯一一点合理的是，当单个块大小到 1 MiB 的时候，树莓派 4 要小幅好于 3B+。</p>
  <p>一件有意思的事情是，我没有包含“1 MiB Read MT”（1 MiB 单块，读取，多线程）这一列。SysBench 在两块树莓派上都给我返回了超过 200 GB/s 的结果，有时候结果还高达 500 GB/s。这显然太滑稽了，所以我直接忽略了那个结果。</p>
  <h3 id="fio-microsd-卡速度测试">FIO microSD 卡速度测试</h3>
  <p>此测试结果依赖于 microSD 卡本身，所以我拿出了我拥有的最快的 SD 卡：Lexar 667x 128 GB microSD 卡，外观类似下面这张图：</p>
  <figure class=""><img src="/static/planet/microsd-card-1024x752.jpg" alt="" /></figure>
  <p>我使用 <code class="language-plaintext highlighter-rouge">fio</code> 作为磁盘（microSD 卡）I/O 性能测试工具。因为我更熟悉 Crystal DiskMark，我调整了 <code class="language-plaintext highlighter-rouge">fio</code> 的参数，以与其一致。</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>fio <span class="nt">--loops</span><span class="o">=</span>5 <span class="nt">--size</span><span class="o">=</span>500m <span class="nt">--filename</span><span class="o">=</span>fiotest.tmp <span class="nt">--stonewall</span> <span class="nt">--ioengine</span><span class="o">=</span>libaio <span class="nt">--direct</span><span class="o">=</span>1 <span class="se">\</span>
  <span class="nt">--name</span><span class="o">=</span>SeqRead <span class="nt">--bs</span><span class="o">=</span>1m <span class="nt">--rw</span><span class="o">=</span><span class="nb">read</span> <span class="se">\</span>
  <span class="nt">--name</span><span class="o">=</span>SeqWrite <span class="nt">--bs</span><span class="o">=</span>1m <span class="nt">--rw</span><span class="o">=</span>write <span class="se">\</span>
  <span class="nt">--name</span><span class="o">=</span>512Kread <span class="nt">--bs</span><span class="o">=</span>512k <span class="nt">--rw</span><span class="o">=</span>randread <span class="se">\</span>
  <span class="nt">--name</span><span class="o">=</span>512Kwrite <span class="nt">--bs</span><span class="o">=</span>512k <span class="nt">--rw</span><span class="o">=</span>randwrite <span class="se">\</span>
  <span class="nt">--name</span><span class="o">=</span>4KQD32read <span class="nt">--bs</span><span class="o">=</span>4k <span class="nt">--iodepth</span><span class="o">=</span>32 <span class="nt">--rw</span><span class="o">=</span>randread <span class="se">\</span>
  <span class="nt">--name</span><span class="o">=</span>4KQD32write <span class="nt">--bs</span><span class="o">=</span>4k <span class="nt">--iodepth</span><span class="o">=</span>32 <span class="nt">--rw</span><span class="o">=</span>randwrite <span class="se">\</span>
  <span class="nt">--name</span><span class="o">=</span>4Kread <span class="nt">--bs</span><span class="o">=</span>4k <span class="nt">--rw</span><span class="o">=</span>randread <span class="se">\</span>
  <span class="nt">--name</span><span class="o">=</span>4Kwrite <span class="nt">--bs</span><span class="o">=</span>4k <span class="nt">--rw</span><span class="o">=</span>randwrite
</code></pre>
    </div>
  </div>
  <figure class=""><img src="/static/planet/fio-microsd.png" alt="" />
    <figcaption>
      MicroSD 性能，单位为 MB/s，越高越好
    </figcaption>
  </figure>
  <p>从结果来看，树莓派 4 的性能有巨大提升，在许多测试中都比 3B+ 快出 50%。这可能是树莓派 4 最有用的更新，因为性能的瓶颈几乎都在缓慢的磁盘 I/O 上。</p>
  <h3 id="p7zip-基准测试">p7zip 基准测试</h3>
  <p>7-zip 有个自带的基准测试工具，当然 <code class="language-plaintext highlighter-rouge">p7zip</code>（7-zip 的 POSIX 移植）也有。我使用这个工具来测试树莓派的压缩与解压性能。</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>7zr b <span class="nt">-mmt1</span>
7zr b
</code></pre>
    </div>
  </div>
  <figure class=""><img src="/static/planet/p7zip.png" alt="" />
    <figcaption>
      p7zip 基准测试，越高越好
    </figcaption>
  </figure>
  <p>如<a href="https://sevenzip.osdn.jp/chm/cmdline/commands/bench.htm">此帮助文档</a>所言，压缩更依赖于内存的吞吐量与延迟，这可能是两块树莓派间在压缩测试中差距增大的原因。总之，树莓派 4 在 p7zip 测试中有 1/3 的性能提升。</p>
  <h3 id="openssl-速度测试">OpenSSL 速度测试</h3>
  <p>OpenSSL 是目前最流行的密码学软件库，它也包含了一个内置的速度测试。结果是在所有块大小中最快的那个——在 4 个测试中大小都是 16,384 字节。</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>openssl speed <span class="nt">-evp</span> aes-256-cbc
openssl speed <span class="nt">-evp</span> aes-256-gcm
openssl speed <span class="nt">-evp</span> sha1
openssl speed <span class="nt">-evp</span> sha256
</code></pre>
    </div>
  </div>
  <figure class=""><img src="/static/planet/openssl.png" alt="" />
    <figcaption>
      OpenSSL 基准测试，单位为 MB/s，越高越好
    </figcaption>
  </figure>
  <h3 id="网络速度测试">网络速度测试</h3>
  <p>树莓派 4 将 300 Mbps 以太网升级到了真正的千兆网口，如果你打算用它来当离线下载器或者 NAS 的话，这是极好的。这里我跑了两个测试，看看网络究竟如何。</p>
  <h4 id="curl-文件下载测试">CURL 文件下载测试</h4>
  <p>这项测试非常简单：使用 cURL 从局域网机器下载文件，查看速度。</p>
  <figure class=""><img src="/static/planet/cURL.png" alt="" />
    <figcaption>
      cURL 下载速度，单位为 MB/s，越高越好
    </figcaption>
  </figure>
  <p>结果不如预想那么好：树莓派 4 没能跑出千兆的速度，而我旁边的 x86 Linux 盒子就做到了。</p>
  <h4 id="nginx-性能测试">NGINX 性能测试</h4>
  <p>另一个常见的场景是：使用 NGINX 提供网页服务（对不起，这里没有 Apache 的位置）。我在两块树莓派上都安装了 NGINX，设置了 <code class="language-plaintext highlighter-rouge">access_log off</code>，并在我的 x86 盒子上使用 Siege 4.0.4 对树莓派服务器进行基准测试。</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>siege <span class="nt">-c</span> 10 <span class="nt">-r</span> 1000 <span class="o">[</span>host]
siege <span class="nt">-c</span> 25 <span class="nt">-r</span> 400 <span class="o">[</span>host]
</code></pre>
    </div>
  </div>
  <figure class=""><img src="/static/planet/nginx.png" alt="" />
    <figcaption>
      NGINX 性能，单位为请求数每秒，越高越好
    </figcaption>
  </figure>
  <p>在 CPU 性能与网络速度的双重提升下，新的树莓派 4 速度接近 3B+ 的两倍。如果你想用树莓派搭个网站的话，这是个好消息。</p>
  <h3 id="应用程序性能">应用程序性能</h3>
  <p>我选择了两个我最熟悉的编程语言环境：Python 和 Ruby（我对 Node 不熟）来进行测试。</p>
  <p>Python 测试使用了<a href="https://stackoverflow.com/a/44677724/8460426">此 Stack Overflow 回答</a>中的那个蠢极了的脚本，运行时间作为结果。</p>
  <div class="language-python highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code><span class="k">def</span> <span class="nf">test</span><span class="p">():</span>
    <span class="sh">"""</span><span class="s">Stupid test function</span><span class="sh">"""</span>
    <span class="n">lst</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nf">range</span><span class="p">(</span><span class="mi">100</span><span class="p">):</span>
        <span class="n">lst</span><span class="p">.</span><span class="nf">append</span><span class="p">(</span><span class="n">i</span><span class="p">)</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="sh">'</span><span class="s">__main__</span><span class="sh">'</span><span class="p">:</span>
    <span class="kn">import</span> <span class="n">timeit</span>
    <span class="nf">print</span><span class="p">(</span><span class="n">timeit</span><span class="p">.</span><span class="nf">timeit</span><span class="p">(</span><span class="sh">"</span><span class="s">test()</span><span class="sh">"</span><span class="p">,</span> <span class="n">setup</span><span class="o">=</span><span class="sh">"</span><span class="s">from __main__ import test</span><span class="sh">"</span><span class="p">))</span>
</code></pre>
    </div>
  </div>
  <p>Ruby 测试就简单很多：使用 Jekyll 构建<a href="https://ibugone.com/">我的站点</a>，查看消耗的时间。</p>
  <figure class=""><img src="/static/planet/python-ruby.png" alt="" />
    <figcaption>
      应用程序性能，单位为秒，越低越好
    </figcaption>
  </figure>
  <p>Ruby 测试比 Python 测试更加均衡，因为它主要测试的是纯计算性能，结果 Ruby 测试中性能差距就要小一些。</p>
  <p>不过等等，这不代表树莓派 4 对大型的 Python 或者 Ruby 项目来说是个好的选择。相同的测试在我的 x86 盒子（i7-8850H, 32 GB DDR4, NVMe SSD）上<strong>快了 10 倍</strong>，其仅使用 5 秒运行 Python 脚本，4 秒构建我的 Jekyll 站点。毕竟，你不能期望一个只卖 55 美元的板子能够有毁天灭地的性能，不是吗？</p>
  <h3 id="usb-io-性能">USB I/O 性能</h3>
  <p>我拿出了我的 USB 3.1 SSD（由 LiteOn L9M 512 GB 和一个包含了 VL716 SATA 转 USB 芯片的硬盘盒组装而成）。但是，我一把 SSD 插上树莓派，它就没电了。之后我发现这是因为供电不足（GPIO 针脚无法提供足够的电力），所以一天之后，我从 MicroUSB / Type-C 接口重新供电。这一次，3B+ 很顺利，但是树莓派 4 在测试时再一次因为电力问题出错。最后，我只能<strong>同时</strong>从 Type-C 和 GPIO 给树莓派供电，以便在不断电的情况下完成 SSD 测试。</p>
  <p>这次的供电问题真的很严重，但先不去管它。让我们看看结果。</p>
  <div class="language-shell highlighter-rouge">
    <div class="highlight">
      <pre class="highlight"><code>fio <span class="nt">--loops</span><span class="o">=</span>5 <span class="nt">--size</span><span class="o">=</span>1g <span class="nt">--filename</span><span class="o">=</span>fiotest.tmp <span class="nt">--stonewall</span> <span class="nt">--ioengine</span><span class="o">=</span>libaio <span class="se">\</span>
  <span class="nt">--direct</span><span class="o">=</span>1 <span class="nt">--name</span><span class="o">=</span>SeqRead <span class="nt">--bs</span><span class="o">=</span>1m <span class="nt">--rw</span><span class="o">=</span><span class="nb">read</span> <span class="nt">--name</span><span class="o">=</span>SeqWrite <span class="nt">--bs</span><span class="o">=</span>1m <span class="nt">--rw</span><span class="o">=</span>write
</code></pre>
    </div>
  </div>
  <figure class=""><img src="/static/planet/fio-usb.png" alt="" />
    <figcaption>
      USB 速度，单位为 MB/s，越高越好
    </figcaption>
  </figure>
  <p>结果很棒！升级后的 USB 3.0 接口，即使没有跑到最高速，也比前代树莓派高出很多。但在享受高速 USB 之前，再让我强调一次：请特别关注你的 USB 外设，特别是那些有点儿耗电的设备，像机械硬盘和 SSD。如果电力得以保证，利用那两个高速 USB 接口对于 NAS 或者其他存储扩展来说好处多多。</p>
  <h2 id="总结">总结</h2>
  <p>在体验过 3B 到 3B+ 的小小提升后，新的树莓派 4 对于大多数树莓派爱好者来说，都是一场盛宴。价格不变，即使你已经有个 3B+，树莓派 4 也可以说是必买的。尽管在供电和散热上有缺点，如果你不插太多外设，不在树莓派上放太多重型任务的话，也没什么关系。</p>
  ]]></content><author><name>taoky</name></author><category term="Technology" /><category term="Translation" /><category term="树莓派" /><summary type="html"><![CDATA[原文地址：https://ibugone.com/blog/2019/09/raspberry-pi-4-review-benchmark/，作者为 @iBug 同学。以下为翻译部分。]]></summary></entry></feed>