Base64 是一种用于表示二进制到文本的编码方案,用于在不支持二进制传输通道中传送二进制数据流。

Base64

// Demo from https://developer.mozilla.org/en-US/docs/Web/API/WindowOrWorkerGlobalScope/atob
var encodedData = window.btoa('Hello, world'); // encode a string
var decodedData = window.atob(encodedData); // decode the string

Unicode Problem

> window.btoa('中文')
< VM200:1 Uncaught DOMException: Failed to execute 'btoa' on 'Window': The string to be encoded contains characters outside of the Latin1 range.
    at <anonymous>:1:8

// resution
// https://stackoverflow.com/questions/23223718/failed-to-execute-btoa-on-window-the-string-to-be-encoded-contains-characte
// https://developer.mozilla.org/en-US/docs/Web/API/WindowBase64/Base64_encoding_and_decoding

原理简析

定义可见字符

var key = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=';

可见字符能定义的位数

1 << 2 = 4 = 100(2) 能表示 4 bit分别为 00 0110 11
1 << 6 = 64 = 1000000(2) 表示 6 bit 000000 ~ 111111
所以我们从 8 bit 降级到 6 bit我们用 4 个字符来表示一个字符也就是 3Byte 的编码因为 4 * 6 = 3 * 8

demo


> var key = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/=';

> key.indexOf('Q').toString(2)
< "10000"
< 6 Bits  0 结果"010000"

> btoa('A')
< "QQ=="
< Base 64 6 Bits  0 结果"010000 010000"

> 'A'.charCodeAt(0).toString(2)
< "1000001"
8 bits  0 结果 => 6 + 2 => 2 后补 0
"0100 0001" => "010000 01" => "010000 010000"


> btoa('ABC')
< "QUJD"

base64 6 bits  0 结果
Q: "010000"
U: "010100"
J: "001001"
D: "000011"

"ABC" 8 bits  0 结果
A: "01000001"
B: "01000010"
C: "01000011"

ABC = "01000001" + "01000010" + "01000011"
    = "010000" + "01" + "0100" + "0010" + "01" + "000011"
    = "010000" + "010100" + "001001" + "000011"
    = Q + U + J + D

Demo

拖拽文件至此,显示文件 Base 64 字符串

References