各种数据库对比
各种数据库的简单对比,搞清楚一些简单的原理和区别。包括 MySQL、Lindorm、HBase、Hologres 等。
所谓的大数据、高并发都是通过分布式存储 + 分布式计算 + MapReduce 的机制完成的。分而治之。
HDFS
分布式文件存储。
The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware.
Map/Reduce DataFlow
https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html
Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner.
WordCount
https://help.aliyun.com/document_detail/27875.htm?spm=a2c4g.11186623.0.0.268a7e6dhRJ5qp#t12013.html
// TODO: WordCount
MaxCompute/ODPS
架构图:
Spark
MySQL
- 常规的关系型存储
- 列存储大小(除 Text 类型)外的限制为 65535 bytes
HBase
- NoSQL
- 点查
- 列存储
Lindorm
Hologres
- Shared Disk/Storage 架构,仅共享存储,但这种架构会受限于存储网络的读取上限
- Shared Nothing 架构,计算和存储在同一个节点上,扩容有 rebalance 的过程
- 存储计算分离,但计算时有分片缓存,异步扩展计算或存储
References
- Spark: https://spark.apache.org/
- MaxCompute: https://help.aliyun.com/document_detail/27800.html