各种数据库的简单对比,搞清楚一些简单的原理和区别。包括 MySQL、Lindorm、HBase、Hologres 等。

所谓的大数据、高并发都是通过分布式存储 + 分布式计算 + MapReduce 的机制完成的。分而治之。

HDFS

分布式文件存储。

The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware.

Map/Reduce DataFlow

https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html

Hadoop MapReduce is a software framework for easily writing applications which process vast amounts of data (multi-terabyte data-sets) in-parallel on large clusters (thousands of nodes) of commodity hardware in a reliable, fault-tolerant manner.

WordCount

https://help.aliyun.com/document_detail/27875.htm?spm=a2c4g.11186623.0.0.268a7e6dhRJ5qp#t12013.html

// TODO: WordCount

MaxCompute/ODPS

架构图:

Spark

MySQL

HBase

介绍

  • NoSQL
  • 点查
  • 列存储

Lindorm

介绍

Hologres

介绍

  • Shared Disk/Storage 架构,仅共享存储,但这种架构会受限于存储网络的读取上限
  • Shared Nothing 架构,计算和存储在同一个节点上,扩容有 rebalance 的过程
  • 存储计算分离,但计算时有分片缓存,异步扩展计算或存储

References

  • Spark: https://spark.apache.org/
  • MaxCompute: https://help.aliyun.com/document_detail/27800.html