Hibari is a production-ready, distributed, key-value, big data
store. Hibari uses chain replication for strong consistency,
high-availability, and durability. Hibari has excellent performance
especially for read and large value operations.
This FAQ may help answer some of your questions.
1.1. |
Where can I download the Hibari binary code package?
|
|
Sorry, Hibari binary packages are not available right now.
|
1.2. |
How can I download the Hibari source code repositories using Git?
|
|
Please see here for pointers to download the
Hibari source code repositories using Git and Google’s repo
helper tool.
|
2. Hibari and Hadoop/HDFS
2.1. |
Chain Replication is one of Hibari’s characteristics…
|
| -
Question: Chain Replication is one of Hibari’s
characteristics. HDFS of Hadoop has similar characteristics. I’m
not able to understand the difference. Please describe specific
differences?
-
Answer: Sorry, we are not familiar with the implementation of
HDFS. We need to schedule time to research and to understand the
implementation of HDFS. Thank you in advance for your patience.
|
2.2. |
Hibari’s Admin Server plays a role like Name Node of Hadoop…
|
| -
Question: We understand Hibari’s Admin Server plays a role like
Name Node of Hadoop and controls chains under a master and slave
configuration. If the Hibari Admin server abnormally stops, a
standby needs to replace it and continue operations. To what extent
is Hibari’s Admin Server and standbys are synced? Is it a level of
full sync to enable a shift to a standby without any data loss?
Also, how long would a shift to a standby take (out-of-service
time)?
-
Answer: Yes, a standby for Hibari’s Admin Server can resume full
service without any data loss. All of Hibari’s Admin Server’s
private state is stored in bootstrap bricks on disk. The storage of
the boostrap bricks is managed by quorom replication. When the
Admin Server is stopped (e.g. node shutdown) or crashes (e.g. power
failure), a standy Admin server will take over, assume the master
role, and restore the cluster’s state from the bootstrap bricks.
In theory, the 20-30 seconds that are required for the Admin Server
to restart could mean 20-30 seconds of negative service impact to
Hibari clients. In practice, however, Hibari clients almost never
notice when an Admin Server instance crashes and
restarts. Everytime any change happens in Hibari clients (data
nodes), then master Admin Server will update the information into
memory and also both bootstrap bricks of master and standby Admin
Server at the same time. Sync between Master and Standby will be
done everytime there is any change in Hibari clients side.
|
2.3. |
Hibari’s Admin Server stores what state…
|
| |
2.4. |
What about data locality…
|
| -
Question: Hibari provides API’s that can be used like Hbase and
Big Table. You also said linkage with Map Reduce is theoretically
possible but not yet implemented. What we do not understand is data
locality. Map reduce focuses on data locality and improves
processing efficiency by processing on a node the data that the node
has. Suppose that you need to develop a map-reduce framework for
Hibari, is it possible to design that data locality is recognized?
I.e., Does / can Hibari provide API or others to retrieve data from
chains on a certain node or to retreive data from a group of nodes
that constitute certain chains?
-
Answer: Yes. Hibari has APIs to retrieve data (keys or
keys+values) across all chains and to retrieve data from single or
multiple single chains. Using Hibari’s consistent hashing algorithm
implementation, the application can control how keys are mapped to
chains by a key hashing prefix and can control the relative chain
storage by a chain weighting factor.
|
2.5. |
Hibari’s Thrift API…
|
| -
Question: Hibari also supports Thrift API. How much gap in
execution speed is there when using Erlang API and Thrift API?
-
Answer: So far, there has been no measurable performance gain or
loss found between Hibari’s native Erlang, UBF/EBF, and Thrift API
implementations.
|
2.6. |
Tools for multiple clusters…
|
| -
Question: Does Hibari provide tools to maintain many clusters?
-
Answer: No, not currently.
|
2.7. |
Maximum number of clusters…
|
| -
Question: What is the maximum number of structured clusters that
are theoretically possible? What is the max number of clusters that
are actually proven?
-
Answer: Each Hibari cluster is an independent entity. There is no
limit since there is no sharing between Hibari clusters.
|
2.8. |
Maximum number of nodes …
|
| -
Question: What is the maximum number of nodes within a Hibari
cluster that are theoretically possible? What is the max number of
nodes that are actually proven?
-
Answer: There is no known theoretical limit. The maximum size of
a Hibari cluster has not yet been determined. A practical limit of
approximately 200-250 nodes is likely. This limit is currently
governed by the implementation of Hibari’s Admin Server and by the
implementation of Erlang’s distribution. The largest proven
deployment of Hibari is 50-60 nodes.
|