9. FAQ

9.1 Storage/Hardware

1. How does ActorDB store data?

How data is stored in the storage engine is described in this blog post. Essentially every actor is an SQLite database and every actor that is inside a node is stored within a single LMDB file.

2. Where does ActorDB store data?

On linux default directory is /var/lib/actordb

On OSX/Windows default directory is ./data/

Data directories contain a lmdb file. This is where everything is stored.

3. How do I backup my data?

Execute: actordb_tool backup /path/to/source/lmdb /path/to/backup/lmdb

Unless the load on the database is small, you should backup to a second disk. Running it on the same disk will have a severe negative effect on write performance while it is running.

You must do this for at least one node of every cluster.

4. What kind of hardware should my nodes have?

SSD drives are highly recommended. The more RAM you have the better. At least a gigabit connection between nodes.

9.2 Clustering

5. Can I run multiple ActorDBs on a single node?

Yes. Install them into separate locations. They must have different names (in vm.args) and different ports (thrift_port, mysql_protocol, rpcport settings in app.config).

6. How are actors distributed across nodes?

ActorDB has nodes and clusters. Every node must be a part of a cluster. There can be as many clusters as you wish.

ActorDB uses sharding to distribute actors. Every node holds ~4 shards. For every query, actor name for query is hashed. This determines which shard actor falls into, which then determines which node actor must live in.

7. How do I set replication factor?

You set it indirectly with the size of your cluster(s). An actor lives within a cluster, it is replicated across all nodes within that cluster.

8. How many nodes should I add to a cluster?

It should be an odd number: 1, 3 or 5. Single node clusters have no redundancy but are fastest. Three node clusters are a good compromise. Five node clusters provide lots of safety at the cost of performance. If you want to start small, a single node cluster with regular backups may be fine. If your needs start growing, you can add nodes at any time.

9. If I want to add nodes to an existing cluster, what is the right process?

  1. 1. Create a backup of an existing node in that cluster.
  2. 2. Copy backup lmdb file to the new server. If on linux put it in /var/lib/actordb/lmdb
  3. 3. Configure app.config, vm.args.
  4. 4. Start up node.
  5. 5. Go back to the node for step 1, add new node using actordb_console.

10. What happens when a new node is added to ActorDB?

  1. 1. New node is stored to global state.
  2. 2. New node receives global state.
  3. 3. New node initializes itself with state
  4. 4. It notices it has less shards than other nodes.
  5. 5. It finds a node which has the largest chunk of shard space. It tells it wants half of one shard (upper half).
  6. 6. Copy process starts.
  7. 7. Once complete, it checks again how much shard space it has. If it has less than other nodes, move to step 5.

Once a shard is placed on a node, it will remain there forever. If new nodes are added, existing shards get smaller and new shards are created from them.

11. When a shard is placed on a new node, what happens to existing actors within that shard are they moved as well?

Actors get moved to that new node (and its cluster) one by one. Moving a shard means moving actors.

For KV types, the entire KV shard is copied to new node, then a DELETE statement is executed to delete the unused half. Origin shard deletes upper half, destination shard deletes lower half.

12. While an actor is being moved, is it locked for reads/writes?

You can think of actors as being in two parts. Base data and write log. Every write hits the write log, base data remains unchanged.

Actor move/copy is done in stages:

  1. 1. Copy the base chunk of data.
  2. 2. Copy everything in the write log up to the point of where actor has started copy process.
  3. 3. Lock actor for writes.
  4. 4. Copy the remainder.
  5. 5. Unlock source. If operation is move (not copy), source is deleted and replaced with a redirect marker. Any queries in-flight will be redirected to new node.

9.3 Querying

13. Does ActorDB have map/reduce?

Not at the moment. If you have an opinion where it would make sense, let us know: https://github.com/biokoda/actordb/issues

14. How do KV types work?

Normal actors belong to shards and there can be any number of actors. KV actors are shards. If every node has ~4 shards on it, this means for KV types it will have 4 actors of that KV schema.

Actor statements like this one: actor kvtype(kvkey);

Mean ActorDB will hash kvkey to determine which shard it falls into. Then send the query to that shard. That shard may hold many other keys in the same actor database.

15. How do schema updates work?

Once schema has been updated, every actor will execute it before its next read or write call. This means the database will incrementally update itself. The currently active actors will do it immediately, the inactive ones later.

16. How do I reset the database and start over?

Stop ActorDB, delete the lmdb file (/var/lib/actordb/lmdb on linux) then start ActorDB again.

17. How do I use thrift interface?

Install thrift using your package manager. Run thrift --help and find your language in the list. Download the ActorDB thrift file.

Run: thrift --gen mylang adbt.thrift

Local folder should now contain the code you can directly put in your project to connect and run queries.

18. Should I use thrift or a mysql driver?

We recommend thrift.

19. Which node should my clients connect to?

Does not matter. They can connect to any node. Using a pool of connections to multiple nodes is recommended. Every ActorDB node can easily handle thousand of client connections.

<< Previous     (8. Configuration)