Running a standalone worker node

Creating a distributed cluster of worker nodes

It is now possible to start multiple worker nodes and have them register with etcd.

This is a work in progress. Although multiple workers can be running, each query currently runs in a single worker. This will be the case until data partitioning is implemented.

To run the worker:

It is necessary to tell the worker which IP address to bind to and also which address to register with etcd for service discovery.

The data_dir must also be specified so that the worker can locate data files (currently only unpartitioned csv files are supported).

Each worker registers with etcd every 5 seconds using a TTL of 10 seconds for now, so once a worker dies, it could take up to 10 seconds for etcd to expire its membership of the cluster.

In this example, the IP address of the worker node is 192.168.0.45.

cargo run --bin worker -- --bind 0.0.0.0:8080 --register 192.168.0.45:8080 --etcd http://192.168.0.37:2379 --data_dir ./test/data

To run the console:

The console will connect to etcd and pick the first worker in the list.

cargo run --bin console -- --etcd http://192.168.0.37:2379

Run some queries

Currently, only CSV files are supported and the schema must be defined by running a CREATE EXTERNAL TABLE query, for example:

CREATE EXTERNAL TABLE uk_cities (name VARCHAR(100) NOT NULL, lat DOUBLE NOT NULL, lng DOUBLE NOT NULL)

The execution context is hard-coded to look for csv files in the test/data directory (relative path).

Here’s an example of a console session that registers a CSV file as a table and then executes a query against the table.

DataFusion Console
$ CREATE EXTERNAL TABLE uk_cities (name VARCHAR(100) NOT NULL, lat DOUBLE NOT NULL, lng DOUBLE NOT NULL)
Executing query ...
Registered schema with execution context
Query executed in 0.000714958 seconds

$ SELECT lat, lng, name FROM uk_cities WHERE lat < 51 ORDER BY lat, lng
Executing query ...
50.376289,-4.143841,Plymouth, UK
50.614429,-2.457621,Weymouth, Dorset, UK
50.720806,-1.904755,Bournemouth, UK
50.768036,0.290472,Eastbourne, East Sussex, UK
50.825024,-0.383835,Worthing, West Sussex, UK
50.854259,0.573453,Hastings, East Sussex, UK
50.967941,0.085831,Uckfield, East Sussex, UK

Query executed in 0.008330428 seconds

Guides

Getting Started with Docker

License

Apache 2.0

Resources

Gitter Release Notes Roadmap Source Docker Repo