Apache Drill supports the ANSI standard of SQL to run queries on data sets. We can use SQL to run the query against data sources like HBase, Hive, distributed file system, local file system, RDBMS systems, non-relational databases, etc. Using the Drill special functions and operators we can drill down more on nested data.

Apache Drill supports the data definition statements to create the table, create the view, drop table, drop view, alter statement, describe table/views/database/workspace, and so on. In this section, we will go through the following Drill Data definition statements.


Drill CREATE TABLE AS (CTAS)

We can create a table in Drill using the create table as a statement. A table in Drill can be created in df.tmp workspace. It can't be created in HBase or Hive using the storage plugins and another thing is that the workspace should be set to writable using this parameter "writable": true.

Syntax:

CREATE TABLE name [ (column list) ] AS query;

We will create a table in Drill based on the JSON dataset named "emp_detail.json" that we already created in the previous section. Go to the drill home directory and start the Drill in embedded mode after that set the schema to dfs.tmp and run the CREATE TABLE statement.

Command:

cloudduggu@ubuntu:~/drill$ ./bin/drill-embedded
apache drill> use dfs.tmp;
apache drill (dfs.tmp)> CREATE TABLE emp_data as SELECT * FROM dfs.`/home/cloudduggu/drill/sample-data/emp_detail.json`;
apache drill (dfs.tmp)> SELECT * FROM emp_data;


Output:


drill create table as statement cloudduggu

drill table output cloudduggu


Drill CREATE TABLE PARTITION BY AS (CTAS)

We can create a Drill table using the PARTITION BY clause that will provide the best performance when the data is huge and the query will be fired based on the partition column.

Syntax:

CREATE TABLE table_name [ (column, . . .) ]
[ PARTITION BY (column, . . .) ]
AS SELECT column_list FROM <source_name>;

Now we will create a table named "emp_data_part" and partition this table by the "gender" column.

Command:

apache drill (dfs.tmp)>CREATE TABLE emp_data_part PARTITION BY (gender) as SELECT * FROM dfs.`/home/cloudduggu/drill/
apache drill (dfs.tmp)>SELECT * FROM emp_data_part;

Output:

We can see the output of the query based on the PARTITION column gender.

drill create table partition by cloudduggu


Drill CREATE VIEW

A View is a logical representation of data and in Drill, we can create views combining multiple data sets that represent data from a single data source.

Syntax:

CREATE [OR REPLACE] VIEW [workspace.]view_name [ (column_name [, ...]) ] AS query;

We will create a view named "emp_data_view" that will be based on the emp_detail.json data file.

Command:

apache drill> use dfs.tmp;
apache drill (dfs.tmp)> CREATE VIEW emp_data_view  as SELECT * FROM dfs.`/home/cloudduggu/drill/sample-data/emp_detail.json`;
apache drill (dfs.tmp)> SELECT * FROM emp_data_view;

Output:

drill create view cloudduggu


Drill DROP TABLE

Drill DROP table is used to drop table from dfs storage plugin.

Syntax:

DROP TABLE [IF EXISTS] [workspace.]name;

Let's drop the table "emp_data" that we have created in CREATE table section.

Command:

apache drill (dfs.tmp)> DROP TABLE emp_data;

Output:

drill drop table cloudduggu


Drill DROP VIEW

Drill DROP view is used to drop view from Drill.

Syntax:

DROP VIEW [IF EXISTS] [workspace.]view_name;

We will use the DROP VIEW command and drop the already created view name "emp_data_view".

Command:

apache drill (dfs.tmp)> DROP VIEW emp_data_view;

Output:

drill drop view cloudduggu