Skip to content

DuckDB API Server with Arrow Flight SQL Airport support and concurrent writes/reads (quackpipe)

License

Notifications You must be signed in to change notification settings

quackscience/quackpy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

78 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

a pipe for quackheads

๐Ÿค quackpype

QuackPy is a serverless OLAP API built on top of DuckDB exposing HTTP/S and Arrow Flight SQL interfaces

Important

  • Arrow Flight API for modern data clients (DuckDB Airport)
  • Easy HTTP API with multiple formats (JSON,CSV,Parquet)
  • Unlocked Concurrent inserts and querying on DuckDB
  • Persistent storage using w/ multiuser authentication
  • Native access to any DuckDB Extension & Format
  • Embedded SQL Query Interface for instant usage

๐ŸŒฑ Get Started

Run using docker or build from source

docker pull ghcr.io/quackscience/quackpy:latest
docker run -ti --rm -p 8123:8123 -p 8815:8815 ghcr.io/quackscience/quackpy:latest

๐Ÿ‘‰ Usage

See the Examples directory for quick starters

Note

Quackpipe executes queries in :memory: unless authentication details are provided for data persistence

๐Ÿ•ธ๏ธ HTTP API

Execute DuckDB queries using the HTTP POST/GET API (compatible with the ClickHouse HTTP API)

curl -X POST "http://user:persistence@localhost:8123" \
   -H "Content-Type: application/json" \
   -d 'SELECT version()'  

โœˆ๏ธ FLIGHT API

Execute DuckDB queries using the experimental Flight GRPC API and Airport

Note

Quackpipe executes queries in :memory: unless an authorization header is provided for data persistence

๐ŸŽซ Pass Airport Security
CREATE SECRET airport_flight (
ยท       type airport,
โ€ฃ       auth_token 'user:password',
ยท       scope 'grpc://localhost:8815'
ยท );
๐ŸŽซ Take Airport Flights
D select flight_descriptor, endpoint from airport_list_flights('grpc://127.0.0.1:8815', null);
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚        flight_descriptor        โ”‚                                                    endpoint                                                    โ”‚
โ”‚ union(cmd blob, path varchar[]) โ”‚           struct(ticket blob, "location" varchar[], expiration_time timestamp, app_metadata blob)[]            โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ show_databases                  โ”‚ [{'ticket': SHOW DATABASES, 'location': [grpc://localhost:8815], 'expiration_time': NULL, 'app_metadata': }]   โ”‚
โ”‚ show_tables                     โ”‚ [{'ticket': SHOW TABLES, 'location': [grpc://localhost:8815], 'expiration_time': NULL, 'app_metadata': }]      โ”‚
โ”‚ show_version                    โ”‚ [{'ticket': SELECT version(), 'location': [grpc://localhost:8815], 'expiration_time': NULL, 'app_metadata': }] โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

D select * from airport_take_flight('grpc://localhost:8815/', ['show_version']);
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ "version"() โ”‚
โ”‚   varchar   โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ v1.1.3      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
๐ŸŽซ ATTACH Flights Tables
D --- Attach to Flight Server
D ATTACH 'deltalake' (TYPE AIRPORT, location 'grpc://localhost:8815/'); 

D --- Create Schema + Table
D CREATE SCHEMA deltalake.test1; 
D CREATE TABLE deltalake.test1.people (
     name VARCHAR,
     love_of_duckdb INT,
     tags VARCHAR[]
    );

D --- Insert into Flight Table
D INSERT INTO deltalake.test1.people values
  ('rusty', 5, ['airport', 'datasketches']);

D --- Select from Flight Table
D SELECT * FROM deltalake.test1.people;
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  name   โ”‚ love_of_duckdb โ”‚          tags           โ”‚
โ”‚ varchar โ”‚     int32      โ”‚        varchar[]        โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ rusty   โ”‚              5 โ”‚ [airport, datasketches] โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚ 1 row.                                   3 columns โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Flight Tables can be accessed via HTTP API using the schema name

USE test1; SELECT * FROM people;

image

๐ŸŽซ Take Custom Flights w/ Custom Headers + Ticket
D SELECT * FROM airport_take_flight('grpc://localhost:8815', 'SELECT 1', headers := MAP{'authorization':'user:password'} );
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚   1   โ”‚
โ”‚ int32 โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚   1   โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
๐ŸŽซ Take Python Flights
from pyarrow.flight import FlightClient, Ticket, FlightCallOptions 
import json
import pandas
import tabulate

sql="""SELECT version()"""
  
flight_ticket = Ticket(sql)

token = (b"authorization", bytes(f"user:persistence".encode('utf-8')))
options = FlightCallOptions(headers=[token])
client = FlightClient(f"grpc://localhost:8815")

reader = client.do_get(flight_ticket, options)
arrow_table = reader.read_all()
# Use pyarrow and pandas to view and analyze data
data_frame = arrow_table.to_pandas()
print(data_frame.to_markdown())
|    | "version"()   |
|---:|:--------------|
|  0 | v1.1.3        |

๐Ÿ“บ SQL User-Interface

quackpy ships with the DuckDB SQL quack user-interface based on ch-ui

sequenceDiagram
    participant Client
    participant QuackPy
    participant DuckDB


    Client ->> QuackPy: ListFlights
    QuackPy ->> Client: Return Flights Table
    Client ->> QuackPy: GetFlightInfo
    QuackPy ->> DuckDB: DuckDB Execute
    DuckDB ->> QuackPy: Arrow Results Stream
    QuackPy ->> Client: FlightInfo(ticket)
    Client ->> QuackPy: do_get(ticket)
    QuackPy ->> Client: Stream of Results

Loading

๐Ÿƒ Disclaimers

Footnotes

  1. DuckDB ยฎ is a trademark of DuckDB Foundation. All rights reserved by their respective owners. 1 โ†ฉ

  2. ClickHouse ยฎ is a trademark of ClickHouse Inc. No direct affiliation or endorsement. 2 โ†ฉ

  3. Released under the MIT license. See LICENSE for details. All rights reserved by their respective owners. 3 โ†ฉ

  4. Flight implementation inspired by Duck Takes Flight 4 โ†ฉ