Deprecated: Function get_magic_quotes_gpc() is deprecated in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 99

Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 619

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1169

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176
8000 GitHub - gggion/ob-duckdb: Org Babel integration with DuckDB.
Nothing Special   »   [go: up one dir, main page]

Skip to content

gggion/ob-duckdb

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ob-duckdb

ob-duckdb brings DuckDB, an in-process analytical database, to Org mode with powerful synchronous and asynchronous execution capabilities.

This package enables you to:

  • Execute DuckDB SQL queries directly in Org mode source blocks
  • Run complex analytical queries asynchronously without blocking Emacs
  • View results as tables, CSV, JSON, and other formats
  • Maintain persistent sessions for interactive data analysis
  • Connect to database files or use in-memory databases
  • Use Org variables in your queries
  • Execute system shell commands within DuckDB sessions
  • Track execution history and navigate between related blocks

Contents

Installation

Prerequisites

  • Emacs 28.1 or later
  • Org mode + Org Babel
  • DuckDB CLI must be installed and available in your PATH

Manual Installation

Clone the repository and add to your load path:

(add-to-list 'load-path "/path/to/ob-duckdb")
(require 'ob-duckdb)
;; then add the babel language etc

With straight

(use-package ob-duckdb
  :straight (:host github :repo "gggion/ob-duckdb")
  :after org
  :config
  (org-babel-do-load-languages
   'org-babel-load-languages
   (append org-babel-load-languages '((duckdb . t)))))

Doom Emacs

;; packages.el
(package! ob-duckdb
  :recipe (:host github
           :repo "gggion/ob-duckdb"
           :files ("*.el")))

;; config.el
(use-package! ob-duckdb)
(after! org
  (org-babel-do-load-languages 'org-babel-load-languages
   (append org-babel-load-languages '((duckdb . t)))))

MELPA

Not on MELPA yet.

Features

  • Execute SQL queries directly in Org mode
  • Asynchronous execution for long-running queries
  • Persistent sessions for maintaining database state across block executions
  • Execution history for tracking analysis workflows
  • Connect to database files or use in-memory database
  • Variable substitution from Org mode elements
  • Multiple output formats (table, CSV, JSON, markdown, etc.)
  • Full support for DuckDB dot commands
  • Shell command execution within DuckDB sessions
  • ANSI color support for colorized output (mostly for exception output or duckbox mode output)
  • Customizable display options for query results
  • Block tracking system for reliable async results

USAGE

Basic Usage

Basic example:

#+begin_src duckdb

SELECT 1 AS test;

RESULTS:

┌───────┐
│ test  │
│ int32 │
├───────┤
│     1 │
└───────┘

Connect to a database file:

#+begin_src duckdb :db test_database.duckdb

CREATE TABLE IF NOT EXISTS test_table (
  id INTEGER,
  name VARCHAR,
  created_at TIMESTAMP
);

INSERT INTO test_table VALUES
  (1, 'First entry', CURRENT_TIMESTAMP),
  (2, 'Second entry', CURRENT_TIMESTAMP);

SELECT * FROM test_table;

RESULTS:

┌───────┬──────────────┬─────────────────────────┐
│  id   │     name     │       created_at        │
│ int32 │   varchar    │        timestamp        │
├───────┼──────────────┼─────────────────────────┤
│     1 │ First entry  │ 2025-04-05 16:58:59.59  │
│     2 │ Second entry │ 2025-04-05 16:58:59.59  │
└───────┴──────────────┴─────────────────────────┘

Header Arguments

ob-duckdb supports these header arguments, including dot commands from the CLI:

ArgumentDescriptionExampledot command
:dbDatabase file path (or :memory: for in-memory):db test.duckdb
:sessionSession name for persistent connections:session my-session
:asyncRun query asynchronously (requires session):async yes
:timerShow execution time:timer on.timer on/off
:headersShow column headers:headers on.headers on/off
:nullvalueString to display for NULL values:nullvalue "N/A".nullvalue
:separatorColumn separator for output:separator "\".separator COL ?ROW?
:echoEcho commands in output:echo on.echo on/off
:bailExit on error:bail on.bail on/off
:outputOutput handling:output buffer
:prologueSQL to execute before the main body:prologue "CREATE..."
:epilogueSQL to execute after the main body:epilogue "DROP..."

Format Examples

Markdown format

#+begin_src duckdb :format markdown :headers on

SELECT
  date_part('year', d) AS year,
  date_part('month', d) AS month,
  date_part('day', d) AS day
FROM (VALUES
  ('2023-01-15'::DATE),
  ('2023-06-30'::DATE),
  ('2023-12-25'::DATE)
) AS dates(d);

RESULTS:

year | month | day |
2023 | 1 | 15 |
2023 | 6 | 30 |
2023 | 12 | 25 |

JSON format

#+begin_src duckdb :format json :wrap src json

SELECT
  json_object('id', id, 'name', name) AS json_data
FROM (VALUES
  (1, 'Alice'),
  (2, 'Bob'),
  (3, 'Charlie')
) AS t(id, name);

Results:

[{"json_data":{"id":1,"name":"Alice"}},
{"json_data":{"id":2,"name":"Bob"}},
{"json_data":{"id":3,"name":"Charlie"}}]

LaTeX

DuckDB supports LaTeX as an output format, which is pretty neat. If you have LaTeX installed and org-latex-preview enabled, you can render the table in your org buffer.

HEADER: :format latex~
HEADER: :timer on~
HEADER: :wrap src latex~
begin_src duckdb~
-- Test multiple header arguments together
SELECT
  row_number() OVER () AS id,
  (random() * 100)::INTEGER AS random_number,
  CASE WHEN random() > 0.5 THEN 'Group A' ELSE 'Group B' END AS category
FROM range(1, 11);

RESULTS:

\begin{tabular}{|rrl|}
\hline
id & random_number & category \\
\hline
1  & 64            & Group B  \\
2  & 18            & Group A  \\
3  & 63            & Group A  \\
4  & 45            & Group A  \\
5  & 31            & Group B  \\
6  & 90            & Group A  \\
7  & 31            & Group A  \\
8  & 5             & Group B  \\
9  & 12            & Group A  \\
10 & 55            & Group B  \\
\hline
\end{tabular}

Custom separator

#+begin_src duckdb :format csv :separator "@@@@" :headers on

SELECT
  'Column 1' AS first,
  'Column 2' AS second,
  'Column 3' AS third
UNION ALL
SELECT 'Data 1', 'Data 2', 'Data 3';

Results:

first@@@@second@@@@third
Column 1@@@@Column 2@@@@Column 3
Data 1@@@@Data 2@@@@Data 3

Custom NULL value display

#+begin_src duckdb :nullvalue "N/A" :headers on

SELECT
  1 AS id,
  NULL AS missing_value,
  'present' AS existing_value
UNION ALL
SELECT 2, 'found', NULL;

Results:

┌───────┬───────────────┬────────────────┐
│  id   │ missing_value │ existing_value │
│ int32 │    varchar    │    varchar     │
├───────┼───────────────┼────────────────┤
│     1 │ N/A           │ present        │
│     2 │ found         │ N/A            │
└───────┴───────────────┴────────────────┘

Asynchronous Execution

For long-running queries, you can use asynchronous execution to avoid blocking Emacs:

#+begin_src duckdb :async yes :session test :var parquet_url='https://github.com/apache/parquet-mr/raw/master/parquet-hadoop/src/test/resources/test-file-with-no-column-indexes-1.parquet'

SELECT * FROM read_parquet('parquet_url') WHERE id < 10

Initially shows:

Executing asynchronously...

And when complete, displays the results:

┌───────┬─────────┬──────────────────────┬─────────────────────────────────────┐
│  id   │  name   │       location       │            phoneNumbers             │
│ int64 │ varchar │ struct(lon double,…  │ struct(phone struct(number bigint…  │
├───────┼─────────┼──────────────────────┼─────────────────────────────────────┤
│     0 │ p0      │ NULL                 │ {'phone': [{'number': 0, 'kind': …  │
│     1 │ p1      │ {'lon': 1.0, 'lat'…  │ {'phone': [{'number': 1, 'kind': …  │
│     2 │ p2      │ {'lon': 2.0, 'lat'…  │ {'phone': [{'number': 2, 'kind': …  │
│     3 │ p3      │ NULL                 │ {'phone': [{'number': 3, 'kind': …  │
│     4 │ p4      │ {'lon': 4.0, 'lat'…  │ {'phone': [{'number': 4, 'kind': …  │
│     5 │ p5      │ {'lon': 5.0, 'lat'…  │ {'phone': [{'number': 5, 'kind': …  │
│     6 │ p6      │ NULL                 │ {'phone': [{'number': 6, 'kind': …  │
│     7 │ p7      │ {'lon': 7.0, 'lat'…  │ {'phone': [{'number': 7, 'kind': …  │
│     8 │ p8      │ {'lon': 8.0, 'lat'…  │ {'phone': [{'number': 8, 'kind': …  │
│     9 │ p9      │ NULL                 │ {'phone': [{'number': 9, 'kind': …  │
├───────┴─────────┴──────────────────────┴─────────────────────────────────────┤
│ 10 rows                                                            4 columns │
└──────────────────────────────────────────────────────────────────────────────┘

Requirements for async execution:

  • Must use a session (`:session some-name`)
  • Must include `:async yes` in header arguments

Async execution is useful for:

  • Executing queries that depend on reading files through the internet (eg aws s3 or some other file from an url)
  • Very large queries that would otherwise lock up Emacs
  • Complicated analyses that take more than a few seconds to run
  • Maintaining productivity while waiting for results

Dot Commands

DuckDB’s dot commands are fully supported inside the src block, you can see all of them by doing .help -all.

#+begin_src duckdb

.print "DuckDB dot commands can be used inside the org block:"
.help

RESULTS:

DuckDB dot commands can be used inside the org block:

.bail on|off             Stop after hitting an error.  Default OFF
.binary on|off           Turn binary output on or off.  Default OFF
.cd DIRECTORY            Change the working directory to DIRECTORY
.changes on|off          Show number of rows changed by SQL
.check GLOB              Fail if output since .testcase does not match
.columns                 Column-wise rendering of query results
...

Some examples of dot command usage

.print command

#+begin_src duckdb

.print "IM SCREAMING AAAAAAAAA"

RESULTS:

IM SCREAMING AAAAAAAAA

Using .shell for system commands

The .shell dot command allows executing shell commands within DuckDB:

#+begin_src duckdb :results output :wrap example

-- moving to a dir
.cd /tmp/dumps/new
-- List files in current directory
.shell ls -la
-- Show current date and time
.shell date
-- Run a simple echo command
.print \n
.shell echo "Im screaming from the shell AAAAAAAAAAAAAAAAAA"

RESULTS:

total 8
drwx------ 2 demo demo 4096 Apr  2 19:34 .
drwxr-x--T 6 demo demo 4096 Apr  2 23:12 ..
Sat Apr  5 11:05:59 PM -04 2025


Im screaming from the shell AAAAAAAAAAAAAAAAAA

More complex shell integration example:

#+begin_src duckdb

-- First create a temp table
CREATE TEMPORARY TABLE sample_data AS
  SELECT * FROM range(1, 6) AS r(num);

-- Run a query
SELECT * FROM sample_data;

-- Use shell to create a directory for outputs if it doesn't exist
.shell mkdir -p duckdb_outputs

-- Export query results to a CSV file using shell command
.mode csv
.once duckdb_outputs/sample_data.csv
SELECT * FROM sample_data;

-- Verify the file was created
.shell ls -l duckdb_outputs/

-- Show file contents
.shell cat duckdb_outputs/sample_data.csv

Results:

┌───────┐
│  num  │
│ int64 │
├───────┤
│     1 │
│     2 │
│     3 │
│     4 │
│     5 │

103CE
└───────┘
total 4
-rw-r--r-- 1 demo demo 20 Apr  5 18:21 sample_data.csv
num
1
2
3
4
5

Output Formats

DuckDB supports various output formats through the .mode command, which can be set with the :format header argument.

Available formats:

asciiColumns/rows delimited by 0x1F and 0x1E
boxTables using unicode box-drawing characters
csvComma-separated values
columnOutput in columns. (See .width)
duckboxTables with extensive features
htmlHTML <table> code
insertSQL insert statements for TABLE
jsonResults in a JSON array
jsonlinesResults in a NDJSON
latexLaTeX tabular environment code
lineOne value per line
listValues delimited by “|”
markdownMarkdown table format
quoteEscape answers as for SQL
tableSame style as org tables
tabsTab-separated values
tclTCL list elements
trashNo output
org-tableSpecial format that produces Org tables

Displaying output in a dedicated buffer:

:output buffer header argument will display results in a separate buffer, useful for large results that would otherwise clutter your Org document. It opens a buffer named *DuckDB-output* with the query results.

#+begin_src duckdb :output buffer

  -- Output goes to a dedicated buffer
.mode box
SELECT
  'Row 1' AS description,
  1 AS value,
  CAST('2023-01-01' AS DATE) AS date
UNION ALL
SELECT
  'Row 2',
  2,
  CAST('2023-02-15' AS DATE);

RESULTS:

Output sent to buffer.

The buffer content would look like:

┌─────────────┬───────┬────────────┐
│ description │ value │    date    │
│   varchar   │ int32 │    date    │
├─────────────┼───────┼────────────┤
│ Row 1       │     1 │ 2023-01-01 │
│ Row 2       │     2 │ 2023-02-15 │
└─────────────┴───────┴────────────┘

Using org-table mode

Use the special org-table format to get results in an org-compatible table:

#+begin_src duckdb

.mode org-table
SELECT * FROM range(1, 5) as t(num);

Results:

| num |
|-----|
| 1   |
| 2   |
| 3   |
| 4   |

Other Output Examples:

CSV Format

#+begin_src duckdb :format csv

SELECT * FROM generate_series(1, 5) AS s(num) ;

Results:

num
1
2
3
4
5

JSON Format

#+begin_src duckdb :format json :wrap src json

SELECT
  json_object('id', id, 'name', name) AS json_data
FROM (VALUES
  (1, 'Alice'),
  (2, 'Bob'),
  (3, 'Charlie')
) AS t(id, name);

Results:

[{"json_data":{"id":1,"name":"Alice"}},
{"json_data":{"id":2,"name":"Bob"}},
{"json_data":{"id":3,"name":"Charlie"}}]

Markdown Format

#+begin_src duckdb :format markdown :headers on

SELECT
  date_part('year', d) AS year,
  date_part('month', d) AS month,
  date_part('day', d) AS day
FROM (VALUES
  ('2023-01-15'::DATE),
  ('2023-06-30'::DATE),
  ('2023-12-25'::DATE)
) AS dates(d);

RESULTS:

| year | month | day |
| 2023 |     1 |  15 |
| 2023 |     6 |  30 |
| 2023 |    12 |  25 |

Custom Separator

#+begin_src duckdb :format csv :separator "@@@@" :headers on :wrap example

SELECT
  'Column 1' AS first,
  'Column 2' AS second,
  'Column 3' AS third
UNION ALL
SELECT 'Data 1', 'Data 2', 'Data 3';

RESULTS:

first@@@@second@@@@third
Column 1@@@@Column 2@@@@Column 3
Data 1@@@@Data 2@@@@Data 3

Line Format with Custom NULL Values

#+begin_src duckdb :format line :nullvalue "N/A" :headers on :wrap example

SELECT
  1 AS id,
  NULL AS missing_value,
  'present' AS existing_value
UNION ALL
SELECT 2, 'found', NULL;

RESULTS:

id = 1
 missing_value = N/A
existing_value = present

            id = 2
 missing_value = found
existing_value = N/A

Database Connection

The :db header allows us to use a database file, it’s the equivalent of executing the command duckdb <db>, <db> being the path to the db file. In order to find the db file within your folder structure you can either put the whole path in the :db parameter or you can use :dir to first navigate to the folder where your db file lives. There’s also the duckdb-cli dot command .cd, which does the same thing.

NOTE: if the db file is not found, it will be created.

NOTE: duckdb can read sqlite .db files directly, meaning you can give the path to a sqlite db file and it’ll work.

#+begin_src duckdb :db test_database.duckdb

CREATE TABLE IF NOT EXISTS test_table (
  id INTEGER,
  name VARCHAR,
  created_at TIMESTAMP
);

INSERT INTO test_table VALUES
  (1, 'First entry', CURRENT_TIMESTAMP),
  (2, 'Second entry', CURRENT_TIMESTAMP);

SELECT * FROM test_table;

RESULTS:

┌───────┬──────────────┬─────────────────────────┐
│  id   │     name     │       created_at        │
│ int32 │   varchar    │        timestamp        │
├───────┼──────────────┼─────────────────────────┤
│     1 │ First entry  │ 2025-04-05 16:58:59.59  │
│     2 │ Second entry │ 2025-04-05 16:58:59.59  │
│     1 │ First entry  │ 2025-04-05 16:59:31.385 │
│     2 │ Second entry │ 2025-04-05 16:59:31.385 │
└───────┴──────────────┴─────────────────────────┘

More complex example:

#+begin_src duckdb :db test_database.duckdb

-- Create a more complex schema
CREATE TABLE IF NOT EXISTS users (
  user_id INTEGER PRIMARY KEY,
  username VARCHAR NOT NULL UNIQUE,
  email VARCHAR,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

CREATE TABLE IF NOT EXISTS posts (
  post_id INTEGER PRIMARY KEY,
  user_id INTEGER,
  title VARCHAR NOT NULL,
  content TEXT,
  created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
  FOREIGN KEY (user_id) REFERENCES users(user_id)
);

-- Add some test data
INSERT INTO users (user_id, username, email) VALUES
  (1, 'alice', 'alice@example.com'),
  (2, 'bob', 'bob@example.com'),
  (3, 'charlie', 'charlie@example.com');

INSERT INTO posts (post_id, user_id, title, content) VALUES
  (1, 1, 'Alice First Post', 'Hello from Alice!'),
  (2, 2, 'Bob Introduction', 'Hi, I am Bob.'),
  (3, 1, 'Alice Again', 'Second post from Alice');

-- Run a join query
SELECT
  u.username,
  p.title,
  p.created_at
FROM posts p
JOIN users u ON p.user_id = u.user_id
ORDER BY p.created_at DESC;

RESULTS:

┌──────────┬──────────────────┬─────────────────────────┐
│ username │      title       │       created_at        │
│ varchar  │     varchar      │        timestamp        │
├──────────┼──────────────────┼─────────────────────────┤
│ alice    │ Alice First Post │ 2025-04-05 16:59:10.818 │
│ bob      │ Bob Introduction │ 2025-04-05 16:59:10.818 │
│ alice    │ Alice Again      │ 2025-04-05 16:59:10.818 │
└──────────┴──────────────────┴─────────────────────────┘

Sessions

Sessions allow maintaining state between source blocks, useful for incremental analysis and multi-step workflows.

#+begin_src duckdb :session my-session-test :results output

-- First command in the session
CREATE TEMPORARY TABLE session_test (id INTEGER, value VARCHAR);
INSERT INTO session_test VALUES (1, 'First value');
select * from session_test;

RESULTS:

┌───────┬─────────────┐
│  id   │    value    │
│ int32 │   varchar   │
├───────┼─────────────┤
│   1   │ First value │
└───────┴─────────────┘

#+begin_src duckdb :session my-session-test :results output

-- Second command uses the same session and can access previous data
INSERT INTO session_test VALUES (2, 'Second value');
SELECT * FROM session_test ORDER BY id;

RESULTS:

┌───────┬──────────────┐
│  id   │    value     │
│ int32 │   varchar    │
├───────┼──────────────┤
│     1 │ First value  │
│     2 │ Second value │
└───────┴──────────────┘

Sessions can also be connected to database files:

#+begin_src duckdb :session db-session :db test_database.duckdb

-- This session connects to a specific database file
SELECT 'New session with database file' AS message;

-- Access tables from the database
SELECT COUNT(*) AS user_count FROM users;

NOTE: The :db argument is the equivalent of doing duckdb some_database, which means that DuckDB won’t allow multiple processes connected to the same db file (see Concurrency). This isn’t an issue on non-session source blocks since we start and kill the duckdb process. But in the case of sessions, it wont be possible to use the same db as usual if it’s being used in another session.

Session Management Functions

These interactive functions help manage DuckDB sessions:

FunctionDescription
org-babel-duckdb-create-sessionCreates a new DuckDB session with optional database file connection. Prompts for session name and database path interactively.
org-babel-duckdb-delete-sessionTerminates a DuckDB session, kills its buffer, and removes it from the session registry. Uses completion to select the session to delete.
org-babel-duckdb-display-sessionsShows information about all active sessions in a formatted help buffer, including session names, database connections, and status.
org-babel-duckdb-cleanup-sessionsRemoves dead sessions from the registry whose processes or buffers no longer exist. Helps reduce clutter.

DuckDB also offers some methods to avoid this deadlock, which can be read about here: Concurrency.

Variable Substitution

ob-duckdb supports using variables from the source block headers. Variable substitution applies to strings, dollar sign variables ($var) and org tables.

Basic variable substitution

#+begin_src duckdb :var min_value=5 max_value=10

-- variable substitution
SELECT *
FROM generate_series(min_value, max_value) AS s(value)
WHERE value BETWEEN min_value AND max_value;

Results:

┌───────┐
│ value │
│ int64 │
├───────┤
│     5 │
│     6 │
│     7 │
│     8 │
│     9 │
│    10 │
└───────┘

String variables

#+begin_src duckdb :var name_variable="'test_name'" table_variable="'test_table'"

-- String variable substitution
SELECT 'Hello, ' || name AS greeting;

-- Table name variable substitution
SELECT 'Table name is: ' || $table_variable AS info;

RESULTS:

┌───────────────────────────┐
│           info            │
│          varchar          │
├───────────────────────────┤
│ Table name is: test_table │
└───────────────────────────┘

This can quickly get confusing since at the moment the variable substitution is absolute, meaning if our variable name is table, it’ll replace all instances of table within the block to our value, and this applies to keywords too, meaning this:

#+begin_src duckdb :var select='test_table'

select 'select is being replaced here' from select;

Will result in this:

Parser Error: syntax error at or near "test_table"

LINE 1: test_table 'test_table is being replaced here' from test_table...
        ^
[ Babel evaluation exited with code 1 ]

I’m still unsure if to simply allow for the default behavior or try to implement some rules to the substitution.

Table lookup syntax

A more organized approach to variable substitution, allowing tracking of multiple variables by reusing a whole table in multiple blocks by simply using its name in the variable header.

#+name: lookup_table

keyvalue
a‘apple’
b‘banana’
c‘cranberry’
message‘IM SCREAMING AAAAAAA’

#+begin_src duckdb :var data=lookup_table :headers on

-- Test table lookup with the varname[key] syntax
SELECT
  'a key refers to ' || data[a] AS a_lookup,
  'b key refers to ' || data[b] AS b_lookup,
  'c key refers to ' || data[c] AS c_lookup;

Results:

┌───────────────────────┬────────────────────────┬───────────────────────────┐
│       a_lookup        │        b_lookup        │         c_lookup          │
│        varchar        │        varchar         │          varchar          │
├───────────────────────┼────────────────────────┼───────────────────────────┤
│ a key refers to apple │ b key refers to banana │ c key refers to cranberry │
└───────────────────────┴────────────────────────┴───────────────────────────┘

#+begin_src duckdb :var data=lookup_table :wrap example :format line

-- Test table lookup with the varname[key] syntax
SELECT data[message] as IMPORTANT_MESSAGE;

RESULTS:

IMPORTANT_MESSAGE = IM SCREAMING AAAAAAA

Advanced substitution examples

Queries as variables

#+name: var_table_name

keyvalue
query1select [ 1,2,3,4,5,6 ]
query2select unnest(#1) from
table_namesome_table
+HEADER: :var var_table=var_table_name
+begin_src duckdb
.echo on
var_table[query2]( var_table[query1] as var_table[table_name] );

RESULTS:

select unnest(#1) from( select [ 1,2,3,4,5,6 ] as some_table );
┌────────────┐
│ unnest(#1) │
│   int32    │
├────────────┤
│          1 │
│          2 │
│          3 │
│          4 │
│          5 │
│          6 │
└────────────┘
.exit

Substitution applying to variables based on header order

+HEADER: :var commands="output_format"
+HEADER: :var output_format="dot_command1\ndot_command2\ndot_command3"
+HEADER: :var dot_command1=".mode csv"
+HEADER: :var dot_command2=".timer on"
+HEADER: :var dot_command3=".echo on"
+HEADER: :var query="SELECT '123123123' as some_column;"
+NAME: duckdb-execute-query
+begin_src duckdb
commands
query
+end_src

RESULTS:

SELECT '123123123' as some_column;
some_column
123123123
Run Time (s): real 0.000 user 0.000172 sys 0.000115

Execution History

The block tracking system maintains a history of all DuckDB block executions, allowing you to revisit previous analyses and track changes over time.

After executing a block:

SELECT 1 AS test;

It gets assigned block and execution IDs:

#+PROPERTY: EXEC_ID 91aac778-0bc2-4686-9a5e-4ef002266d40
#+PROPERTY: ID 3a80d690-1556-42f0-aaf4-a18ad4dfe10f
  SELECT 1 AS test;

Execution History Commands

These commands help navigate and explore your DuckDB execution history:

CommandDescription
`org-duckdb-blocks-list`Show all tracked blocks and their execution history
`org-duckdb-blocks-recent`Show recent executions chronologically
`org-duckdb-blocks-navigate-recent`Navigate to recent executions with completion
`org-duckdb-blocks-goto-block`Jump to a specific block by ID
`org-duckdb-blocks-goto-execution`Jump to a specific execution by ID
`org-duckdb-blocks-execution-info`Display detailed information about a specific execution

Example of execution history

When you execute `org-duckdb-blocks-list`, you’ll see a display like:

DuckDB Blocks and Executions
==========================

Block ID: 3a80d690-1556-42f0-aaf4-a18ad4dfe10f
  File: /path/to/file.org
  Buffer: file.org
  Source block: 796-846
  Executions (2 total):
    35c31cfa-c974-47d0-bc02-6afcddca0a55 (2025-04-15 12:09:20.768) :headers on :results drawer :format table
    91aac778-0bc2-4686-9a5e-4ef002266d40 (2025-04-15 12:04:10.041)

For a specific execution (`org-duckdb-blocks-execution-info`):

DuckDB Execution: 35c31cfa-c974-47d0-bc02-6afcddca0a55
==========================

Block ID: 3a80d690-1556-42f0-aaf4-a18ad4dfe10f
File: /path/to/file.org
Buffer: file.org

Execution Time: 2025-04-15 12:09:20.768

=== SOURCE BLOCK STATE ===
Parameters: :headers on :results drawer :format table
Line Count: 4

=== CONTENT ===
  SELECT 1 AS test;
  SELECT 0 AS test;
  SELECT 3 AS test;
  SELECT 2 AS test;

This history functionality is particularly useful for:

  • Tracking the evolution of your analyses
  • Reproducing previous results
  • Understanding changes in queries over time
  • Navigating complex documents with many DuckDB blocks

More Examples

Data Types

DuckDB supports a variety of data types, including complex ones:

#+begin_src duckdb :headers on :session data_types_example :format duckbox :wrap example

  -- Test various DuckDB data types
  CREATE OR REPLACE TABLE data_types_table AS
  SELECT
    42::TINYINT AS tiny_int,
    42::SMALLINT AS small_int,
    42::INTEGER AS int,
    42::BIGINT AS big_int,
    42.5::FLOAT AS float_val,
    42.5::DOUBLE AS double_val,
    'hello'::VARCHAR AS str,
    TRUE::BOOLEAN AS bool_val,
    '2023-01-15'::DATE AS date_val,
    '12:34:56'::TIME AS time_val,
    '2023-01-15 12:34:56'::TIMESTAMP AS timestamp_val,
    ARRAY[1, 2, 3] AS array_val,
    STRUCT_PACK(x := 1, y := 'hello') AS struct_val,
    MAP([1, 2], ['one', 'two']) AS map_val;

SELECT * from data_types_table;

RESULTS:

┌──────────┬───────────┬───┬──────────────────────┬──────────────────────┐
│ tiny_int │ small_int │ … │      struct_val      │       map_val        │
│   int8   │   int16   │   │ struct(x integer, …  │ map(integer, varch…  │
├──────────┼───────────┼───┼──────────────────────┼──────────────────────┤
│    42    │    42     │ … │ {'x': 1, 'y': hello} │ {1=one, 2=two}       │
├──────────┴───────────┴───┴──────────────────────┴──────────────────────┤
│ 1 rows                                            14 columns (4 shown) │
└────────────────────────────────────────────────────────────────────────┘

#+begin_src duckdb :headers on :results drawer :session data_types_example :format box :wrap example

DESCRIBE data_types_table;

RESULTS:

┌───────────────┬──────────────────────────────┬──────┬──────┬─────────┬───────┐
│  column_name  │         column_type          │ null │ key  │ default │ extra │
├───────────────┼──────────────────────────────┼──────┼──────┼─────────┼───────┤
│ tiny_int      │ TINYINT                      │ YES  │ NULL │ NULL    │ NULL  │
│ small_int     │ SMALLINT                     │ YES  │ NULL │ NULL    │ NULL  │
│ int           │ INTEGER                      │ YES  │ NULL │ NULL    │ NULL  │
│ big_int       │ BIGINT                       │ YES  │ NULL │ NULL    │ NULL  │
│ float_val     │ FLOAT                        │ YES  │ NULL │ NULL    │ NULL  │
│ double_val    │ DOUBLE                       │ YES  │ NULL │ NULL    │ NULL  │
│ str           │ VARCHAR                      │ YES  │ NULL │ NULL    │ NULL  │
│ bool_val      │ BOOLEAN                      │ YES  │ NULL │ NULL    │ NULL  │
│ date_val      │ DATE                         │ YES  │ NULL │ NULL    │ NULL  │
│ time_val      │ TIME                         │ YES  │ NULL │ NULL    │ NULL  │
│ timestamp_val │ TIMESTAMP                    │ YES  │ NULL │ NULL    │ NULL  │
│ array_val     │ INTEGER[]                    │ YES  │ NULL │ NULL    │ NULL  │
│ struct_val    │ STRUCT(x INTEGER, y VARCHAR) │ YES  │ NULL │ NULL    │ NULL  │
│ map_val       │ MAP(INTEGER, VARCHAR)        │ YES  │ NULL │ NULL    │ NULL  │
└───────────────┴──────────────────────────────┴──────┴──────┴─────────┴───────┘

#+begin_src duckdb :headers on :results drawer :session data_types_example :format box :wrap example

--table was too wide so we can transpose by doing this
UNPIVOT (SELECT CAST(COLUMNS(*) AS VARCHAR) FROM (
    SELECT * FROM data_types_table
)) AS t ON COLUMNS(*) INTO NAME col_name VALUE col_value;

RESULTS:

┌───────────────┬──────────────────────┐
│   col_name    │      col_value       │
├───────────────┼──────────────────────┤
│ tiny_int      │ 42                   │
│ small_int     │ 42                   │
│ int           │ 42                   │
│ big_int       │ 42                   │
│ float_val     │ 42.5                 │
│ double_val    │ 42.5                 │
│ str           │ hello                │
│ bool_val      │ true                 │
│ date_val      │ 2023-01-15           │
│ time_val      │ 12:34:56             │
│ timestamp_val │ 2023-01-15 12:34:56  │
│ array_val     │ [1, 2, 3]            │
│ struct_val    │ {'x': 1, 'y': hello} │
│ map_val       │ {1=one, 2=two}       │
└───────────────┴──────────────────────┘

Window Functions

#+begin_src duckdb :headers on

-- Window functions
WITH sales AS (
  SELECT * FROM (VALUES
    ('North', 100),
    ('North', 150),
    ('South', 120),
    ('South', 90),
    ('East', 110),
    ('East', 140),
    ('West', 95),
    ('West', 125)
  ) AS t(region, amount)
)

SELECT
  region,
  amount,
  SUM(amount) OVER (PARTITION BY region) AS region_total,
  AVG(amount) OVER (PARTITION BY region) AS region_avg,
  RANK() OVER (PARTITION BY region ORDER BY amount DESC) AS rank_in_region,
  SUM(amount) OVER () AS grand_total
FROM sales
ORDER BY region, amount DESC;

Results:

┌─────────┬────────┬──────────────┬────────────┬────────────────┬─────────────┐
│ region  │ amount │ region_total │ region_avg │ rank_in_region │ grand_total │
│ varchar │ int32  │    int128    │   double   │     int64      │   int128    │
├─────────┼────────┼──────────────┼────────────┼────────────────┼─────────────┤
│ East    │    140 │          250 │      125.0 │              1 │         930 │
│ East    │    110 │          250 │      125.0 │              2 │         930 │
│ North   │    150 │          250 │      125.0 │              1 │         930 │
│ North   │    100 │          250 │      125.0 │              2 │         930 │
│ South   │    120 │          210 │      105.0 │              1 │         930 │
│ South   │     90 │          210 │      105.0 │              2 │         930 │
│ West    │    125 │          220 │      110.0 │              1 │         930 │
│ West    │     95 │          220 │      110.0 │              2 │         930 │
└─────────┴────────┴──────────────┴────────────┴────────────────┴─────────────┘

Recursive CTE for Hierarchical Data

#+begin_src duckdb :headers on :format table

-- Recursive CTE for hierarchical data
WITH RECURSIVE hierarchy AS (
  -- Base case: get root nodes (nodes with no parent)
  SELECT 1 AS id, 'Root A' AS name, 0 AS parent_id, 0 AS level
  UNION ALL
  SELECT 2, 'Root B', 0, 0
  UNION ALL
  SELECT 3, 'Child A.1', 1, 1
  UNION ALL
  SELECT 4, 'Child A.2', 1, 1
  UNION ALL
  SELECT 5, 'Child B.1', 2, 1
  UNION ALL
  SELECT 6, 'Grandchild A.1.1', 3, 2
)

SELECT
  id,
  CASE
    WHEN level = 0 THEN name
    ELSE repeat('  ', level) || '└─ ' || name
  END AS hierarchical_name,
  parent_id
FROM hierarchy
ORDER BY
  CASE WHEN parent_id = 0 THEN id ELSE parent_id END,
  level,
  id;

Results:

+----+-------------------------+-----------+
| id |    hierarchical_name    | parent_id |
+----+-------------------------+-----------+
| 1  | Root A                  | 0         |
| 3  |   └─ Child A.1          | 1         |
| 4  |   └─ Child A.2          | 1         |
| 2  | Root B                  | 0         |
| 5  |   └─ Child B.1          | 2         |
| 6  |     └─ Grandchild A.1.1 | 3         |
+----+-------------------------+-----------+

JSON Processing

#+begin_src duckdb :headers on :wrap example

-- Test DuckDB's JSON functionality
CREATE TABLE json_test AS
SELECT
  -- Parse JSON
  json_extract('{"name": "Alice", "age": 30}', '$.name') AS name,
  -- Create JSON
  json_object('id', 1, 'tags', json_array('red', 'blue')) AS created_json,
  -- Array access
  json_extract(json_array(1, 2, 3), '$[1]') AS second_element,
  -- Nested extraction
  json_extract(
    '{"user": {"details": {"address": {"city": "New York"}}}}',
    '$.user.details.address.city'
  ) AS nested_city;

SELECT * from json_test;
DESCRIBE SELECT * from json_test;

RESULTS:

┌─────────┬────────────────────────────────┬────────────────┬─────────────┐
│  name   │          created_json          │ second_element │ nested_city │
│  json   │              json              │      json      │    json     │
├─────────┼────────────────────────────────┼────────────────┼─────────────┤
│ "Alice" │ {"id":1,"tags":["red","blue"]} │ 2              │ "New York"  │
└─────────┴────────────────────────────────┴────────────────┴─────────────┘
┌────────────────┬─────────────┬─────────┬─────────┬─────────┬─────────┐
│  column_name   │ column_type │  null   │   key   │ default │  extra  │
│    varchar     │   varchar   │ varchar │ varchar │ varchar │ varchar │
├────────────────┼─────────────┼─────────┼─────────┼─────────┼─────────┤
│ name           │ JSON        │ YES     │ NULL    │ NULL    │ NULL    │
│ created_json   │ JSON        │ YES     │ NULL    │ NULL    │ NULL    │
│ second_element │ JSON        │ YES     │ NULL    │ NULL    │ NULL    │
│ nested_city    │ JSON        │ YES     │ NULL    │ NULL    │ NULL    │
└────────────────┴─────────────┴─────────┴─────────┴─────────┴─────────┘

Using Prologue and Epilogue

The :prologue and :epilogue header arguments let you execute SQL before and after the main body:

+HEADER: :results drawer
+HEADER: :prologue "CREATE TEMP TABLE test_prologue(id INTEGER);\nINSERT INTO test_prologue VALUES (1), (2), (3);"
+HEADER: :epilogue "DROP TABLE test_prologue;"
+begin_src duckdb
-- The prologue creates a table and inserts data before this query runs
SELECT * FROM test_prologue;
-- The epilogue will clean up after

Results:

┌───────┐
│  id   │
│ int32 │
├───────┤
│     1 │
│     2 │
│     3 │
└───────┘

Special Characters and Quotes

+HEADER: :prologue UNPIVOT (SELECT CAST(COLUMNS(*) AS VARCHAR) FROM (
+HEADER: :epilogue )) AS t ON COLUMNS(*) INTO NAME col_name VALUE col_value;
+begin_src duckdb :wrap example
-- Test handling of quotes and special characters
SELECT
  'Single ''quoted'' text' AS single_quotes,
  'Text with "double quotes"' AS double_quotes,
  'Text with semicolons;' AS semicolons,
  'Line 1
   Line 2
   Line 3' AS multiline

RESULTS:

┌───────────────┬──────────────────────────────────┐
│   col_name    │            col_value             │
│    varchar    │             varchar              │
├───────────────┼──────────────────────────────────┤
│ single_quotes │ Single 'quoted' text             │
│ double_quotes │ Text with "double quotes"        │
│ semicolons    │ Text with semicolons;            │
│ multiline     │ Line 1\n     Line 2\n     Line 3 │
└───────────────┴──────────────────────────────────┘

Ideas

IN PROGRESS - using full org table as data source

Querying directly from an org table is not yet supported.

#+name: complex_data

firstlastagecity
JohnDoe35New York
JaneSmith28Los Angeles
BobBrown42Chicago

#+begin_src duckdb :var people=complex_data :headers on

-- Complex table manipulation
WITH people_data AS (
  SELECT first, last, age, city FROM people
)

SELECT
  first || ' ' || last AS full_name,
  CASE
    WHEN age < 30 THEN 'Young'
    WHEN age < 40 THEN 'Middle'
    ELSE 'Senior'
  END AS age_group,
  city
FROM people_data
ORDER BY age DESC;

variable types?

Looking into being able to set variable type in the header arguments, not sure if it would be useful though.

Troubleshooting

Common Issues

  1. DuckDB not found: Ensure DuckDB CLI is installed and in your PATH. You should be able to call duckdb from your shell by doing duckdb.
  2. Session not working: Make sure session names are consistent. You can use org-babel-duckdb-display-sessions to check running duckdb sessions.
  3. Asynchronous execution errors: Async execution requires a session. Ensure you’ve included both `:session name` and `:async yes` in your header arguments.
  4. Large query output issues: For very large query results, use `:output buffer` to display results in a separate buffer.
  5. Block tracking problems: If block tracking seems inconsistent, try using `org-duckdb-blocks-list` to view all tracked blocks and their current status.
  6. Installation: I’m on doom emacs so I’m not entirely sure if there will be issues installing on vanilla or other frameworks, let me know.

Contributing

Contributions, bug reports, and feature requests are more than welcome. This is the first time I’ve done a package so I’m expecting issues to arrive or there might be some obvious optimizations that I missed. I’ve tried to document the code as best as possible but feel free to:

  1. Open issues for bug reports or feature requests
  2. Submit pull requests with improvements
  3. Share examples and documentation
  4. Suggest changes to documentation in order to improve clarity

License

GPLv3


Note: This package is independently developed and not officially affiliated with DuckDB.

About

Org Babel integration with DuckDB.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published
0