ob-duckdb
brings DuckDB, an in-process analytical database, to Org mode with powerful synchronous and asynchronous execution capabilities.
This package enables you to:
- Execute DuckDB SQL queries directly in Org mode source blocks
- Run complex analytical queries asynchronously without blocking Emacs
- View results as tables, CSV, JSON, and other formats
- Maintain persistent sessions for interactive data analysis
- Connect to database files or use in-memory databases
- Use Org variables in your queries
- Execute system shell commands within DuckDB sessions
- Track execution history and navigate between related blocks
- Emacs 28.1 or later
- Org mode + Org Babel
- DuckDB CLI must be installed and available in your PATH
Clone the repository and add to your load path:
(add-to-list 'load-path "/path/to/ob-duckdb")
(require 'ob-duckdb)
;; then add the babel language etc
(use-package ob-duckdb
:straight (:host github :repo "gggion/ob-duckdb")
:after org
:config
(org-babel-do-load-languages
'org-babel-load-languages
(append org-babel-load-languages '((duckdb . t)))))
;; packages.el
(package! ob-duckdb
:recipe (:host github
:repo "gggion/ob-duckdb"
:files ("*.el")))
;; config.el
(use-package! ob-duckdb)
(after! org
(org-babel-do-load-languages 'org-babel-load-languages
(append org-babel-load-languages '((duckdb . t)))))
Not on MELPA yet.
- Execute SQL queries directly in Org mode
- Asynchronous execution for long-running queries
- Persistent sessions for maintaining database state across block executions
- Execution history for tracking analysis workflows
- Connect to database files or use in-memory database
- Variable substitution from Org mode elements
- Multiple output formats (table, CSV, JSON, markdown, etc.)
- Full support for DuckDB dot commands
- Shell command execution within DuckDB sessions
- ANSI color support for colorized output (mostly for exception output or duckbox mode output)
- Customizable display options for query results
- Block tracking system for reliable async results
Basic example:
#+begin_src duckdb
SELECT 1 AS test;
RESULTS:
┌───────┐ │ test │ │ int32 │ ├───────┤ │ 1 │ └───────┘
Connect to a database file:
#+begin_src duckdb :db test_database.duckdb
CREATE TABLE IF NOT EXISTS test_table (
id INTEGER,
name VARCHAR,
created_at TIMESTAMP
);
INSERT INTO test_table VALUES
(1, 'First entry', CURRENT_TIMESTAMP),
(2, 'Second entry', CURRENT_TIMESTAMP);
SELECT * FROM test_table;
RESULTS:
┌───────┬──────────────┬─────────────────────────┐ │ id │ name │ created_at │ │ int32 │ varchar │ timestamp │ ├───────┼──────────────┼─────────────────────────┤ │ 1 │ First entry │ 2025-04-05 16:58:59.59 │ │ 2 │ Second entry │ 2025-04-05 16:58:59.59 │ └───────┴──────────────┴─────────────────────────┘
ob-duckdb
supports these header arguments, including dot commands from the CLI:
Argument | Description | Example | dot command |
---|---|---|---|
:db | Database file path (or :memory: for in-memory) | :db test.duckdb | |
:session | Session name for persistent connections | :session my-session | |
:async | Run query asynchronously (requires session) | :async yes | |
:timer | Show execution time | :timer on | .timer on/off |
:headers | Show column headers | :headers on | .headers on/off |
:nullvalue | String to display for NULL values | :nullvalue "N/A" | .nullvalue |
:separator | Column separator for output | :separator "\" | .separator COL ?ROW? |
:echo | Echo commands in output | :echo on | .echo on/off |
:bail | Exit on error | :bail on | .bail on/off |
:output | Output handling | :output buffer | |
:prologue | SQL to execute before the main body | :prologue "CREATE..." | |
:epilogue | SQL to execute after the main body | :epilogue "DROP..." |
#+begin_src duckdb :format markdown :headers on
SELECT
date_part('year', d) AS year,
date_part('month', d) AS month,
date_part('day', d) AS day
FROM (VALUES
('2023-01-15'::DATE),
('2023-06-30'::DATE),
('2023-12-25'::DATE)
) AS dates(d);
RESULTS:
year | month | day | 2023 | 1 | 15 | 2023 | 6 | 30 | 2023 | 12 | 25 |
#+begin_src duckdb :format json :wrap src json
SELECT
json_object('id', id, 'name', name) AS json_data
FROM (VALUES
(1, 'Alice'),
(2, 'Bob'),
(3, 'Charlie')
) AS t(id, name);
Results:
[{"json_data":{"id":1,"name":"Alice"}},
{"json_data":{"id":2,"name":"Bob"}},
{"json_data":{"id":3,"name":"Charlie"}}]
DuckDB supports LaTeX as an output format, which is pretty neat. If you have LaTeX installed and org-latex-preview
enabled, you can render the table in your org buffer.
HEADER: :format latex~
HEADER: :timer on~
HEADER: :wrap src latex~
begin_src duckdb~
-- Test multiple header arguments together
SELECT
row_number() OVER () AS id,
(random() * 100)::INTEGER AS random_number,
CASE WHEN random() > 0.5 THEN 'Group A' ELSE 'Group B' END AS category
FROM range(1, 11);
RESULTS:
\begin{tabular}{|rrl|}
\hline
id & random_number & category \\
\hline
1 & 64 & Group B \\
2 & 18 & Group A \\
3 & 63 & Group A \\
4 & 45 & Group A \\
5 & 31 & Group B \\
6 & 90 & Group A \\
7 & 31 & Group A \\
8 & 5 & Group B \\
9 & 12 & Group A \\
10 & 55 & Group B \\
\hline
\end{tabular}
#+begin_src duckdb :format csv :separator "@@@@" :headers on
SELECT
'Column 1' AS first,
'Column 2' AS second,
'Column 3' AS third
UNION ALL
SELECT 'Data 1', 'Data 2', 'Data 3';
Results:
first@@@@second@@@@third Column 1@@@@Column 2@@@@Column 3 Data 1@@@@Data 2@@@@Data 3
#+begin_src duckdb :nullvalue "N/A" :headers on
SELECT
1 AS id,
NULL AS missing_value,
'present' AS existing_value
UNION ALL
SELECT 2, 'found', NULL;
Results:
┌───────┬───────────────┬────────────────┐ │ id │ missing_value │ existing_value │ │ int32 │ varchar │ varchar │ ├───────┼───────────────┼────────────────┤ │ 1 │ N/A │ present │ │ 2 │ found │ N/A │ └───────┴───────────────┴────────────────┘
For long-running queries, you can use asynchronous execution to avoid blocking Emacs:
#+begin_src duckdb :async yes :session test :var parquet_url='https://github.com/apache/parquet-mr/raw/master/parquet-hadoop/src/test/resources/test-file-with-no-column-indexes-1.parquet'
SELECT * FROM read_parquet('parquet_url') WHERE id < 10
Initially shows:
Executing asynchronously...
And when complete, displays the results:
┌───────┬─────────┬──────────────────────┬─────────────────────────────────────┐ │ id │ name │ location │ phoneNumbers │ │ int64 │ varchar │ struct(lon double,… │ struct(phone struct(number bigint… │ ├───────┼─────────┼──────────────────────┼─────────────────────────────────────┤ │ 0 │ p0 │ NULL │ {'phone': [{'number': 0, 'kind': … │ │ 1 │ p1 │ {'lon': 1.0, 'lat'… │ {'phone': [{'number': 1, 'kind': … │ │ 2 │ p2 │ {'lon': 2.0, 'lat'… │ {'phone': [{'number': 2, 'kind': … │ │ 3 │ p3 │ NULL │ {'phone': [{'number': 3, 'kind': … │ │ 4 │ p4 │ {'lon': 4.0, 'lat'… │ {'phone': [{'number': 4, 'kind': … │ │ 5 │ p5 │ {'lon': 5.0, 'lat'… │ {'phone': [{'number': 5, 'kind': … │ │ 6 │ p6 │ NULL │ {'phone': [{'number': 6, 'kind': … │ │ 7 │ p7 │ {'lon': 7.0, 'lat'… │ {'phone': [{'number': 7, 'kind': … │ │ 8 │ p8 │ {'lon': 8.0, 'lat'… │ {'phone': [{'number': 8, 'kind': … │ │ 9 │ p9 │ NULL │ {'phone': [{'number': 9, 'kind': … │ ├───────┴─────────┴──────────────────────┴─────────────────────────────────────┤ │ 10 rows 4 columns │ └──────────────────────────────────────────────────────────────────────────────┘
Requirements for async execution:
- Must use a session (`:session some-name`)
- Must include `:async yes` in header arguments
Async execution is useful for:
- Executing queries that depend on reading files through the internet (eg aws s3 or some other file from an url)
- Very large queries that would otherwise lock up Emacs
- Complicated analyses that take more than a few seconds to run
- Maintaining productivity while waiting for results
DuckDB’s dot commands are fully supported inside the src block, you can see all of them by doing .help -all
.
#+begin_src duckdb
.print "DuckDB dot commands can be used inside the org block:"
.help
RESULTS:
DuckDB dot commands can be used inside the org block: .bail on|off Stop after hitting an error. Default OFF .binary on|off Turn binary output on or off. Default OFF .cd DIRECTORY Change the working directory to DIRECTORY .changes on|off Show number of rows changed by SQL .check GLOB Fail if output since .testcase does not match .columns Column-wise rendering of query results ...
#+begin_src duckdb
.print "IM SCREAMING AAAAAAAAA"
RESULTS:
IM SCREAMING AAAAAAAAA
The .shell
dot command allows executing shell commands within DuckDB:
#+begin_src duckdb :results output :wrap example
-- moving to a dir
.cd /tmp/dumps/new
-- List files in current directory
.shell ls -la
-- Show current date and time
.shell date
-- Run a simple echo command
.print \n
.shell echo "Im screaming from the shell AAAAAAAAAAAAAAAAAA"
RESULTS:
total 8 drwx------ 2 demo demo 4096 Apr 2 19:34 . drwxr-x--T 6 demo demo 4096 Apr 2 23:12 .. Sat Apr 5 11:05:59 PM -04 2025 Im screaming from the shell AAAAAAAAAAAAAAAAAA
#+begin_src duckdb
-- First create a temp table
CREATE TEMPORARY TABLE sample_data AS
SELECT * FROM range(1, 6) AS r(num);
-- Run a query
SELECT * FROM sample_data;
-- Use shell to create a directory for outputs if it doesn't exist
.shell mkdir -p duckdb_outputs
-- Export query results to a CSV file using shell command
.mode csv
.once duckdb_outputs/sample_data.csv
SELECT * FROM sample_data;
-- Verify the file was created
.shell ls -l duckdb_outputs/
-- Show file contents
.shell cat duckdb_outputs/sample_data.csv
Results:
┌───────┐ │ num │ │ int64 │ ├───────┤ │ 1 │ │ 2 │ │ 3 │ │ 4 │ │ 5 │ 103CE └───────┘ total 4 -rw-r--r-- 1 demo demo 20 Apr 5 18:21 sample_data.csv num 1 2 3 4 5
DuckDB supports various output formats through the .mode
command,
which can be set with the :format
header argument.
Available formats:
ascii | Columns/rows delimited by 0x1F and 0x1E |
box | Tables using unicode box-drawing characters |
csv | Comma-separated values |
column | Output in columns. (See .width) |
duckbox | Tables with extensive features |
html | HTML <table> code |
insert | SQL insert statements for TABLE |
json | Results in a JSON array |
jsonlines | Results in a NDJSON |
latex | LaTeX tabular environment code |
line | One value per line |
list | Values delimited by “|” |
markdown | Markdown table format |
quote | Escape answers as for SQL |
table | Same style as org tables |
tabs | Tab-separated values |
tcl | TCL list elements |
trash | No output |
org-table | Special format that produces Org tables |
:output buffer
header argument will display results in a separate buffer, useful for large results that would otherwise clutter your Org document.
It opens a buffer named *DuckDB-output*
with the query results.
#+begin_src duckdb :output buffer
-- Output goes to a dedicated buffer
.mode box
SELECT
'Row 1' AS description,
1 AS value,
CAST('2023-01-01' AS DATE) AS date
UNION ALL
SELECT
'Row 2',
2,
CAST('2023-02-15' AS DATE);
RESULTS:
Output sent to buffer.
The buffer content would look like:
┌─────────────┬───────┬────────────┐ │ description │ value │ date │ │ varchar │ int32 │ date │ ├─────────────┼───────┼────────────┤ │ Row 1 │ 1 │ 2023-01-01 │ │ Row 2 │ 2 │ 2023-02-15 │ └─────────────┴───────┴────────────┘
Use the special org-table
format to get results in an org-compatible table:
#+begin_src duckdb
.mode org-table
SELECT * FROM range(1, 5) as t(num);
Results:
| num | |-----| | 1 | | 2 | | 3 | | 4 |
#+begin_src duckdb :format csv
SELECT * FROM generate_series(1, 5) AS s(num) ;
Results:
num 1 2 3 4 5
#+begin_src duckdb :format json :wrap src json
SELECT
json_object('id', id, 'name', name) AS json_data
FROM (VALUES
(1, 'Alice'),
(2, 'Bob'),
(3, 'Charlie')
) AS t(id, name);
Results:
[{"json_data":{"id":1,"name":"Alice"}},
{"json_data":{"id":2,"name":"Bob"}},
{"json_data":{"id":3,"name":"Charlie"}}]
#+begin_src duckdb :format markdown :headers on
SELECT
date_part('year', d) AS year,
date_part('month', d) AS month,
date_part('day', d) AS day
FROM (VALUES
('2023-01-15'::DATE),
('2023-06-30'::DATE),
('2023-12-25'::DATE)
) AS dates(d);
RESULTS:
| year | month | day | | 2023 | 1 | 15 | | 2023 | 6 | 30 | | 2023 | 12 | 25 |
#+begin_src duckdb :format csv :separator "@@@@" :headers on :wrap example
SELECT
'Column 1' AS first,
'Column 2' AS second,
'Column 3' AS third
UNION ALL
SELECT 'Data 1', 'Data 2', 'Data 3';
RESULTS:
first@@@@second@@@@third Column 1@@@@Column 2@@@@Column 3 Data 1@@@@Data 2@@@@Data 3
#+begin_src duckdb :format line :nullvalue "N/A" :headers on :wrap example
SELECT
1 AS id,
NULL AS missing_value,
'present' AS existing_value
UNION ALL
SELECT 2, 'found', NULL;
RESULTS:
id = 1 missing_value = N/A existing_value = present id = 2 missing_value = found existing_value = N/A
The :db header allows us to use a database file, it’s the equivalent of executing the command duckdb <db>
, <db> being the path to the db file. In order to find the db file within your folder structure you can either put the whole path in the :db parameter or you can use :dir
to first navigate to the folder where your db file lives. There’s also the duckdb-cli dot command .cd
, which does the same thing.
NOTE: if the db file is not found, it will be created.
NOTE: duckdb can read sqlite .db files directly, meaning you can give the path to a sqlite db file and it’ll work.
#+begin_src duckdb :db test_database.duckdb
CREATE TABLE IF NOT EXISTS test_table (
id INTEGER,
name VARCHAR,
created_at TIMESTAMP
);
INSERT INTO test_table VALUES
(1, 'First entry', CURRENT_TIMESTAMP),
(2, 'Second entry', CURRENT_TIMESTAMP);
SELECT * FROM test_table;
RESULTS:
┌───────┬──────────────┬─────────────────────────┐ │ id │ name │ created_at │ │ int32 │ varchar │ timestamp │ ├───────┼──────────────┼─────────────────────────┤ │ 1 │ First entry │ 2025-04-05 16:58:59.59 │ │ 2 │ Second entry │ 2025-04-05 16:58:59.59 │ │ 1 │ First entry │ 2025-04-05 16:59:31.385 │ │ 2 │ Second entry │ 2025-04-05 16:59:31.385 │ └───────┴──────────────┴─────────────────────────┘
More complex example:
#+begin_src duckdb :db test_database.duckdb
-- Create a more complex schema
CREATE TABLE IF NOT EXISTS users (
user_id INTEGER PRIMARY KEY,
username VARCHAR NOT NULL UNIQUE,
email VARCHAR,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE IF NOT EXISTS posts (
post_id INTEGER PRIMARY KEY,
user_id INTEGER,
title VARCHAR NOT NULL,
content TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (user_id) REFERENCES users(user_id)
);
-- Add some test data
INSERT INTO users (user_id, username, email) VALUES
(1, 'alice', 'alice@example.com'),
(2, 'bob', 'bob@example.com'),
(3, 'charlie', 'charlie@example.com');
INSERT INTO posts (post_id, user_id, title, content) VALUES
(1, 1, 'Alice First Post', 'Hello from Alice!'),
(2, 2, 'Bob Introduction', 'Hi, I am Bob.'),
(3, 1, 'Alice Again', 'Second post from Alice');
-- Run a join query
SELECT
u.username,
p.title,
p.created_at
FROM posts p
JOIN users u ON p.user_id = u.user_id
ORDER BY p.created_at DESC;
RESULTS:
┌──────────┬──────────────────┬─────────────────────────┐ │ username │ title │ created_at │ │ varchar │ varchar │ timestamp │ ├──────────┼──────────────────┼─────────────────────────┤ │ alice │ Alice First Post │ 2025-04-05 16:59:10.818 │ │ bob │ Bob Introduction │ 2025-04-05 16:59:10.818 │ │ alice │ Alice Again │ 2025-04-05 16:59:10.818 │ └──────────┴──────────────────┴─────────────────────────┘
Sessions allow maintaining state between source blocks, useful for incremental analysis and multi-step workflows.
#+begin_src duckdb :session my-session-test :results output
-- First command in the session
CREATE TEMPORARY TABLE session_test (id INTEGER, value VARCHAR);
INSERT INTO session_test VALUES (1, 'First value');
select * from session_test;
RESULTS:
┌───────┬─────────────┐ │ id │ value │ │ int32 │ varchar │ ├───────┼─────────────┤ │ 1 │ First value │ └───────┴─────────────┘
#+begin_src duckdb :session my-session-test :results output
-- Second command uses the same session and can access previous data
INSERT INTO session_test VALUES (2, 'Second value');
SELECT * FROM session_test ORDER BY id;
RESULTS:
┌───────┬──────────────┐ │ id │ value │ │ int32 │ varchar │ ├───────┼──────────────┤ │ 1 │ First value │ │ 2 │ Second value │ └───────┴──────────────┘
Sessions can also be connected to database files:
#+begin_src duckdb :session db-session :db test_database.duckdb
-- This session connects to a specific database file
SELECT 'New session with database file' AS message;
-- Access tables from the database
SELECT COUNT(*) AS user_count FROM users;
NOTE: The :db argument is the equivalent of doing duckdb some_database
, which means that DuckDB won’t allow multiple processes connected to the same db file (see Concurrency). This isn’t an issue on non-session source blocks since we start and kill the duckdb process. But in the case of sessions, it wont be possible to use the same db as usual if it’s being used in another session.
These interactive functions help manage DuckDB sessions:
Function | Description |
---|---|
org-babel-duckdb-create-session | Creates a new DuckDB session with optional database file connection. Prompts for session name and database path interactively. |
org-babel-duckdb-delete-session | Terminates a DuckDB session, kills its buffer, and removes it from the session registry. Uses completion to select the session to delete. |
org-babel-duckdb-display-sessions | Shows information about all active sessions in a formatted help buffer, including session names, database connections, and status. |
org-babel-duckdb-cleanup-sessions | Removes dead sessions from the registry whose processes or buffers no longer exist. Helps reduce clutter. |
DuckDB also offers some methods to avoid this deadlock, which can be read about here: Concurrency.
ob-duckdb
supports using variables from the source block headers. Variable substitution applies to strings, dollar sign variables ($var) and org tables.
#+begin_src duckdb :var min_value=5 max_value=10
-- variable substitution
SELECT *
FROM generate_series(min_value, max_value) AS s(value)
WHERE value BETWEEN min_value AND max_value;
Results:
┌───────┐ │ value │ │ int64 │ ├───────┤ │ 5 │ │ 6 │ │ 7 │ │ 8 │ │ 9 │ │ 10 │ └───────┘
#+begin_src duckdb :var name_variable="'test_name'" table_variable="'test_table'"
-- String variable substitution
SELECT 'Hello, ' || name AS greeting;
-- Table name variable substitution
SELECT 'Table name is: ' || $table_variable AS info;
RESULTS:
┌───────────────────────────┐ │ info │ │ varchar │ ├───────────────────────────┤ │ Table name is: test_table │ └───────────────────────────┘
This can quickly get confusing since at the moment the variable substitution is absolute, meaning if our variable name is table
, it’ll replace all instances of table
within the block to our value, and this applies to keywords too, meaning this:
#+begin_src duckdb :var select='test_table'
select 'select is being replaced here' from select;
Will result in this:
Parser Error: syntax error at or near "test_table" LINE 1: test_table 'test_table is being replaced here' from test_table... ^ [ Babel evaluation exited with code 1 ]
I’m still unsure if to simply allow for the default behavior or try to implement some rules to the substitution.
A more organized approach to variable substitution, allowing tracking of multiple variables by reusing a whole table in multiple blocks by simply using its name in the variable header.
#+name: lookup_table
key | value |
---|---|
a | ‘apple’ |
b | ‘banana’ |
c | ‘cranberry’ |
message | ‘IM SCREAMING AAAAAAA’ |
#+begin_src duckdb :var data=lookup_table :headers on
-- Test table lookup with the varname[key] syntax
SELECT
'a key refers to ' || data[a] AS a_lookup,
'b key refers to ' || data[b] AS b_lookup,
'c key refers to ' || data[c] AS c_lookup;
Results:
┌───────────────────────┬────────────────────────┬───────────────────────────┐ │ a_lookup │ b_lookup │ c_lookup │ │ varchar │ varchar │ varchar │ ├───────────────────────┼────────────────────────┼───────────────────────────┤ │ a key refers to apple │ b key refers to banana │ c key refers to cranberry │ └───────────────────────┴────────────────────────┴───────────────────────────┘
#+begin_src duckdb :var data=lookup_table :wrap example :format line
-- Test table lookup with the varname[key] syntax
SELECT data[message] as IMPORTANT_MESSAGE;
RESULTS:
IMPORTANT_MESSAGE = IM SCREAMING AAAAAAA
#+name: var_table_name
key | value |
---|---|
query1 | select [ 1,2,3,4,5,6 ] |
query2 | select unnest(#1) from |
table_name | some_table |
+HEADER: :var var_table=var_table_name
+begin_src duckdb
.echo on
var_table[query2]( var_table[query1] as var_table[table_name] );
RESULTS:
select unnest(#1) from( select [ 1,2,3,4,5,6 ] as some_table ); ┌────────────┐ │ unnest(#1) │ │ int32 │ ├────────────┤ │ 1 │ │ 2 │ │ 3 │ │ 4 │ │ 5 │ │ 6 │ └────────────┘ .exit
+HEADER: :var commands="output_format"
+HEADER: :var output_format="dot_command1\ndot_command2\ndot_command3"
+HEADER: :var dot_command1=".mode csv"
+HEADER: :var dot_command2=".timer on"
+HEADER: :var dot_command3=".echo on"
+HEADER: :var query="SELECT '123123123' as some_column;"
+NAME: duckdb-execute-query
+begin_src duckdb
commands
query
+end_src
RESULTS:
SELECT '123123123' as some_column; some_column 123123123 Run Time (s): real 0.000 user 0.000172 sys 0.000115
The block tracking system maintains a history of all DuckDB block executions, allowing you to revisit previous analyses and track changes over time.
After executing a block:
SELECT 1 AS test;
It gets assigned block and execution IDs:
#+PROPERTY: EXEC_ID 91aac778-0bc2-4686-9a5e-4ef002266d40
#+PROPERTY: ID 3a80d690-1556-42f0-aaf4-a18ad4dfe10f
SELECT 1 AS test;
These commands help navigate and explore your DuckDB execution history:
Command | Description |
---|---|
`org-duckdb-blocks-list` | Show all tracked blocks and their execution history |
`org-duckdb-blocks-recent` | Show recent executions chronologically |
`org-duckdb-blocks-navigate-recent` | Navigate to recent executions with completion |
`org-duckdb-blocks-goto-block` | Jump to a specific block by ID |
`org-duckdb-blocks-goto-execution` | Jump to a specific execution by ID |
`org-duckdb-blocks-execution-info` | Display detailed information about a specific execution |
When you execute `org-duckdb-blocks-list`, you’ll see a display like:
DuckDB Blocks and Executions
==========================
Block ID: 3a80d690-1556-42f0-aaf4-a18ad4dfe10f
File: /path/to/file.org
Buffer: file.org
Source block: 796-846
Executions (2 total):
35c31cfa-c974-47d0-bc02-6afcddca0a55 (2025-04-15 12:09:20.768) :headers on :results drawer :format table
91aac778-0bc2-4686-9a5e-4ef002266d40 (2025-04-15 12:04:10.041)
For a specific execution (`org-duckdb-blocks-execution-info`):
DuckDB Execution: 35c31cfa-c974-47d0-bc02-6afcddca0a55 ========================== Block ID: 3a80d690-1556-42f0-aaf4-a18ad4dfe10f File: /path/to/file.org Buffer: file.org Execution Time: 2025-04-15 12:09:20.768 === SOURCE BLOCK STATE === Parameters: :headers on :results drawer :format table Line Count: 4 === CONTENT === SELECT 1 AS test; SELECT 0 AS test; SELECT 3 AS test; SELECT 2 AS test;
This history functionality is particularly useful for:
- Tracking the evolution of your analyses
- Reproducing previous results
- Understanding changes in queries over time
- Navigating complex documents with many DuckDB blocks
DuckDB supports a variety of data types, including complex ones:
#+begin_src duckdb :headers on :session data_types_example :format duckbox :wrap example
-- Test various DuckDB data types
CREATE OR REPLACE TABLE data_types_table AS
SELECT
42::TINYINT AS tiny_int,
42::SMALLINT AS small_int,
42::INTEGER AS int,
42::BIGINT AS big_int,
42.5::FLOAT AS float_val,
42.5::DOUBLE AS double_val,
'hello'::VARCHAR AS str,
TRUE::BOOLEAN AS bool_val,
'2023-01-15'::DATE AS date_val,
'12:34:56'::TIME AS time_val,
'2023-01-15 12:34:56'::TIMESTAMP AS timestamp_val,
ARRAY[1, 2, 3] AS array_val,
STRUCT_PACK(x := 1, y := 'hello') AS struct_val,
MAP([1, 2], ['one', 'two']) AS map_val;
SELECT * from data_types_table;
RESULTS:
┌──────────┬───────────┬───┬──────────────────────┬──────────────────────┐ │ tiny_int │ small_int │ … │ struct_val │ map_val │ │ int8 │ int16 │ │ struct(x integer, … │ map(integer, varch… │ ├──────────┼───────────┼───┼──────────────────────┼──────────────────────┤ │ 42 │ 42 │ … │ {'x': 1, 'y': hello} │ {1=one, 2=two} │ ├──────────┴───────────┴───┴──────────────────────┴──────────────────────┤ │ 1 rows 14 columns (4 shown) │ └────────────────────────────────────────────────────────────────────────┘
#+begin_src duckdb :headers on :results drawer :session data_types_example :format box :wrap example
DESCRIBE data_types_table;
RESULTS:
┌───────────────┬──────────────────────────────┬──────┬──────┬─────────┬───────┐ │ column_name │ column_type │ null │ key │ default │ extra │ ├───────────────┼──────────────────────────────┼──────┼──────┼─────────┼───────┤ │ tiny_int │ TINYINT │ YES │ NULL │ NULL │ NULL │ │ small_int │ SMALLINT │ YES │ NULL │ NULL │ NULL │ │ int │ INTEGER │ YES │ NULL │ NULL │ NULL │ │ big_int │ BIGINT │ YES │ NULL │ NULL │ NULL │ │ float_val │ FLOAT │ YES │ NULL │ NULL │ NULL │ │ double_val │ DOUBLE │ YES │ NULL │ NULL │ NULL │ │ str │ VARCHAR │ YES │ NULL │ NULL │ NULL │ │ bool_val │ BOOLEAN │ YES │ NULL │ NULL │ NULL │ │ date_val │ DATE │ YES │ NULL │ NULL │ NULL │ │ time_val │ TIME │ YES │ NULL │ NULL │ NULL │ │ timestamp_val │ TIMESTAMP │ YES │ NULL │ NULL │ NULL │ │ array_val │ INTEGER[] │ YES │ NULL │ NULL │ NULL │ │ struct_val │ STRUCT(x INTEGER, y VARCHAR) │ YES │ NULL │ NULL │ NULL │ │ map_val │ MAP(INTEGER, VARCHAR) │ YES │ NULL │ NULL │ NULL │ └───────────────┴──────────────────────────────┴──────┴──────┴─────────┴───────┘
#+begin_src duckdb :headers on :results drawer :session data_types_example :format box :wrap example
--table was too wide so we can transpose by doing this
UNPIVOT (SELECT CAST(COLUMNS(*) AS VARCHAR) FROM (
SELECT * FROM data_types_table
)) AS t ON COLUMNS(*) INTO NAME col_name VALUE col_value;
RESULTS:
┌───────────────┬──────────────────────┐ │ col_name │ col_value │ ├───────────────┼──────────────────────┤ │ tiny_int │ 42 │ │ small_int │ 42 │ │ int │ 42 │ │ big_int │ 42 │ │ float_val │ 42.5 │ │ double_val │ 42.5 │ │ str │ hello │ │ bool_val │ true │ │ date_val │ 2023-01-15 │ │ time_val │ 12:34:56 │ │ timestamp_val │ 2023-01-15 12:34:56 │ │ array_val │ [1, 2, 3] │ │ struct_val │ {'x': 1, 'y': hello} │ │ map_val │ {1=one, 2=two} │ └───────────────┴──────────────────────┘
#+begin_src duckdb :headers on
-- Window functions
WITH sales AS (
SELECT * FROM (VALUES
('North', 100),
('North', 150),
('South', 120),
('South', 90),
('East', 110),
('East', 140),
('West', 95),
('West', 125)
) AS t(region, amount)
)
SELECT
region,
amount,
SUM(amount) OVER (PARTITION BY region) AS region_total,
AVG(amount) OVER (PARTITION BY region) AS region_avg,
RANK() OVER (PARTITION BY region ORDER BY amount DESC) AS rank_in_region,
SUM(amount) OVER () AS grand_total
FROM sales
ORDER BY region, amount DESC;
Results:
┌─────────┬────────┬──────────────┬────────────┬────────────────┬─────────────┐ │ region │ amount │ region_total │ region_avg │ rank_in_region │ grand_total │ │ varchar │ int32 │ int128 │ double │ int64 │ int128 │ ├─────────┼────────┼──────────────┼────────────┼────────────────┼─────────────┤ │ East │ 140 │ 250 │ 125.0 │ 1 │ 930 │ │ East │ 110 │ 250 │ 125.0 │ 2 │ 930 │ │ North │ 150 │ 250 │ 125.0 │ 1 │ 930 │ │ North │ 100 │ 250 │ 125.0 │ 2 │ 930 │ │ South │ 120 │ 210 │ 105.0 │ 1 │ 930 │ │ South │ 90 │ 210 │ 105.0 │ 2 │ 930 │ │ West │ 125 │ 220 │ 110.0 │ 1 │ 930 │ │ West │ 95 │ 220 │ 110.0 │ 2 │ 930 │ └─────────┴────────┴──────────────┴────────────┴────────────────┴─────────────┘
#+begin_src duckdb :headers on :format table
-- Recursive CTE for hierarchical data
WITH RECURSIVE hierarchy AS (
-- Base case: get root nodes (nodes with no parent)
SELECT 1 AS id, 'Root A' AS name, 0 AS parent_id, 0 AS level
UNION ALL
SELECT 2, 'Root B', 0, 0
UNION ALL
SELECT 3, 'Child A.1', 1, 1
UNION ALL
SELECT 4, 'Child A.2', 1, 1
UNION ALL
SELECT 5, 'Child B.1', 2, 1
UNION ALL
SELECT 6, 'Grandchild A.1.1', 3, 2
)
SELECT
id,
CASE
WHEN level = 0 THEN name
ELSE repeat(' ', level) || '└─ ' || name
END AS hierarchical_name,
parent_id
FROM hierarchy
ORDER BY
CASE WHEN parent_id = 0 THEN id ELSE parent_id END,
level,
id;
Results:
+----+-------------------------+-----------+ | id | hierarchical_name | parent_id | +----+-------------------------+-----------+ | 1 | Root A | 0 | | 3 | └─ Child A.1 | 1 | | 4 | └─ Child A.2 | 1 | | 2 | Root B | 0 | | 5 | └─ Child B.1 | 2 | | 6 | └─ Grandchild A.1.1 | 3 | +----+-------------------------+-----------+
#+begin_src duckdb :headers on :wrap example
-- Test DuckDB's JSON functionality
CREATE TABLE json_test AS
SELECT
-- Parse JSON
json_extract('{"name": "Alice", "age": 30}', '$.name') AS name,
-- Create JSON
json_object('id', 1, 'tags', json_array('red', 'blue')) AS created_json,
-- Array access
json_extract(json_array(1, 2, 3), '$[1]') AS second_element,
-- Nested extraction
json_extract(
'{"user": {"details": {"address": {"city": "New York"}}}}',
'$.user.details.address.city'
) AS nested_city;
SELECT * from json_test;
DESCRIBE SELECT * from json_test;
RESULTS:
┌─────────┬────────────────────────────────┬────────────────┬─────────────┐ │ name │ created_json │ second_element │ nested_city │ │ json │ json │ json │ json │ ├─────────┼────────────────────────────────┼────────────────┼─────────────┤ │ "Alice" │ {"id":1,"tags":["red","blue"]} │ 2 │ "New York" │ └─────────┴────────────────────────────────┴────────────────┴─────────────┘ ┌────────────────┬─────────────┬─────────┬─────────┬─────────┬─────────┐ │ column_name │ column_type │ null │ key │ default │ extra │ │ varchar │ varchar │ varchar │ varchar │ varchar │ varchar │ ├────────────────┼─────────────┼─────────┼─────────┼─────────┼─────────┤ │ name │ JSON │ YES │ NULL │ NULL │ NULL │ │ created_json │ JSON │ YES │ NULL │ NULL │ NULL │ │ second_element │ JSON │ YES │ NULL │ NULL │ NULL │ │ nested_city │ JSON │ YES │ NULL │ NULL │ NULL │ └────────────────┴─────────────┴─────────┴─────────┴─────────┴─────────┘
The :prologue
and :epilogue
header arguments let you execute SQL before and after the main body:
+HEADER: :results drawer
+HEADER: :prologue "CREATE TEMP TABLE test_prologue(id INTEGER);\nINSERT INTO test_prologue VALUES (1), (2), (3);"
+HEADER: :epilogue "DROP TABLE test_prologue;"
+begin_src duckdb
-- The prologue creates a table and inserts data before this query runs
SELECT * FROM test_prologue;
-- The epilogue will clean up after
Results:
┌───────┐ │ id │ │ int32 │ ├───────┤ │ 1 │ │ 2 │ │ 3 │ └───────┘
+HEADER: :prologue UNPIVOT (SELECT CAST(COLUMNS(*) AS VARCHAR) FROM (
+HEADER: :epilogue )) AS t ON COLUMNS(*) INTO NAME col_name VALUE col_value;
+begin_src duckdb :wrap example
-- Test handling of quotes and special characters
SELECT
'Single ''quoted'' text' AS single_quotes,
'Text with "double quotes"' AS double_quotes,
'Text with semicolons;' AS semicolons,
'Line 1
Line 2
Line 3' AS multiline
RESULTS:
┌───────────────┬──────────────────────────────────┐ │ col_name │ col_value │ │ varchar │ varchar │ ├───────────────┼──────────────────────────────────┤ │ single_quotes │ Single 'quoted' text │ │ double_quotes │ Text with "double quotes" │ │ semicolons │ Text with semicolons; │ │ multiline │ Line 1\n Line 2\n Line 3 │ └───────────────┴──────────────────────────────────┘
Querying directly from an org table is not yet supported.
#+name: complex_data
first | last | age | city |
---|---|---|---|
John | Doe | 35 | New York |
Jane | Smith | 28 | Los Angeles |
Bob | Brown | 42 | Chicago |
#+begin_src duckdb :var people=complex_data :headers on
-- Complex table manipulation
WITH people_data AS (
SELECT first, last, age, city FROM people
)
SELECT
first || ' ' || last AS full_name,
CASE
WHEN age < 30 THEN 'Young'
WHEN age < 40 THEN 'Middle'
ELSE 'Senior'
END AS age_group,
city
FROM people_data
ORDER BY age DESC;
Looking into being able to set variable type in the header arguments, not sure if it would be useful though.
- DuckDB not found: Ensure DuckDB CLI is installed and in your PATH. You should be able to call duckdb from your shell by doing
duckdb
. - Session not working: Make sure session names are consistent. You can use
org-babel-duckdb-display-sessions
to check running duckdb sessions. - Asynchronous execution errors: Async execution requires a session. Ensure you’ve included both `:session name` and `:async yes` in your header arguments.
- Large query output issues: For very large query results, use `:output buffer` to display results in a separate buffer.
- Block tracking problems: If block tracking seems inconsistent, try using `org-duckdb-blocks-list` to view all tracked blocks and their current status.
- Installation: I’m on doom emacs so I’m not entirely sure if there will be issues installing on vanilla or other frameworks, let me know.
Contributions, bug reports, and feature requests are more than welcome. This is the first time I’ve done a package so I’m expecting issues to arrive or there might be some obvious optimizations that I missed. I’ve tried to document the code as best as possible but feel free to:
- Open issues for bug reports or feature requests
- Submit pull requests with improvements
- Share examples and documentation
- Suggest changes to documentation in order to improve clarity
GPLv3
Note: This package is independently developed and not officially affiliated with DuckDB.