RFC: Cross-Table Shape Filtering with Subquery Support #2931
Replies: 4 comments 1 reply
-
One minor observation on the text is that it focused on "authorization":
I think authorization is a subset of the primary case here which is filtering across tables / propagating changes to parent shapes. On the DX,
I think it's important we clearly document this, because the natural "it's just familiar SQL"-ness of it carries the potential to confuse users who may assume they can use any Postgres features. It's the same limitations as for existing top level where clauses but I suspect because it looks like complex subquery SQL it's extra important to be clear about what you can and can't use. |
Beta Was this translation helpful? Give feedback.
-
Interesting 🤔 Looks very nice, would need some fancy work in Ash to make our query builder use There are many things that come to mind though: in this subquery can you:
SELECT * FROM users
WHERE id in (SELECT user_id FROM support_tickets WHERE priority > users.notification_priority_threshold)
|
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
Cross-table Shapes Now Available as Experimental Feature 🎉You can now create shapes filtered by subqueries to other tables! This unlocks common patterns for retrieving related data across tables like common authorization patterns. How to UseEnable the experimental feature with the environment variable: Specify shapes with subqueries in the {
table: 'issues',
columns: 'id,title,description,owner_id,project_id',
where: 'project_id IN (SELECT project_id FROM projects WHERE status = \'active\')'
} Current LimitationsWhile functionally complete, this initial implementation prioritizes getting the feature into your hands over performance optimization. We maintain necessary indexes in memory and handle subquery changes by rebuilding affected shapes. For many use cases with small subquery result sets, this approach may already meet your needs. Here is a picture of the memory footprint for number of shapes vs. rows in the subquery Known Issues:
Implementation Trade-offs:Memory UsageThe dependency graph lives entirely in memory. We plan to move the graph into disk using a backend store and optimize memory usage. Shape RecreationWhen subquery results change, we rebuild the entire shape rather than updating incrementally. Clients will see HTTP 409 responses and must re-establish their shapes. This particularly affects use cases with frequently changing subqueries, like fetching comments for dynamic issue selections. What's NextWe're working to address these limitations to make the feature more stable and bug-free. After resolving the critical issues above, we'll optimize the core algorithm for better performance and stability. Try it out and share your feedback! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Electric currently limits shapes to single-table filtering, preventing developers from expressing common authorization patterns that require cross-table relationships. This RFC proposes extending the shape system to support subqueries in WHERE clauses by automatically creating and maintaining "index shapes" - specialized shapes that materialize subquery results and propagate changes to dependent parent shapes, enabling developers to write natural queries like
WHERE author_id IN (SELECT user_id FROM team_members WHERE team_id = 123)
without schema denormalization or client-side workarounds.Background
Electric is a Postgres sync engine that streams data from Postgres into local applications using HTTP-based shape subscriptions. Shapes define subsets of data using a table name, optional WHERE clause, and optional column selection. The Electric sync service maintains a log of operations for each shape and streams changes to subscribed clients in real-time.
Currently, shape WHERE clauses can only reference columns within the target table. This restriction exists because Electric processes the Postgres logical replication stream, which provides row-level changes without join context. When a row changes, Electric can evaluate simple predicates against that row's columns but cannot resolve relationships to other tables without additional infrastructure.
This limitation forces developers to implement authorization patterns through schema denormalization (duplicating user_id or tenant_id across all tables), client-side filtering (over-fetching data and filtering locally), or proxy-based authorization (implementing complex middleware that reconstructs and filters shapes). Each workaround introduces complexity, performance overhead, or security risks.
The previous Electric implementation attempted to solve multi-table filtering using a per-user directed graph to track foreign key relationships. This approach proved memory-intensive and was abandoned before optimization. The graph maintained row-to-row relationships to handle "move-out" scenarios where related rows must be removed when their parent row no longer matches a filter.
Problem
The inability to express cross-table relationships in shape WHERE clauses prevents developers from implementing standard authorization patterns with Electric. This creates three specific problems:
1. Authorization Requires Schema Denormalization
Applications typically implement access control through relational patterns: users belong to teams, content has owners, resources exist within organizational hierarchies. Without subqueries, developers must duplicate authorization fields (user_id, team_id, org_id) across every table they want to sync. This denormalization increases storage costs, complicates migrations, and risks data inconsistency.
2. Middleware Complexity for Relationship-Based Filtering
Developers must implement authorization logic in API proxies or middleware layers, reconstructing SQL joins in application code. This moves database logic into the application layer, creating maintenance burden and potential security vulnerabilities. Each relationship pattern requires custom proxy code that must be kept in sync with schema changes.
3. Performance Degradation from Workaround Overhead
Current workarounds introduce significant performance penalties. Schema denormalization requires updating multiple tables on every authorization change, creating write amplification. Proxy-based filtering adds network hops and computation overhead. Client-side filtering wastes bandwidth by over-fetching. These workarounds make Electric unsuitable for applications with complex authorization models or high performance requirements.
These limitations prevent Electric adoption in multi-tenant SaaS applications, collaborative tools, and any system with relationship-based access control—effectively excluding most production applications.
Proposal
This RFC proposes extending Electric shapes to support subqueries in WHERE clauses by automatically creating and maintaining "index shapes" - specialized shapes that materialize subquery results.
Core Architecture
When a shape is created with a subquery WHERE clause like:
Electric will:
team_members WHERE team_id = 123
if it doesn't existIndex Shape Design
Index shapes are regular Electric shapes with special lifecycle management:
Change Propagation
When changes arrive via logical replication:
For Index Shape Changes:
For Move-Ins (rows added to index shape):
Move-In Consistency Considerations (to investigate during prototyping):
For Move-Outs (rows removed from index shape):
Implementation Details
Metadata Storage: Add foreign key values to shape log entries as indexed metadata, either in the log file or a separate index file, to enable efficient scanning without JSON parsing.
Index Storage Evolution: Initial implementation will use in-memory structures for index shapes. Future optimization may convert index shapes into embedded SQLite databases for better query performance and memory efficiency, but this will be investigated after validating the core approach.
Query Normalization: Normalize WHERE clauses to canonical form to maximize index shape sharing across parent shapes with equivalent conditions.
Concurrent Creation: Coordinate index shape creation when multiple clients request shapes with the same subquery simultaneously to avoid duplicate work.
Complex WHERE Clauses: Support multiple subqueries by creating multiple index shapes and evaluating the composite conditions using Electric's existing WHERE clause matching logic.
Error Handling: Failed Postgres queries for move-ins follow existing Electric error handling patterns (retry with backoff, shape error states).
Limitations
Phase 1 supports only:
Phase 2 will extend to:
Definition of Success
The implementation will be considered successful when it meets the following criteria:
Functional Correctness
Performance Targets
Developer Experience
Production Readiness
Phase 1 Delivery
A working prototype that demonstrates:
This foundation will validate the architecture before optimizing for Phase 2 production requirements.
Beta Was this translation helpful? Give feedback.
All reactions