FireProx Document References Guide¶
This notebook demonstrates how to work with document references in Firestore, including:
- Assigning FireObject references - Store relationships between documents
- Automatic conversions - FireObject ↔ DocumentReference conversion
- Lazy loading - Automatic data fetching on access
- Nested references - References in lists and dictionaries
- Validation - Type safety and state validation
- Common patterns - Real-world reference use cases
Key Findings¶
✅ Automatic Conversion:
- Assigning a FireObject automatically converts to DocumentReference for storage
- Reading a DocumentReference automatically converts to FireObject
- Works recursively in nested structures (lists, dicts)
✅ Lazy Loading:
- Referenced FireObjects start in ATTACHED state
- Data is automatically fetched on first attribute access
- Works seamlessly for both sync and async
⚠️ Important Validations:
- Cannot assign DETACHED FireObjects (no path to reference)
- Cannot mix sync and async FireObjects (TypeError)
- References preserve object identity (same instance on repeated access)
Setup¶
Import modules and initialize FireProx.
from fire_prox import AsyncFireProx, FireProx
from fire_prox.testing import async_demo_client, demo_client
Part 1: Basic Document References¶
Learn how to create and use document references between FireObjects.
Initialize Client¶
# Create sync client and collections
client = demo_client()
db = FireProx(client)
users = db.collection('doc_ref_users')
posts = db.collection('doc_ref_posts')
print("✅ Client initialized")
✅ Client initialized
Feature 1: Assigning FireObject References¶
You can assign one FireObject to another's property to create a reference.
# Create a user
user = users.new()
user.name = 'Ada Lovelace'
user.occupation = 'Mathematician'
user.save(doc_id='ada')
print(f"👤 Created user: {user.name}")
print(f" Path: {user.path}")
# Create a post with a reference to the user
post = posts.new()
post.title = 'On Analytical Engines'
post.author = user # Assign FireObject reference
post.content = 'The Analytical Engine weaves algebraic patterns...'
print(f"\n📝 Created post: {post.title}")
print(f" Author (before save): {type(post.author).__name__}")
# When we access the internal data, we see it's converted to DocumentReference
print(f" Internal storage: {type(post._data['author']).__name__}")
print(f" Reference path: {post._data['author'].path}")
# Save the post
post.save(doc_id='post1')
print("\n✅ Post saved with author reference!")
👤 Created user: Ada Lovelace Path: doc_ref_users/ada 📝 Created post: On Analytical Engines Author (before save): FireObject Internal storage: FireObject Reference path: doc_ref_users/ada ✅ Post saved with author reference!
Feature 2: Reading References Back¶
When you read a document with references, they're automatically converted to FireObjects.
# Fetch the post from Firestore
retrieved_post = db.doc('doc_ref_posts/post1')
retrieved_post.fetch()
print(f"📄 Retrieved post: {retrieved_post.title}")
print(f" Content: {retrieved_post.content[:50]}...")
# Access the author reference
author = retrieved_post.author
print("\n👤 Author reference:")
print(f" Type: {type(author).__name__}")
print(f" State: {author.state}")
print(f" Path: {author.path}")
print("\n✅ Reference converted to FireObject (ATTACHED state)")
📄 Retrieved post: On Analytical Engines Content: The Analytical Engine weaves algebraic patterns...... 👤 Author reference: Type: FireObject State: ATTACHED Path: doc_ref_users/ada ✅ Reference converted to FireObject (ATTACHED state)
Feature 3: Lazy Loading¶
Referenced FireObjects automatically load data on first attribute access.
# The author is currently ATTACHED (no data loaded yet)
print("📊 Before accessing data:")
print(f" State: {author.state}")
# Access an attribute - this triggers lazy loading
print("\n👤 Accessing author.name...")
author_name = author.name
print(f" Name: {author_name}")
print(f" Occupation: {author.occupation}")
# Now the author is LOADED
print("\n📊 After accessing data:")
print(f" State: {author.state}")
# Subsequent accesses are instant (no fetch needed)
print("\n⚡ Second access (instant):")
print(f" Name again: {author.name}")
print("\n✅ Lazy loading automatically fetched data on first access!")
📊 Before accessing data: State: ATTACHED 👤 Accessing author.name... Name: Ada Lovelace Occupation: Mathematician 📊 After accessing data: State: LOADED ⚡ Second access (instant): Name again: Ada Lovelace ✅ Lazy loading automatically fetched data on first access!
Feature 4: Validation - DETACHED Objects¶
You cannot assign DETACHED FireObjects as references (they have no path).
# Create two DETACHED objects
unsaved_user = users.new()
unsaved_user.name = 'Grace Hopper'
unsaved_post = posts.new()
unsaved_post.title = 'Compilers'
print(f"👤 Unsaved user state: {unsaved_user.state}")
print(f"📝 Unsaved post state: {unsaved_post.state}")
# Try to assign DETACHED object as reference
print("\n❌ Attempting to assign DETACHED object...")
try:
unsaved_post.author = unsaved_user
print(" This should not print!")
except ValueError as e:
print(f" Caught ValueError: {e}")
print("\n✅ DETACHED objects cannot be assigned as references")
print(" Save the object first to create a reference!")
# The correct way:
unsaved_user.save(doc_id='grace')
print(f"\n✓ User saved, now in {unsaved_user.state} state")
unsaved_post.author = unsaved_user # Now this works!
print("✓ Reference assignment successful!")
👤 Unsaved user state: DETACHED 📝 Unsaved post state: DETACHED ❌ Attempting to assign DETACHED object... Caught ValueError: Cannot assign a DETACHED FireObject as a reference. The object must be saved first to have a document path. ✅ DETACHED objects cannot be assigned as references Save the object first to create a reference! ✓ User saved, now in LOADED state ✓ Reference assignment successful!
Feature 5: Validation - Sync/Async Mismatch¶
You cannot mix sync and async FireObjects.
# Create async client
async_client = async_demo_client()
async_db = AsyncFireProx(async_client)
async_users = async_db.collection('doc_ref_users_async')
# Create and save an async user
async_user = async_users.new()
async_user.name = 'Margaret Hamilton'
await async_user.save(doc_id='margaret')
print(f"👤 Async user created: {async_user.name}")
print(f" Type: {type(async_user).__name__}")
# Create a sync post
sync_post = posts.new()
sync_post.title = 'Apollo Guidance Computer'
print(f"\n📝 Sync post created: {sync_post.title}")
print(f" Type: {type(sync_post).__name__}")
# Try to assign async user to sync post
print("\n❌ Attempting to mix sync and async...")
try:
sync_post.author = async_user
print(" This should not print!")
except TypeError as e:
print(f" Caught TypeError: {e}")
print("\n✅ Sync and async FireObjects cannot be mixed")
print(" Use matching types: sync with sync, async with async")
👤 Async user created: Margaret Hamilton Type: AsyncFireObject 📝 Sync post created: Apollo Guidance Computer Type: FireObject ❌ Attempting to mix sync and async... Caught TypeError: Cannot assign async FireObject to sync FireObject. Both objects must be from the same context (sync or async). ✅ Sync and async FireObjects cannot be mixed Use matching types: sync with sync, async with async
Feature 6: References in Lists¶
Store multiple references in a list.
# Create multiple reviewers
reviewer1 = users.new()
reviewer1.name = 'Alan Turing'
reviewer1.expertise = 'Computation'
reviewer1.save(doc_id='alan')
reviewer2 = users.new()
reviewer2.name = 'Donald Knuth'
reviewer2.expertise = 'Algorithms'
reviewer2.save(doc_id='donald')
reviewer3 = users.new()
reviewer3.name = 'Barbara Liskov'
reviewer3.expertise = 'Programming Languages'
reviewer3.save(doc_id='barbara')
print("👥 Created reviewers:")
for r in [reviewer1, reviewer2, reviewer3]:
print(f" • {r.name} ({r.expertise})")
# Create a paper with multiple reviewers
paper = posts.new()
paper.title = 'On Computable Numbers'
paper.reviewers = [reviewer1, reviewer2, reviewer3] # List of references
paper.save(doc_id='paper1')
print(f"\n📄 Created paper: {paper.title}")
print(f" Reviewers: {len(paper.reviewers)} assigned")
print("\n✅ Multiple references stored in list!")
👥 Created reviewers: • Alan Turing (Computation) • Donald Knuth (Algorithms) • Barbara Liskov (Programming Languages) 📄 Created paper: On Computable Numbers Reviewers: 3 assigned ✅ Multiple references stored in list!
# Read back and access reviewers
retrieved_paper = db.doc('doc_ref_posts/paper1')
retrieved_paper.fetch()
print(f"📄 Retrieved paper: {retrieved_paper.title}")
print("\n👥 Reviewers (lazy loading):")
for i, reviewer in enumerate(retrieved_paper.reviewers, 1):
print(f" {i}. {reviewer.name} - {reviewer.expertise}")
print(f" State: {reviewer.state}")
print("\n✅ Each reference in list supports lazy loading!")
📄 Retrieved paper: On Computable Numbers
👥 Reviewers (lazy loading):
1. Alan Turing - Computation
State: LOADED
2. Donald Knuth - Algorithms
State: LOADED
3. Barbara Liskov - Programming Languages
State: LOADED
✅ Each reference in list supports lazy loading!
Feature 7: References in Dictionaries¶
Store references as dictionary values with semantic keys.
# Create contributors for different roles
author_user = users.new()
author_user.name = 'Edsger Dijkstra'
author_user.role = 'Primary Author'
author_user.save(doc_id='edsger')
editor_user = users.new()
editor_user.name = 'Niklaus Wirth'
editor_user.role = 'Technical Editor'
editor_user.save(doc_id='niklaus')
reviewer_user = users.new()
reviewer_user.name = 'Tony Hoare'
reviewer_user.role = 'Peer Reviewer'
reviewer_user.save(doc_id='tony')
print("👥 Created contributors:")
for u in [author_user, editor_user, reviewer_user]:
print(f" • {u.name} - {u.role}")
# Create article with contributor dict
article = posts.new()
article.title = 'Structured Programming'
article.contributors = {
'author': author_user,
'editor': editor_user,
'reviewer': reviewer_user
}
article.save(doc_id='article1')
print(f"\n📰 Created article: {article.title}")
print(f" Contributors: {len(article.contributors)} roles defined")
print("\n✅ References stored in dictionary with semantic keys!")
👥 Created contributors: • Edsger Dijkstra - Primary Author • Niklaus Wirth - Technical Editor • Tony Hoare - Peer Reviewer 📰 Created article: Structured Programming Contributors: 3 roles defined ✅ References stored in dictionary with semantic keys!
# Read back and access contributors
retrieved_article = db.doc('doc_ref_posts/article1')
retrieved_article.fetch()
print(f"📰 Retrieved article: {retrieved_article.title}")
print("\n👥 Contributors:")
for role, person in retrieved_article.contributors.items():
print(f" {role.title()}: {person.name}")
print(f" Role: {person.role}")
print(f" State: {person.state}")
print("\n✅ Dictionary references support lazy loading and semantic access!")
📰 Retrieved article: Structured Programming
👥 Contributors:
Author: Edsger Dijkstra
Role: Primary Author
State: LOADED
Editor: Niklaus Wirth
Role: Technical Editor
State: LOADED
Reviewer: Tony Hoare
Role: Peer Reviewer
State: LOADED
✅ Dictionary references support lazy loading and semantic access!
Feature 8: Mixed Nested Structures¶
Complex nesting: references in dicts containing lists, etc.
# Create team members
lead = users.new()
lead.name = 'Dennis Ritchie'
lead.save(doc_id='dennis')
dev1 = users.new()
dev1.name = 'Ken Thompson'
dev1.save(doc_id='ken')
dev2 = users.new()
dev2.name = 'Brian Kernighan'
dev2.save(doc_id='brian')
# Create project with complex nested structure
project = posts.new()
project.title = 'UNIX Operating System'
project.team = {
'lead': lead,
'developers': [dev1, dev2],
'structure': {
'primary': lead,
'secondary': [dev1, dev2]
}
}
project.save(doc_id='project1')
print(f"🚀 Created project: {project.title}")
print(" Team structure: nested dict with lists of references")
print("\n✅ Complex nested references stored!")
🚀 Created project: UNIX Operating System Team structure: nested dict with lists of references ✅ Complex nested references stored!
# Read back and navigate nested structure
retrieved_project = db.doc('doc_ref_posts/project1')
retrieved_project.fetch()
print(f"🚀 Retrieved project: {retrieved_project.title}")
print("\n👥 Team structure:")
print(f" Lead: {retrieved_project.team['lead'].name}")
print("\n Developers:")
for dev in retrieved_project.team['developers']:
print(f" • {dev.name}")
print("\n Nested structure:")
print(f" Primary: {retrieved_project.team['structure']['primary'].name}")
print(" Secondary:")
for dev in retrieved_project.team['structure']['secondary']:
print(f" • {dev.name}")
print("\n✅ All nested references support lazy loading!")
🚀 Retrieved project: UNIX Operating System
👥 Team structure:
Lead: Dennis Ritchie
Developers:
• Ken Thompson
• Brian Kernighan
Nested structure:
Primary: Dennis Ritchie
Secondary:
• Ken Thompson
• Brian Kernighan
✅ All nested references support lazy loading!
Pattern 1: Author/Owner References¶
Track who created or owns a document.
# Create a user
owner = users.new()
owner.name = 'Linus Torvalds'
owner.email = 'linus@example.com'
owner.save(doc_id='linus')
# Create documents with owner references
for i in range(3):
doc = posts.new()
doc.title = f'Kernel Module {i+1}'
doc.owner = owner
doc.created_by = owner # Same reference, different semantic meaning
doc.save(doc_id=f'kernel_module_{i+1}')
print(f"👤 Owner: {owner.name}")
print("📝 Created 3 documents with owner references")
# Query documents by owner
owner_docs = posts.where('owner', '==', owner._doc_ref).get()
print(f"\n🔍 Documents owned by {owner.name}: {len(owner_docs)}")
for doc in owner_docs:
print(f" • {doc.title}")
# Verify lazy loading
print(f" Owner: {doc.owner.name} <{doc.owner.email}>")
print("\n✅ Pattern: Track document ownership with references")
👤 Owner: Linus Torvalds
📝 Created 3 documents with owner references
🔍 Documents owned by Linus Torvalds: 3
• Kernel Module 1
Owner: Linus Torvalds <linus@example.com>
• Kernel Module 2
Owner: Linus Torvalds <linus@example.com>
• Kernel Module 3
Owner: Linus Torvalds <linus@example.com>
✅ Pattern: Track document ownership with references
Pattern 2: Parent/Child Relationships¶
Model hierarchical relationships between documents.
# Create parent document (conversation thread)
thread = posts.new()
thread.title = 'How to learn programming?'
thread.type = 'thread'
thread.parent = None # Top-level thread
thread.save(doc_id='thread_001')
print(f"💬 Thread: {thread.title}")
# Create child documents (replies)
reply1 = posts.new()
reply1.title = 'Start with Python'
reply1.type = 'reply'
reply1.parent = thread # Reference to parent
reply1.save(doc_id='reply_001')
reply2 = posts.new()
reply2.title = 'Practice every day'
reply2.type = 'reply'
reply2.parent = thread
reply2.save(doc_id='reply_002')
# Create nested reply (reply to reply)
nested_reply = posts.new()
nested_reply.title = 'Which Python version?'
nested_reply.type = 'reply'
nested_reply.parent = reply1 # Reference to parent reply
nested_reply.save(doc_id='reply_003')
print(f" ├─ {reply1.title}")
print(f" │ └─ {nested_reply.title}")
print(f" └─ {reply2.title}")
# Query for direct replies to thread
replies = posts.where('parent', '==', thread._doc_ref).get()
print(f"\n📊 Direct replies to thread: {len(replies)}")
for reply in replies:
print(f" • {reply.title}")
print(f" Parent: {reply.parent.title}")
print("\n✅ Pattern: Model hierarchical relationships with parent references")
💬 Thread: How to learn programming?
├─ Start with Python
│ └─ Which Python version?
└─ Practice every day
📊 Direct replies to thread: 2
• Start with Python
Parent: How to learn programming?
• Practice every day
Parent: How to learn programming?
✅ Pattern: Model hierarchical relationships with parent references
Pattern 3: Cross-Collection References¶
References can point to documents in different collections.
# Create multiple collections
products = db.collection('doc_ref_products')
orders = db.collection('doc_ref_orders')
customers = db.collection('doc_ref_customers')
# Create a customer
customer = customers.new()
customer.name = 'Alice Smith'
customer.email = 'alice@example.com'
customer.save(doc_id='alice')
# Create products
product1 = products.new()
product1.name = 'Laptop'
product1.price = 999.99
product1.save(doc_id='laptop_001')
product2 = products.new()
product2.name = 'Mouse'
product2.price = 29.99
product2.save(doc_id='mouse_001')
# Create order with cross-collection references
order = orders.new()
order.order_id = 'ORD-2024-001'
order.customer = customer # Reference to customers collection
order.items = [product1, product2] # References to products collection
order.total = 1029.98
order.save(doc_id='order_001')
print(f"🛒 Order: {order.order_id}")
print(f" Customer: {customer.name} <{customer.email}>")
print(" Items:")
for product in [product1, product2]:
print(f" • {product.name} - ${product.price}")
print(f" Total: ${order.total}")
# Read back and verify cross-collection references
retrieved_order = orders.doc('order_001')
retrieved_order.fetch()
print(f"\n📦 Retrieved order: {retrieved_order.order_id}")
print(f" Customer: {retrieved_order.customer.name}")
print(f" Customer collection: {retrieved_order.customer.path.split('/')[0]}")
print("\n Items:")
for item in retrieved_order.items:
print(f" • {item.name} - ${item.price}")
print(f" Collection: {item.path.split('/')[0]}")
print("\n✅ Pattern: References work across different collections!")
🛒 Order: ORD-2024-001
Customer: Alice Smith <alice@example.com>
Items:
• Laptop - $999.99
• Mouse - $29.99
Total: $1029.98
📦 Retrieved order: ORD-2024-001
Customer: Alice Smith
Customer collection: doc_ref_customers
Items:
• Laptop - $999.99
Collection: doc_ref_products
• Mouse - $29.99
Collection: doc_ref_products
✅ Pattern: References work across different collections!
Async References with Lazy Loading¶
# Initialize async client
async_client = async_demo_client()
async_db = AsyncFireProx(async_client)
async_users = async_db.collection('doc_ref_async_users')
async_posts = async_db.collection('doc_ref_async_posts')
# Create user
async_user = async_users.new()
async_user.name = 'Tim Berners-Lee'
async_user.invention = 'World Wide Web'
await async_user.save(doc_id='tim')
print(f"👤 Async user: {async_user.name}")
# Create post with reference
async_post = async_posts.new()
async_post.title = 'Information Management Proposal'
async_post.author = async_user # Async reference
await async_post.save(doc_id='post_async_1')
print(f"📝 Async post: {async_post.title}")
# Read back
retrieved = async_db.doc('doc_ref_async_posts/post_async_1')
await retrieved.fetch()
print(f"\n📄 Retrieved post: {retrieved.title}")
print(f" Author (before lazy load): State = {retrieved.author.state}")
# Lazy loading works with async too!
print(f" Author name: {retrieved.author.name}")
print(f" Invention: {retrieved.author.invention}")
print(f" Author (after lazy load): State = {retrieved.author.state}")
print("\n✅ Async references support lazy loading!")
👤 Async user: Tim Berners-Lee 📝 Async post: Information Management Proposal 📄 Retrieved post: Information Management Proposal Author (before lazy load): State = ATTACHED Author name: Tim Berners-Lee Invention: World Wide Web Author (after lazy load): State = LOADED ✅ Async references support lazy loading!
Async Nested References¶
# Create team members
member1 = async_users.new()
member1.name = 'Vint Cerf'
await member1.save(doc_id='vint')
member2 = async_users.new()
member2.name = 'Bob Kahn'
await member2.save(doc_id='bob')
# Create project with nested references
async_project = async_posts.new()
async_project.title = 'TCP/IP Protocol'
async_project.team = {
'lead': member1,
'members': [member1, member2]
}
await async_project.save(doc_id='project_async_1')
print(f"🚀 Async project: {async_project.title}")
# Read back
retrieved_project = async_db.doc('doc_ref_async_posts/project_async_1')
await retrieved_project.fetch()
print("\n👥 Team:")
print(f" Lead: {retrieved_project.team['lead'].name}")
print(" Members:")
for member in retrieved_project.team['members']:
print(f" • {member.name}")
print("\n✅ Async nested references work perfectly!")
🚀 Async project: TCP/IP Protocol
👥 Team:
Lead: Vint Cerf
Members:
• Vint Cerf
• Bob Kahn
✅ Async nested references work perfectly!
Summary¶
✅ Key Capabilities¶
Automatic Conversion¶
- Assignment:
post.author = userconverts FireObject → DocumentReference - Retrieval: Reading back converts DocumentReference → FireObject
- Nested: Works recursively in lists and dicts
- Transparent: Conversions happen automatically
Lazy Loading¶
- ATTACHED State: Referenced objects start without data loaded
- Auto-Fetch: First attribute access triggers data fetch
- LOADED State: After fetch, subsequent accesses are instant
- Async Support: Lazy loading works for both sync and async
Validation¶
- DETACHED Check: Cannot assign unsaved FireObjects
- Type Safety: Cannot mix sync and async FireObjects
- State Tracking: Objects maintain state through lifecycle
🎯 Best Practices¶
✅ DO:¶
1. Save before referencing
user = users.new()
user.name = 'Ada'
user.save(doc_id='ada') # Save first!
post.author = user # Now can reference
2. Use semantic key names in dicts
doc.contributors = {
'author': author_user,
'editor': editor_user,
'reviewer': reviewer_user
}
3. Query by reference
# Find all posts by an author
user_posts = posts.where('author', '==', user._doc_ref).get()
4. Leverage lazy loading
# No need to manually fetch
author_name = post.author.name # Automatically loads
❌ DON'T:¶
1. Reference DETACHED objects
# Bad - will raise ValueError
unsaved_user = users.new()
post.author = unsaved_user # Error!
2. Mix sync and async
# Bad - will raise TypeError
async_user = async_users.new()
await async_user.save()
sync_post.author = async_user # Error!
3. Create circular references
# Avoid - can cause infinite recursion
doc1.ref = doc2
doc2.ref = doc1
📊 Common Patterns¶
1. Ownership Tracking¶
doc.owner = user
doc.created_by = user
2. Parent/Child Relationships¶
reply.parent = thread
# Query children
replies = collection.where('parent', '==', thread._doc_ref).get()
3. Many-to-Many via Lists¶
paper.reviewers = [user1, user2, user3]
course.students = [student1, student2, ...]
4. Cross-Collection References¶
order.customer = customer # Different collection
order.items = [product1, product2] # Another collection
💡 Performance Tips¶
- Lazy Loading: Referenced data loads on-demand (efficient)
- Caching: Same reference returns same object instance
- Batch Queries: Consider using
where()to find related documents - Avoid Deep Nesting: Multiple levels of references = multiple fetches
📚 Learn More¶
- Firestore references: https://firebase.google.com/docs/firestore/data-model#references
- FireProx state machine: See
state.pydocumentation - Query documentation: See pagination and queries notebooks