Firestore Deep Dive

Firestore is a document database with offline-first semantics out of the box. For the right app shape — real-time UIs, moderate write volume, user-isolated data — it replaces a whole backend team. For the wrong shape (big analytics queries, joins across millions of docs), it's the wrong tool. This chapter covers the right-shape patterns.

Data modeling — collections, documents, subcollections

users/                       (collection)
  userId-a/                  (document)
    name: "Aarav"
    email: "a@x.com"
    createdAt: Timestamp

    orders/                  (subcollection)
      order-1/               (document)
        total: 4999
        status: "shipped"

        items/               (subcollection)
          item-a/            (document)
            productId: "p1"
            quantity: 2

Modeling rules of thumb

Small, stable document shape. Firestore charges per read; a 300KB document read N times is 300KB × N of quota.
Denormalize — duplicate data when it saves reads. No joins.
Max document size: 1 MB. Max depth: 100 subcollections.
Collection Group queries let you query across same-named subcollections across all parents (orders/*/items).

Setup

// libs.versions.toml
firebase-firestore = { module = "com.google.firebase:firebase-firestore-ktx" }
kotlinx-coroutines-play-services = { module = "org.jetbrains.kotlinx:kotlinx-coroutines-play-services", version = "1.9.0" }

// Data classes with Kotlinx serialization can't be used with Firestore directly.
// Use @DocumentId + no-arg constructor, or manual mapping.

val firestore: FirebaseFirestore = Firebase.firestore.apply {
    // Enable offline persistence (default on, but be explicit)
    firestoreSettings = firestoreSettings {
        isPersistenceEnabled = true
        cacheSizeBytes = 100L * 1024 * 1024            // 100 MB local cache
        host = if (BuildConfig.DEBUG) "10.0.2.2:8080" else "firestore.googleapis.com"
        isSslEnabled = !BuildConfig.DEBUG
    }
}

Writing data

Set (create or overwrite)

data class UserDoc(
    @DocumentId val id: String = "",
    val name: String = "",
    val email: String = "",
    @ServerTimestamp val createdAt: Timestamp? = null
)

suspend fun createUser(id: String, name: String, email: String) {
    firestore.collection("users").document(id).set(
        UserDoc(id = id, name = name, email = email)
    ).await()
}

Update (partial)

suspend fun updateEmail(userId: String, newEmail: String) {
    firestore.collection("users").document(userId).update(
        mapOf(
            "email" to newEmail,
            "updatedAt" to FieldValue.serverTimestamp()
        )
    ).await()
}

Increment, array ops, delete fields

firestore.collection("posts").document(postId).update(
    mapOf(
        "viewCount" to FieldValue.increment(1),
        "likedBy" to FieldValue.arrayUnion(userId),
        "retractedAt" to FieldValue.delete()
    )
).await()

FieldValue.increment is an atomic server-side op — no read-modify- write race.

Reading — one-shot

suspend fun loadUser(id: String): UserDoc? = try {
    val snap = firestore.collection("users").document(id).get().await()
    snap.toObject(UserDoc::class.java)
} catch (e: FirebaseFirestoreException) {
    null
}

// Specify source
val remote = firestore.collection("users").document(id)
    .get(Source.SERVER).await()

val cached = firestore.collection("users").document(id)
    .get(Source.CACHE).await()

Real-time listeners

fun observeUser(id: String): Flow<UserDoc?> = callbackFlow {
    val registration = firestore.collection("users").document(id)
        .addSnapshotListener { snap, error ->
            if (error != null) {
                close(error)
                return@addSnapshotListener
            }
            trySend(snap?.toObject(UserDoc::class.java))
        }
    awaitClose { registration.remove() }
}

Metadata and local writes

firestore.collection("posts").document(id)
    .addSnapshotListener(MetadataChanges.INCLUDE) { snap, _ ->
        if (snap?.metadata?.hasPendingWrites() == true) {
            // Local write hasn't reached the server yet (offline)
        }
        if (snap?.metadata?.isFromCache == true) {
            // Data is from local cache, not live
        }
    }

The metadata tells you whether to show a "sending…" indicator for optimistic UI.

Queries

// Simple
val recent = firestore.collection("posts")
    .whereEqualTo("authorId", userId)
    .orderBy("createdAt", Query.Direction.DESCENDING)
    .limit(20)
    .get().await()

// Compound
val trending = firestore.collection("posts")
    .whereGreaterThan("likes", 100)
    .orderBy("likes", Query.Direction.DESCENDING)
    .orderBy("createdAt", Query.Direction.DESCENDING)
    .limit(50)
    .get().await()

// in / not-in / array-contains
val batch = firestore.collection("users")
    .whereIn("role", listOf("admin", "moderator"))
    .get().await()

val tagged = firestore.collection("posts")
    .whereArrayContains("tags", "kotlin")
    .get().await()

// Collection group
val allOrderItems = firestore.collectionGroup("items")
    .whereEqualTo("productId", productId)
    .get().await()

Pagination with `startAfter`

suspend fun loadPage(lastDoc: DocumentSnapshot? = null, pageSize: Int = 20): QuerySnapshot {
    var query = firestore.collection("posts")
        .orderBy("createdAt", Query.Direction.DESCENDING)
        .limit(pageSize.toLong())
    if (lastDoc != null) query = query.startAfter(lastDoc)
    return query.get().await()
}

Query limitations

whereIn / whereArrayContainsAny — max 30 values
Inequality filters — only one field per query (<, <=, >, >=, !=)
Compound queries require a composite index — Firestore logs the creation URL in the error message

Transactions and batched writes

Transaction — read then write atomically

suspend fun transferPoints(fromId: String, toId: String, amount: Int) {
    firestore.runTransaction { tx ->
        val fromRef = firestore.collection("users").document(fromId)
        val toRef = firestore.collection("users").document(toId)

        val fromSnap = tx.get(fromRef)
        val toSnap = tx.get(toRef)

        val fromBalance = fromSnap.getLong("points") ?: 0
        if (fromBalance < amount) throw InsufficientFunds()

        tx.update(fromRef, "points", fromBalance - amount)
        tx.update(toRef, "points", (toSnap.getLong("points") ?: 0) + amount)
    }.await()
}

Transactions auto-retry on concurrent modification. Don't do side effects (network calls, push notifications) inside the transaction lambda — it may execute multiple times.

Batched write — multiple writes, no reads

val batch = firestore.batch()
batch.set(firestore.collection("orders").document(orderId), order)
batch.update(firestore.collection("carts").document(userId), "items", emptyList<Any>())
batch.update(firestore.collection("users").document(userId),
    "orderCount", FieldValue.increment(1))
batch.commit().await()

Up to 500 writes per batch. Atomic: all or nothing.

Offline persistence

Firestore ships with aggressive offline caching:

Reads hit the local cache first, then the server
Writes are queued and replayed when online
Listeners emit from cache immediately, then from server

// Explicit offline toggling (for testing)
suspend fun goOffline() { firestore.disableNetwork().await() }
suspend fun goOnline() { firestore.enableNetwork().await() }

// Clear local cache (on sign-out)
suspend fun clearCache() {
    firestore.terminate().await()
    firestore.clearPersistence().await()
}

Snapshots in memory — listener aggregation

Every active listener consumes memory. Keep the number reasonable:

class InboxViewModel @Inject constructor(
    private val firestore: FirebaseFirestore
) : ViewModel() {
    val conversations: StateFlow<List<Conversation>> = observeConversations()
        .stateIn(viewModelScope, SharingStarted.WhileSubscribed(5_000), emptyList())

    private fun observeConversations(): Flow<List<Conversation>> = callbackFlow {
        val reg = firestore.collection("conversations")
            .whereArrayContains("participants", currentUserId)
            .addSnapshotListener { snap, _ ->
                trySend(snap?.toObjects(Conversation::class.java) ?: emptyList())
            }
        awaitClose { reg.remove() }
    }
}

SharingStarted.WhileSubscribed(5_000) tears down the listener 5s after the last collector disconnects (screen goes background). Re-subscribes when the UI returns.

Security Rules — the authorization layer

Every Firestore operation runs through Security Rules — server-side code that decides allow/deny based on the incoming request:

rules_version = '2';
service cloud.firestore {
  match /databases/{db}/documents {

    // Users can only access their own profile
    match /users/{userId} {
      allow read: if request.auth != null;
      allow write: if request.auth != null && request.auth.uid == userId;

      // Subcollection inherits nothing — define its own rules
      match /orders/{orderId} {
        allow read: if request.auth.uid == userId;
        allow create: if request.auth.uid == userId &&
                         request.resource.data.total is number &&
                         request.resource.data.total > 0;
        allow update: if request.auth.uid == userId &&
                         resource.data.status != "shipped";   // lock after shipped
      }
    }

    // Admin-only collection
    match /admin/{doc=**} {
      allow read, write: if request.auth.token.role == 'admin';
    }

    // Public read, admin write
    match /products/{productId} {
      allow read: if true;
      allow write: if request.auth.token.role == 'admin';
    }

    // Rate limiting via timestamps
    match /posts/{postId} {
      allow create: if request.auth != null &&
                       request.time > resource.data.lastPostAt + duration.value(10, 's');
    }
  }
}

Testing rules with the emulator

// Node test
const {initializeTestEnvironment, assertFails, assertSucceeds} = require('@firebase/rules-unit-testing');

test('user cannot read other users orders', async () => {
  const alice = testEnv.authenticatedContext('alice').firestore();
  await assertFails(alice.doc('users/bob/orders/order1').get());
});

Run firebase emulators:exec --only firestore 'npm test' in CI.

Data classes and Firestore mapping

Firestore's auto-mapping requires a no-arg constructor and var properties. Kotlin data class with default values works:

data class UserDoc(
    @DocumentId var id: String = "",
    var name: String = "",
    var email: String = "",
    @ServerTimestamp var createdAt: Timestamp? = null,
    @get:PropertyName("is_premium") @set:PropertyName("is_premium")
    var isPremium: Boolean = false,
    @Exclude var localOnly: String = ""          // not serialized
)

// Nested objects work too
data class ConversationDoc(
    @DocumentId var id: String = "",
    var participants: List<String> = emptyList(),
    var lastMessage: LastMessage? = null
)

data class LastMessage(
    var body: String = "",
    var senderId: String = "",
    @ServerTimestamp var sentAt: Timestamp? = null
)

Mapping to domain types

fun UserDoc.toDomain(): User = User(
    id = UserId(id),
    name = name,
    email = Email.parse(email).valueOr { error(it) },
    createdAt = createdAt?.toDate()?.toInstant() ?: Instant.EPOCH,
    isPremium = isPremium
)

Keep Firestore DTOs in :data; convert to domain types at the repository boundary.

Aggregation queries (counts, sums)

Firestore now supports server-side aggregations — no need to fetch all docs:

val query = firestore.collection("posts").whereEqualTo("authorId", userId)

val count = query.count().get(AggregateSource.SERVER).await().count
val sum = query.aggregate(
    AggregateField.sum("likes"),
    AggregateField.average("commentsCount")
).get(AggregateSource.SERVER).await()

Pay for 1 read per aggregation instead of N reads for all documents.

Performance patterns

1. Index every `whereEqualTo` + `orderBy` combination

Firestore auto-indexes single-field queries. Compound queries need a composite index. The error message includes a URL that creates it with one click.

2. Denormalize hot paths

For a social feed where "posts with author name + avatar" is queried 100× more than the author is edited, embed a snapshot of the author in each post:

data class PostDoc(
    @DocumentId var id: String = "",
    var body: String = "",
    var authorId: String = "",
    var authorName: String = "",       // denormalized
    var authorAvatar: String = ""      // denormalized
)

Update via a Cloud Function trigger when the user changes their profile.

3. Use Bundles for cold-start reads

firestore.loadBundle(bundleData) pre-hydrates the cache with docs generated server-side — no client-side reads for the first render.

4. Cache sizing

Default cache is 100 MB. For read-heavy apps, increase:

firestoreSettings { cacheSizeBytes = 250L * 1024 * 1024 }

Cost optimization

Firestore charges per read, write, delete, and network egress. Tips:

Listener re-subscribes are free — keep them attached while the UI is alive.
Paginate queries; don't fetch entire collections.
Use projections sparingly — Firestore always returns the whole document.
Use aggregation queries for counts instead of get() then .size().
Batched writes are still 1 write per doc — doesn't save cost.
Offline reads from cache don't cost anything.

Common anti-patterns

Anti-patterns

Firestore mistakes

Storing everything in one giant document
Deep nesting (5+ subcollections)
Client-side filtering after whereEqualTo
No security rules (production left open)
Transactions that do HTTP calls
Forgetting to remove snapshot listeners

Best practices

Production patterns

Small documents, denormalized for read patterns
Max 2-3 levels of subcollections
Server-side filtering with composite indexes
Rules enforced + emulator-tested in CI
Pure reducers in transaction blocks
awaitClose { reg.remove() } in every callbackFlow

Key takeaways

Practice exercises

01
Model a chat schema
Design conversations/ and messages/ collections with appropriate denormalization. Write security rules that enforce participant-only access.
02
Real-time Flow
Write observeMessages(convId) returning Flow<List<Message>> backed by addSnapshotListener with proper awaitClose cleanup.
03
Transaction
Implement a transferPoints transaction. Verify it retries on concurrent modification by racing two coroutines.
04
Emulator tests
Set up Firebase emulators + firebase-rules-unit-testing. Write a test asserting a non-admin cannot write to /admin/.
05
Aggregation
Replace a query that counts all posts by a user (fetches all docs) with an aggregation query. Compare read cost.

Continue to Cloud Storage & FCM for file uploads and push notifications, or Remote Config & App Check for feature flags and anti-abuse.

Data modeling — collections, documents, subcollections​

Modeling rules of thumb​

Setup​

Writing data​

Set (create or overwrite)​

Update (partial)​

Increment, array ops, delete fields​

Reading — one-shot​

Real-time listeners​

Metadata and local writes​

Queries​

Pagination with startAfter​

Query limitations​

Transactions and batched writes​

Transaction — read then write atomically​

Batched write — multiple writes, no reads​

Offline persistence​

Snapshots in memory — listener aggregation​

Security Rules — the authorization layer​

Testing rules with the emulator​

Data classes and Firestore mapping​

Mapping to domain types​

Aggregation queries (counts, sums)​

Performance patterns​

1. Index every whereEqualTo + orderBy combination​

2. Denormalize hot paths​

3. Use Bundles for cold-start reads​

4. Cache sizing​

Cost optimization​

Common anti-patterns​