Interview QuestionPractical QuestionFollow-up Questions

Designing a Real Time Chat Architecture on Android

skydovesJaewoong Eum (skydoves)||12 min read

Designing a Real Time Chat Architecture on Android

Building a real time chat screen requires coordinating persistent network connections, local persistence, optimistic UI updates, and efficient list rendering within a modern Android architecture. The chat domain is deceptively complex: messages must appear instantly when sent, arrive in real time from other participants, survive offline periods, and maintain strict ordering even under network partitions. Getting the architecture right at the foundation determines whether the system remains maintainable as features like read receipts, typing indicators, and media messages are added.

By the end of this lesson, you will be able to:

  • Describe how WebSocket connections provide full-duplex real time communication and how to manage their lifecycle on Android.
  • Explain the offline first pattern where a local Room database serves as the single source of truth for the UI.
  • Trace the data flow of a sent message from user input through optimistic insertion, server acknowledgment, and status reconciliation.
  • Apply message ordering strategies that handle clock skew, network delays, and concurrent writes.
  • Design a performant Compose UI layer using LazyColumn with stable keys and proper state hoisting.

Real Time Transport: WebSocket Lifecycle Management

A chat screen requires a persistent, bidirectional communication channel. HTTP polling introduces latency and wastes bandwidth. WebSockets solve this by upgrading an HTTP connection to a persistent TCP socket that both sides can write to at any time. On Android, OkHttp provides a robust WebSocket implementation.

The manager class wraps OkHttp's WebSocket and exposes incoming messages as a SharedFlow:

class ChatWebSocketManager(
    private val okHttpClient: OkHttpClient,
    private val baseUrl: String
) {
    private var webSocket: WebSocket? = null
    private val _incomingMessages = MutableSharedFlow<Message>(
        extraBufferCapacity = 64,
        onBufferOverflow = BufferOverflow.DROP_OLDEST
    )
    val incomingMessages: SharedFlow<Message> = _incomingMessages.asSharedFlow()

    private val listener = object : WebSocketListener() {
        override fun onMessage(webSocket: WebSocket, text: String) {
            _incomingMessages.tryEmit(Json.decodeFromString<Message>(text))
        }
        override fun onFailure(webSocket: WebSocket, t: Throwable, response: Response?) {
            scheduleReconnect() // Exponential backoff
        }
    }

The connection and messaging methods manage the socket lifecycle:

    fun connect(conversationId: String) {
        val request = Request.Builder().url("$baseUrl/chat/$conversationId").build()
        webSocket = okHttpClient.newWebSocket(request, listener)
    }

    fun send(message: Message): Boolean {
        return webSocket?.send(Json.encodeToString(message)) ?: false
    }

    fun disconnect() {
        webSocket?.close(1000, "User left chat")
        webSocket = null
    }
}

The WebSocket connection should be scoped to the chat screen's lifecycle. Reconnection logic must handle transient failures with exponential backoff, starting at one second and doubling up to 30 seconds. For background delivery when the socket is inactive, Firebase Cloud Messaging can supplement the WebSocket with push notifications.

Offline First Persistence with Room

The single most important architectural decision is making the local database the single source of truth. The Compose UI never reads directly from the network. All incoming messages are written to Room, and the UI observes a Flow emitted by a DAO query. This guarantees consistent state and offline access.

@Entity(tableName = "messages")
data class MessageEntity(
    @PrimaryKey val id: String,
    val conversationId: String,
    val senderId: String,
    val text: String,
    val timestamp: Long,
    val status: String, // SENDING, SENT, DELIVERED, FAILED
    val localSequence: Long // Monotonic local ordering tiebreaker
)

@Dao
interface MessageDao {
    @Query("SELECT * FROM messages WHERE conversationId = :id ORDER BY timestamp DESC, localSequence DESC")
    fun observeMessages(id: String): Flow<List<MessageEntity>>

    @Upsert
    suspend fun upsert(message: MessageEntity)
}

Room's Flow-returning queries are reactive: whenever a row is inserted or updated, Room re-emits the query result to all active collectors. The UI updates automatically when a new message arrives from the WebSocket, when the user sends a message optimistically, or when a status changes from SENDING to SENT. The localSequence column is a monotonically increasing counter that serves as a tiebreaker when messages share identical timestamps.

This interview continues for subscribers

Subscribe to Dove Letter for full access to exclusive interviews about Android and Kotlin development.

Become a Sponsor