sia.hackernoon.com

Introduction

Smooth scrolling is critical for chat apps - lag or stutter can severely impact user satisfaction and retention. Chat interfaces face unique challenges due to dynamic, high-density content like text bubbles, images, emojis, and timestamps.

Our team recently encountered a subtle but challenging requirement while working on our chat implementation: dynamically positioning timestamps inline with the last line of text when space permits, or dropping them to a new line when the text is too wide. This seemingly minor design decision uncovered significant performance bottlenecks.

In this article, I'll walk you through two approaches that we used - SubcomposeLayout and the optimized Layout alternative - to demonstrate how seemingly small implementation choices can dramatically impact your app's performance. Whether you're building a chat UI or any complex custom layout in Compose, these techniques will help you identify and resolve critical performance bottlenecks.

Understanding the Technical Challenge

Why Dynamic Positioning Based on Text Content is Complex

Dynamic positioning of elements relative to text presents several unique challenges in UI development. In our case, positioning timestamps based on the available space in the last line of text is particularly complex for several reasons:

1.Variable Text Properties: Message text varies in length, content, and formatting. Each message could have different font sizes, weights, or even mixed formatting within a single message.

Line Break Uncertainty: Text wrapping is unpredictable at design time. The same message may wrap differently based on:
- Screen size and orientation
- Font scaling settings
- Dynamic container sizing
- Text accessibility settings
Measurement Dependencies: To determine if a timestamp fits inline, we need to:
- Measure the complete text layout first
- Calculate the width of the last line specifically
- Measure the timestamp element
- Compare these measurements against the container width
- Make positioning decisions based on these calculations

Initial implementation using SubcomposeLayout

SubcomposeLayout is one of Jetpack Compose's most powerful but resource-intensive layout APIs, designed specifically for complex layouts requiring multiple measurement and composition passes.

In essence, SubcomposeLayout works through two critical phases:

Subcomposition: Compose the components individually as required, rather than all at once.
Measurement: Measure these individually composed components before determining the final arrangement.

For our timestamp positioning challenge, SubcomposeLayout seemed like the perfect solution. We needed to:

First measure the text content to determine line metrics
Then decide whether to place the timestamp inline or on a new line
Finally compose and position the timestamp based on that decision

Here's simplified version of how we initially implemented the dynamic timestamp positioning using SubcomposeLayout:

@Composable
fun TextMessage_subcompose(
    modifier: Modifier = Modifier,
    message: Message,
    textColor: Color,
    bubbleMaxWidth: Dp = 280.dp
) {
    val maxWidthPx = with(LocalDensity.current) { bubbleMaxWidth.roundToPx() }

    SubcomposeLayout(modifier) { constraints ->
        
        // ━━━ Phase 1: Subcompose and measure text ━━━
        var textLayoutResult: TextLayoutResult? = null
        val textPlaceable = subcompose("text") {
            Text(
                text = message.text,
                color = textColor,
                onTextLayout = { textLayoutResult = it }
            )
        }[0].measure(constraints.copy(maxWidth = maxWidthPx))
        
        // Extract text metrics after measurement
        val textLayout = requireNotNull(textLayoutResult) {
            "Text layout should be available after subcomposition"
        }
        val lineCount = textLayout.lineCount
        val lastLineWidth = ceil(
            textLayout.getLineRight(lineCount - 1) - 
            textLayout.getLineLeft(lineCount - 1)
        ).toInt()
        val widestLineWidth = (0 until lineCount).maxOf { lineIndex ->
            ceil(
                textLayout.getLineRight(lineIndex) - 
                textLayout.getLineLeft(lineIndex)
            ).toInt()
        }

        // ━━━ Phase 2: Subcompose and measure footer ━━━
        val footerPlaceable = subcompose("footer") {
            MessageFooter(message = message)
        }[0].measure(constraints)

        // ━━━ Calculate container dimensions ━━━
        val canFitInline = lastLineWidth + footerPlaceable.width <= maxWidthPx
        val containerWidth = max(widestLineWidth, lastLineWidth + footerPlaceable.width)
            .coerceAtMost(maxWidthPx)
        val containerHeight = if (canFitInline) {
            max(textPlaceable.height, footerPlaceable.height)
        } else {
            textPlaceable.height + footerPlaceable.height
        }

        // ━━━ Layout and placement ━━━
        layout(containerWidth, containerHeight) {
            textPlaceable.place(x = 0, y = 0)
            
            if (canFitInline) {
                footerPlaceable.place(
                    x = containerWidth - footerPlaceable.width,
                    y = textPlaceable.height - footerPlaceable.height
                )
            } else {
                footerPlaceable.place(
                    x = containerWidth - footerPlaceable.width,
                    y = textPlaceable.height
                )
            }
        }
    }
}

The logic seemed straightforward:

Measure the text first to get line metrics and determine the last line width
Measure the footer (timestamp and status icons) to know its dimensions
Calculate container dimensions based on whether the footer fits inline
Place both elementsaccording to the inline/separate line decision

This approach worked functionally - the timestamps were positioned correctly based on available space. However, as we scaled our chat implementation by introducing additional features, new UI elements, and increased complexity, our performance testing uncovered significant issues. Although these issues weren't solely due to SubcomposeLayout itself, but rather emerged from the cumulative interaction of multiple components at scale, we determined it necessary to revisit our approach comprehensively.

Upon careful analysis of our TextMessage implementation, several performance bottlenecks were discovered:

Elevated Composition Overhead

Each function call invokes subcompose("text") and subcompose("footer"), effectively triggering two separate composition phases per message on every layout pass - doubling the composition work compared to a traditional single-pass layout approach.

Increased GC Pressure

Each subcompose invocation allocates intermediary lists and lambda instances. Under heavy scrolling scenarios (hundreds of messages), these temporary objects accumulate, leading to more frequent garbage collections and frame drops.

Layout pass complexity

SubcomposeLayout inherently requires more complex layout logic because composition and measurement are interleaved.

This complexity multiplies across all visible items during scrolling, creating a cumulative performance impact that becomes pronounced in production chat environments with hundreds of messages. These findings led us to explore a more efficient approach using Compose's standard Layout API, which could maintain the same dynamic positioning behavior while significantly reducing the computational overhead.

Optimized Implementation with Layout

After identifying the performance bottlenecks in our SubcomposeLayout approach, we turned to Compose's standard Layout API. Unlike SubcomposeLayout, the standard Layout follows Compose's conventional composition → measurement → placement pipeline, which offers several key advantages:

Single Composition Phase: All child composables are created during the normal composition phase, not during layout. This allows Compose's recomposition optimizations to work effectively - stable composables can be skipped when their inputs haven't changed.
Predetermined Children: Layout works with a fixed set of measurables that are known at composition time. This eliminates the dynamic allocation overhead of subcomposition and reduces garbage collection pressure.
Simplified Control Flow: The layout logic becomes more straightforward since we don't need to interleave composition and measurement operations.

Implementation Strategy

Our optimized approach maintains the same visual behavior while restructuring the implementation to work within Layout's constraints. Here's a simplified snippet of our optimized approach:

@Composable
fun TextMessage_layout(
    modifier: Modifier = Modifier,
    message: Message,
    textColor: Color,
    bubbleMaxWidth: Dp = 260.dp
) {
    // Shared reference for accessing text layout metrics during measurement
    val textLayoutRef = remember { Ref<TextLayoutResult>() }
    val density = LocalDensity.current

    Layout(
        modifier = modifier,
        content = {
            // Primary text content
            Text(
                text = message.text,
                color = textColor,
                onTextLayout = { result -> textLayoutRef.value = result }
            )
            // Footer containing timestamp and status indicators
            MessageFooter(message = message)
        }
    ) { measurables, constraints ->
        
        val maxWidthPx = with(density) { bubbleMaxWidth.roundToPx() }

        // ━━━ Single-pass measurement of all children ━━━
        val textPlaceable = measurables[0].measure(
            constraints.copy(maxWidth = maxWidthPx)
        )
        val footerPlaceable = measurables[1].measure(constraints)

        // ━━━ Extract text metrics for positioning logic ━━━
        val textLayout = requireNotNull(textLayoutRef.value) {
            "TextLayoutResult must be available after text measurement"
        }
        
        val lineCount = textLayout.lineCount
        val lastLineWidth = ceil(
            textLayout.getLineRight(lineCount - 1) - 
            textLayout.getLineLeft(lineCount - 1)
        ).toInt()
        
        val widestLineWidth = (0 until lineCount).maxOf { lineIndex ->
            ceil(
                textLayout.getLineRight(lineIndex) - 
                textLayout.getLineLeft(lineIndex)
            ).toInt()
        }

        // ━━━ Determine layout strategy ━━━
        val canFitInline = lastLineWidth + footerPlaceable.width <= maxWidthPx

        val containerWidth = if (canFitInline) {
            max(widestLineWidth, lastLineWidth + footerPlaceable.width)
        } else {
            max(widestLineWidth, footerPlaceable.width)
        }.coerceAtMost(maxWidthPx)

        val containerHeight = if (canFitInline) {
            max(textPlaceable.height, footerPlaceable.height)
        } else {
            textPlaceable.height + footerPlaceable.height
        }

        // ━━━ Element placement ━━━
        layout(containerWidth, containerHeight) {
            // Position text at top-left
            textPlaceable.place(x = 0, y = 0)
            
            // Position footer based on available space
            if (canFitInline) {
                // Inline: bottom-right of the text area
                footerPlaceable.place(
                    x = containerWidth - footerPlaceable.width,
                    y = textPlaceable.height - footerPlaceable.height
                )
            } else {
                // Separate line: below text, right-aligned
                footerPlaceable.place(
                    x = containerWidth - footerPlaceable.width,
                    y = textPlaceable.height
                )
            }
        }
    }
}

But why is it better?

Composition separation

 content = {
            Text(
                text = message.text,
                color = textColor,
                onTextLayout = { result -> textLayoutRef.value = result }
            )         
            MessageFooter(message = message)
        }

Both child composables are created during the normal composition phase. This allows Compose to apply its standard optimizations - if message.text and textColor haven't changed, the Text composable can be skipped entirely during recomposition.

2. Single Measurement Pass


val textPlaceable = measurables[0].measure(rawConstraints.copy(maxWidth = maxWidthPx))
val footerPlaceable = measurables[1].measure(rawConstraints)

Each child is measured exactly once per layout pass. The measurables list is predetermined and stable, eliminating the allocation overhead of dynamic subcomposition.

Shared Layout Result

val textLayoutRef = remember { Ref<TextLayoutResult>() }
//… later in Text composable:
onTextLayout = { result -> textLayoutRef.value = result }

We use a Ref to share the TextLayoutResult between the Text composable's measurement and our subsequent line calculations. This avoids redundant text layout operations while keeping the data accessible for our positioning logic.

4. Streamlined Logic Flow The layout logic follows a clear, predictable sequence:

Measure children → Extract text metrics → Calculate container size → Place elements

This eliminates the complexity of interleaved composition and measurement that characterized our SubcomposeLayout approach.

The resulting implementation achieves identical visual behavior while working within Compose's optimized composition pipeline, setting the stage for significant performance improvements that we'll examine in our benchmark results.

Comparative Performance Analysis

Understanding Macrobenchmarking in Android

Before diving into our results, it's essential to understand why macrobenchmarking provides the most accurate performance insights for real-world app scenarios. Unlike microbenchmarks that measure isolated code snippets, macrobenchmarks evaluate your app's performance under realistic conditions - including the Android framework overhead, system interactions, and actual user behavior patterns.

Macrobenchmarking is particularly critical for UI performance analysis because it captures the complete rendering pipeline: from composition through layout to drawing and display. This comprehensive approach reveals performance bottlenecks that might be invisible in isolated testing environments.

Benchmarking and Results

We conducted macro-benchmark tests comparing both implementations (SubcomposeLayout vs. Layout). The benchmarks clearly indicated substantial performance improvements, including:

Reduced Frame Drops: The optimized approach significantly reduced stutter during rapid scrolling.
Lower Garbage Collection Pressure: Fewer intermediate object creations dramatically improved GC metrics.
Simpler and More Maintainable Code: By reducing complexity, the layout logic became clearer and easier for ongoing maintenance.

The benchmarks were structured using a macrobenchmark test similar to the following snippet:

@Test
fun scrollTestLayoutImplementation() = benchmarkRule.measureRepeated(
    packageName = "ai.aiphoria.pros",
    metrics = listOf(FrameTimingMetric()),
    iterations = 10,
    setupBlock = {
        pressHome()
        device.waitForIdle(1000)
        startActivityAndWait(setupIntent(useSubcompose = false))
    },
    startupMode = StartupMode.WARM
) {
    performEnhancedScrollingActions(device)
}

private fun performEnhancedScrollingActions(device: UiDevice, scrollCycles: Int = 40) {
   val width = device.displayWidth
   val height = device.displayHeight
   val centerX = width / 2
   val swipeContentDownStartY = (height * 0.70).toInt()
   val swipeContentDownEndY = (height * 0.3).toInt()
   val swipeSteps = 3
   val pauseBetweenScrolls = 15L

   repeat(scrollCycles) {
       device.swipe(centerX, swipeContentDownEndY, centerX, swipeContentDownStartY, swipeSteps) // Scrolls content up
       SystemClock.sleep(pauseBetweenScrolls)
   }

   repeat(scrollCycles) {
       device.swipe(centerX, swipeContentDownStartY, centerX, swipeContentDownEndY, swipeSteps) // Scrolls content down
       SystemClock.sleep(pauseBetweenScrolls)
   }
}

Our macrobenchmark tests revealed substantial performance improvements when switching from SubcomposeLayout to the optimized Layout approach. The results demonstrate consistent gains across all performance percentiles:

Frame Duration Improvements

The most critical metric for user experience - frame rendering time - showed significant improvements:

P50 (Median): 5.9ms vs 6.3ms (6.7% improvement)
P90: 10.5ms vs 11.0ms (4.7% improvement)
P95: 12.3ms vs 12.9ms (4.8% improvement)
P99: 15.0ms vs 16.2ms (8% improvement)

While these improvements might seem modest in absolute terms, they represent meaningful gains in a chat interface where smooth 60fps scrolling is critical. The P99 improvement is particularly significant - those worst-case frame times that cause noticeable stuttering are reduced by nearly 8%.

Frame Overrun Analysis

Frame overruns occur when rendering takes longer than the 16.67ms budget for 60fps. Our optimized Layout implementation shows better performance characteristics:

Fewer severe overruns: The P99 frame overrun improved from 1.4ms to 0.2ms
Better consistency: More predictable frame timing across all percentiles
Reduced stuttering: Fewer instances of frames missing their vsync deadline

The frame overrun improvements are especially important for maintaining smooth scrolling during intensive user interactions like rapid scroll gestures or when the system is under memory pressure.

Key Lessons Learned

When to Avoid SubcomposeLayout

Our experience reveals specific scenarios where SubcomposeLayout's flexibility comes at too high a performance cost:

High-Frequency Layouts: In scrolling lists where layout operations occur dozens of times per second, the overhead of subcomposition becomes prohibitive.
Simple Dynamic Positioning: When your layout requirements can be achieved through measurement and calculation rather than conditional composition, standard Layout is more efficient.
Performance-Critical UI: Chat interfaces, gaming UIs, or any context where frame drops directly impact user satisfaction warrant the optimization effort.

When SubcomposeLayout Still Makes Sense

SubcomposeLayout remains the right choice for:

Complex Conditional Composition: When you need to compose entirely different component trees based on measured content.
Infrequent Layout Operations: For dialogs, configuration screens, or other UI that doesn't require high-frequency layout passes.
Prototyping: SubcomposeLayout'sflexibility makes it excellent for rapid iteration during development.

Performance Optimization Checklist

Based on our optimization journey, here's a practical checklist for identifying and resolving similar performance bottlenecks:

Detection

Profile scroll performance using macrobenchmarks, not just microbenchmarks
Monitor frame timing metrics during realistic user interactions
Test on lower-end devices where performance differences are amplified
Measure at scale - test with hundreds of items, not just a few

Analysis

Identify composition frequency - how often are your custom layouts being measured?
Count measurement passes - are children being measured multiple times unnecessarily?
Check allocation patterns - are you creating temporary objects during layout?

Optimization

Prefer single-pass measurement when possible
Share measurement results between composition and layout phases efficiently
Minimize object allocation during layout operations
Validate with benchmarks - ensure optimizations provide measurable benefits

Conclusion

The key takeaway for Android developers building high-performance UIs: always measure your assumptions. What appears to be a minor implementation detail can have a substantial impact on user experience at scale. Invest in proper benchmarking infrastructure early, and don't hesitate to revisit implementation choices as your app's performance requirements evolve.

Happy coding!

How We Cut Chat UI Frame Time by 8% with One Jetpack Compose Optimization