Device QA

Building a Device QA Matrix for Connected TV Apps

How to design and maintain a device testing matrix for connected TV streaming apps. Covers device selection, test tiers, automation options, and maintenance strategy.

January 12, 2026

A device QA matrix is the document that tells your team which devices to test, what to test on each one, and how deep to go. Without it, testing is ad hoc: someone grabs whatever TV is nearby, runs through a few screens, and declares it working. With a well-designed matrix, you systematically cover the devices that matter most and catch the problems that matter most, without testing everything on everything.

Choosing devices for the matrix

Start with data if you have it. Your analytics should tell you which device families generate the most viewing sessions. Sort by session volume and you have your priority list.

If you are launching a new app and do not have analytics yet, start with market share data by platform and model year. Roku, Samsung, LG, and Google TV cover the majority of connected TV sessions in North America. The specific models within each platform matter because hardware capabilities and web engine versions change with each model year.

What to buy for a test lab:

For Samsung: one current-year model and at least one model from 2-3 years ago. The older model tests your compatibility with the older Chromium engine.

For LG: same pattern. Current year plus one older model.

For Roku: one current Streaming Stick and one older Express or Premiere model. The Express has the most constrained hardware and will surface performance issues that do not appear on higher-end models.

For Google TV: a Chromecast with Google TV (the most common and most constrained device) and ideally one Android TV from a TV manufacturer (Sony, TCL, or Hisense with Google TV).

Total minimum: 6-8 devices for a four-platform matrix. This is a starting point, not a ceiling.

Defining test tiers

Not every device gets the same depth of testing. Define tiers based on importance:

Tier 1: full regression testing. Every feature, every user flow, every known edge case. Run the full test suite. This is typically 3-4 devices: your highest-traffic models.

Test coverage for Tier 1 includes:

  • Complete navigation flow (every screen reachable from every other screen)
  • All playback scenarios (VOD, live if applicable, various content types)
  • DRM validation (license acquisition, renewal, error handling)
  • Subtitle rendering for all supported languages
  • Audio track switching
  • 4-hour sustained playback for memory stability
  • Network disconnection and recovery
  • App lifecycle (background, foreground, suspend, resume)
  • Deep linking
  • Accessibility (screen reader, focus indicator visibility)

Tier 2: critical path testing. Test the things that, if broken, would affect a large number of users. Skip edge cases and cosmetic details.

Tier 2 coverage:

  • App launch to first content playback
  • DRM-encrypted playback
  • D-pad navigation through primary user flows
  • Subtitle display (one language)
  • Background/foreground transition during playback
  • Error handling for common failures (network drop, API error)

Tier 3: smoke testing. Verify the app works at all. If smoke tests pass, the risk of major issues on these devices is low enough to accept.

Tier 3 coverage:

  • App launches without crashing
  • Content loads and displays
  • Video plays
  • Basic navigation works

Building the test plan

A test plan is a list of specific, repeatable test cases organized by feature area. Each test case should include:

  • A test ID (for tracking results)
  • A description of what to do
  • Expected behavior
  • Pass/fail criteria
  • Any prerequisites (specific content, network conditions, device state)

Example test cases:

TC-PLAY-001: VOD playback start

  1. Navigate to a VOD content item
  2. Select play
  3. Verify playback starts within 5 seconds
  4. Verify video and audio are in sync
  5. Verify the correct ABR rung is selected for the network speed

TC-DRM-003: License renewal during long playback

  1. Start playback of encrypted content
  2. Wait for at least one license renewal period (configure short duration in test environment)
  3. Verify playback continues without interruption
  4. Verify no visible glitch at the renewal point

TC-NAV-005: Back button from playback

  1. During playback, press the back button
  2. Verify the player closes or shows a confirmation dialog (depending on platform)
  3. Verify the previous screen is restored correctly
  4. Verify no audio continues after exiting the player

Automation considerations

Full UI automation on real TV devices is hard. The tooling is not as mature as mobile or web automation. But some things can be automated:

What can be automated reasonably:

  • App launch and cold start time measurement
  • API response validation (using a proxy or network stub)
  • Screenshot capture and comparison for visual regression
  • Basic navigation flow execution (on platforms with automation support)
  • Memory profiling over time (connect to debug interface, sample heap periodically)

What is hard to automate:

  • Playback visual quality assessment
  • Subtitle rendering quality
  • Audio/video sync perception
  • User experience of buffering and bitrate switches
  • HDMI-CEC interactions

Automation tools by platform:

  • Roku: Roku’s automated channel testing framework, ECP (External Control Protocol) for remote command simulation
  • Samsung Tizen: Remote Web Inspector automation, Selenium-based approaches via the web engine
  • Google TV: UIAutomator, Espresso, Appium
  • LG webOS: ares CLI with scripted Web Inspector interaction

Start with automating smoke tests and startup time measurement. These give the highest return for the lowest effort. Add more automated tests incrementally.

Maintaining the matrix over time

A device matrix is not a one-time document. It needs to be updated:

Annually: when new model-year hardware ships, add it to Tier 1. Demote older hardware to lower tiers. Remove hardware that has fallen below your support threshold.

After each release: record test results. Track which devices had issues and what those issues were. Over time, this data tells you which devices are the most problematic and deserve more testing attention.

When support thresholds change: if your analytics show that a device family has dropped below a meaningful percentage of sessions, consider dropping it from the matrix entirely. Testing resources are finite.

When new features are added: each new feature may need new test cases. Review the matrix after significant feature additions and add targeted tests for the new functionality.

The QA matrix is a living tool. Keep it in a shared location where the whole team can see it and update it. It is as much a planning document as a testing document.

More resources

Browse the full set of guides and platform notes.

All Guides