Monkey business in children’s apps

It’s not uncommon for small children, toddlers, and even babies to be mesmerized by smartphones and tablets. Whether it’s angering birds, ninja-ing fruit, or crafting mines, intuitive touchscreen-based interfaces allow the youngest of users to play in high-definition virtual environments with ease.

Given such a compelling audience, there’s now a vibrant ecosystem of app developers specializing in products designed for young children. A quick glance at the “Kids” category in the iOS App Store and the “Family” category in the Google Play Store reveals hundreds of offerings from dozens of companies.

Many apps listed in children’s categories are offered free of charge. Just like free apps for general audiences, developers of no-cost children’s apps still need to generate revenue. They might opt to partner with advertising networks to display ads alongside app content. These developers might also use analytics services to better understand their users’ interests.

Advertising and analytics are ubiquitous online. However, developers of children’s apps must be especially careful with any data they share with affiliates, inadvertently or otherwise. In the United States, the Children’s Online Privacy Protection Act (COPPA) limits the collection and sharing of data from young children. Multimedia captures, street addresses, and unique identifiers — among many others — fall within the scope of this law. Children’s apps need to properly inform parents of the nature of any data collection, and subsequently obtain parental consent before proceeding. The Federal Trade Commission (FTC) has brought about stiff civil penalties on software developers that violate these standards.

The ICSI Usable Security and Privacy group and the Haystack team are in a unique position to investigate the data collection behavior of children’s apps — behavior often invisible to end-users. We aim to empower parents and regulators alike with a bird’s eye view of the state of data collection in children’s apps. Our goal is to evaluate large numbers of children’s apps at scale.

We have combined two of our in-house research tools in order to characterize app data collection behavior: the Lumen Privacy Tool (a tool that detects privacy leaks on mobile apps, available on Google Play) to monitor data transmitted to remote servers, and a customized Android platform to log accesses to permission-protected resources and track coverage.

Previous investigations of children’s apps relied on human-directed exploration of the software. Manual efforts are time-consuming and financially costly.  We automate investigations using the Android Exerciser Monkey, which generates a pseudorandom stream of input events, such as taps, swipes, hardware button presses, and application activity triggers. For our research, we hired a tester to manually explore apps so we can establish a baseline to which we can compare the monkey’s thoroughness.

We have assembled a corpus of 446 free apps gathered from the “Ages 5 & under,” “Ages 6-8,” and “For kids 9 & up” subcategories in the Google Play Store. Developers self-declare their apps for these subcategories for young children. This is an explicit acknowledgement that their target audience is indeed children under 13 years of age.

We presented a report and a poster of our ongoing work at FTC PrivacyCon 2017. Although this effort is still underway, we’re excited to share some of our early findings. The following results were collected using automated monkey-driven analysis.

“Do you know where your children are?”

COPPA prohibits the collection of street-resolvable geolocation from users under 13 years of age. Online services can (and often do) identify users’ cities and ISPs via IP address location lookups. However, this doesn’t necessarily run afoul of COPPA restrictions, as that’s insufficiently fine information to determine the user’s home address or street.

Apps that use GPS, cell network, and Wi-Fi location would likely be in violation though. Those geolocation technologies enable apps to resolve the user’s location well within tens of meters; enough to identify a street address with high confidence. Luckily, Android requires apps to declare the ACCESS_FINE_LOCATION permission and request user approval in order to access this functionality.

Using our automated test platform, we identified a likely leakage of fine geolocation data to third parties without informing the user. Out of the 111 children’s apps we’ve auto-tested so far, we observed 15 sharing Wi-Fi router MAC addresses with third parties. All 15 were developed by BabyBus, and all shared it with the same analytics firm. Wi-Fi router MAC addresses are trivially street-resolvable using public APIs offered by WiGLE and Google Maps Geolocation.

Excerpt of POST request to third party:


We were able to locate our offices with very high accuracy using this data. We also also asked our manual tester where she evaluated the affected apps, and with permission verified those locations too (not displayed to protect her privacy).

Additionally, this third party collects the names of all Wi-Fi routers saved on the host device. It’s unclear what purpose this information could serve, but it could easily be used to fingerprint users and identify when the user is at home vs. in school.

We notified BabyBus of this finding, and they responded rapidly, saying that have now stopped using this analytics package. All their products on the Google Play Store were indeed updated within days of our notice, but we still need to re-run our tests to verify this change.

“Just say no”

Online advertisers seek to increase conversation rates by serving users with timely ads relevant to their interests. This requires tracking users over time and across different services, often by uniquely identifying their devices and building personal profiles tied to those identifiers. With enough observations and data tied to a particular device, advertisers can build a profile of that user’s interests and background. For young children though, this kind of highly targeted behavioral advertising is restricted under COPPA guidelines.

A number of unique identifiers are available on Android devices. These include the Wi-Fi radio’s MAC address, phone IMEI, and SIM card identifiers. One such unique identifier is the Android Advertising ID (AAID). The AAID is a persistent OS-generated identifier that the user can opt not to share with apps. The user can also choose to manually regenerate it, effectively disconnecting themselves from marketing profiles built on the previous AAID. Google Play policy recommends that the Android AAID be used exclusively for advertising and analytics purposes. The policy discourages developers from associating the AAID to other (more difficult to reset) device identifiers.

From our automated tests to date, we identified a children’s game that not only collected the AAID, but also other device identifiers with it, going against Google’s best practices. This collection persists even if a parent uses the system settings to explicitly opt out of tracking.

Excerpt of POST request to third-party:


We reached out to the app developer for verification and comments. They have yet to respond to us. We are in the process of checking for this behavior in their other apps, as well as other companies’ products that use the same third-party service.

Baby steps

These results just scratch the surface of the data collection behavior in the children’s apps we’re testing. Our automated testing process revealed the collection and sharing of COPPA-relevant data — behavior otherwise invisible even to tech-savvy parents, much less to the young children actually using these apps. We still have the rest of our corpus to analyze, both with our automated system, and with our manual tester for comparison. We look forward to sharing additional findings on this blog and discussing our methods rigorously in refereed academic venues.

Leave a Reply

Your email address will not be published. Required fields are marked *