Is Google harvesting our data to feed its AIs?

Google may be harvesting huge amounts of data from Android phones in Australia, according to a report last week from The Australian. This isn’t a new claim; indeed we’ve known for a while now that Google and Apple receive a lot of personal data from Android and iOS phones respectively. What’s more interesting at this point in time is what Google and Apple are doing with all of our data. There’s growing concern that Google, in particular, is using it to train AIs.

The latest claim about Google receiving personal data from Android phones was made by Google’s software rival, Oracle. It estimated the data transfer could be about a gigabyte per user per month, which would cause some consumers to exceed their monthly data allowance.

So what kind of data are we talking about here? The information being transferred to Google is labeled an “activity log,” and it includes detailed location data. Google argues this is done with the permission of Android users, but the company is now being challenged on whether that user consent is informed. Many of us do not read lengthy terms and conditions, especially on phones, and we just press the “agree” option to make the long screed go away.

So yes, Google’s data collection terms and conditions need to be reviewed – especially with new privacy laws such as the GDPR and our own Privacy Bill about to take effect. However, it’s worth noting that the transfer of personal data is opt-in for users of both Google’s Android and Apple’s iOS. That means you have to specifically turn on location history and similar services in order for your data to be uploaded to Google and Apple’s servers. So it’s not like this data is being collected by default.

It’s far more interesting to ponder what exactly Google is doing with the activity logs it collects.

We know Google uses location data to help target its adverts. It also uses the data to improve its other online products. A Google spokesperson told the news website Quartz in January that by opting in to Location History on Android, “you can receive traffic predictions for your daily commute, view photos grouped by locations, see recommendations based on places you’ve visited, and even locate a missing phone.”

But could Google also be using your personal data to help train its AI services? In recent years, Google has been positioning itself as an “AI-first” company. The way machine learning (a type of AI) works is that you feed it loads and loads of data, so that it can “learn” how to make decisions and serve people. So it makes sense to feed it personal data from the smartphones we humans carry around all day, every day.

Certainly the chairman of the Australian Privacy Foundation, David Vaile, believes our personal data is being fed to Google’s AI systems. He told The Guardian that “Google has self-evolving machine-learning algorithms that use this data being sent from Android devices.” He added, ominously, “they let them loose on the data and see what they come up with.”

If that’s the case, we shouldn’t be worrying so much about the wording of Google’s T&Cs. We should instead be worrying about what Google’s AIs are learning about us, and how it could potentially be used to manipulate us.

That may sound overly paranoid, but a new Google AI service announced just this month inadvertantly showcased these very risks.

At its annual developer conference, Google announced an AI voice system called Duplex that both wowed and frightened people. Duplex will soon be integrated into the company’s popular virtual assistant, Google Assistant. When this was demoed on stage, the app phoned up a hair salon and a restaurant and convincingly carried out a conversation with the people who answered. Essentially, it fooled those people into thinking they were talking to a human. One of the tricks used by Duplex was inserting ums and ahs into its conversation.

That was an ethically dubious design choice, to say the least. There’s no question that Google means well with its AI services, but it just designed an app that casually manipulates people! Do we want our personal data being used to help train such services? Where’s the accountability if one of Google’s AI services uses our personal data in a manner we’re uncomfortable with?

We’re already seeing government departments being held accountable for the decisions its AI-powered software systems make. There’s even been a call for a “data-mining watchdog” to help regulate how government-run AIs use our personal data.

The key difference is that many people have no choice about using a government service. Whereas with Google, as consumers we have a choice whether or not to use its AI services.

Or do we?

In order to use a Duplex-enhanced version of Google Assistant, it’s highly likely you will have to opt-in to sharing your personal data with Google. Android users already must opt-in to Location History in order to set up Google Assistant. “The Assistant depends on these settings in order to work,” you are informed when you go to activate the app. (You can turn off Location History later though and it doesn’t stop Google Assistant from running.)

The point is, if you’re using an Android phone then you probably also want to use advanced services like Google Maps and Google Assistant. Both require Location History to be turned on.

We’re entering a minefield of privacy concerns in this emerging AI era. More and more of Google’s services will be powered by AI, and we will feel impelled to opt-in to data tracking in order to make the best use of those services.

But can we trust AIs to not manipulate people? I don’t think we can. After all, if we can’t trust Google itself to not cross ethical boundaries when designing software products, then how can we trust the software itself.