Data Collection: Less is More

I have developed a bit of a reputation for being a strong privacy advocate, so it should not come as a surprise that I favor minimum data collection. Here’s why you should too.

Data for monetezation

Many silicon valley startups have started with the approach for “collect user data now, figure out what to do with it later.” This outdated idea assumes two things: 1) that you can do whatever you want with data after you have collected it, and 2) you have no obligation to keep track of that data or any information about that data. Both of these assumptions are false.

What you can do with data

What you can do with a user’s data depends on where that user resides. Increasingly more states are enacting controls on what data is collected and used, and a users rights to information about that data or correction of data. So you may have to know where the user you collected data from is located and what laws are in effect at the time the data was collected. Depending on what jurisdiction you collected it from, the user may need to explicitly consent to any future use that you have yet to envision in order for you to use it.

Keeping track of data

So according to what I already discussed, you need to keep track of where and when you collected user data. You may need to keep track of exactly what that user agreed to you using the data for. And you need to keep track of where the data resides. You will need to keep track of who can access that data, too. If you get big (and who doesn’t want that?) you will become liable for any data that was leaked, particularly if it includes things like financial, health, or other sensitive data. Unless you have strong controls, you can’t even use that data for development or troubleshooting.

Data is a liability

You may or may not have been dissuaded from collecting user data yet. But there’s more! Over time, I have had many discussions of the impact of various data collection schemes with entrepreneurs and they are always surprised when I tell them what could happen to that data. Just by collecting data you could make yourself a target for a whole host of actors.

State and national regulators

Everyone has started to collectively groan as state and federal lawmakers come up with new data breach laws. As discussed above, you need to know where user data is at all times (and how to contact those users!) in case the worst happens. Thankfully, this generally only applies to financial data because any operation to notify users and answer their questions gets expensive fast.

Increasingly, regulators are concerning themselves with issues beyond the data breach. In California, for example, the AG has started taking actions against website operators for not obtaining appropriate consent for data they sell. Many other states are following suit which could foreseeably result in multiple large fines for the same data.

If you’re looking beyond the United States, there are even more stringent restrictions on what you can do with data collected from users. It is advisable that anyone involved in collection, storage, or processing user data in Europe consult an attorney on the proper handling of that data because the law is so deep and nuanced. Brazil, Japan, China, and most of the rest of the world has some kind of data regulation you need to carefully observe.

Users themselves

In most places where privacy laws have been enacted, they nearly universally come with a right of access and/or deletion for users. You not only need to keep track of where every bit of each user’s data is stored, you need to have a plan in place for deleting each bit of it if so requested.

The law is still out on whether you need to go through all your offline backups and delete information from there if a user requests deletion. You almost certainly would be liable if you had to restore from backup and you restored data for a user that had requested deletion.

Anyone who files a lawsuit or divorce

This is where most of the surprise comes from for entrepreneurs. In the US, under the Third Party Doctrine, if someone gets beyond the very low bar for discovery in any court case by or against one of your users and any party thinks the data you have collected could relate in any way to that case, you must turn it over when they ask for it. You can theoretically resist, but are you really ready for the very high attorney bills that come from defending a properly formulated and served subpoena? That is extremely hard to win even in the most specious of circumstances. It is really best to just not collect that data in the first place.