Online Form Information Harvested Even Before Submission

Did you ever begin to sign up for a newsletter or some website service and then change your mind? Maybe you were worried that you were giving out too much personal information on the form. Unfortunately, new research has found that the information you put on your form may have been harvested by third parties, even before you clicked the ‘submit’ button. But how would you know?

Cookies have long been used to track users as they move around the internet. When advertising brokers have enough of them, they have a pretty good idea of your interests and can use those interests in their marketing by presenting you with what they refer to as relevant ads; ads that their clients paid them to present to those who would be most likely to purchase what they are selling. But such marketing has run into a number of roadblocks lately. Newer browsers and browser extensions can effectively block cookies, leaving advertising brokers looking to find other ways to gather personal information.

A recent research paper found that a number of data brokers, advertisers and unknown others were using information from yet-to-be-submitted forms to track potential targets. To gather data on this shady practice, the researchers built a machine learning algorithm in the form of a web crawler that could find and fill online forms. Once found, the bot filled in an email address and a password. They used the top 100,000 sites listed on Tranco, a website ranking service, to see which sites may be stealing form information before it was submitted.

Keep in mind that tracking is common, especially among the largest online services. According to the tracker monitoring site, whotracks.me here are the top trackers.

In other words, these are the companies that probably have the most information on you. On the positive side, they probably don’t need to use trickery, like stealing information from unsubmitted forms, to get your personal information. They get that when you decide to use their free service. (See my post on the details of the Yahoo privacy policy.) After that, they simply follow your activities. It seems like the one big exception to this is Facebook, which, as we will see, later sells the information they collect.

The following table from the research shows the companies that received the leaked form data. It’s important to note that they may have not directly initiated the leak, however. They simply received it.

The table shows information for European Union (EU) sites as well as sites in the U.S. These are, for the most part, the data brokers. The data they collect is sold to more well known sites such as WebMD, Yahoo, and Fox News. Here is a list of the top 10 websites that receive this exfiltrated data. Note that Facebook is listed as a data marketer.

Online forms have been a source of concern for some time. Last year, a researcher found that the autofill function available in some browsers could leak information. The visual below shows how this occurs.

A couple of years ago, hackers found they could create simple forms in Google Forms featuring a company logo. 265 companies were targeted with these fake forms, but the hacker’s main target seemed to be Yahoo, AT&T, and Office 365. These targets are shown in the chart below.

The Google Forms vector works because people trust the connection to Google. Below is an obviously fake form I made on Google Forms. I used the Google connection as much as possible, but the footer is what really exists on the form. I can send this to any email address and it will activate a subject line that says, “Google Job Application”, and, since it’s from Google, it should have no problem evading spam filters. Of course, my form (shown below) is pretty crude, as were the ones in the hacking campaign, but they could be honed to target job seekers. I just need the form returned to my hacking group so that I can harvest the personal information. The return address on the form is hidden and would be advantageous to the hackers if it were a valid Gmail address.

The leaky forms found by the researchers have yet to be repaired. Some of the data broker companies claim that they have, at least implicitly, received the user’s consent to acquire such data, although that seems like a tenuous defense. Such permissions may be hidden in the terms users agreed to when they signed up for a free service. The researchers conclude that, “our results—likely lower bounds—show that on thousands of sites email addresses are collected from login, registration and newsletter subscription forms; and sent to trackers before users submit any form or give their consent.” Given the fact that those collecting data in this manner see nothing wrong with it, they conclude that, “based on our findings, users should assume that the personal information they enter into web forms may be collected by trackers—even if the form is never submitted.” So the next time you fill out an online form, simply assume that some of the information on it is being submitted even if you change your mind.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s