Online Dating, Sex Offenders and Background Checks: The Hype and The Problem

Via PogoWasRight, I hear of this NJ online dating bil:

The bill as introduced requires online dating services to disclose to any user from New Jersey whether it has performed background checks on members of the site.

The flawed part of the bill comes in the fact that to satisfy the bill’s “Criminal Background screening” all a site has to do is a simple name search via a regularly updated government public records database or a database maintained by a private vendor.

The actual text of the bill is available in PDF.

The article calls the bill “flawed,” and I agree. These sorts of simple name matching background checks are unreliable. They’re likely to have errors, they’re easy to fool, they’re likely to have mismatches, and they promote a false sense of security. They may not be complete, arrests are not followed up by lack of charges, or something else that shows a person is innocent. Expungements and pardons may not only fail to clear the record, they may not be returned at all. All that and in general criminal records are actually hard to read: its difficult for a lay person to tell from a court printout what someone’s exact criminal history is.

I wanted to add what sorts of things help promote not only the hype, but the background check solution.

As the article notes, Wired Editor Kevin Poulson wrote a perl script to compare sex offender lists with names on MySpace. They ended up arresting an individual from that, and Wired wrote it up under the headline MySpace Predator Caught By Code.

At the time, I blogged about this on my previous blog:

Wired wrote some code to match the information in the national sex offender database — first and last name, and zip code (within 5 miles) — with profiles on MySpace. This gave them “vast numbers of false or unverifiable matches.” It took months of part time work, looking at each profile, to figure out which were actual predators still using the site for their predation. Some profiles were dormant. Some were innocent. One lead to an arrest.

But here was the problem. Not just with the process, but with the entire pitch:

The predator was not caught by code. “Vast numbers of false or unverifiable” matches were caught by code.

It was the human work of tracking down all the false matches and doing investigations that actually caught the bad guy. I predicted at the time that this exercise would incorrectly portray the “magic” of data matching. Not only does it promote the hype of sex predators on social network sites, but it also promotes the idea that there is an easy “search” that one can make to check this threat.

Posted: November 27, 2007 in:

Connecticut Takes Some Protection Order Info off the Web

Via PI Buzz I hear that the Connecticut court is removing some protection / restraining order information from its website.

Effective Monday, Dec. 3, 2007, and in accordance with federal law, information identifying a party protected by a restraining order will no longer be available through the civil/family look-up section of the Judicial Branch’s website. This federal prohibition does not extend to disclosable information in a file at a court clerk’s office.

Under a 2006 amendment to the Violence Against Women Reauthorization Act of 2005, no state, Indian tribe of territory “shall make available publicly on the Internet any information regarding the registration or filing of a protection order, restraining order, or injunction in either the issuing or enforcing State, tribal, or territorial jurisdiction, if such publication would be likely to publicly reveal the identity or location of the party protected under such order.”

EPIC and several domestic violence advocates in DC filed comments against a proposal by the DC courts to place information like this online. We highlighted this VAWA prohibition. In those comments we pointed that DC law requires an intrafamily relationship before a protection order is issued. That means that information like the restrained party’s name might identify the protected party. We also pointed out that information about addresses that the restrained party cannot approach has a likelihood of disclosing protected party location. We also pointed to other privacy problems with other court records, including divorces and civil cases.

It looks like the Connecticut court will still keep a lot of things on line. For more on the risks of that, see our comments (pdf).

Posted: November 23, 2007 in:

Pro-Privacy Anti-Cyberbullying Public Service Announcement

People my age remember the anti-drug public service announcements from the 1980’s. Classic lines like “I learned it by watching you” and “this is your brain on drugs.” Besides, of course, the vast socio-economic effects of the war on drugs, these are probably some of the most memorable icons of the ridiculousness of that era. At least for white middle class me.

But it seems that nowadays, kids are being warned against Cyberbullying. The slogan is “Delete cyberbullying. Don’t write it. Don’t forward it.”

I like how they do more than just detract the creation of offending content. They point out that it also propagates via user action. Forwarding, linking, adding to newsfeeds helps to move the information around. And it helps to propagate the privacy violation. Forwarding not only increases audience, it also may move it from a non-indexed medium, like text message, to one that search engines pick up, like blogs or social networking.

Like a good PSA, they’re tough to watch and pretty intense:

I like that the overbearing medium of the public service announcement is being used to promote a privacy aware generation. Better than stoking simplistic prohibitionist hysteria.

Posted: November 18, 2007 in:

“Do Not Track” lists and registries

Several consumer groups have proposed a do not track list in response to the problem of behavioral profiling online. The idea is that domains which use technologies that track users via the internet register with the Federal Trade Commission. Internet users who do not want to be tracked can then download this list to block the tracking technologies. This would be accomplished with a browser extension, plugin or some other technical method on the user end. The groups have provided a pdf image that describes how the system works.

The idea is not without its critics. Declan McCullagh writes:

The pro-regulation lobbyists and activists are most upset about behavioral advertising, meaning computer-generated ads that are based on pages a visitor previously viewed. Someone who spends a lot of time reading a newspaper’s Asia travel articles may see ads for trips to China even when perusing sports scores. Quelle horreur!


I think some of the messaging on this is a bit off. At least, it gives people the wrong idea as to how this works. Note this Washington Post article:

Privacy, consumer and technology groups yesterday proposed the creation of a Do Not Track list similar to the Do Not Call phone list, allowing people to prevent companies from tracking which Web sites they visit.

Under Do Not Call, you sign up your number for a list. Telemarketers are then prohibited from using this list. Likening this recent proposal to do not call gives people the idea that they have to sign up for a do-not-track list. That someone will be keeping track of all the people that don’t want to be tracked. Thats not quite how this works. This works more like an sex offender registry — the people we are on the lookout for (the trackers/sex offenders) are the ones that are tracked. Not the consumers.

Of course it is problematic messaging to compare servers that track online consumers to sex offenders. But it does describe the interaction better: users are not signing up with the government, they’re using the government list to know who to avoid.


The system does have some limitations. It doesn’t address all the data collection and use practices out there. One major item left off the list is data collection by search engines. That’s data that can be used for behavioral profiling. Specially since search engines like google keep individually identified information.

It’s also basically an opt-out system. I’ve talked about the problems with opt-out before. And also how opt-in is better.

But it does mitigate some problems with opt-out. Under opt-out, the data collector has no incentive to explain its data collection practices to the users. In fact, the incentive is to not explain it. Also under opt-out, the consumer has to go to each place and opt-out of that one place. Burdensome. This proposal fixes those problems by legislating the incentive on to the collector to disclose. It also allows one easy opt-out, rather than many.

It basically complements and facilitates many self-defense measures that are out there. I’m quite tech savvy. I use adblock, I manage my cookies. I block most third party scripts on the sites I visit. It seems like it would be more efficient to let all users simply make one choice — be tracked or not — than to have to make each choice like I do. And it would make it easier if the law facilitated this, by mandating that trackers disclose this information to the FTC. Investors make decisions based on mandated disclosures. Consumers should be able to as well.

Posted: November 4, 2007 in:

Domestic Violence Court Records and Privacy

I previously blogged about a proposal to place District of Columbia domestic violence and domestic relations (divorces, child neglect, child custody) court records online. These dockets contain case name, case type, scheduling, address where service occured, and even some dispositions.

I worked with the domestic violence community here in DC to create a set of comments(pdf) to the court. They touched on the topics I had previously mentioned: data brokers, identity theft, and stigma. Plus some more.

VAWA Prohibition

Importantly, a lot of this proposal is prohibited by federal law. The Violence Against Women Act (VAWA) Prohibits the internet publication of protection order information. For more, I created a web page on VAWA and Privacy at EPIC. VAWA Section 106(c) prohibition language states:

Limits on Internet Publication of Protection Order Information.–Section 2265(d) of title 18, United States Code, is amended by adding at the end the following:
“(3) Limits on internet publication of registration information.–A State, Indian tribe, or territory shall not make available publicly on the Internet any information regarding the registration or filing of a protection order, restraining order, or injunction in either the issuing or enforcing State, tribal or territorial jurisdiction, if such publication would be likely to publicly reveal the identity or location of the party protected under such order. A State, Indian tribe, or territory may share court-generated and law enforcement-generated information contained in secure, governmental registries for protection order enforcement purposes.

I italicized some important language in there. The goal of this prohibition is to protect the privacy of the protected person. Lots of things besides the name and address of the protected person are likely to reveal their identity and location. The name of the restrained party can reveal the identity of the protected person. Here in DC, you can only get a protection order if you have an “intrafamily relationship.” This means that the universe of possible protected people for a given restrained party is small. These orders usually include stay away provisions which reveal the locations that the protected person frequents, not just their home address. DC itself is a small jurisdiction, so if you are from out of state and register your protection order here, then you will basically be providing notice that you moved to or have a connection to DC.

This prohibition reaches more than just than the domestic violence docket. As you can read in the comments, the definition of “protection order” may include criminal orders as well as orders in the domestic relations docket. So this rule impacts those dockets as well.

Other Records Affect Domestic Violence Survivors

The comments also directly discuss the threat of data brokers. We presented examples of data broker products, such as marketing lists of people who are “Single Again” or who have “recently filed divorces.” The source of this marketing information is family court records. Making it easier for data brokers to collect data makes it cheaper for them to spread it, which leads to more information flows.

We recommended that technical and legal measures guard against data broker access to online court records. Technically, CAPTCHA‘s should be implemented:

A CAPTCHA is a program that protects websites against bots by generating and grading tests that humans can pass but current computer programs cannot. For example, humans can read distorted text . . . but current computer programs can’t

Legally, the court should simply refuse to allow commercial resellers to access online court records.


We recommended a password protected database as a the best way to balance convenience, privacy and transparency. This way proper security — such as the data broker restrictions — could be implemented. We also recommended a basic set of Fair Information Practices, including that individuals have the ability to control whether their information is placed online. The details are at the comments.

Posted: November 1, 2007 in: