ideaproxy
unread,Mar 20, 2014, 3:19:26 PM3/20/14You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
This disclosure describes a method for estimating demographic information for a person based on location as well as online attributes. A web service is provided for customers to query the data, and the inputs from the demographics web service along with a online fraud screening web service are used to improve the data for both the demographics and fraud screening web services.
Definitions:
* A demographic attribute is an average, estimate, or distribution of one of the following across people:
* Income
* Gender
* Age
* Education level
* A geographic attribute is a postal/country pair, city/country pair, or a similar identifier for a geographical area.
* Inputted country and postal is the location an online user enters into a e-commerce or other online form.
* client ID is an unique identifier assigned to clients that query demographics or fraud screening web services.
* user agent - HTTP header representing browser and operating system an online visitor uses
* accept language - HTTP header representing languages browser is configured to accept.
Step 1. Load demographic attributes for geographical attribute(s) into a database. This data is only available for selected countries.
Step 2. Read inputted country, postal code, city, client ID, email domain, user_agent, first and last name from fraud screening or demographics web service log database. Look up demographic attributes by billing postal and country or other geographical attribute. Compute weighted average or use other mathematical algorithm to combine each demographic attribute associated with location for each client ID, email domain, user_agent, first and last name and write these "derived demographics" to a database for each client ID, email domain, user_agent, first and last name.
Step 3. For countries where demographic attributes are not available by location, we compute an estimate of demographics by location as follows.
Read country, postal code, city, client ID, and user_agent from fraud screening or demographics service log database. Lookup derived demographics attributes computed in Step 2 using client ID and user_agent. Compute average of derived demographics attributes for each geographical attribute and save in a database, adjusting for relative country demographic differences. For instance, if average income in a country is lower than worldwide average we reduce the derived income.
Step 4. Demographics Web service takes IP address, first name, last name, address, city, region, postal, email domain, user agent, accept language, md5 hash of email address, phone number as optional inputs.
The web service returns:
* Estimated income - Using weighted average or other combination of:
* demographics for geographical attribute from Step 1, when available
* demographics associated with email domain, user agent, first and last name from Step 2
* purchase history from fraud screening and demographics web services associated with email md5, phone, and name
* Affluence score - Same as above but using average of derived demographics data from Step 3
* Age distribution - probability distribution representing proportionate numbers of persons in successive age categories in a given population. Computed in similar manner to estimated income and affluence score.
* Gender - Database lookup using firstname
* User type - returned from looking up type of end user (residential, business, cellular, traveler) by IP address in IP intelligence database.
* Device type - type and version of browser, operating system, derived from user agent string.
* Spoken languages - Database lookup using accept-language header
* population density - Database lookup of geographical attribute to determine if someone is in rural or urban area.
The web service logs the inputs into a database.
Step 5.
Web service logs from step 4, combined with logs from fraud screening service are used to:
* Build reputation and profile around name, email, phone, including associated billing and IP locations. Fraud screening service uses reputation to influence risk score. If billing postal, name, email and phone on an e-commerce order are consistent with the profile built around the order attributes, a low risk score is returned. For instance, if we see the same postal code associated with an email, that suggests the risk is lower. If billing postal varies, or if there is no activity seen for a name, email or phone, risk score can be increased.
* Build IP Geolocation database aggregating IP address / location pairs from logs to compute an estimated/average/median location for IP netblocks as well as IP addresses.