Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

IP Demographics

34 views
Skip to first unread message

ideaproxy

unread,
Oct 11, 2012, 5:38:24 PM10/11/12
to
The goal of this project is to develop a product that will return demographic information given a user's IP address and other information about the user, like their browser's user-agent.

The first iteration of this product will focus on providing an income score that tells our clients whether the person is likely affluent, average, or lower income. Our clients can then use this information to display pages or products that may be of particular interest to such a user. For example to determine whether or not to offer a promotion. In addition to income score, we could return education score, gender score, tech savvy score.

There would be a RESTful web service that takes an IP address and user-agent string and returns an affluence rating

For the first iteration, we will be using census data for block groups. We will map the addresses/zipcodes of user-agent strings of online e-commerce orders in our fraud detection service logfile to these block groups and use the demographic data to correlate certain user-agent strings or features of the user-agent strings with median household incomes. We will then combine this data with the user's location to provide an affluence score.

We will develop a mapping between the addresses in the log data and the census data, and break the user agent into features.
We will find the correlation between these features and median household incomes. In particular, develop a sense of how strong the correlations are and whether they will be useful in constructing a score.
Develop a good model for normalizing and combining the features. This might involve a machine learning algorithm.
Implement the the chosen model in a programming language.
Develop a web service to provide this data to customers given an IP address, user-agent and accept-language, returning demographic information.

ideaproxy

unread,
Mar 20, 2014, 3:19:26 PM3/20/14
to
This disclosure describes a method for estimating demographic information for a person based on location as well as online attributes. A web service is provided for customers to query the data, and the inputs from the demographics web service along with a online fraud screening web service are used to improve the data for both the demographics and fraud screening web services.

Definitions:
* A demographic attribute is an average, estimate, or distribution of one of the following across people:
* Income
* Gender
* Age
* Education level
* A geographic attribute is a postal/country pair, city/country pair, or a similar identifier for a geographical area.
* Inputted country and postal is the location an online user enters into a e-commerce or other online form.
* client ID is an unique identifier assigned to clients that query demographics or fraud screening web services.
* user agent - HTTP header representing browser and operating system an online visitor uses
* accept language - HTTP header representing languages browser is configured to accept.

Step 1. Load demographic attributes for geographical attribute(s) into a database. This data is only available for selected countries.

Step 2. Read inputted country, postal code, city, client ID, email domain, user_agent, first and last name from fraud screening or demographics web service log database. Look up demographic attributes by billing postal and country or other geographical attribute. Compute weighted average or use other mathematical algorithm to combine each demographic attribute associated with location for each client ID, email domain, user_agent, first and last name and write these "derived demographics" to a database for each client ID, email domain, user_agent, first and last name.

Step 3. For countries where demographic attributes are not available by location, we compute an estimate of demographics by location as follows.

Read country, postal code, city, client ID, and user_agent from fraud screening or demographics service log database. Lookup derived demographics attributes computed in Step 2 using client ID and user_agent. Compute average of derived demographics attributes for each geographical attribute and save in a database, adjusting for relative country demographic differences. For instance, if average income in a country is lower than worldwide average we reduce the derived income.

Step 4. Demographics Web service takes IP address, first name, last name, address, city, region, postal, email domain, user agent, accept language, md5 hash of email address, phone number as optional inputs.
The web service returns:
* Estimated income - Using weighted average or other combination of:
* demographics for geographical attribute from Step 1, when available
* demographics associated with email domain, user agent, first and last name from Step 2
* purchase history from fraud screening and demographics web services associated with email md5, phone, and name
* Affluence score - Same as above but using average of derived demographics data from Step 3
* Age distribution - probability distribution representing proportionate numbers of persons in successive age categories in a given population. Computed in similar manner to estimated income and affluence score.
* Gender - Database lookup using firstname
* User type - returned from looking up type of end user (residential, business, cellular, traveler) by IP address in IP intelligence database.
* Device type - type and version of browser, operating system, derived from user agent string.
* Spoken languages - Database lookup using accept-language header
* population density - Database lookup of geographical attribute to determine if someone is in rural or urban area.

The web service logs the inputs into a database.

Step 5.

Web service logs from step 4, combined with logs from fraud screening service are used to:
* Build reputation and profile around name, email, phone, including associated billing and IP locations. Fraud screening service uses reputation to influence risk score. If billing postal, name, email and phone on an e-commerce order are consistent with the profile built around the order attributes, a low risk score is returned. For instance, if we see the same postal code associated with an email, that suggests the risk is lower. If billing postal varies, or if there is no activity seen for a name, email or phone, risk score can be increased.
* Build IP Geolocation database aggregating IP address / location pairs from logs to compute an estimated/average/median location for IP netblocks as well as IP addresses.

ideaproxy

unread,
Feb 11, 2015, 4:27:34 PM2/11/15
to
This disclosure describes a system for determining whether to offer a discount, promo code, customized pricing, or free shipping to an online visitor on an e-commerce website using demographics data.

A merchant would include Javascript code or other client side code on their website.
This Javascript code would post a query to an analysis server, possibly including the customer's location, email, customer ID and purchase history.

The Javascript code could be included with Javascript code that captures information about the visitor's device, and event, click and mouse movement data for fraud detection purposes.

Alternatively the merchant would collect the customer data, including IP, on their webserver and post a query to the analysis server.

The analysis server receive that data and use it together with the IP address, user agent and accept language http headers to calculate a demographics profile using the methods
described in the "System for providing demographic information for online visitors" disclosure published on alt.free.proxyservers on 2014-03-24.

The server would return a indicator to the Javascript or client code whether the merchant should offer discounted pricing or free shipping based on the online visitors' estimated income, affluence, age, gender, and whether the visitor is coming from a residential, business, mobile or traveler network.

The merchant would then report back to the server whether a purchase was completed, and what the price was, or if a promo code or free shipping was applied. This report could be part of a fraud detection web service query.

The server would use the reported data to build models to estimate the conversion rate based on demographics profile. To do this it would randomly return a discount indicator for a subset of queries, then calculate the percentage of customers converted as a function of demographical attributes and whether the customer was offered a discount. This data would be used to build an mathematical model of when it would be optimal to offer a discount considering the increased conversation rate against the loss in revenue.

The server could also build a model to predict the effect on conversion rate using user behavior on the website, and purchase history. For example if the customer has purchased low price items in the past they may offer discounts. Or if the customer visits certain pages, they might get offered a discount.
0 new messages