What combination of search parameters are needed to reproduce the front page of HN?

267 views
Skip to first unread message

Nick Nome

unread,
Jun 4, 2011, 1:51:45 PM6/4/11
to HNSearch
Is it possible to reproduce the content and order of articles on the
front page of HN through the API?
Thanks,
--Patrick

andres

unread,
Jun 4, 2011, 2:02:33 PM6/4/11
to HNSearch
Great question. To first order yes because you could just sort results
by the Hacker News hotness algorithm. However, you'll probably want
some caching layer on top of that to avoid sorting 3M items on every
user request. Also, Paul might be using other signals in his ranking
algorithm that we don't have access to.

Andres
Message has been deleted

Zack Maril

unread,
Jun 4, 2011, 3:24:34 PM6/4/11
to HNSearch
That was actually my same question. I've been looking for the past
hour and I just found these:
http://news.ycombinator.com/item?id=1781013
http://amix.dk/blog/post/19574

So you need to use the votes, gravity weight and age of the article to
get the basic ranking.

Looking further:
There are a bunch of variables involved that pg didn't provide defs
for in his post. But, thank god for open source, you can get the
algorithms by downloading arc. Here's what I've figured out so far:

Numeric Variables you need to get the frontpage algo for a post:
Votes, search given
Age, search given
Gravity weight, constant given
Contro-factor, calculated
Light-weight factor, calculated
Front threshold, given

True/false variables
Whether it has a url, search given
Whether it is controversial, not sure yet
Whether it is lightweight, not sure yet

You won't be able to clone hn exactly, since pg has more info than you
do. Plus, news.yc shows he uses some random functions to make sure
that the front page keeps flowing. But you can get close enough that
most people wouldn't be able to tell the difference if they didn't
know what was going on.

Alternative: you could just pull news.ycombinator and break it into
pieces that you wanted. I thought about doing a HN Sans where it
filters out things from the front page currently, but that's been done
and it doesn't use the api much at all.

Good luck!
-Zack
Message has been deleted

Nick Nome

unread,
Jun 4, 2011, 3:42:18 PM6/4/11
to HNSearch
What would I submit as the value of q= if I just want everything
that
was the top 30 according to the hotness algorithm?
Thanks,
--Patrick

Zack Maril

unread,
Jun 4, 2011, 3:46:39 PM6/4/11
to hnse...@googlegroups.com
Try adding "&limit=30" to the end of your request string. Or using if you use jQuery, add in another item like "limit" : 30.

From the api:http://www.hnsearch.com/api

andres

unread,
Jun 4, 2011, 4:01:53 PM6/4/11
to HNSearch
You can sort by the hotness algorithm function and set a limit of 30:

http://api.thriftdb.com/api.hnsearch.com/items/_search?limit=30&sortby=product(points,pow(2,div(div(ms(create_ts,NOW),3600000),72)))%20desc&pretty_print=true

Note: tweak function parameters to your liking

Andres

Nick Nome

unread,
Jun 4, 2011, 4:21:05 PM6/4/11
to HNSearch
Thanks!

On Jun 4, 1:01 pm, andres <and...@octopart.com> wrote:
> You can sort by the hotness algorithm function and set a limit of 30:
>
> http://api.thriftdb.com/api.hnsearch.com/items/_search?limit=30&sortb...

Alex Figueroa

unread,
Oct 16, 2013, 11:19:39 PM10/16/13
to hnse...@googlegroups.com
Hey,

I know it's been a while but I've found that this actually gives you the "Best" page of HN and not necessarily the "Front Page" view you see upon loading an HN app or going to news.ycombinator.com itself.

Is this expected? Is there a query available to get the front page data?

Thanks,
Alex Figueroa

Andres Morey

unread,
Oct 17, 2013, 9:22:43 AM10/17/13
to hnse...@googlegroups.com, hnse...@googlegroups.com
HNSearch doesn't have access to all of the ranking signals that HN does so you can't reproduce the front page  exactly for an arbitrary point in time. However, you can access the HN RSS feed through HNSearch marked up with HNSearch item ids:


If you like, you can poll that to build up a history of HN front pages over time.
--
You received this message because you are subscribed to the Google Groups "HNSearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hnsearch+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
Reply all
Reply to author
Forward
0 new messages