Concept and names relative to sports or politics

7 views
Skip to first unread message

Vavliakis Konstantinos

unread,
Nov 27, 2009, 8:44:55 AM11/27/09
to get.theinfo
Dear all,

I am looking for two datasets, one comprised of tags relative to
sports and one of tags relative to politic.
Preferably tags should contain both relative concepts and names.
Any help would be appreciated!


Kind regards,

Kostas

thermo XD

unread,
Nov 30, 2009, 2:27:00 AM11/30/09
to get-t...@googlegroups.com, kost...@gmail.com
I recommed you to grab video tags from YouTube (gdata) favorited
videos and extracting tag coocurring multiple times with "sports" and
"politics" or similar, example:

http://gdata.youtube.com/feeds/api/users/NufffRespect/favorites?v=2

and in favorite http://www.youtube.com/watch?v=fTahZE4q90U

for example extract:

<span class="watch-channel-stat">Category:&nbsp;</span>
<a id="watch-video-category" href="/news" class="hLink category"
onmousedown="yt.analytics.urchinTracker('/Events/VideoWatch/VideoCategoryLink');">News
&amp; Politics</a><br>
</div>


<div id="watch-video-tags-div">

<div class="floatL">
<span class="watch-channel-stat">Tags:&nbsp;</span>
</div>
<div id="watch-video-tags" class="floatL">
<a href="/results?search_query=illuminati&amp;search=tag"
class="hLink">illuminati</a>&nbsp;
<a href="/results?search_query=republican&amp;search=tag"
class="hLink">republican</a>&nbsp;
<a href="/results?search_query=democrat&amp;search=tag"
class="hLink">democrat</a>&nbsp;

<a href="/results?search_query=obama&amp;search=tag"
class="hLink">obama</a>&nbsp;
<a href="/results?search_query=mccain&amp;search=tag"
class="hLink">mccain</a>&nbsp;
<a href="/results?search_query=bush&amp;search=tag"
class="hLink">bush</a>&nbsp;
<a href="/results?search_query=clinton&amp;search=tag"
class="hLink">clinton</a>&nbsp;
<a href="/results?search_query=new&amp;search=tag"
class="hLink">new</a>&nbsp;
<a href="/results?search_query=world&amp;search=tag"
class="hLink">world</a>&nbsp;

<a href="/results?search_query=order&amp;search=tag"
class="hLink">order</a>&nbsp;
<a href="/results?search_query=symbolism&amp;search=tag"
class="hLink">symbolism</a>&nbsp;
<a href="/results?search_query=left&amp;search=tag"
class="hLink">left</a>&nbsp;
<a href="/results?search_query=right&amp;search=tag"
class="hLink">right</a>&nbsp;
<a href="/results?search_query=politics&amp;search=tag"
class="hLink">politics</a>&nbsp;
<a href="/results?search_query=2008&amp;search=tag"
class="hLink">2008</a>&nbsp;

<a href="/results?search_query=election&amp;search=tag"
class="hLink">election</a>&nbsp;
<a href="/results?search_query=nwo&amp;search=tag"
class="hLink">nwo</a>&nbsp;
<a href="/results?search_query=Documentary&amp;search=tag"
class="hLink">Documentary</a>&nbsp;
<a href="/results?search_query=Commentary&amp;search=tag"
class="hLink">Commentary</a>&nbsp;
<a href="/results?search_query=Analysis&amp;search=tag"
class="hLink">Analysis</a>&nbsp;
<a href="/results?search_query=Political&amp;search=tag"
class="hLink">Political</a>&nbsp;

<a href="/results?search_query=Commercial&amp;search=tag"
class="hLink">Commercial</a>&nbsp;

</div>

Fine tune the number of coocurrences to filter out spurious tags
appearing only a few times...

Regards,
Jose
> --
> [from the http://groups.google.com/group/get-theinfo mailing list]
Reply all
Reply to author
Forward
0 new messages