Term and markup extraction

49 views
Skip to first unread message

Meguebli Youssef

unread,
Nov 19, 2013, 5:24:03 AM11/19/13
to zemanta-d...@googlegroups.com
Dear all, 

I found recently zemanta api which offer an interesting tool to extract Terms and markup from a raw text. As I understood, Term extraction service  is based on ODP taxonomy but I don't know how it proceeds exactly . So Have you an idea about the technique or approach used and by the way the reference I may mentionned  in my paper because I will publish a work based on this service and I must put some references about the approach used for term Extraction.

For markup extraction, It is a service that extract most important terms using some known sites such wikipedia and youtube  but like Term Extraction I do not have any idea how it works and which approches it used. Please can you give some references about approaches used mainly in the case we use Wikipédia. Indeed, in my work I take only markup that had been deduced based on Wikipédia.

Thank's a lot ! 

Tomaž Šolc

unread,
Nov 19, 2013, 9:12:05 AM11/19/13
to zemanta-d...@googlegroups.com
Hi

On 19. 11. 2013 11:24, Meguebli Youssef wrote:
> Please can you give some references about approaches used mainly
> in the case we use Wikipédia. Indeed, in my work I take only markup that
> had been deduced based on Wikipédia.

Regarding links to Wikipedia, I wrote a paper in 2008 that might help
you. It's probably outdated though:

http://www.tablix.org/~avian/blog/papers/generation_of_intext_hyperlinks.pdf

Also, see this talk given at Wikimania 2008:

https://www.youtube.com/watch?v=SoIfie9srxY

You might also find other, smaller bits of information in old blog posts
either on my blog or on the official Zemanta blog.

Best regards
Tomaž

Meguebli Youssef

unread,
Nov 19, 2013, 11:23:31 AM11/19/13
to zemanta-d...@googlegroups.com
Thank you for your quick answer, but I forgot to ask you also about the adopted method or technique used to extract keywords using the ODP taxonomy ?
Reply all
Reply to author
Forward
0 new messages