How to search only paintings? API EUROPEANA

62 views
Skip to first unread message

Michele Mallia

unread,
Dec 7, 2017, 1:20:20 PM12/7/17
to Europeana API forum
Hi! How can i make a query to retrieve ONLY PAINTINGS or DRAWINGS (about Picasso, for example) from Europeana Art Collections?

Hugo Manguinhas

unread,
Feb 13, 2018, 7:55:36 AM2/13/18
to Europeana API forum
Dear Michele,

Thank you for you email!

At the moment we don't have yet a way to restrict the search within a specific Europeana Thematic Collection such as the one for Art... you can only do it now on the Collections portal (https://www.europeana.eu/portal/en/collections/art). 

This functionality will be added in the near future, but in the meantime you can still query using dc:type and dc:subject with the values of painting and drawing and the dc:creator as Picasso. I would still advise to use the Collections portal to try out and tailor your query and then reproduce it in the API.

Hope this information is of help and if you have further question, don't hesitate to contact us.

Sincerely,
Hugo Manguinhas

mar...@orgonemedia.nl

unread,
Aug 9, 2018, 12:13:44 PM8/9/18
to Europeana API forum
Hi Hugo / Europeana,

I'm the developer of art-tab.eu and the browser extensions. Currently the project is human-curated, which is not sustainable on the long run. I'm looking into ways to make a live connection to the API, getting a random object each time.
Instead of building various queries by hand, it would be VERY useful if I could directly query the already curated Collections, like Art and Fasion! I can query the dataset, but it doesn't return any items, just info.

Any development on querying Collections directly? You mention the 'near future' :)

Cheers,

Martijn

Fred Leeflang

unread,
Aug 10, 2018, 6:22:45 AM8/10/18
to Europeana API forum
Hi Martijn,

That's a really nice add-on. I've been thinking myself about making something somewhat similar but to combine that with a Wordpress plugin for one of the sites I'm working on, https://knipoogje-kunst.nl.

The plan is that this site will be a portal for starting artists where they can begin to display their artwork 'with a wink' -- this is works of art based on existing art but with a 'wink and a nudge'.

Would  you perhaps be interested in working together on that?

-Fred

Hugo Manguinhas

unread,
Aug 13, 2018, 6:15:31 AM8/13/18
to europe...@googlegroups.com
Dear Martijn,

Thank you for your email and reminder!

We have recently made that feature available on the Search API... we were only planning to "formally" announce it on September together with a blog post, but the feature is already available in production for 1 month and fully ready for use. You can see it in our documentation on the respective page, see: https://pro.europeana.eu/resources/apis/search#get-started (look for the "theme" parameter).


Please try it out and give us your feedback.

Best,
The API Team
 


--
Visit Europeana Labs for API Documentation, Open Datasets, and our Apps Showcase - http://labs.europeana.eu
---
You received this message because you are subscribed to the Google Groups "Europeana API forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to europeanaAPI...@googlegroups.com.
To post to this group, send email to europe...@googlegroups.com.
Visit this group at https://groups.google.com/group/europeanaAPI.
For more options, visit https://groups.google.com/d/optout.

James Morley

unread,
Aug 13, 2018, 7:17:55 AM8/13/18
to europe...@googlegroups.com
Hugo, that is *soooooo* great! I haven't had a chance to test it, but are there any performance hits? I guess this depends on whether the method in effect applies the same underlying query filters in realtime or if the theme values are pushed into the records as facets.

If the latter, more of a curious question than a feature request (I'm sure at some point I'll find a specific use case I can cite!) - is it possible to see themes represented in object records (either search or object api) and is it also possible to facet by them? So if I get a record in a search response can I see what themes it is in? And if I run a search can I see how many results fall into how many themes?

Cheers, James

---
James Morley

Hugo Manguinhas

unread,
Aug 13, 2018, 8:04:30 AM8/13/18
to europe...@googlegroups.com
Hi James,

There are no performance hits between using the theme filter or the underlying the query... we implemented a custom plugin for Solr that basically does query rewriting and it is also connected to the Solr warmup so the time it takes to obtain the results should be instantaneous (unless our Solr cluster is all busy handling other queries).

About relating objects/items to Thematic Collections, so far we have only considered it for the upcoming redesign of the Item page... I will then consider how we could bring it as a feature of the API (still need to see how that would play out in terms of API design). Wrt faceting, it is not possible to facet on it for now since it is not handled as an actual Solr field but a query rewrite... we could still mirror the same functionality on the API layer, something that we need to looking into in terms of feasibility but it would indeed be quite interesting to have.

Best regards,
The API Team

mar...@orgonemedia.nl

unread,
Aug 27, 2018, 4:27:00 PM8/27/18
to Europeana API forum
Hi API Team :)

that's indeed great news! I was unable to get to a satifying result using the normal search query, the results were too unreliable in quality. I'm looking into changing the Art Up Your Tab browser extension to use the API directly, instead of loading from curated results that have to be maintained by a human and needs a hosting provider for the CMS.

TL;DR I need your help :)

So the idea is to load a random image from a theme, probably the 'art' theme or a theme you can choose. However, it turns out to be difficult to get a truly random one out of the entire collection. I use "&rows=10&start=[random page out of total pages]", but since the images are grouped by a certain provider, a set of 10 can be very similar at thus boring to see after each other. The biggest issue and basically showstopper: the "start" pagination can't go past the first 1000 records, so it seems I'm stuck forever to a limited set of the first 1000 items in the theme.
I could use the "cursor" option, but then I can't randomly query some page in the collection and go only linear through it, which is too boring, even when I shuffle the 100 or 200 images I could fetch and cache.

Since you know the API best, is there any other way you would see to get a 'random' ordering in the query or a random query result?
If not, the only option I see to have is too build a hosted harverster that gets many results and randomizes them, which would defeat the purpose of a completely stand-alone browser extension.

Few other questions /  remarks:
1. There's no fashion theme, is it coming or not selected for the API?
2. I use https://www.europeana.eu/api/v2/search.json?wskey=xxx&query=*:*&theme=art&profile=rich&media=true&qf=IMAGE_SIZE:large&qf=IMAGE_SIZE:medium, but still some results don't have an emdPreview image. It might be just inconsistency in the data, but can you confirm the "&profile=rich&media=true&qf=IMAGE_SIZE:large&qf=IMAGE_SIZE:medium" params I added work together with a "theme" query? Its supposed to return only items that have an image and of a specific minimum size. It doesn't happen that often though and I can make a workaround, just wondering which params still with together with the "theme" option.

Amazing work guys.

Thanks,

Martijn

mar...@orgonemedia.nl

unread,
Aug 27, 2018, 4:33:02 PM8/27/18
to Europeana API forum
Hi Fred,

an excellent idea would be a Wordpress plugin with which you could search the Europeana db and insert images and caption info from it in pages, with a deeplink to the source, but I'm unsure what you exactly mean to do.
Sadly I have no time to do such a co-op (holidays are over;), but feel free to ask me questions about ways to integrate the API with Wordpress. Then just PM me.

Greetz,

Martijn

Maarten Brinkerink

unread,
Aug 28, 2018, 1:26:52 AM8/28/18
to europe...@googlegroups.com
Hi Martijn,

I believe this already existed at some point. Can EF people pitch in here?

Best,

Maarten

Verstuurd vanaf mijn tablet / Sent from my tablet
--

mar...@orgonemedia.nl

unread,
Aug 28, 2018, 4:54:27 AM8/28/18
to Europeana API forum
Hi Maarten,

you're right, there's an old plugin that does something like that.
Who are EF people and who is pitching?

Gr,
Martijn

Maarten Brinkerink

unread,
Aug 28, 2018, 4:55:43 AM8/28/18
to europe...@googlegroups.com
Europeana Foundation. If they know the details?

Hugo Manguinhas

unread,
Aug 28, 2018, 11:57:25 AM8/28/18
to europe...@googlegroups.com
Dear Martijn,

Thank you for your email!

I am afraid there isnt a out-of-the-box solution to do what you are looking for... however, you can play with the sort and choose the timestamp_updated field but that will not be as dynamic as you might need. Another way is to still go with the approach you have tried but first perform a facet search on the DATA_PROVIDER field and then randomly select one to filter on... at least you would get results from different providers.

Hope that helps!

(In the meantime, I will consider adding that feature to our sorting options since Solr has support for it https://lucene.apache.org/solr/4_6_0/solr-core/org/apache/solr/schema/RandomSortField.html )

Kind regards,
The API team


Hugo Manguinhas

unread,
Aug 28, 2018, 12:00:10 PM8/28/18
to europe...@googlegroups.com
Dear Maarten and Martijn, 

I must say that I am a bit lost with the Wordpress discussion... could you post a separate thread?

Thanks in advance.

Best,
The API Team

mar...@orgonemedia.nl

unread,
Aug 28, 2018, 12:13:36 PM8/28/18
to Europeana API forum
Hi Hugo,

I've played around with data providers before, but it still results in getting too many of the same type of images after each other, while I think it's nicest if you get totally different kind of objects each time.
Anyway, I have a few tricks up my sleeve to try with caching and shuffling, and already found the sort option to be helpful. 

I was thinking a random query option might be available in Solr, it would be the best for me and save an immense amount of time to come up with some scraper functionality.

About the Wordpress discussion: yes I'm confused too, just wanted to reply that he can PM me to keep it out of this thread.

Thanks so much for your quick repsonse.

Gr,

Martijn

Maarten Brinkerink

unread,
Aug 28, 2018, 12:17:22 PM8/28/18
to europe...@googlegroups.com
My point was that such a WP plug-in as proposed as an application on top op the Europeana API, already exists. And I asked for help from the EF staff to point Martijn to this, because I don’t remember the details.


Verstuurd vanaf mijn tablet / Sent from my tablet

Gordea Sergiu

unread,
Sep 5, 2018, 8:03:46 AM9/5/18
to europe...@googlegroups.com

Hi Martijn, Hugo,

 

I think there are use cases where we would like to use a random sort for items.

Solr has a built in mechanism to support random sort, but a schema update and a reindexing is required to have everything in place:

https://lucene.apache.org/solr/4_6_0/solr-core/org/apache/solr/schema/RandomSortField.html

 

This can still be combined with filtering results with data quality measures, like the completeness level (e.g one can ask for items with completeness > 8 and to sort by random field)

 

I think it is worth to take in consideration the random sort for the next versions of search api.

 

 

BR,

 

SERGIU GORDEA

mar...@orgonemedia.nl

unread,
Sep 5, 2018, 8:14:02 AM9/5/18
to Europeana API forum
Hi Sergui,

that's great news and I think the best solution for the next version of the Art Up Your Tab extension. A fall-back plan is to cache more query results locally and randomize them there, but the biggest issue remains that I can only query the 'first' 1000 items, since pagination doesn't go beyond that.

I will await your news of when and if it can be released.

Thank you for your input!

Gr,

martijn

To post to this group, send email to europ...@googlegroups.com.

--
Visit Europeana Labs for API Documentation, Open Datasets, and our Apps Showcase - http://labs.europeana.eu
---
You received this message because you are subscribed to the Google Groups "Europeana API forum" group.
To unsubscribe from this group and stop receiving emails from it, send an email to europeanaAPI...@googlegroups.com.

To post to this group, send email to europ...@googlegroups.com.

James Morley

unread,
Sep 5, 2018, 10:04:34 AM9/5/18
to europe...@googlegroups.com
Hi, if you're caching potentially large datasets, have you looked at the cursor based pagination? 


It's what I have used for harvesting the content for https://culturepics.org/on-this-day as the date metadata isn't comprehensive enough to make live calls (and I do a lot of data processing anyway to enable some of the features of the site)

Best, James

---
James Morley

So the idea is to load a random image from a theme, probably the 'art' theme or a theme you can choose. However, it turns out to be difficult to get a truly random one out of the entire collection. I use "&rows=10&start=[random page out of total pages]", but since the images are grouped by a certain provider, a set of 10 can be very similar at thus boring to see after each other.. The biggest issue and basically showstopper: the "start" pagination can't go past the first 1000 records, so it seems I'm stuck forever to a limited set of the first 1000 items in the theme.

mar...@orgonemedia.nl

unread,
Sep 5, 2018, 3:11:32 PM9/5/18
to Europeana API forum
Hi James,

thanks for the input, great initiative this culture pic of the day, I love it!

I did consider and tried cursor based, but the goal is to have a standalone browser extension without the need for hosting. I can do limited harvesting /caching in localStorage locally, but the items are very grouped together by provider or type, which results in too much of the same. I'd like to achieve the effect of total surprise each time you open a new tab, so it's needs to be very random. Current the existing browser extension is human curated, but we ran out of funds. So I'm looking into a way to make it a no cost stand alone thing so it can run forever :)
Still, you are right, I will achieve it with cursor based harvesting, it's just a lot more work and doesn't make sense if there's an option to randomize the query results from the source.

I'll wait a bit for an api update and we keep running human style for now :)

Grtz,

Martijn
Reply all
Reply to author
Forward
0 new messages