Booleans in BaseX

3 views
Skip to first unread message

Owen Ambur

unread,
Jul 17, 2023, 6:07:17 PM7/17/23
to Naval Sarda, Jorge Sanchez, William O. Glascoe III, aboutthe...@googlegroups.com, Chris Fox
Naval & Jorge, this reference leads me to believe that Boolean operators can be used in BaseX.

While I haven't given much thought to how they might be useful in my StratML-enabled query service, I have no doubt that they would be.

For example, based upon my exchange with William about EDITH, I was prompted to convert In Silico World's about statement to StratML format as well as the about statement for the Digital Europe Programme.  Subsequently, I conducted a full-text query to identify other organizations pursuing "in silico" testing.  The query retrieves 42 documents but many of them are referencing "in Silicon" Valley.  Thus, it would be nice to be able either to limit the search to the exact phrase "in silico" and/or to exclude "in Silicon".

In other cases, it would be good to be able to query for multiple values in a single query field and/or to query for terms that are synonyms, alternate spellings, and/or otherwise expressions of the same concepts.

I wouldn't expect to do anything about this in the near-term but FYI for future reference as well as any comments or suggestions you may wish to offer. 




On Saturday, July 15, 2023 at 11:55:07 AM EDT, Owen Ambur <owen....@verizon.net> wrote:


William, in follow up to your phone call, the goals and objectives set forth in the draft VHT "roadmap" are now available in StratML format at https://stratml.us/drybridge/index.htm#EDITH

Needless to say, I share your belief that their usage of the StratML standard would strongly support the realization of their values and objectives.

As time permits, I will also convert to StratML format the about statements for https://digital-strategy.ec.europa.eu/en/activities/digital-programme & https://insilico.world/community/


On Friday, July 14, 2023 at 12:13:34 PM EDT, William O. Glascoe III <life...@icloud.com> wrote:


Jorge Sanchez

unread,
Jul 18, 2023, 8:24:33 AM7/18/23
to Owen Ambur, Naval Sarda, William O. Glascoe III, aboutthe...@googlegroups.com, Chris Fox
Hi Owen, 

I don't know exactly what you refer to with Booleans in Basex, but in short terms yes. 
- We can use documents that use xs:boolean as the data type for elements and attributes.
- We can check for boolean elements in documents like if a certain element is true, false, present, etc. We can make a query where we check for example the documents that have a certain or a combination of certain conditions like if the document or a part has three times certain term (like "silico") at a certain point or in tho whole document. 
- We can configure the query to look for silico, silicon or both (https://docs.basex.org/wiki/Full-Text#Fuzzy_Querying).

This is probably the page that you are looking for https://docs.basex.org/wiki/Full-Text

The main problem here is that building a search UI can be as complex and time consuming as you want. Exact match, fuzzy, presence, distance, scoring, ... It is probably better to make a 5 to 10 lines MOSCOW prioritization or something like that. Understand which actors are going to use the search and how. 

Owen, I will probably build an XProc Odt report for StratML in the next weeks. I will keep you posted, I'll share it using Github. I need something like that, at least the main parts (goals, objectives, ..).

Kind Regards.







---- On Tue, 18 Jul 2023 00:04:38 +0200 Owen Ambur <owen....@verizon.net> wrote ---

Naval Sarda

unread,
Jul 18, 2023, 9:32:12 AM7/18/23
to jo...@vionta.net, Owen Ambur, William O. Glascoe III, aboutthe...@googlegroups.com, Chris Fox

Hi Owen,

Intial implementation had exact match implemented by us for searching within baseX. Then you had requested that it should match with data present anywhere in the field. You can decide which way you wish to have.

Boolean is for Yes and No types of field.

Naval Sarda

EpiComm Technologies

Owen Ambur

unread,
Jul 18, 2023, 11:24:03 AM7/18/23
to jo...@vionta.net, Naval Sarda, aboutthe...@googlegroups.com
Naval, I don't understand your response.  Yes, I'm looking for exact matches in this particular instance, regardless of where they occur.  In this full-text query example, I'm looking for "in silico" and not "in Silicon".  However, I can understand if making such a distinction may not be feasible due to the complexities involved and/or the processing capacity required.  Per Jorge's response, we should be thinking about which Boolean search capabilities might be most useful and feasible.

Jorge, I'll look forward to seeing what you can do and learning what we might try to do next to add the most value.  I'll be especially interested to have any suggestions you may wish to offer for enhancements at https://search.aboutthem.info/ -- particularly with respect to how it may or may not meet your requirements.  At some point, I'd like to share our code on GitHub as well.  Based upon my interchanges with Naval, I understand the primary difficulty in properly documenting the code for reuse by others.



Reply all
Reply to author
Forward
0 new messages