What are Natural and Controlled Languages?

Natural Language is everyday language, meaning that when you search using natural language you are searching by terms you would use when talking to someone. A common example of Natural language searching is Google searching.

Natural language searching brings back high recall but low precision (a huge amount of results, a lot of them not matching the topic you searched)

Positives: Anyone can do it, not all databases use Controlled language, high number or results. 

Negatives: Low precision

Controlled Language is language that is predetermined by an organization. Organizations create a list (thesaurus) of acceptable terms to use for searching. The terms, when used, bring back low recall and high precision (small amount of results, most or all matching search topic).

When teaching a term it will only bring back results that match your term or a hierarchal term. For words that could have multiple meanings, for example, desert does that mean to leave? or a hot sandy region? you can control for this and only receiver results based on the meaning you want to look for by checking the thesaurus and scope notes.

Controlled vocabularies allow for hierarchies. For each term there is a broader term (BT), narrower term (NT), related term (RT), Scope note (SN), used for (UF), and "see" (USE) for most terms.

Ex.
Toes                              
   BT Foot                          

Shoe                              
   NT Running shoe

Athletes
   RT Sports                

Advertising-Bus lines
   UF Bus lines-Advertising              
                                                              (Example retrieved from ERIC Thesaurus https://eric.ed.gov/?)

SN- A note under the subject heading that describes the meaning of the term.

USE-When searching a term that is not a controlled vocabulary term it will have USE listed and the term that is a controlled vocabulary term that will bring back the desired results.

Positives: Consistent representation of the term, hierarchies, high precision

Negatives: Not all organizations use it.


No comments:

Post a Comment