Understanding The Semantic In Semantic Web

In a previous article I described how, in order to have a true Semantic Web, we would need to have a user action to direct the program. I would like to examine some of the motivations for this point of view.

Practical Example

When a person, company, program, or any other entity, needs to extract data or information from the internet they will normally follow a set order of actions to do so. As an example, let’s assume a researcher is attempting to gather all the information they can on ‘What happens to lava if it is super cooled while it is still flowing’. The researcher would input this information into a data retrieval program, such as a search engine. The program would use the input to search its data resources and then return a result list based on that input. Depending on the input, the result list will be an array of data of which only a portion will contain the data the researcher is looking for.

If the researcher input the query just as it is posed above the data returned would range from, ‘What to do in Hawaii’, to, ‘Super cooling water’. The reason for this is that most data retrieval systems do not have a contextual filter that determines context while it is being retrieved. Some data retrieval systems will present filters after the data is presented but even then the filters are fairly broad in their context. With the above search an ideal filter matrix would include cryogenics (with a related subset), volcanic (with a related subset), and Science (with a related subset that includes physics). The difficulty in obtaining the desired result is the program being used to retrieve the information has to determine what is being searched for. What, if instead of the above scenario, the search was for ‘super cool places were lava flows’?

Mapping And Directions

Not only is this an issue for the retrieval of data it is also a prime concern for data mapping and data storage. Even if the retrieving program had the proper conveyed concept of the search being made when it started using the data resources at its disposal. What if the resources it attached to couldn’t return the correct data because the data mapping was not able to connect the concept to its structure? It would be like trying to return data on a request for "compatibly opposite person".  Unless there is an exact phrase to return, the system would have to understand ‘that opposites attract’ and have data mapped in such a way that it would connect the relationship.

Data relationships have not yet evolved into conceptual relationships in a way to make it useful in data retrieval. There has been a tremendous amount of study and research including, Basilisk[i], Using Subjective Adjectives in Opinion Retrieval from Blogs[ii], Acquisition of Subjective Adjectives with Limited Resources[iii], Semantic Extraction from Texts[vi], and a number of others. We have yet to implement this research into the developing Web. There have been some advancement and moves including the Microsoft Odata [iv]initiative, RESTful[v] programming, and a long list of other standards based implementations. But, for now, it seems we are mired by the sheer size of the task that lies before us. It seems the research has been done but the implementation is the wall we have yet to climb.

The Future

I foresee a time when data is accessible across all platforms, programs, servers, and borders. In order to get there we have to improve the UI to the person initiating the search so that they can direct the program’s next step. Just as in our everyday life we have to ask for clarification and further explanation, the programming has to be created in such a way as to ask those same questions. If we accept input, as a program, we don’t fully understand we have to return widgets that help the program understand the concept.  We have to map data so it can be flexible enough to understand relationships in a homogenously semantic context.  Red is not just a color and a mirror is not just glass.

It is possible to code a program to understand these differences and to do some pre-filtering before we return a result. The major component in this process is the user, the one doing the search or asking for information. We need to be able to present pertinent exposed filters that can be used to direct the program by the user, much like a conductor orchestrating, in order to achieve the desired outcome. The only way to be sure the program’s conveyed concept is accurate is to allow the user to promote the correct result by using tools we present to them.

Programming can do a lot but only the user can determine if the results are what they are looking for.