T. 0843 289 1054

T. 0843 289 1054

Overcoming the Challenge of Unstructured Information

Virtuoso Partners > Resources  > Blogs  > Overcoming the Challenge of Unstructured Information

Overcoming the Challenge of Unstructured Information

According to AIIM research, 75% of the organizations we surveyed view digital transformation as “important” or “very important” to their organization. Survey respondents point to techniques Like advanced data capture, machine Learning, and process automation to provide the powerful potential to reengineer and improve core business processes.

The trouble, however, is that that the majority of information capture and content management
solutions on the market have been built to work with highly-structured and pre-determined
information and workflows. Feedback from our AIIM community of practitioners tells us that working with unstructured information is one of the biggest barriers to digital transformation.

Structured and Unstructured Information

So how can you begin to overcome the challenge? One place to start is to differentiate between structured and unstructured information clearly. We can do this in several key ways – and we need to because how we manage information is significantly influenced by whether it is structured or unstructured.

Structured information has a fixed structure, hence the name, and refers to information that
consists fundamentally of columns and rows of data in a table, or spread across several or many
Linked tables. A spreadsheet is a simple example of this. A form could also be considered structured insofar as the purpose of most forms is to gather information that is then put into this
sort of a structure. Most structured data is stored and managed in a database. In fact, most
information repositories are a combination of this sort of structured data and someplace to store
the binary files associated with them.

In contrast, unstructured information is much more variable both in format and in content. Consider a contract, or a project initiation document, or a personnel review. Each of these simple examples might be created or captured in a variety of formats. While each might have some rules that guide their content, all of these documents will vary greatly in terms of their form, format, content and context to the business.

Capturing Structured and Unstructured Information

Capturing structured information is accomplished in several ways. Data can be input manually or
extracted from structured forms. It can also be extracted through some sort of structured output
from another system – for example, an HR or accounting application. There may be some requirement to transform the data from one syntax to another, but structured applications in the form of databases are designed to ingest structured content and apply appropriate access controls, business rules and Logic, and lifecycle management.

But capturing unstructured information is more challenging. A common example is email. Email may appear to have structure and context, as it is addressed to people and sits in an inbox, or maybe in a filing category in the inbox or private folders. But the emails in a user’s email system are
not controlled. There are no rules to the retention and disposition of the information over time.
Current practice is usually to send emails to those who need it, and more often than not, also to
those who may only be interested in the content. This creates many copies and reduces the
likelihood and possibility of control. Effective information management provides a clear policy and structure, and the ability to capture and save all types of unstructured information so it can be protected, retained, and searched.

Facing File Formats

How we capture and manage unstructured digital information is closely tied to the file format used
to store it. Most organizations don’t give much thought to the file formats used to store their
information – and this can cause problems in the short and Long term. Many file formats are highly
proprietary and can only be manipulated using a specific software application, or even a specific
version of that application. When formats are Less proprietary, such that more applications can
interact with them, the resulting files may not be 100% compatible with each other and with every
application.

The better approach is to determine the appropriate file formats for creating and/or capturing
information based on a number of factors. Who is the intended audience? Are there any specific
regulatory requirements to maintain information in a certain format, or in a non-proprietary,
open, or standard format? And perhaps most importantly, what’s the value of the information over time?

Download Article: Overcoming the Challenge of Unstructured Information

___________
Source: AIIM