Semi-structured Data and Effective Data Management

Irrespective of the information access challenges that exist within organizations, all organizations either do or should understand the value of effective data management to ensure accurate and valid analytics access. How organizations get there, however, and the types of data they leverage to do so will differ based on business requirements, corporate culture and technical environments. As data management becomes more complex with businesses managing disparate data sources from both internal and external data sources, companies require a way to take advantage of their current and previous investments in information management.

What this means for organizations is that ripping and replacing analytical output at every stage of upgrade may not be a realistic approach to a business intelligence strategy. In some cases, reports or other documents might exist that aren’t being taken into account to leverage as a data source. Instead, organizations look for ways to redesign solutions to take into account broader data needs, but may be overlooking ways to access current information.

With an increasing focus on automated data preparation and broader data management, organizations are beginning to look at their information management investments more broadly. Companies can now store semi-structured data and take advantage of big data platforms, leverage cloud applications and pull together on-premises and cloud-based data for better insight into operations, strategy, customers, partners, and suppliers. With all of these complexities, it can be difficult to remember that some types of outputs can also be leveraged as inputs for other applications. Due to the ability to store and leverage more diverse data sets, organizations can also leverage some of their outputs as a data source.

Keeping these factors in mind, there may be some data sources and documents that can be better leveraged by organizations. With mature data prep tools and diverse data storage options, it becomes possible to look at information access more broadly and identify whether there are key assets that aren’t being leveraged but that are required to make better business insights. Diverse market options let organizations pick and choose what works best within their environments. The same can be said for the way in which data sources are defined and used. With the ability to access semi-structured content, it becomes possible for businesses to leverage PDFs and other curated content, whether general documents or pre-built reports. As long as information can be validated and quality controls ensured, then it makes sense for companies to evaluate their data assets more broadly to identify all of the valid data areas required to support better information visibility and decision making.

There are many cases where an organization uses market data or reports sourced from data providers and getting access to the source data is either challenging at a technical level or can be provided at an extra cost to the organization. In cases such as these, it may make sense to leverage the report components as opposed to creating a new data source.

In general, data prep is expanding into its own market, giving organizations access to more tools that help create automated processes and manage the expanding needs of organizations. Coupling this with big data storage lets organizations look at more diverse data and even take advantage of previously created content and reports. Natural Language Processing (NLP) and text analytics can decipher textual outputs on the Web or within documents. Many organizations overlook the fact that some of these outputs are reports that can also be leveraged. Whether internal or provided by external sources, as organizations explore the ability to leverage more diverse data sources, some of those sources might include reports provided by data providers in report format, or other report outputs that are needed but getting to the source data is not available in the short-term or can be leveraged as its own source.


About the author

Lyndsay Wise is the research director, business intelligence at Enterprise Management Associates (EMA). She joined EMA in 2015 as research director for business intelligence (BI) and data warehousing. She is focused on data integration, data governance, cloud technologies, data visualization, analytics, and collaboration. Lyndsay has more than 10 years of experience in software research, BI consulting, and strategy development, specializing in software evaluation and best-fit solution selection.