What is Structured Data?

Structured data refers to any data that resides in a fixed field within a record or file. This includes data contained in relational databases and spreadsheets.

Characteristics of Structured Data

Structured data first depends on creating a data model – a model of the types of business data that will be recorded and how they will be stored, processed and accessed. This includes defining what fields of data will be stored and how that data will be stored: data type (numeric, currency, alphabetic, name, date, address) and any restrictions on the data input (number of characters; restricted to certain terms such as Mr., Ms. or Dr.; M or F).

Structured data has the advantage of being easily entered, stored, queried and analyzed. At one time, because of the high cost and performance limitations of storage, memory and processing, relational databases and spreadsheets using structured data were the only way to effectively manage data. Anything that couldn’t fit into a tightly organized structure would have to be stored on paper in a filing cabinet.

 

Managing Structured Data

Structured data is often managed using Structured Query Language (SQL) – a programming language created for managing and querying data in relational database management systems. Originally developed by IBM in the early 1970s and later developed commercially by Relational Software, Inc. (now Oracle Corporation).

Structured data was a huge improvement over strictly paper-based unstructured systems, but life doesn’t always fit into neat little boxes. As a result, the structured data always had to be supplemented by paper or microfilm storage. As technology performance has continued to improve, and prices have dropped, it was possible to bring into computing systems unstructured and semi-structured data.

Unstructured and Semi-Structured Data

Unstructured data is all those things that can’t be so readily classified and fit into a neat box: photos and graphic images, videos, streaming instrument data, webpages, PDF files, PowerPoint presentations, emails, blog entries, wikis and word processing documents.

Semi-structured data is a cross between the two. It is a type of structured data, but lacks the strict data model structure. With semi-structured data, tags or other types of markers are used to identify certain elements within the data, but the data doesn’t have a rigid structure. For example, word processing software now can include metadata showing the author’s name and the date created, with the bulk of the document just being unstructured text. Emails have the sender, recipient, date, time and other fixed fields added to the unstructured data of the email message content and any attachments. Photos or other graphics can be tagged with keywords such as the creator, date, location and keywords, making it possible to organize and locate graphics. XML and other markup languages are often used to manage semi-structured data.

Structured Data Technology Standards

SQL has been a standard of the American National Standards Institute since 1986. It is managed by InterNational Committee for Information Technology Standards (INCITS) Technical Committee DM 32 – Data Management and Interchange.  The committee has two task groups, one for databases and the other for metadata. HP, CA, IBM, Microsoft, Oracle, Sybase (SAP) and Teradata all participate, as well as several federal government agencies. Both of the committee project documents have links to further information on each project. SQL became an International Organization for Standards (ISO) standard in 1987. The published standards are available for purchase from the ANSI eStandards Store, under the INCITS/ISO/IEC 9075 classification.

I used referenced web site : webopedia.com

We have a few referrer link, like data model etch.

You can see the post at below link;

http://www.webopedia.com/TERM/S/structured_data.html

Thanks for this post : Vangie Beal

What’s the Difference Between Structured & Unstructured Data

If left unmanaged, your data can become overwhelming, making it difficult to procure information you need when you need it. While software is designed to address archiving, e-discovery, compliance, etc., the overarching goal is most always the same: to make managing and maintaining data a  feasible task. In this post, you’ll see two types of data you’re accustomed to working with, paying close attention to the differences between structured and unstructured data.

Data Structured vs Unstructured Data

 

What is Structured Data?

Before getting into unstructured data, you need to have an understanding for its structured counterpart. Structured data (as explained succinctly in Big Data Republic’s video) is information, usually text files, displayed in titled columns and rows which can easily be ordered and processed by data mining tools. This could be visualized as a perfectly organized filing cabinet where everything is identified, labeled and easy to access. Most organizations are likely to be familiar with this form of data and already using it effectively, so let’s move on to the hotter question.

What is Unstructured Data?

Believe it or not, your database of structured information doesn’t even contain half of the information available for your use! Seth Grimes, a leading industry analyst on the confluence of structured and unstructured data sources, published an article that stated, “80% of business-relevant information originates in unstructured form, primarily text.”  This may seem like an outlandish percentage, but don’t jump to conclusions too fast. We’re just getting started.

Now that you have a grasp on structured data, it will be much easier to understand what unstructured data is. Unstructured data, usually binary data that is proprietary, is that which has no identifiable internal structure. It can be visualized as a level 5 hoarder’s living room; it’s a massive unorganized conglomerate of various objects that are worthless until identified and stored in an organized fashion. Once this organization process has taken place (through the use of specialized software), the items can then be searched through and categorized (to an extent) for obtaining insights. While data mining tools might not be equipped to parse information in email messages (however organized it may be), you may have very good reason to collect and categorize data from this source. This illustrates the importance and plausible breadth of unstructured data.

Email Has Structure, Right?

The term “unstructured” has faced major scrutiny for several reasons. One argument is that although some form of structure is not formally identified, it can still be implied and therefore should not be labeled as “unstructured.” The counter-point states that if data has some form of structure but is not helpful to the processing task at hand, it may still be characterized as “unstructured.” So, while email messages may contain information that does have some implied structure, we can label the information as “unstructured” because normal data mining tools aren’t equipped to parse it. Alas, both sides of the argument persist.

Unstructured Data Types

Unstructured data is raw and unorganized and organizations store it all. Ideally, all of this information would be converted into structured data however, this would be costly and time consuming. Also, not all types of unstructured data can easily be converted into a structured model. For example, an email holds information such as the time sent, subject, and sender (all uniform fields), but the content of the message is not so easily broken down and categorized. This can introduce some compatibility issues with the structure of a relational database system.

In case you’re still not quite sure what we mean, here is a limited list of types of unstructured data:

  • Emails
  • Word Processing Files
  • PDF files
  • Spreadsheets
  • Digital Images
  • Video
  • Audio
  • Social Media Posts

Looking at the list, you may be wondering what these files have in common. The files listed above can be stored and managed without the format of the file being understood by the system. This allows them to be stored in an unstructured fashion because the contents of the files are unorganized.

The big data industry is growing but the problem of unstructured data going unused has been identified by organizations. Better yet, technologies and services are being developed in reaction. Darin Stewart of InformationWeek said in a recent article about big data, “The age of information overload is slowly drawing to a close. Enterprises are finally getting comfortable with managing massive amounts of data, content and information. The pace of information creation continues to accelerate, but the ability of infrastructure and information management to keep pace is coming within sight. Big Data is now considered a blessing rather than a curse.”

I used source sherpa’s web site.

You can see this post on their website :

What’s the Difference Between Structured & Unstructured Data?