What is meant by unstructured data?

Unstructured data is the term used to denote all data that does not sit in rows and columns databases, that is, it is not part of a SQL or Oracle type database. Examples include all types of office documents, as well as emails, scanned images, PDFs, video, CCTV/digital camera footage and even data from internet connected devices and log files (although the latter two are more correctly referred to as “dark data”, in that they are often hidden or difficult to find).

Unstructured data usually has very limited metadata, or none at all. Paper documents will have no metadata, a spreadsheet will have date, title and author information whilst emails will contain more information such as “to”, “from”, “date”, “subject”, alongside the body of the message (for this reason emails are sometimes considered to be semi-structured data).

