When you find the data you are looking for, you will need to understand it quickly. After all, there might be millions of records or more! You will need to understand this vast amount of data, and again, you will need to do so in minutes.
To help you, OvalEdge brings you a special tool called Smart-Catalog. The feature not only catalogues your data automatically, but it also draws out its relationship with other datasets.
As soon you see a table or file, you can look at the summary view of data, this summary view have most frequent questions about the data like its row count, density, expert profile etc.
Data is always changing, and in order to understand any data-driven phenomenon, the first thing you need to do is to look at its past. The timeline view does exactly that, allowing users to see past data summaries with just one click. By using the timeline view, you can see when and how the data has changed, and how it’s evolving over time.
When it comes to dealing with Big Data, it is important to partition tables in a logical category. For instance, you may have billions of sales transactions from all over the country, but most questions are asked in terms of specific states. So, you may want to partition the table by state to avoid unnecessary performance issues.
OvalEdge provides a single-click view, allowing you to see the summary of your data for every partition. You can analyze individual partitions as well.
This is a mighty tool. It can direct your analysis to unknown dimensions. To understand how, consider this example: Say you have customer records in the CRM database, with your clients’ names and home addresses. Relationship Marker tells you that this table is related to another file which is located on a marketing analyst’s shared portal. You find out it contains house prices. You click once and join those two datasets, at which point you instantly have an idea of your customers’ economic standing. Relationship Marker can add a whole new dimension to your analysis.
Data lineage, which is also known as the data lifecycle, explains where data came from and where it moves over time. In order to understand the data, it’s important to see its lineage.
OvalEdge not only keep tracks of data movement on the Hadoop cluster, but it has algorithms that determine this lineage automatically. You can now visualize the whole workflow of the data to better understand how data is used across the organization.
When you have millions of files and tables, it is important to narrow the scope of your analysis and focus on a single business area. By using OvalEdge, you can define the functional category of any dataset. OvalEdge is smart, and it can analyze data automatically and define the functional groups for non-assigned datasets.
A substantial amount of dark data is not stored in a standard format; rather, it’s stored in the form of spreadsheets, PDF files, etc. OvalEdge extracts the data from these files automatically and provides recommendations to correct the data types for analysis. With just one click, you can correct various dates and other mathematical fields. Note that the analysis can only be performed with the correct data types.