Pig is a platform for creating query execution routines on large, distributed data sets. The scripting language used is called Pig Latin (No, I didn’t make it up, believe me). Pig is supposedly easy to understand and learn. But my question is how many of these can one learn?
Connection analytics is the one that helps to discover these interrelated connections and influences between people, products, and systems within a network or even combining data from multiple networks.
A tool for moving data from Hadoop to non-Hadoop data stores like data warehouses and relational databases.
A free and open source real-time distributed computing system. It makes it easier to process unstructured data continuously with instantaneous processing, which uses Hadoop for batch processing.
You have SaaS, PaaS and now DaaS which stands for Data-as-a-Service. DaaS providers can help get high quality data quickly by by giving on-demand access to cloud hosted data to customers.
It is an approach to data management that allows an application to retrieve and manipulate data without requiring technical details of where it stored and how it is formatted etc. For example, this is the approach used by social networks to store our photos on their networks.
1 followed by 27 zeroes and this is the size of the digital universe tomorrow.
Now that Big Data has become sexy, people just start adding adjectives to Data to come up with new terms like dark data, dirty data, small data, and now smart data. Come on guys, give me a break, Dirty data is data that is not clean or in other words inaccurate, duplicated and inconsistent data. […]
Business intelligence (BI) is an umbrella term that includes the applications, infrastructure and tools, and best practices that enable access to and analysis of information to improve and optimize decisions and performance.