23 Jun Big-Data is the real-thing…
Big-Data is the real-thing… Data is exploding at an astounding rate. Big data refers to huge data sets characterized by larger volumes (by orders of magnitude) and by greater variety and complexity (a mix of structured and unstructured data), generated at a higher velocity than your organization has faced before. According to Cisco, by 2015, nearly 15 billion connected devices—including 3 billion Internet users plus machine-to-machine connections—will contribute to the flood of big data.
Both structured data (such as transactions) and unstructured data (text, images, video, and more) are growing. However, unstructured data, which is heterogeneous and variable in nature, is growing faster. As a new, relatively untapped source of insight, unstructured data analytics can reveal important interrelationships that were previously difficult or impossible to determine.
Big data can help you beat the competition
Big data analytics is a technology-enabled strategy for gaining richer, deeper, and more accurate insights into customers, partners, and the business—and ultimately gaining competitive advantage. By processing a steady stream of real-time data, organizations can turn insights into actions to make time-sensitive decisions faster than ever before, monitor emerging trends, course-correct rapidly, and jump on new business opportunities.
Big data use cases are already being applied across industries. For example, with big data analytics, healthcare and life sciences research organizations can gain insight into patient treatment options, telecom companies can better manage costs, retailers can connect better with customers by understanding user behavior and buying patterns, marketers can mine sentiment data to evaluate brand reputation, and energy companies can better understand and reduce consumption.
New technologies make big data possible
Big data is a disruptive force presenting opportunities as well as challenges to IT organizations. Traditional data processing and analytical solutions can’t handle the massive scale, speed, or heterogeneity of big data. To successfully derive value from big data, organizations need new ways to harness and mine it for insight. Increased horsepower and storage capabilities built into today’s standard servers plus emerging technologies such as the Apache Hadoop*framework redefine the way data is managed and analyzed by leveraging the power of a distributed grid of computing resources.
These technologies utilize:
• Infrastructure architecture and distributed processing frameworks that scale for large, data-intensive jobs
• Cost-effective, efficient storage that can handle terabytes and petabytes of data and support intelligent capabilities that reduce your data footprint such as data compression, automatic data tiering, and data deduplication
• Network infrastructure that can quickly import large data sets and replicate that data for processing
• Security capabilities that protect highly distributed infrastructure and data
• The Apache Hadoop* framework is an emerging standard for big data
• The Apache Hadoop* framework is emerging as a standard for gaining insight from unstructured big data. Hadoop* is an open-source framework that uses a simple programming model to enable distributed processing of large data sets on clusters of standard computers. The complete technology stack includes common utilities; a distributed file system; analytics and data storage platforms; and an application layer that manages distributed processing, parallel computation, workflow, and configuration management. In addition to offering high availability, Hadoop is more cost-effective for handling large unstructured data sets than conventional approaches, and it offers massive scalability and speed.
• Hadoop runs on standard servers, changing the economics of big data
• Hadoop is a distributed processing framework that runs on clusters of standard servers rather than expensive high-end machines. Using standard servers helps make big data analytics more cost-effective than you might think. For example, the cost of storage in traditional data warehouse systems is typically tens of thousands of dollars per terabyte. With Hadoop utilizing X86 -based servers, that cost can be reduced to hundreds of dollars per terabyte. Plus, depending on the scope of your project, your cluster can start small with the option to add more servers as your data needs grow, making it possible to space out your capital investments.