Jim Scott

Subscribe to Jim Scott: eMailAlertsEmail Alerts
Get Jim Scott: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Top Stories by Jim Scott

If you’re in business, you have data. And if you’re like a lot of businesses, you have a lot of data. And it’s not only coming from your customers, it’s coming from other business units, partners, in-house applications, the cloud, hardware logs, etc. And that data could help you be better at your business, if only you had the right solution to access it in ways that deliver quantifiable value. One solution is to build an enterprise data hub (EDH) through which all your data flows for processing. Many IT professionals turn to Apache Hadoop as the core component of an EDH, but other technologies can be complementary. For example, a NoSQL database can play an important role in an EDH to help manage the complexities of processing and storing structured data for your organization. Why NoSQL? While RDBMSs still have plenty of useful functions, consider a NoSQL database ... (more)

Analytics in Decision-Making Workflow | @CloudExpo #BigData #Microservices

Putting Analytics into the Decision-Making Workflow with Apache Spark Data-driven businesses use analytics to inform and support their decisions. In many companies, marketing, sales, finance, and operations departments tend to be the earliest adopters of data analytics, with the rest of the business lagging behind. The goal for many organizations now is to make analytics a natural part of most-if not every-employee's daily workflow. Achieving that objective typically requires a shift in the corporate culture, and ready access to user-friendly data analytics tools. Big Data Should... (more)

Apache Spark: A Key to Big Data Initiatives | @CloudExpo #Microservices

Apache Spark continues to gain a lot of traction as companies launch or expand their big data initiatives. There is no doubt that it’s finding a place in corporate IT strategies. The open-source cluster computing framework was developed in the AMPLab at the University of California at Berkeley in 2009 and became an incubated project of the Apache Software Foundation in 2013. By early 2014, Spark had become one of the foundation’s top-level projects, and today it is one of the most active projects managed by Apache. Because Spark was optimized to run in-memory, it is capable of p... (more)

Apache Spark vs. Hadoop | @CloudExpo #BigData #DevOps #Microservices

If you’re running Big Data applications, you’re going to want to look at some kind of distributed processing system. Hadoop is one of the best-known clustering systems, but how are you going to process all your data in a reasonable time frame? Apache Spark offers services that go beyond a standard MapReduce cluster. A choice of job styles MapReduce has become a standard, perhaps the standard, for distributed file systems. While it’s a great system already, it’s really geared toward batch use, with jobs needing to queue for later output. This can severely hamper your flexibility.... (more)

Taking Apache Spark for a Spin | @BigDataExpo #BigData

You might have looked at some of the articles on Apache Spark on the Web and wondered if you could try it out for yourself. While Spark and Hadoop are designed for clusters, you might think you need to have lots of nodes. If you wanted to see what you could do with Spark, you could set up a home lab with a few servers from Ebay. But there’s no rule saying that you need more than one machine just to learn Spark. Today’s multi-core processors are like having a cluster already on your desk. Even better, with a laptop, you can pick up your cluster and take it with you. Try doing that ... (more)