(ju_see/Shutterstock)
With just over a week left on the 2019 calendar, its now time for predictions. Well run several stories featuring the 2020 predictions of industry experts and observers in the field. It all starts today with what is arguably the most critical aspect of the big data question: The data itself.
Theres no denying that Hadoop had a rough year in 2019. But is it completely dead? Haoyuan HY Li, the founder and CTO of Alluxio, says that Hadoop storage, in the form of the Hadoop Distributed File System (HDFS) is dead, but Hadoop compute, in the form of Apache Spark, lives strong.
There is a lot of talk about Hadoop being dead, Li says. But the Hadoop ecosystem has rising stars. Compute frameworks like Spark and Presto extract more value from data and have been adopted into the broader compute ecosystem. Hadoop storage (HDFS) is dead because of its complexity and cost and because compute fundamentally cannot scale elastically if it stays tied to HDFS. For real-time insights, users need immediate and elastic compute capacity thats available in the cloud. Data in HDFS will move to the most optimal and cost-efficient system, be it cloud storage or on-prem object storage. HDFS will die but Hadoop compute will live on and live strong.
As HDFS data lake deployments slow, Cloudian is ready to swoop in and capture the data into its object store, says Jon Toor, CMO of Cloudian.
In 2020, we will see a growing number of organizations capitalizing on object storage to create structured/tagged data from unstructured data, allowing metadata to be used to make sense of the tsunami of data generated by AI and ML workloads, Toor writes.
The end of one thing, like Hadoop, will give rise the beginning of another, according to ThoughtSpot CEO Sudheesh Nair.
(Swill Klitch/Shutterstock)
Over the last 10 years or so, weve seen the rise, plateau, and the beginning of the end for Hadoop, Nair says. This isnt because Big Data is dead. Its exactly the opposite. Every organization in the world is becoming a Big Data company. Its a requirement to operate in todays business landscape. Data has become so voluminous, and the need for agility with this data so great, however, that organizations are either building their own data lakes or warehouses, or going directly to the cloud. As that trend accelerates in 2020, well see Hadoop continue to decline.
When data gets big enough, it exerts a gravitational-like force, which makes it difficult to move, while also serving to attract even more data. Understanding data gravity will help organizations overcome barriers to digital transformation, says Chris Sharp, CTO of Digital Realty.
Data is being generated at a rate that many enterprises cant keep up with, Sharp says. Adding to this complexity, enterprises are dealing with data both useful and not useful from multiple locations that is hard to move and utilize effectively. This presents enterprises with a data gravity problem that will prevent digital transformation initiatives from moving forward. In 2020, well see enterprises tackle data gravity by bringing their applications closer to data sources rather than transporting resources to a central location. By localizing data traffic, analytics and management, enterprises will more effectively control their data and scale digital business.
All things being equal, its better to have more data than less of it. But companies can move the needle just by using available technology to make better use of the data they already have, argues Beaumont Vance, the director of AI, data science, and emerging technology at TD Ameritrade.
As companies are creating new data pools and are discovering better techniques to understand findings, we will see the true value of AI delivered like never before, Vance says. At this point, companies are using less than 20% of all internal data, but through new AI capabilities, the remaining 80% of untapped data will be usable and easier to understand. Previous questions which were unanswerable will have obvious findings to help drive massive change across industries and societies.
Big data is tough to manage. What if you could do AI with small data? You can, according to Arka Dhar, the CEO of Zinier.
Going forward, well no longer require massive big data sets to train AI algorithms, Dhar says. In the past, data scientists have always needed large amounts of data to perform accurate inferences with AI models. Advances in AI are allowing us to achieve similar results with far less data.
(Drendan/Shutterstock)
How you store your data dictates what you can do with it. You can do more with data stored in memory than on disk, and in 2020, well see organizations storing more data on memory-based systems, says Abe Kleinfled, the CEO of GridGain.
In 2020, the adoption of in-memory technologies will continue to soar as digital transformation drives companies toward real-time data analysis and decision-making at massive scale, Kleinfled says. Lets say youre collecting real-time data from sensors on a fleet of airplanes to monitor performance and you want to develop a predictive maintenance capability for individual engines. Now you must compare anomalous readings in the real-time data stream with the historical data for a particular engine stored in the data lake. Currently, the only cost-effective way to do this is with an in-memory data integration hub, based on an in-memory computing platform like Apache Ignite that integrates Apache Spark, Apache Kafka, and data lake stores like Hadoop.2020 promises to be a pivotal year in the adoption of in-memory computing as data integration hubs continue to expand in enterprises.
Big data can make your wildest business dreams come true. Or it can turn into a total nightmare. The choice is yours, say Eric Raab and Kabir Choudry, vice presidents at Information Builders.
Those that have invested in the solutions to manage, analyze, and properly action their data will have a clearer view of their business and the path to success than has ever been available to them, Raab and Choudry write. Those that have not will be left with a mountain of information that they cannot truly understand or responsibly act upon, leaving them to make ill-informed decisions or deal with data paralysis.
Lets face it: Managing big data is hard. That doesnt change in 2020, which will bring a renewed focus on data orchestration, data discovery, data preparation, and model management, says Todd Wright, head of data management and data privacy solutions at SAS.
(a-image/Shutterstock)
According to the World Economic Forum, it is predicted by 2020 that the amount of data we produce will reach a staggering 44 zettabytes, Wright says. The promise of big data never came from simply having more data and from more sources but by being able to develop analytical models to gain better insights on this data. With all the work being done to advance the work of analytics, AI and ML, it is all for not if organizations do not have a data management program in place that can access, integrate, cleanse and govern all this data.
Organizations are filling up NVMe drives as fast as they can to help accelerate the storage and analysis of data, particularly involving IoT. But doing this alone is not enough to ensure success, says Nader Salessi, the CEO and founder of NGD Systems.
NVMe has provided a measure of relief and proven to remove existing storage protocol bottlenecks for platforms churning out terabytes and petabytes of data on a regular basis, Salessi writes. Even though NVMe is substantially faster, it is not fast enough by itself when petabytes of data are required to be analyzed and processed in real time. This is where computational storage comes in and solves the problem of data management and movement.
Data integration has never been easy. With the ongoing data explosion and expansion of AI and ML use cases, it gets even harder. One architectural concept showing promise is the data fabric, according to the folks at Denodo.
Through real-time access to fresh data from structured, semi-structured and unstructured data sets, data fabric will enable organization to focus more on ML and AI in the coming year, the Denodo company says. With the advancement in smart technologies and IoT devices, a dynamic data fabric provides quick, secure and reliable access to vast data through logical data warehouse architecture. Thus, facilitating AI-driven technologies and revolutionizing businesses.
Seeing how disparate data sets are connected using semantic AI and enterprise knowledge graphs (EKG) provide other approaches for tackling the data silo problem, says Saurav Chakravorty, the principal data scientist at Brillio.
An organizations valuable information and knowledge is often spread across multiple documents and data silos, creating big headaches for a business, Chakravorty says. EKG will allow organizations to do away with semantic incoherency in fragmented knowledge landscape. Semantic AI with EKG complement each other and can bring great value overall to enterprise investments in data lake and big data.
2020 holds the potential to be a breakout year for storage-class memory, argues Charles Fan, the CEO and co-founder of MemVerge.
With an increasing demand from data center applications, paired with the increased speed of processing, there will be a huge push towards a memory-centric data center, Fan says. Computing innovations are happening at a rapid pace, with more and more computation techfrom x86 to GPUs to ARM. This will continue to open up new topology between CPU and memory units. While architecture currently tends to be more disaggregated between the computing layer and the storage layer, I believe we are headed towards a memory-centric data center very soon.
We are rapidly moving toward a converged storage and processing architecture for edge deployments, says Bob Moul, CEO of machine data intelligence platform Circonus.
Gartner predicts there will be approximately 20 billion IoT-connected devices by 2020, Moul says. As IoT networks swell and become more advanced, the resources and tools that managed them must do the same. Companies will need to adopt scalable storage solutions to accommodate the explosion of data that promises to outpace current technologys ability to contain, process and provide valuable insights.
Dark data will finally see the light of day in 2020, according to Rob Perry, the vice president of product marketing at ASG Technologies.
(PictureDragon/Shutterstock)
Every organization has islands of data, collected but no longer (or perhaps never) used for business purposes, Perry says. While the cost of storing data has decreased dramatically, the risk premium of storing it has increased dramatically. This dark data could contain personal information that must be disclosed and protected. It could include information subject to Data Subject Access Requests and possible required deletion, but if you dont know its there, you cant meet the requirements of the law. Though, this data could also hold the insight that opens up new opportunities that drive business growth. Keeping it in the dark increases risk and possibly masks opportunity. Organizations will put a new focus on shining the light on their dark data.
Open source databases will have a good year in 2020, predicts Karthik Ranganathan, founder and CTO at Yugabyte.
Open source databases that claimed zero percent of the market ten years ago, now make up more than 7%, Ranganathan says. Its clear that the market is shifting and in 2020, there will be an increase in commitment to true open source. This goes against the recent trend of database and data infrastructure companies abandoning open source licenses for some or all of their core projects. However, as technology rapidly advances it will be in the best interest of database providers to switch to a 100% open source model, since freemium models take a significantly longer period of time for the software to mature to the same level as a true open source offering.
However, 2019 saw a pull back away from pure open source business models from companies like Confluent, Redis, and MongoDB. Instead of open source software, the market will be responsive to open services, says Dhruba Borthakur, the co-founder and CTO of Rockset.
Since the public cloud has completely changed the way software is delivered and monetized, I predict that the time for open sourcing new, disruptive data technologies will be over as of 2020, Borthakur says. Existing open-source software will continue to run its course, but there is no incentive for builders or users to choose open source over open services for new data offerings..Ironically, it was ease of adoption that drove the open-source wave, and it is ease of adoption of open services that will precipitate the demise of open source particularly in areas like data management. Just as the last decade was the era of open-source infrastructure, the next decade belongs to open services in the cloud.
Related Items:
2019: A Big Data Year in Review Part One
2019: A Big Data Year in Review Part Two
Read more:
Big Data Predictions: What 2020 Will Bring - Datanami
- Wyplay’s Digital TV Middleware Source Code is Now Available to Members of the Frog by Wyplay Community [Last Updated On: January 5th, 2014] [Originally Added On: January 5th, 2014]
- Find Open Source Alternatives to commercial software | Open ... [Last Updated On: January 5th, 2014] [Originally Added On: January 5th, 2014]
- Open Source Initiative - Official Site [Last Updated On: January 5th, 2014] [Originally Added On: January 5th, 2014]
- SCALE 11x: Evolution of an Open Source Software Foundation - Stephen Walli - Video [Last Updated On: January 5th, 2014] [Originally Added On: January 5th, 2014]
- Bitcoin Baron Keeps a Secretive Open Source OS Alive [Last Updated On: January 22nd, 2014] [Originally Added On: January 22nd, 2014]
- osalt.com - Find Open Source Alternatives to commercial ... [Last Updated On: January 22nd, 2014] [Originally Added On: January 22nd, 2014]
- Sustainability of Open Source software communities beyond a fork - Video [Last Updated On: January 22nd, 2014] [Originally Added On: January 22nd, 2014]
- Bringing MoreWomen to Free and Open Source Software - Video [Last Updated On: January 22nd, 2014] [Originally Added On: January 22nd, 2014]
- Acquia podcast with Sensio Labs UK - Video [Last Updated On: January 22nd, 2014] [Originally Added On: January 22nd, 2014]
- xTuple ERP + OrangeHRM Open source software leaders integration - Video [Last Updated On: January 22nd, 2014] [Originally Added On: January 22nd, 2014]
- Guest articles setting out the author's position on the current status and future directions of KDE and its software [Last Updated On: January 23rd, 2014] [Originally Added On: January 23rd, 2014]
- Open Source Power for Small Business in 2014 [Last Updated On: January 23rd, 2014] [Originally Added On: January 23rd, 2014]
- EnterpriseDB Expands in Korea to Meet Rising Demand for Postgres [Last Updated On: January 24th, 2014] [Originally Added On: January 24th, 2014]
- Introduction to FOSS - Free and Open Source Software - Video [Last Updated On: January 24th, 2014] [Originally Added On: January 24th, 2014]
- Out in the Open: Teenage Hacker Transforms Web Into One Giant Bitcoin Network [Last Updated On: January 27th, 2014] [Originally Added On: January 27th, 2014]
- Who says that Open Source Software does not have support? By Rosaria Silipo - Video [Last Updated On: January 27th, 2014] [Originally Added On: January 27th, 2014]
- Microsoft Open Sources Its Internet Servers, Steps Into the Future [Last Updated On: January 28th, 2014] [Originally Added On: January 28th, 2014]
- Microsoft cloud server designs for Facebook's Open Compute Project [Last Updated On: January 28th, 2014] [Originally Added On: January 28th, 2014]
- Richard Stallman Free v Open Source Software - Video [Last Updated On: January 28th, 2014] [Originally Added On: January 28th, 2014]
- UK government looks to open source to cut costs [Last Updated On: January 30th, 2014] [Originally Added On: January 30th, 2014]
- Free Software + $20 USB Dongle = Software Defined Radio, Hak5 1524 - Video [Last Updated On: January 30th, 2014] [Originally Added On: January 30th, 2014]
- Libreoffice 4.2 challenges Microsoft Office with improved Windows integration [Last Updated On: January 31st, 2014] [Originally Added On: January 31st, 2014]
- Fallout 3 Let's Play Pt 6 - Video [Last Updated On: February 1st, 2014] [Originally Added On: February 1st, 2014]
- 14 1 29 Tom G Open Source Software 1 - Video [Last Updated On: February 1st, 2014] [Originally Added On: February 1st, 2014]
- 14 1 29 Tom G Open Source Software - Video [Last Updated On: February 1st, 2014] [Originally Added On: February 1st, 2014]
- How is open source software like great wine? - Video [Last Updated On: February 3rd, 2014] [Originally Added On: February 3rd, 2014]
- Free and open source software key for multicore hardware [Last Updated On: February 4th, 2014] [Originally Added On: February 4th, 2014]
- Blender Tutorial - 2D Animation (1) Bone Rigging, Shape Character Planes by VscorpianC - Video [Last Updated On: February 4th, 2014] [Originally Added On: February 4th, 2014]
- Obama Bit Coin Conspiracy? - Video [Last Updated On: February 4th, 2014] [Originally Added On: February 4th, 2014]
- The Pentagon's Mad Science Is Going Open Source [Last Updated On: February 5th, 2014] [Originally Added On: February 5th, 2014]
- The open source countdown has begun [Last Updated On: February 6th, 2014] [Originally Added On: February 6th, 2014]
- BLOG: Why open source will rule the data centre [Last Updated On: February 6th, 2014] [Originally Added On: February 6th, 2014]
- OpenDaylight Summit: SDN Needs Open Source and Open Standards [Last Updated On: February 10th, 2014] [Originally Added On: February 10th, 2014]
- 7 reasons not to use open source software [Last Updated On: February 12th, 2014] [Originally Added On: February 12th, 2014]
- The Open Source Initiative | Open Source Initiative [Last Updated On: February 12th, 2014] [Originally Added On: February 12th, 2014]
- Find Open Source Alternatives to commercial software ... [Last Updated On: February 12th, 2014] [Originally Added On: February 12th, 2014]
- Has Linux Conquered the Cloud? [Last Updated On: February 13th, 2014] [Originally Added On: February 13th, 2014]
- The New eRacks/NAS36 Rackmount Storage Server Achieves Price/Density Breakthrough: 100TB Storage in Only 4U for Under ... [Last Updated On: February 14th, 2014] [Originally Added On: February 14th, 2014]
- 2012 Red Hat Summit Build a PaaS using Open Source Software ~ Redhat Linux Video YouTube - Video [Last Updated On: February 14th, 2014] [Originally Added On: February 14th, 2014]
- Intel launches big data software suite - free to a good home [Last Updated On: February 15th, 2014] [Originally Added On: February 15th, 2014]
- Three college students build a health provider search site in six weeks [Last Updated On: February 16th, 2014] [Originally Added On: February 16th, 2014]
- The Asgard Show Episode 6 - Video [Last Updated On: February 16th, 2014] [Originally Added On: February 16th, 2014]
- Open source startups: Don't try to be Red Hat [Last Updated On: February 18th, 2014] [Originally Added On: February 18th, 2014]
- Open Source in the Enterprise: To Pay or Not to Pay? [Last Updated On: February 18th, 2014] [Originally Added On: February 18th, 2014]
- DEF CON 12 - Wendy Seltzer and Seth Schoen, Hacking the Spectrum - Video [Last Updated On: February 18th, 2014] [Originally Added On: February 18th, 2014]
- dev@Pulse Speaker Predictions - Jonathan Bryce - Video [Last Updated On: February 19th, 2014] [Originally Added On: February 19th, 2014]
- Facebook Boosts Its Open Source Mojo With New Project [Last Updated On: February 20th, 2014] [Originally Added On: February 20th, 2014]
- Raising Linux to Grow Open Source [Last Updated On: February 20th, 2014] [Originally Added On: February 20th, 2014]
- Apple Veteran Named PayPal's First Head of Open Source Software [Last Updated On: February 20th, 2014] [Originally Added On: February 20th, 2014]
- Open Source Software | 46 of 62 | MconneX - Video [Last Updated On: February 20th, 2014] [Originally Added On: February 20th, 2014]
- News Flash from Redmond: FOSS Causes Dissatisfaction! [Last Updated On: February 25th, 2014] [Originally Added On: February 25th, 2014]
- FOSS4G with Eric Brelsford - Video [Last Updated On: February 25th, 2014] [Originally Added On: February 25th, 2014]
- NYLUG Presents: Mark Tolliver on Palamida. Application Security for Open Source Software (6/25/08) - Video [Last Updated On: February 25th, 2014] [Originally Added On: February 25th, 2014]
- DARPA Open Catalog Makes Agency-Sponsored Software and Publications Available to All [Last Updated On: February 25th, 2014] [Originally Added On: February 25th, 2014]
- Munich opts for open source groupware from Kolab [Last Updated On: February 26th, 2014] [Originally Added On: February 26th, 2014]
- Modelling Hands Step by Step Using Free Open Source Software Seamless3d 3 - Video [Last Updated On: February 27th, 2014] [Originally Added On: February 27th, 2014]
- Accelerating the Network with Open Source Software, Erik Ekudden | OpenDaylight Summit 2014 - Video [Last Updated On: February 27th, 2014] [Originally Added On: February 27th, 2014]
- The Commercial Case for Open Source Software [Last Updated On: March 1st, 2014] [Originally Added On: March 1st, 2014]
- Beginners guide to contributing to open source software - Video [Last Updated On: March 3rd, 2014] [Originally Added On: March 3rd, 2014]
- Free Open Source Software [Last Updated On: March 4th, 2014] [Originally Added On: March 4th, 2014]
- Open Source Software - Video [Last Updated On: March 4th, 2014] [Originally Added On: March 4th, 2014]
- Open Source Software EDTC5325 - Video [Last Updated On: March 6th, 2014] [Originally Added On: March 6th, 2014]
- Broadcom Announces Open Switch Pipeline Specification Targeting Growing SDN Application Ecosystem [Last Updated On: March 7th, 2014] [Originally Added On: March 7th, 2014]
- RIT launches nation’s first minor in free and open source software and free culture [Last Updated On: March 7th, 2014] [Originally Added On: March 7th, 2014]
- Forum created to push optical SDNs [Last Updated On: March 10th, 2014] [Originally Added On: March 10th, 2014]
- Google embraces open source for 10th year of Summer of Code [Last Updated On: March 10th, 2014] [Originally Added On: March 10th, 2014]
- Is Open Source Software The Answer to Oregon's IT Problems? [Last Updated On: March 11th, 2014] [Originally Added On: March 11th, 2014]
- Spenden Ticketautomat mit Open Source Software auf der CeBIT 2014, CMS Garden - Video [Last Updated On: March 14th, 2014] [Originally Added On: March 14th, 2014]
- 2012 Red Hat Summit Build a PaaS using Open Source Software - Video [Last Updated On: March 14th, 2014] [Originally Added On: March 14th, 2014]
- CyanogenMod receiving Linux New Media Award 2014 (Best Open Source Software App for Android) - Video [Last Updated On: March 15th, 2014] [Originally Added On: March 15th, 2014]
- Real tech 25 Finding open source software you can trust - Video [Last Updated On: March 15th, 2014] [Originally Added On: March 15th, 2014]
- Tor is building an anonymous instant messenger [Last Updated On: April 10th, 2017] [Originally Added On: March 15th, 2014]
- MailPile is now in Alpha [Last Updated On: April 10th, 2017] [Originally Added On: March 15th, 2014]
- $2,400 “Introduction to Linux” course will be free and online this summer [Last Updated On: April 10th, 2017] [Originally Added On: March 16th, 2014]
- Linaro announces MediaTek as member [Last Updated On: March 18th, 2014] [Originally Added On: March 18th, 2014]
- TN state departments asked to switch over to open source software [Last Updated On: March 18th, 2014] [Originally Added On: March 18th, 2014]
- Open source project builds mobile networks without big carriers [Last Updated On: March 18th, 2014] [Originally Added On: March 18th, 2014]
- Your U.S. government uses open source software, and loves it [Last Updated On: March 18th, 2014] [Originally Added On: March 18th, 2014]
- Linux Goes to the Head of the Class [Last Updated On: March 22nd, 2014] [Originally Added On: March 22nd, 2014]
- What is open source? - Definition from WhatIs.com [Last Updated On: March 23rd, 2014] [Originally Added On: March 23rd, 2014]