Teradata releases data lake platform to open source

Thank you

Your message has been sent.

There was an error emailing this page.

Teradata today released its data lake management software platform to the open source community. The project aims to help organizations address common challenges in data lake implementation, including skill shortages for engineers and administrators, learning and implementing governance best practices and driving data lake adoption beyond engineers.

Teradata is offering the new open source Kylo project under the Apache 2.0 license, and plans to offer services and support for the platform.

Kylo evolved from code developed by Teradata company Think Big Analytics over eight years of engagements with Fortune 1000 customers on more than 150 data lake projects. It was built using open source capabilities including Apache Hadoop, Apache Spark and Apache NiFi.

[ Related: 15 data and analytics trends that will dominate 2017 ]

"Open source software has an appeal to users seeking independence, cooperative learning, experimentation and flexibility for customized deployments, Rick Farnell, president of Think Big, said in a statement today.

Teradata says data lakes take too long to build, and in the average six to 12 month build cycle, users find that use cases often become out of date. In addition, while the software costs associated with data lakes may be lower, Teradata says engineering costs can mount quickly. When data lakes are successfully created, users often find them difficult to explore.

Teradata says Kylo will help organizations address these challenges, because it integrates and simplifies pipeline development and common data management tasks. That means organizations that leverage Kylo achieve faster time-to-value and greater user adoption and developer productivity. Teradata says Kylo doesn't require coding, and it offers an intuitive user interface that enables self-service data ingest. Meanwhile, reusable templates help increase productivity.

[ Analytics 50: Call for 2017 entries ]

One major telecommunications company recently implemented Kylo after a large team of 30 data engineers spent months hand-coding data ingestion pipelines. With Kylo, a single individual was able to ingest, cleanse, profile and validate the same data in less than a week, Teradata says.

The Kylo software, documentation and tutorials are now available via the Kylo project website and via the GitHub website. Think Big is offering optional services around Kylo including the following:

Thor Olavsrud covers IT security, big data, open source technology, Microsoft tools and servers for CIO.com.