Open source and proprietary software solutions: the key for an analytic project – Information Age

With the entire approved analytic process in a repeatable workflow organisations spend less time on repeating mundane tasks and process, and spend more time on valuable aspects of the analysis

In the world of data analysis it may be no coincidence that open source tools like the R statistical computing language have blossomed as analytics and big data have matured together.

Hadoop, Python There seems to be a special kind of magic between the curious minds of data analysts (with a small a as they may be line of business users that dont have a degree in statistics or a qualification in coding) and with new ways of exploring the world.

Open source software has proven itself to be a very useful way of rapidly finding quality insights out about the world when out to the challenging task of finding insights from the enormous volumes of data out there. Big data analytics provides an opportunity for open source data quality tools to deliver new insights.

>See also:Using data analytics to improve business processes and reduce waste

From a bottom-line focus, using open source solutions as part of the enterprise mix can help provide a cost-effective method to help get successful analytics projects off the ground.

Certainly, any business still using coding-intensive legacy architectures, or SAS solutions, will find themselves easily seduced the speed and versatility of modern products in the analytical toolkit.

Bringing these products and tools together can be complicated, but linking them together in one platform provides the fun and thrill for the analysts who want to use their favourite tools, and still maintain the governance, repeatability and reliability the business needs to really create a long-lived culture of analytics.

Its a plain fact that much of an analysts role, be they a specialist quant or a general business user, is more likely than not filled with the tedium of finding, cleaning, prepping, and cleansing data. By that stage theyve lost the enjoyment of what made the relationship with data special in the first place.

The trouble is that many legacy solutions cant adapt to the changing data landscape. Some were not designed to deal with the variety of data structured, unstructured, and semi-structured, or in the various types it is available from numerous applications and sources. This is why its sensible to allow for a flexible environment for analysts to take advantage of data across any system and in any format.

>See also:Data leader on the impact and necessity of data analytics

If this, the foundational element of the data journey, can be made as seamless and easy as possible, then the analytical detectives can be doing what they trained and are paid to do. Thats better for them, and its better for the business, as that passion and brain power is not atrophying with the tedious end of the mundane elements of data preparation.

Additionally, most data scientists today build predictive and machine learning models in open source programming languages and then need to deploy that code into different technology frameworks.

Its time consuming, error-prone and requires additional development resources often stalling data science projects altogether. Its important to pay attention to any roadblocks between data scientists and development teams by accelerating the model making and model deployment processes.

It can require considerable coding expertise to harness complex sets of open source tools, adding difficulty, not least because the skills are in high demand and fetch a premium on the market.

As a consequence code-free environments for analytics that simplify data access, preparation, analysis, and consumption are becoming a must in the modern enterprise.

A project manager should be able to quickly prepare, clean and combine data from any range of data sources. It should be a breeze to implement fuzzy matching techniques to improve the accuracy of results, and however the project is designed, as a matter of course it should reduce the dependency and reliance on data scientists and IT wherever possible. Its simply not sustainable to do this in any other way.

>See also:Machine learning and AI is changing how data science is leveraged

Following the data preparation and quality improvement, the next step involves taking that data and incorporating predictive or advanced analytics to make or to further improve business decisions. And in the modern, agile enterprise, this should mean doing this without having to write code if users dont wish to.

Once those elements are accounted for it should be a simple matter to build repeatable workflow processes that provides the business with greater data consistency and accuracy and result in tangible business benefits once the insights are acted upon.

With the entire approved analytic process in a repeatable workflow organisations spend less time on repeating mundane tasks and process, and spend more time on valuable aspects of the analysis. Analysts will enjoy themselves once more, following their curiosity and solving problems rather than administrating.

This is important. Todays data scientists are spending too much time building advanced models that never reach deployment. Gartner stated that many projects remain stuck at the pilot stage.

>See also:Is Hadoops position as the king of big data storage under threat?

Only 15% of businesses reported deploying their big data project to production in the Business Intelligence & Analytics Summit 2016 research. Yhat states that only 10% of predictive models actually get deployed. And according to TDWI, models can take an average of six to nine months to get deployed. Thats not a sustainable way of working.

Modelling tools need to be more accessible to accelerate deployment, and to save time and frustration. In part, its worth bringing joy back to data scientists and business users alike. With a wealth of data out there, its a good time to encourage and empower the people who love to solve complex business problems.

Sourced byMatthew Madden, director, Product Marketing at Alteryx

The UKs largest conference fortechleadership,TechLeadersSummit, returns on 14 September with 40+ top execs signed up to speak about the challenges and opportunities surrounding the most disruptive innovations facing the enterprise today.Secure your place at this prestigious summit byregisteringhere

Visit link:
Open source and proprietary software solutions: the key for an analytic project - Information Age

Related Posts
This entry was posted in $1$s. Bookmark the permalink.