Gourav Singh Bais
Gourav is an applied machine learning engineer at ValueMomentum Inc. He is skilled in developing machine learning/deep learning pipelines, retraining systems and transforming data science prototypes to production-grade solutions. He has been working in the same field for the last three years and has served a lot of clients including Fortune 500 companies, which provided him the exposure to build experience and skills that can contribute to the machine learning community.
As a data professional, you may come across some datasets with few independent variables (input variables). One variable would be time, and the other can be any sort of time-dependent column, such as the number of bookings in a hotel or the number of passengers on a flight.
This type of data is referred to as time-series data, which has some type of trend and captures a point in time. There are various ways of storing this type of data, such as relational databases or files, like CSV or Excel. However, these options are not designed to efficiently store the time-series data. Enter time-series databases, which are specifically designed to efficiently and quickly store time-series data.
There are various use cases where time-series databases (TSDB) perform significantly better than other storage mechanisms. Consider a few:
Furthermore, there are several advantages to using a time-series database over other storage mechanisms for that data type. Here are a few reasons:
One widely used time-series database is InfluxDB. The company InfluxData created InfluxDB, an open source time-series database. Its written in Go for storing and retrieving time-series data for any use case, including operations monitoring, application metrics, Internet of Things (IoT) sensor data and real-time analytics.
To learn more about the benefits of InfluxDB, you can refer to the InfluxData website.
In this article, you will learn what is needed to get started in InfluxDB with R language, starting from installing, setting up, querying, writing and finally, building a simple time-series application using R.
Clients interacting with InfluxDB using any programming language must be able to connect to the database so that different database operations can be carried out. The influxdb-client-r library can be used to connect to InfluxDB using R. Its a package that supports operations, like writing data, reading data and getting the database status. This client library works with InfluxDB version 2.0.
Lets start with setting up InfluxDB using version 2.0. InfluxDB is available on different platforms, like Windows, Linux and macOS. Examples that you will see in this article are tested against macOS Big Sur, although installing it on any platform is simple.
Alternatively, you can use InfluxDB Cloud to quickly get a free instance of InfluxDB running in minutes without having to install anything locally on your machine.
InfluxDB can be installed on macOS using Homebrew:
```$ brew update$ brew install influxdb influxdb-cli```
```
$ brew update
$ brew install influxdb influxdb-cli
```
Alternatively, InfluxDB can be manually downloaded here.
Once InfluxDB is installed, you can start it by using this code:
The first time you start InfluxDB, it will ask you to set up the account, which can be carried out using the UI [localhost:8086](localhost:8086) or command line interface (CLI). For a UI setup, you will have to open the localhost URL and provide the information required for the initial setup. If youre using CLI, youll need to do it with the InfluxDB client, which can be started in the terminal using the following code:
For the initial setup, note the following details:
Username: You can choose any username for the initial user.
Password: You need to create and confirm a password for database access.
Organization name: You need to choose the initial organization name.
Bucket name: An initial bucket name is required, and you can create as many buckets as you want to work with.
Retention period: The time period your bucket will store the data before deleting it. You can choose **never** or leave it empty for an infinite retention period.
To install InfluxDB on other platforms, refer to the following link.
Once you have installed InfluxDB and completed the setup, you can log in to [localhost:8086](localhost:8086). You should see a screen like this:
You can take a look through the various modules included in the dashboard, though this article will primarily focus on those through which you can connect to the InfluxDB client. Start with the data module:
Here, you can observe different sections, like Sources, Bucket, Telegraf, Scrapers, and Tokens. To interact with InfluxDB using R, youll need to check the Buckets and Tokens sections. To connect with the database, youll need to have a private token (key) generated that is only accessible to you, allowing you to connect to different buckets.
To generate this token, navigate to the Tokens tab. On the right side, you will see a Generate Tokens button. This button has two different sections:
Read/Write Token: This token provides read and write access to different buckets, which can be limited to the scope (to specific buckets) or provided to all the buckets available. With this token, you can only read and write the data in an organization.
All-Access Token: This token provides full access to actions, like reading, writing, updating or deleting each bucket. This would be the recommended token through which you can connect to any bucket available without any explicit configuration and can perform all the needed actions, like read, write, update and delete.
For the purposes of this article, youll want to generate an All-Access Token. Once the token is generated, you can access it anytime by simply logging into the localhost console.
Now that you have InfluxDB all set up, you can download R and RStudio for writing and testing the code. Installing R is pretty simple. You can download the package here, then open and install it. After the R installation, you can download RStudio, which will be the IDE that you use to write the R code. You can download RStudio here.
At this stage, you have almost all the tools and technologies needed to connect to InfluxDB. As the last step, you need to install the InfluxDB client library for R, which can be downloaded using the following line of code:
```install.packages("influxdbclient")```
```
install.packages("influxdbclient")
```
If you install it on RStudio, other dependencies will be downloaded along with the base library. However, if dependencies are not automatically downloaded, you can separately download them using the following line of code:
```install.packages(c("httr", "bit64", "nanotime", "plyr"))```
```
install.packages(c("httr", "bit64", "nanotime", "plyr"))
```
The next step will be to import the InfluxDB client library in R and create an instance of InfluxDBClient that can be used to interact with the database and perform all sets of operations. Parameters required to make a database connection include the following:
Since this connection will be made locally, the connection script should look like this:
If you are using a cloud account make sure the URL parameter matches the region your cloud account is located in, rather than using localhost. You can find the URL endpoints in the docs.
Now that you have established a connection to InfluxDB, its time to use the data to perform different database operations. To understand these operations, lets take a look at some sample data of worldwide COVID-19 casesfrom January 2020 to April 2020:
This sample data contains the following fields:
To read the data frame in R, you will need to write the following line of code:
Lets start by first inserting this data into InfluxDB. To do so, use the write() method, which accepts parameters like this:
```client$write(data, bucket, precision, measurementCol,tagCols, fieldCols, timeCol, object)```
```
client$write(data, bucket, precision, measurementCol,
tagCols, fieldCols, timeCol, object)
```
Note: The above method is simply a function definition, not part of the code.
This method takes the following parameters:
To store the COVID-19 data in InfluxDB using the write() method, you will need to make sure that your time-stamp column (Date) is in POSIXct format.
The response from the write() function can be NULL, True, or an error. To debug the write() function and check how the data is being written in the database, you can assign an object: lp.
Now that you have your time-stamped data stored in the database, lets try reading the data. For querying the data using the R client, the read() function is used, which expects a Fluxquery. For querying, you can make use of the same client that you created for writing the data or you can create a new InfluxDB client and do the same.
Lets break down the above query. Starting with the keyword from, youll need to first specify the bucket name, followed by the range of time from which you want to select the data, and finally, a set of conditions. In the above query, the condition specifies not to include the start and stop columns from the database.
The result contains a list of data frames for each entry made in the database for the specified period. To check an instance of it, you can use the following code:
Now that you have queried the data, lets make use of this data for forecasting purposes. Here, you will be training a time-series model on the data retrieved and will try to predict the next five days cases. Lets create a dataframe from the results that you have after querying:
Once the dataframe is created, there are some changes that will be required to apply the time-series model on it. Typically, this stage is data preprocessing.
After preprocessing, now its time to create a time-series representation of our data. This would be done using the following code:
Finally, lets fit the data into the forecasting model and make the predictions for the next five days:
That is how data can be accessed and used for time-series forecasting, which is just one practical use case for the time-stamp data. The whole implementation can be found here.
For more information and best practices for optimizing the performance of InfluxDB, refer to the docs.
After reading this article, you now know how to set up InfluxDB in your system, as well as how to create a client and to write and read data for your time-series use case using R language. One major advantage of InfluxDB is that it comes with support for almost all major programming languages.
There are several options for storing time-series data, but time-series databases, like InfluxDB, can do so more quickly and on a higher scale. Several use cases, such as IoT applications, automated cars or real-time application analysis, need data insertion from as little as tens of thousands to as many as hundreds of thousands of entries at a time. Time-series databases perform this task at a very high speed and in real time, allowing them to be easily adapted by any developer working on a real-time time-series application. Be sure to consider deploying InfluxDB to use these great features in your own applications.
The New Stack is a wholly owned subsidiary of Insight Partners, an investor in the following companies mentioned in this article: Real.
Featured image via Pixabay.
Visit link:
Getting Started with R and InfluxDB The New Stack - thenewstack.io
- Research, Evaluation and Learning at the International Rescue Committee - World - ReliefWeb [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- Conserving Biodiversity with AI - BBN Times [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- DevOps Fundamentals You Ever Wanted To Know - hackernoon.com [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- Another Perspective on Evictions - Bacon's Rebellion [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- Amitabh Bachchan on fans alternate job suggestion: My job is now insured - The Indian Express [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- Will You Soon Download Packaging Machine Controls from the Internet? - Packaging Digest [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- 5 free resources every data scientist should start using today - The Next Web [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- Who's hoping to make an Epic impact on Green Bay area music scene with a new concert venue? | Streetwise - Green Bay Press Gazette [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- Industrial robots are dominating but are they safe from cyber-attacks? - TechHQ [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- Friday Rant - Rise of the Rogue-Bots? - Diginomica [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- Important Reasons Why You Should Pick RoR As Your Web-Based Development Project - Customer Think [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- Portrait of the software developer as an artist - ComputerWeekly.com [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- Python may be your safest bet for a career in coding - Gadgets Now [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- 1Password is coming to Linux - ZDNet [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- IBM creates an open source tool to simplify API documentation - TechRepublic [Last Updated On: August 10th, 2020] [Originally Added On: August 10th, 2020]
- Mastercard : Accelerate Ignites Next Generation of Fintech Disruptors and Partners to Build the Future of Commerce - Marketscreener.com [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
- Expanding the Universe of Haptics | by Lofelt | Aug, 2020 - Medium [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
- UX Designer Salary: 5 Important Things to Know - Dice Insights [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
- Persistent memory reshaping advanced analytics to improve customer experiences - IT World Canada [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
- NextCorps and SecondMuse Open Application Period for Programs that Help Climate Technology Startups Accelerate Hardware Manufacturing - GlobeNewswire [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
- Buried deep in the ice is the GitHub code vault humanity's safeguard against devastation - ABC News [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
- Top 12 Most Used Tools By Developers In 2020 - Analytics India Magazine [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
- Facebook's React 17 JavaScript library: Here's why its top feature is 'no new features' - ZDNet [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
- CORRECTING and REPLACING Anyscale Hosts Inaugural Ray Summit on Scalable Python and Scalable Machine Learning - Business Wire [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
- Google: Here's how much we give to open source through our GitHub activity - ZDNet [Last Updated On: August 12th, 2020] [Originally Added On: August 12th, 2020]
- How Chriselle Lim And Joan Nguyen Created Bmo, The Coworking Space And Virtual Classroom Of The Future (With A Childcare Twist) - Forbes [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
- How Will Public Libraries Adapt To New School Year Norms? - Book Riot [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
- Google: We'll test hiding the full URL in Chrome 86 to combat phishing - ZDNet [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
- How to install Python 3 and PIP 3 on Ubuntu 20.04 LTS - Linux Shout - H2S Media [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
- What are Bitcoin Wallets: Everything You Need to Know - Programming Insider [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
- JSHint is Now Free Software after Updating License to MIT Expat - WP Tavern [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
- How to learn JavaScript: These are the best online courses - Mashable [Last Updated On: August 13th, 2020] [Originally Added On: August 13th, 2020]
- What developers need to know about inter-blockchain communication - ComputerWeekly.com [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
- Introducing the CDK construct library for the serverless LAMP stack - idk.dev [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
- IBM asked software developers to take on the wrath of Mother Nature - The Drum [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
- Aspire Technology Launches First Truly Secure Public Blockchain for Creation of Digital Assets - GlobeNewswire [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
- GM Creates And Shares New Workplace Safety Technologies - Pulse 2.0 [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
- Key Considerations and Tools for IP Protection of Computer Programs in Europe and Beyond - Lexology [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
- The state of application security: What the statistics tell us - CSO Online [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
- Open Source: What's the delay on the former high/middle school on North Mulberry? - knoxpages.com [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
- The Risks Associated with OSS and How to Mitigate Them - Security Boulevard [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
- news digest: Microsoft launches open source website, TensorFlow Recorder released, and Stackery brings serverless to the Jamstack - SD Times -... [Last Updated On: August 14th, 2020] [Originally Added On: August 14th, 2020]
- Build Your Own PaaS with Crossplane: Kubernetes, OAM, and Core Workflows - InfoQ.com [Last Updated On: August 17th, 2020] [Originally Added On: August 17th, 2020]
- ISRO Is Recruiting For Vacancies with Salary Upto Rs 54000: How to Apply - The Better India [Last Updated On: August 17th, 2020] [Originally Added On: August 17th, 2020]
- Does technology increase the problem of racism and discrimination? - TechTarget [Last Updated On: August 17th, 2020] [Originally Added On: August 17th, 2020]
- CORRECTING and REPLACING Anyscale Hosts Inaugural Ray Summit on Scalable Python and Scalable Machine Learning - Yahoo Finance [Last Updated On: August 17th, 2020] [Originally Added On: August 17th, 2020]
- In the City: Take advantage of open recreation, cultural and park amenities - Coloradoan [Last Updated On: August 17th, 2020] [Originally Added On: August 17th, 2020]
- Exploring the future of modern software development - ComputerWeekly.com [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- Hadoop Developer Interview Questions: What to Know to Land the Job - Dice Insights [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- SiFive Opens Business Unit to Build Chips With Arm and RISC-V Inside - Electronic Design [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- Use Pulumi and Azure DevOps to deploy infrastructure as code - TechTarget [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- Why ASP.NET Core Is Regarded As One Of The Best Frameworks For Building Highly Scalable And Modern Web Applications - WhaTech [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- NITK figures 4th in Google Summer of Code ranking - BusinessLine [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- Learn More About Dynamo for Revit: Features, Functions, and News - ArchDaily [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- Linux Foundation showcases the greater good of open source - ComputerWeekly.com [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- Programming language Kotlin 1.4 is out: This is how it's improved quality and performance - ZDNet [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- Top 10 Languages That Paid Highest Salaries Worldwide In 2020 - Analytics India Magazine [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- Programming language Rust: Mozilla job cuts have hit us badly but here's how we'll survive - ZDNet [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- In-App Bidding Gathers Steam, But Adoption Looks Nothing Like Header Bidding On The Web - AdExchanger [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- 13 thoughts on Fitting Snake Into A QR Code - Hackaday [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- Newham test and trace app was designed by man who grew up in the borough - Newham Recorder [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- 'Trapped in a code' the fight over our algorithmic future - Open Democracy [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- Telegram launches one-on-one video calls on iOS and Android - The Verge [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- AWS Controllers for Kubernetes Will Be A 'Boon For Developers' - CRN: Technology news for channel partners and solution providers [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- Coding within company constraints - ComputerWeekly.com [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- Open Source and Open Standards: The Recipe for Success Featured - The Fast Mode [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- How Intel helped give the worlds first cyborg a voice - The Next Web [Last Updated On: August 21st, 2020] [Originally Added On: August 21st, 2020]
- Tiger Woods, Rory McIlroy near bottom of field at The Northern Trust - ESPN [Last Updated On: August 22nd, 2020] [Originally Added On: August 22nd, 2020]
- Intel Owl OSINT tool automates the intel-gathering process using a single API - The Daily Swig [Last Updated On: August 22nd, 2020] [Originally Added On: August 22nd, 2020]
- IOTA Foundation presents the current projects in the mobility industry - Crypto News Flash [Last Updated On: August 22nd, 2020] [Originally Added On: August 22nd, 2020]
- How 'Fortnite' and 'Second Life' Shaped the Future of Indian Market - Santa Fe Reporter [Last Updated On: August 22nd, 2020] [Originally Added On: August 22nd, 2020]
- Apple Enters $ 2 Trillion Club, Github's Chinese Counterpart And More In This Week's Top News - Analytics India Magazine [Last Updated On: August 22nd, 2020] [Originally Added On: August 22nd, 2020]
- As world grapples with pandemic, schools are the epicenter - ABC News [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
- Why Businesses Should Embrace Modernizing Their Legacy Applications - TechBullion [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
- Is It Time To Rename RPG? - IT Jungle [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
- Phantasy Star Online programmers on breaking new ground and their Diablo-style isometric prototype - Polygon [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
- How To Learn To Program In Python By Playing Videogames - Analytics India Magazine [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
- New Microsoft program to help develop the quantum computing workforce of the future in India - Microsoft [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
- How the Docker Revolution Will Change Your Programming, Part 1 - Walter Bradley Center for Natural and Artificial Intelligence [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]
- The art of developing happy customers - ComputerWeekly.com [Last Updated On: August 24th, 2020] [Originally Added On: August 24th, 2020]