Ten Ways to Apply Machine Learning in Earth and Space Sciences – Eos

The Earth and space sciences present ideal use cases for machine learning (ML) applications because the problems being addressed are globally important and the data are often freely available, voluminous, and of high quality.Machine learning (ML), loosely defined as the ability of computers to learn from data without being explicitly programmed, has become tremendously popular in technical disciplines over the past decade or so, with applications including complex game playing and image recognition carried out with superhuman capabilities. The Earth and space sciences (ESS) community has also increasingly adopted ML approaches to help tackle pressing questions and unwieldy data sets. From 2009 to 2019, for example, the number of studies involving ML published in AGU journals approximately doubled.

In many ways, ESS present ideal use cases for ML applications because the problems being addressedlike climate change, weather forecasting, and natural hazards assessmentare globally important; the data are often freely available, voluminous, and of high quality; and computational resources required to develop ML models are steadily becoming more affordable. Free computational languages and ML code libraries are also now available (e.g., scikit-learn, PyTorch, and TensorFlow), contributing to making entry barriers lower than ever. Nevertheless, our experience has been that many young scientists and students interested in applying ML techniques to ESS data do not have a clear sense of how to do so.

An ML algorithm can be thought of broadly as a mathematical function containing many free parameters (thousands or even millions) that takes inputs (features) and maps those features into one or more outputs (targets). The process of training an ML algorithm involves optimizing the free parameters to map the features to the targets accurately.

There are two broad categories of ML algorithms relevant in most ESS applications: supervised and unsupervised learning (a third category, reinforcement learning, is used infrequently in ESS). Supervised learning, which involves presenting an ML algorithm with many examples of input-output pairs (called the training set), can be further divided, according to the type of target that is being learned, as either categorical (classification; e.g., does a given image show a star cluster or not?) or continuous (regression; e.g., what is the temperature at a given location on Earth?). In unsupervised learning, algorithms are not given a particular target to predict; rather, an algorithms task is to learn the natural structure in a data set without being told what that structure is.

Supervised learning is more commonly used in ESS, although it has the disadvantage that it requires labeled data sets (in which each training input sample must be tagged, or labeled, with a corresponding output target), which are not always available. Unsupervised learning, on the other hand, may find multiple structures in a data set, which can reveal unanticipated patterns and relationships, but it may not always be clear which structures or patterns are correct (i.e., which represent genuine physical phenomena).

Books and classes about ML often present a range of algorithms but leave people to imagine specific applications of these algorithms on their own.Books and classes about ML often present a range of algorithms that fall into one of the above categories but leave people to imagine specific applications of these algorithms on their own. However, in practice, it is usually not obvious how such approaches (some seemingly simple) may be applied in a rich variety of ways, which can create an imposing obstacle for scientists new to ML. Below we briefly describe various themes and ways in which ML is currently applied to ESS data sets (Figure 1), with the hope that this listnecessarily incomplete and biased by our personal experienceinspires readers to apply ML in their research and catalyzes new and creative use cases.

One of the simplest and most powerful applications of ML algorithms is pattern identification, which works particularly well with very large data sets that cannot be traversed manually and in which signals of interest are faint or highly dimensional. Researchers, for example, applied ML in this way to detect signatures of Earth-sized exoplanets in noisy data making up millions of light curves observed by the Kepler space telescope. Detected signals can be further split into groups through clustering, an unsupervised form of ML, to identify natural structure in a data set.

Conversely, atypical signals may be teased out of data by first identifying and excluding typical signals, a process called anomaly or outlier detection. This technique is useful, for example, in searching for signatures of new physics in particle collider experiments.

An important and widespread application of supervised ML is the prediction of time series data from instruments or from an index (or average value) that is intended to encapsulate the behavior of a large-scale system. Approaches to this application often involve using past data in the time series itself to predict future values; they also commonly involve additional inputs that act as drivers of the quantities measured in the time series. A typical example of ML applied to time series in ESS is its use in local weather prediction, with which trends in observed air temperature and pressure data, along with other quantities, can be predicted.

In many instances, however, predicting a single time series of data is insufficient, and knowledge of the temporal evolution of a physical system over regional (or global) spatial scales is required. This spatiotemporal approach is used, for example, in attempts to predict weather across the entire globe as a function of time and 3D space in high-capacity models such as deep neural networks.

Physics-based simulations can take days or weeks to run on even the most powerful computers. An alternate solution is to train ML models to act as emulators for physics-based models.Traditional, physics-based simulations (e.g., global climate models) are often used to model complex systems, but such models can take days or weeks to run on even the most powerful computers, limiting their utility in practice. An alternate solution is to train ML models to act as emulators for physics-based models or to replicate computationally intensive portions within such models. For example, global climate models that run on a coarse grid (e.g., 50- to 100-kilometer resolution) can include subgrid processes, like convection, modeled using ML-based parameterizations. Results with these approaches are often indistinguishable from those produced by the original model alone but can run millions or billions of times faster.

Many physics-based simulations proceed by integrating a set of partial differential equations (PDEs) that rely on time-varying boundary conditions and other conditions that drive interior parts of the simulation. The physics-based model then propagates information from these boundary and driver conditions into the simulation spaceimagine, for example, a 3D cube being heated at its boundary faces with time-varying heating rates or with thermal conductivity that varies spatiotemporally within the cube. ML models can be trained to reflect the time-varying parameterizations both within and along the simulation boundaries of a physical model, which again may be computationally cheaper and faster.

If a spatiotemporal ML model of a physical system can be trained to produce accurate results under a variety of input conditions, then the implication is that the model implicitly accounts for all the physical processes that drive that system, and thus, it can be probed to gain insights into how the system works. Certain algorithms (e.g., random forests) can automatically provide a ranking of feature importance, giving the user a sense of which input parameters affect the output most and hence an intuition about how the system works.

More sophisticated techniques, such as layerwise relevance propagation, can provide deeper insights into how different features interact to produce a given output at a particular location and time. For example, a neural network trained to predict the evolution of the El NioSouthern Oscillation (ENSO), which is predominantly associated with changes in sea surface temperature in the equatorial Pacific Ocean, revealed that precursor conditions for ENSO events occur in the South Pacific and Indian Oceans.

A ubiquitous challenge in ESS is to invert observations of a physical entity or process into fundamental information about the entity or the causes of the process (e.g., interpreting seismic data to determine rock properties). Historically, inverse problems are solved in a Bayesian framework requiring multiple runs of a forward model, which can be computationally expensive and often inaccurate. ML offers alternative methods to approach inverse problems, either by using emulators to speed up forward models or by using physics-informed machine learning to discover hidden physical quantities directly. ML models trained on prerun physics-based model outputs can be used for rapid inversion.

Satellite observations often provide global, albeit low-resolution and sometimes indirect (i.e., proxy-based), measurements of quantities of interest, whereas local measurements provide more accurate and direct observations of those quantities at smaller scales. A popular and powerful use for ML models is to estimate the relationship between global proxy satellite observations and local accurate observations, which enables the creation of estimated global observations on the basis of localized measurements. This approach often includes the use of ML to create superresolution images and other data products.

Typically, uncertainty in model outputs is quantified using a single metric such as the root-mean-square of the residual (the difference between model predictions and observations). ML models can be trained to explicitly predict the confidence interval, or inherent uncertainty, of this residual value, which not only serves to indicate conditions under which model predictions are trustworthy (or dubious) but can also be used to generate insights about model performance. For instance, if there is a large error at a certain location in a model output under specific conditions, it could suggest that a particular physical process is not being properly represented in the simulation.

Domain experts analyzing data from a given system, even in relatively small quantities, are often able to extrapolate the behavior of the systemat least conceptuallybecause of their understanding of and trained intuition about the system based on physical principles. In a similar way, laws and relationships that govern physical processes and conserved quantities can be explicitly encoded into neural network algorithms, resulting in more accurate and physically meaningful models that require less training data.

In certain applications, the values of terms or coefficients in PDEs that drive a systemand thus that should be represented in a modelare not known. Various ML algorithms were developed recently that automatically determine PDEs that are consistent with the available physical observations, affording a new and powerful discovery tool.

In still newer work, ML methods are being developed to directly solve PDEs. These methods offer accuracy comparable to traditional numerical integrators but can be dramatically faster, potentially allowing large-scale simulations of complex sets of PDEs that have otherwise been unattainable.

The Earth and space sciences are poised for a revolution centered around the application of existing and rapidly emerging ML techniques to large and complex ESS data sets being collected. These techniques have great potential to help scientists address some of the most urgent challenges and questions about the natural world facing us today. We hope the above list sparks creative and valuable new applications of ML, particularly among students and young scientists, and that it becomes a community resource to which the ESS community can add more ideas.

We thank the AGU Nonlinear Geophysics section for promoting interdisciplinary, data-driven research, for supporting the idea of writing this article, and for suggesting Eos as the ideal venue for dissemination. The authors gratefully acknowledge the following sources of support: J.B. from subgrant 1559841 to the University of California, Los Angeles, from the University of Colorado Boulder under NASA Prime Grant agreement 80NSSC20K1580, the Defense Advanced Research Projects Agency under U.S. Department of the Interior award D19AC00009, and NASA/SWO2R grant 80NSSC19K0239 and E.C. from NASA grants 80NSSC20K1580 and 80NSSC20K1275. Some of the ideas discussed in this paper originated during the 2019 Machine Learning in Heliophysics conference.

Jacob Bortnik ([emailprotected]), University of California, Los Angeles; and Enrico Camporeale, Space Weather Prediction Center, NOAA, Boulder, Colo.; also at Cooperative Institute for Research in Environmental Sciences, University of Colorado Boulder

See the original post:
Ten Ways to Apply Machine Learning in Earth and Space Sciences - Eos

The home of The Spot 518 – Real-Local-News – Spotlight News

TROY Artificial intelligence and machine learning are revolutionizing the ways in which we live, work, and spend our free time, from the smart devices in our homes to the tasks our phones can carry out. This transformation is being made possible by a surge in data and computing power that can help machine learning algorithms not only perform device-specific tasks, but also help them gain intelligence or knowledge over time.

In the not-so-distant future, artificial intelligence and machine learning tasks will be carried out among connected devices through wireless networks, dramatically enhancing the capabilities of future smartphones, tablets, and sensors, and achieving whats known as distributed intelligence. As technology stands right now, however, machine learning algorithms are not efficient enough to be run over wireless networks and wireless networks are not yet ready to transmit this type of intelligence.

With the support of a National Science Foundation Faculty Early Career Development Program grant, Tianyi Chen, an assistant professor of electrical, computer, and systems engineering at Rensselaer Polytechnic Institute and member of the Rensselaer-IBM Artificial Intelligence Research Collaboration (AIRC), is exploring how to make such knowledge-sharing tools a reality.

I think in the future, the main terminal of intelligence will be our phones. Our phones will be able to control our computers, our cars, our meeting rooms, our apartments, Chen said. This will be powered by resource-efficient machine learning algorithms and also the support of future wireless networks.

Through his collaboration with the Lighting Enabled Systems and Applications Center at Rensselaer, Chen will validate the algorithms he develops using the centers smart conference room.

The conference room is equipped with devices that are capable of sensing the environment, processing that information, and efficiently sharing it with other devices on the network the same framework the algorithms are being designed to function within.

We need to redesign our wireless networks to support not only traditional traffic, like video and voice, but to support new traffic such as transmittable intelligence, Chen said. We need to design more efficient learning algorithms that are suitable for running on the wireless network.

Chen also stressed the importance of ensuring that knowledge-sharing algorithms only extract anonymized information in order to maintain data privacy as our devices and daily lives become increasingly networked. While the goals of this research are foundational in nature.

Chen said the potential for future applications is wide-ranging from power grids to urban transportation systems.

Founded in 1824, Rensselaer Polytechnic Institute is Americas first technological research university. Rensselaer encompasses five schools, 32 research centers, more than 145 academic programs, and a dynamic community made up of more than 7,600 students and more than 100,000 living alumni. Rensselaer faculty and alumni include more than 145 National Academy members, six members of the National Inventors Hall of Fame, six National Medal of Technology winners, five National Medal of Science winners, and a Nobel Prize winner in Physics. With nearly 200 years of experience advancing scientific and technological knowledge, Rensselaer remains focused on addressing global challenges with a spirit of ingenuity and collaboration.

This feature was originally published on the Rensselaer website.

View post:
The home of The Spot 518 - Real-Local-News - Spotlight News

Which Industries are Hiring AI and Machine Learning Roles? – Dice Insights

Companies everywhere are pouring resources into artificial intelligence (A.I.) and machine learning (ML) initiatives. Many technologists believe that apps smartened with A.I. and ML tools will eventually offer better customer personalization; managers hope that A.I. will lead to better data analysis, which in turn will power better business strategies.

But which industries are actually hiring A.I. specialists? If you answer that question, it might give you a better idea of where those resources are being deployed. Fortunately,CompTIAs latest Tech Jobs Reportoffers a breakdown of A.I. hiring, using data from Burning Glass, which collects and analyzes millions of job postings from across the country. Check it out:

Perhaps its no surprise that manufacturing tops this list; after all, manufacturers have been steadily automating their production processes for years, and it stands to reason that they would turn to A.I. and ML to streamline things even more. In theory, A.I. will also help manufacturers do everythingfrom reducing downtime to improving supply chainsalthough it may take some time to get the models right.

The presence of healthcare, banking, and public administration likewise seem logical.These three industries have the money to invest in A.I. and ML right now and have the greatest opportunity to see the investment pay off, fast, Gus Walker, director of product at Veritone, an A.I. tech company based in Costa Mesa, California,told Dicelate last year.That being said, the pandemic has caused industries hit the hardest to take a step back and look at how they can leverage AI and ML to rebuild or adjust in the new normal.

Compared to overall tech hiring, the number of A.I.-related job postings is still relatively small. Right now, mastering and deploying A.I. and machine learning is something of a specialist industry; but as these technologies become more commodified, and companies develop tools that allow more employees to integrate A.I. and ML into their projects, the number of job postings for A.I. and ML positions could increase over the next several years. Indeed, one IDC report from 2020 found three-quarters of commercial enterprise applications could lean on A.I. in some way by2021.

Its also worth examining where all that A.I. hiring is taking place; its interesting that Washington DC tops this particular list, with New York City a close second; Silicon Valley and Seattle, the nations other big tech hubs, are somewhat further behind, at least for the moment. Washington DC is notable not only for federal government hiring, but the growing presence of companies such as Amazon that hunger for talent skilled in artificial intelligence:

Jobs that leverage artificial intelligence are potentially lucrative, with a current median salary (according to Burning Glass)of $105,000. Its also a skill-set thatmore technologists may need to become familiar with, especially managers and executives.A.I. is not going to replace managers but managers that use A.I. will replace those that do not, Rob Thomas, senior vice president of IBMscloudand data platform,recently told CNBC. If you mention A.I. or ML on your resume and applications, make sure you know your stuff before the job interview; chances are good youll be tested on it.

Want more great insights?Create a Dice profile today to receive the weekly Dice Advisor newsletter, packed with everything you need to boost your career in tech. Register now

Read more from the original source:
Which Industries are Hiring AI and Machine Learning Roles? - Dice Insights

Battle of the buzzwords: AIOps vs. MLOps square up – TechTarget

AIOps and MLOps are terms that might appear to have a similar meaning, given that the acronyms on which they are based -- AI and ML -- are often used in similar contexts. However, AIOps and MLOps mean radically different things.

A team or company might use both AIOps and MLOps at the same time but not for the same purposes. Let's dig into what each is individually and then whether they can be used together.

AIOps, which stands for artificial intelligence for IT operations, is the use of AI to help perform IT operations work.

For example, a team that uses AIOps might use AI to analyze the alerts generated by its monitoring tools and then prioritize the alerts so that the team knows which ones to focus on. Or an AIOps tool could automatically find and fix an application that has crashed, using AI to determine the cause of the problem and the proper remediation.

Short for machine learning IT operations, MLOps is a technique that helps organizations optimize their use of machine learning and AI tools.

The core idea behind MLOps is that the stakeholders involved in making decisions about machine learning and AI are typically siloed from each other. Data scientists know how AI and machine learning algorithms work. But they don't usually collaborate closely with IT engineers, responsible for deploying AI and machine learning tools, or with compliance officers, who manage security and regulatory aspects of machine learning and AI use.

Put another way, MLOps is like DevOps in that it seeks to break down the silos that separate different types of teams. But, whereas DevOps is all about encouraging collaboration between developers and IT operations teams, MLOps focuses on collaboration between everyone who plays a role in choosing or managing machine learning and AI resources.

It's tempting to assume that AIOps and MLOps basically mean the same thing, given that AI and machine learning mean similar -- albeit not identical -- things.

But, in fact, the terms are not closely related at all. You could argue that a healthy MLOps practice would help organizations choose and deploy AIOps tools, but that's only one possible goal of MLOps. Beyond that, AIOps and MLOps don't intersect.

This is a sign that the tech community has overused the -Ops construction. When you can take any noun, add the -Ops suffix and invent a new buzzword -- without logical consistency to unite it with similarly formed buzzwords -- it might be time to move onto new techniques for labeling buzzwords.

Link:
Battle of the buzzwords: AIOps vs. MLOps square up - TechTarget

Machine Learning Algorithm Trained on Images of Everyday Items Detects COVID-19 in Chest X-Rays with 99% Accuracy – HospiMedica

New research using machine learning on images of everyday items is improving the accuracy and speed of detecting respiratory diseases, reducing the need for specialist medical expertise.

In a study by researchers at Edith Cowan University (Perth, Australia), the results of this technique, known as transfer learning, achieved a 99.24% success rate when detecting COVID-19 in chest X-rays. The study tackles one of the biggest challenges in image recognition machine learning: algorithms needing huge quantities of data, in this case images, to be able to recognize certain attributes accurately.

According to the researchers, this was incredibly useful for identifying and diagnosing emerging or uncommon medical conditions. The key to significantly decreasing the time needed to adapt the approach to other medical issues was pre-training the algorithm with the large ImageNet database. The researchers hope that the technique can be further refined in future research to increase accuracy and further reduce training time.

"Our technique has the capacity to not only detect COVID-19 in chest x-rays, but also other chest diseases such as pneumonia. We have tested it on 10 different chest diseases, achieving highly accurate results," said ECU School of Science researcher Dr. Shams Islam. "Normally, it is difficult for AI-based methods to perform detection of chest diseases accurately because the AI models need a very large amount of training data to understand the characteristic signatures of the diseases. The data needs to be carefully annotated by medical experts, this is not only a cumbersome process, it also entails a significant cost. Our method bypasses this requirement and learns accurate models with a very limited amount of annotated data. While this technique is unlikely to replace the rapid COVID-19 tests we use now, there are important implications for the use of image recognition in other medical diagnoses."

Related Links:Edith Cowan University

Follow this link:
Machine Learning Algorithm Trained on Images of Everyday Items Detects COVID-19 in Chest X-Rays with 99% Accuracy - HospiMedica

Ron DeSantis’ Deplatforming Bill Is Deplatformed and Everyone WinsBut Also Loses Mother Jones – Mother Jones

Let our journalists help you make sense of the noise: Subscribe to the Mother Jones Daily newsletter and get a recap of news that matters.

Last month, Florida Republican Gov. Ron DeSantis signed into law a measure to radically overhaul how social media companies operate in his state. Under its provisions, sites like Twitter and Facebook would be prohibited from banning from their platforms elected officials who violated the sites terms of service. The pretext of the legislation, which included a hilarious but sort of incongruous exemption for Disney+, was obviousit was a response to former President Donald Trumps banishment from Twitter and Facebook for cheering on an insurrection. Equally obvious was the measures unconstitutionality. A governor cannot dictate a private companys speech; there is no constitutional right to post.

And sure enough, after a challenge from tech companies, on Wednesday a federal judge in Florida issued a preliminary injunction against the law, finding that it would likely violate the First Amendment.

DeSantis vowed to appeal, of course, and so this cycle will likely just repeat itself again a few months down the line. But the whole episode is clarifying. Earlier this year I wrote about the outsized place that content creation has taken in conservative politics. Much like a child repeating a curse word because they heard it from their parents, when a new generation of conservatives treats shitposting as the end-point of politics, you know they learned it from Trump. The ex-president often substituted the performance of governing for the real thinglook no further than the daily coronavirus briefings last yearand held elaborate signing ceremonies for what were essentially press releases. Everything was a product, packaged for consumption via an increasingly online conservative media:

Big Tech and cancel culture have emerged as key villains for the new right, not just because of how neatly they fit into long-standing tropes about cosmopolitan elites, but because so much of modern conservatism lives online. Offline, there are issues that warrant serious attention from one of the nations two governing partiescities without water, cities soon to be underwater, whole states without power, and a world still suffering from a deadly virus. But with a nudge from Trump, the right has become ever more dissociated from reality, channeling its energy into an endless series of fights over deplatforming and whos triggering whom. During the Obama years, a Breitbart provocateur interrupted a White House press conference to complain about losing his Twitter verification badge. Then, it was a sideshow; now, its the whole point.

The Florida law was a natural product of this ecosystem. Which is why, while a major piece of legislation getting gutted by the courts would be damaging for a policy-centered Democratic administration, Wednesdays injunction is sort of the best of both world for DeSantis. It frees him to continue railing against the evils of Big Tech (now clearly in league with unaccountable liberal judges!) without never having to implement the law itself. Just throw in some critical race theory (which Florida, at DeSantis urging, also banned) and this whole fight would contain the entirety of Biden-era conservative thought. The whole party is a television show now.

Read the original post:

Ron DeSantis' Deplatforming Bill Is Deplatformed and Everyone WinsBut Also Loses Mother Jones - Mother Jones

Deplatforming Bills are Dying, Proving to be Political Stunts and not Solutions – Reform Austin

The legislature evidenced GOP priorities, such as siding with Trump after being censored for incitement of violence when a mob of his supporters assaulted the U.S Capitol.

SB 12, authored by Republican state Sen. Bryan Hughes, died during the regular session but is expected to be brought back by Abbott to his special session. This bill would ultimately prohibit online platforms from censoring expressed views by social media users.

Republicans have been claiming for years that big tech companies and social media platforms have been intolerant with their conservative views, pushing for bills to avoid any type of censorship whatsoever.

But the law does not seem to consider this a solution, as a federal judge on Wednesday granted a preliminary injunction against Floridas new social media law in this regard.

Balancing the exchange of ideas among private speakers is not a legitimate governmental interest. said District Court Judge Robert Hinkle.

NetChoices president, Steve DelBianco, supported the judges ruling saying that the order allowed the platforms to keep their users safe from the worst content posted by irresponsible users.

On deplatforming bills, Antigone Davis, Global Head of Safety in California-headquartered Facebook Inc said they will continue advocating for internet rules that protect free expression while allowing platforms like Facebook to remove harmful content.

This type of legislation would make children and other vulnerable communities less safe by making it harder for us to remove content like pornography, hate speech, bullying, self-harm images and sexualized photos of minors said Davis on SB 12.

Governor Abbott however, believes it is a move to target Republicans and censor their personal views.

There is a dangerous movement that is spreading across the country to try to silence conservative ideas, religious beliefs, Abbott said.

And as Republican insist they are being forced to silence and social media companies keep insisting they only apply regulatory measures on speech that incites violence or illegal acts, experts like Kate Huddleston, from ACLU of Texas, say SB 12 is a waste of time and resources.

Federal law already prohibits holding social media platforms liable as the publisher of content provided by others, Huddleston said.

View post:

Deplatforming Bills are Dying, Proving to be Political Stunts and not Solutions - Reform Austin

Wizards of OSS: Industry perspectives on open source software – VentureBeat

Elevate your enterprise data technology and strategy at Transform 2021.

Let theOSS Enterprise newsletterguide your open source journey!Sign up here.

Open source software (OSS) is so prevalent that its difficult to imagine life without it. For businesses, open source brings scalability, transparency, cost savings, and the power of the crowd.

To get an idea of the pervasiveness of open source software commercial or otherwise just consider WordPress. The brand synonymous with content management systems (CMS) spans two broad incarnations the self-hosted open source version available through WordPress.organd a hosted version called WordPress.com thats operated by Automattic. Collectively, they now power more than 40% of all websites.

Similarly, just about everyone is familiar with Android, the open source mobile operating system (OS) that claims a global market share of 84%. The lions share of this belongs to Googles flavor of Android, which includes an ecosystem of services and proprietary applications that make Google a lot of money. The core Android Open Source Project (AOSP), however, has been forked several times, perhaps most notably (in the West, at least) by Amazon to create Fire OS, which powers most of its tablets and TV streaming devices. Android is also the most prominent mobile operating system in China, though local handset makers have created their own forks sans Google.

Android is actually based on a modified version of the Linux kernel, arguably one of the biggest success stories to emerge from the open source world. Linux is now used in everything from automobiles to air traffic control and medical devices and is also widely employed in web servers, the most common being Apache.

In fact, the growth of the web over the past 30 years has been fueled in large part by open source software. So what would a world without open source look like?

Everything from operating systems, databases, web servers, programming languages, and developer tools all wouldnt be possible without open source, said Martin Traverso, a former Facebook engineer and cocreator of thedistributed SQL query engine Presto. There would likely be fewer developers in the world because not all developers have the luxury of being part of a certain company theres a lot of innovation that happens outside of companies like Google, Microsoft, and Facebook.

In other words, self-taught or indie developers would have less incentive and opportunity to gain a foothold in software development if everything was locked behind a proprietary door.

Traverso joined Facebook in 2012 and alongside two colleagues developed Presto to help analysts and data scientists run faster queries on large amounts of data. Facebook open-sourced Presto a year later, and in 2019 Traverso and his cofounders left Facebook to launch a fork of the original Presto project, called PrestoSQL, as part of the newly formed Presto Software Foundation. In December, PrestoSQL was rebranded as Trino, and the Presto Software Foundation was renamed the Trino Software Foundation.

Above: Hey Presto

In 2019, Traverso also cofounded a company called Starburst Data that targets enterprises with a commercial version of Trino and raised $100 million at a $1.2 billion valuation in January.

For perspective on the impact Presto (the original project) and Trino have had, Amazons AWS uses them as part of the companys Athena interactive query service, and they are also used by Uber, Airbnb, Intel, Twitter, Netflix, Atlassian, and Alibaba. Starburst, meanwhile, claims notable commercial clients like Comcast and Vmware. None of this would have been possible without open source.

Open source has cultivated a community of innovation that wouldnt otherwise exist, Traverso said. Anything that contains software today depends on open source your TV, phone, car, and so on. Theres huge leverage across the industry, and without all those open source components, everyone would have to either build them themselves or buy them.

This helps illustrate what open source software means to businesses of all sizes. It really isnt just free software aimed at cash-strapped startups. Instead, it serves as the fundamental building blocks of most of the technologies we use on a daily basis, something even the major technology companies rely on and its main benefit can be measured in eyeballs and people power.

Open source software is constantly improving because it is updated regularly to meet the needs of a diverse group of users, resulting in technology offerings that are more powerful and broadly applicable than just a single company and a single use case, Traverso said. While a big company might have the resources to develop these technologies from scratch, it wouldnt have the same diverse and growing body of contributors continuously iterating and making the technology better.

Indeed, even a trillion-dollar company wouldnt be prepared to develop everything from scratch internally, as that would mean going back to square one on programming languages, operating systems, databases, web servers, and more.

Using open source software allows these companies to dedicate those resources to more business-critical projects, Traverso added.

But despite all the benefits of open source software, it comes with some notable hurdles. These include the lack of proper project documentation to establish whether its safe to use a specific piece of software.

The biggest challenge is determining whether your use of open source is compatible with the security, legal, privacy, and integrity requirements of your business, Facebook open source product manager Michael Cheng said. Its sometimes challenging to determine where open source packages originate. Without knowing who created the software, it may be difficult to determine whether you can or should use it in your business.

Its also worth looking at how well supported a project is after all, many open source developers work entirely on their own dime in their spare time. A recent Synopsys report showed that 91% of codebases contained open source dependencies with zero development activity in the past two years, representing a three percentage point increase on the previous year. This should be a red flag for any company, as it could mean major vulnerabilities.

However, when that technology becomes critical to everyday products, industries and companies often collaborate to support a project that might otherwise have fallen by the wayside. This is why the Linux Foundation set up the The Core Infrastructure Initiative (CII) with backing from tech titans like Google, Amazon, Cisco, Microsoft, Intel, and Facebook. Just a few months ago, Google announced it would start funding developers for Linux kernel, which Android is based on.

If nothing else, the situation highlights some of the challenges businesses face when choosing their open source technology stack. Companies should be asking themselves if they have the expertise and the resources to build the technology in-house, Traverso said. If not, they should look for projects with thriving communities or vendor support.

Oskari Saarenmaa is cofounder and CEO of Aiven, a Finnish company that manages businesses open source data infrastructure on all the major clouds, freeing developers up to focus on building applications.

Aiven provides commercial support, such as security and maintenance, for nine core open source projects, including MySQL, Elasticsearch, Apache Kafka, M3, Redis, InfluxDB, Apache Cassandra, PostgreSQL, and Grafana. The Helsinki-based startup, which raised $100 million at an $800 million valuation back in March, works with such big-name companies as Comcast, Atlassian, and Toyota.

Above: Aiven console

According to Saarenmaa, if a company picks its open source technologies carefully, there are no obvious downsides but he warned against relying too much on contributions from a narrow community of users. With open source, theres no obvious vendor you can demand or push to implement such functionality, he said. On the other hand, as the code is open, you always have the opportunity to contribute the required changes for everyones benefit.

Its worth noting that Aiven is one of the companies that joined the Amazon-led OpenSearch project, a fork that came to be after Elastic switched Elasticsearch to a more restrictive server side public license (SSPL) that prevented cloud service providers (such as Amazons AWS) from offering Elasticsearch as a service.

Put simply, licensing is a perennial concern for open source developers across the spectrum.

Most open source projects nowadays use a pretty narrow set of licenses, but there are some commercial open source companies that muddy the waters between open and proprietary licenses, so its important to make sure you dont start building on top of something that limits your future business opportunities, Saarenmaa explained.

When it comes to starting to build something new directly on top of open source technologies, its important to understand what exactly the role of this technology is, how its licensed, and how its supported, Saarenmaa continued. If its a critical piece of technology, you should look to use popular open source technologies that are developed by a wider community of contributors in case one contributor or company steps away, there are others who can step in.

There are numerous recent examples of bait-and-switch activity, in which a company that built itself on an open source ethos changes the terms of engagement further down the road. MongoDB, for example, created the SSPL back in 2018 to enforce the exact same types of restrictions Elastic pursued essentially, stopping large cloud providers from profiting off open source without giving back. MongoDB tried to pass SSPL off as open source but withdrew its application to the open source initiative (OSI) the following year. The OSI has also calledSSPL fauxpen source proprietary software that masquerades as open source.

Justin Dorfman, open source program manager at cybersecurity company Reblaze, said there is ultimately nothing illegal about this kind of license switching and that the risk is minimal for companies engaging in the practice. In fact, it might actually be good for business MongoDBs market capitalization has gradually risen from around $4 billion at the time of its license switch to an all-time high of $25 billion this past February.

So is there anything that can or should be done to counter this trend? It wont be easy, but Dorfman says education could help.

The community should be educating computer science students early on, encouraging them to become members or volunteers of the OSI, and providing more clarity as to what open source truly is and what it isnt, he said. Just because you can see the code on GitHub or GitLab doesnt mean its truly open source. This still doesnt protect a project from switching when its convenient for them, but the more that they are aware of open source versus source available, the better.

At the top of the technology food chain, numerous companies have created billion- and trillion-dollar businesses off the back of open source software. Facebook, for example, was built on open source technologies from the get-go, with the likes of Linux, Apache, MySQL, and PHP serving as the building blocks for what is now one of the 10 most valuable companies in the world.

Much of the technology we build to power our datacenters, AI and machine learning architecture, or developer tools would not be anywhere as robust, reliable, scalable, or feature-rich as they are without the feedback, contributions, and collaborative energy of countless companies, communities, and individuals we work with in open source, Facebook open source head Kathy Kam said.

On the flip side, the social networking giant has also open-sourced dozens of its own internal projects, including React, a JavaScript library for building user interfaces that is now one of the most popular open source projects in the world. Using open source and making open source available enables all of us to build better software together, Kam continued.

Above: Facebook likes open source

But why would a company open-source some of its technologies and not others? What factors are at play here?

Many companies open-source non-differentiating parts of their technology to drive adoption for the differentiating, closed-source parts of their technology, Kam explained.

This means any technology a company has developed to support a core function of its business, but which isnt a direct competitive advantage in itself, might be better off as an open source project. In the community, it can benefit from the input of thousands of developers who might also contribute to an ecosystem of products that support the original companys core product.

However, a company of Facebooks size may have any number of reasons for pushing a piece of software into the open source sphere.

When it comes to open source, Facebooks focus is a bit different, Kam added. Our mission is to give people the power to build community and bring the world closer together. Realizing this vision at the scale and complexity of billions of users worldwide requires that we collaborate openly with diverse external stakeholders to meet these challenges head-on.

While there is often a degree of altruism involved when big tech companies elect to open-source one of their technologies, these players usually stand to benefit somewhere along the way by spurring activity in a particular space, for instance. By way of example, Facebook open-sourced Magma in 2019 to help telecom companies more easily deploy wireless networks in remote areas, a project that was eventually taken over by the Linux Foundation. How might this benefit Facebook? Well, getting people online means they can access Facebook services. This strategy is further evidenced by Facebooks significant internet infrastructure investments spanning subsea cables and satellites.

Embracing open source can also help businesses attract top technical talent developers generally like all things open source. Martin Traverso worked on Presto for nearly seven years while he was at Facebook. The open source community has a very ardent following of really talented developers and engineers, he said. During my time at Facebook, many engineers cited the companys involvement in, and contribution to, open source as a reason for joining the team. Theres also a lower ramp-up cost for developers joining the company if theyre already familiar with the technology.

There have been several billion-dollar exits in the commercial open source software (COSS) space in recent years, including enterprise-focused Red Hat, which IBM snapped up for a cool $34 billion and Mulesoft, which Salesforce took over for $6.5 billion. Throw in the countless other businesses drawing in sizable investments for their affiliations with the open source world, and its clear investors are crazy for open source, though that wasnt always the case.

So what has changed? According to Two Sigma Ventures VC Vinay Iyengar, the cloud has played a major role in this transformation.

Historically, successful COSS companies, most notably Red Hat, made money from selling technical support to their customers, he said. This was never a super compelling or scalable way to build a large software business. Over the years, however, the rise of the cloud has allowed COSS vendors to sell their software as a managed service. Companies like MongoDB, GitHub, and Cloudera were early pioneers in leveraging this open core model successfully, paving the way for a new, and far more compelling, model of COSS monetization.

Two Sigma Ventures has backed a number of notable players in the open source and open core spheres, including DevOps powerhouse GitLab and Timescale, a time-series database operator that recently announced a $40 million tranche of funding. The VC firm also launched the Open Source Index, a useful tool that showcases the most popular and fastest-growing open source projects on GitHub, allowing users to sort and filter by various criteria.

Above: Open Source Index: Top 10 by TSV ranking

Such data can prove useful for companies looking at which communities are most active, metrics that can help determine which open source technologies are worth building a commercial business on top of. For Iyengar, that is one of open source softwares core selling points.

Generally speaking, COSS companies have large preexisting communities and lots of developer love before they even begin to sell their commercial offerings, he said. This leads to remarkably efficient customer acquisition and bottoms-up growth compared to closed-source equivalents. Additionally, many of these projects constitute a core part of an enterprises infrastructure, making them very difficult to replace once implemented.

This, according to Iyengar, leads to great net retention dynamics and lower churn. We have seen this time and time again, especially with some of the new COSS pioneers like HashiCorp, Confluent, and Databricks, he said.

Many of the major VC and private equity firms have already gone all in on companies that monetize open source tools in some way. And there is at least one investor dedicated entirely to COSS startups Joseph Jacks is the founder and sole general partner at OSS Capital.

We invest exclusively in COSS companies, defined as any given company that would not exist without the co-existence of a given open source core technology,' Jacks explained. We are technology-agnostic and vertical-agnostic investors as long as the company meets this abstract definition, it fits our strict investment thesis.

OSS Capitals most recent investment was a new open source developer tool platform called Rome that launched with $4.5 million in seed funding.

While OSS Capital is mostly focused on pre-series A investments, the COSS space has generated numerous billion-dollar companies in recent years. Investing in an early-stage company may incur higher risks, but the rewards could be significant. For now, Jacks said hes happy to have OSS Capital fly under the radar.

Since our founding, we have made around a dozen investments, he said. We have intentionally kept a low profile on announcing investments since our focus today is at the pre-A stage.

Link:
Wizards of OSS: Industry perspectives on open source software - VentureBeat

Realizing this is getting out of hand, Coq mulls new name for programming language – The Register

After three decades, Coq, a theorem-proving programming language developed by researchers in France, is being fitted for a new name because it has become impossible to ignore that it sounds like bawdy English slang.

Once referred to as CoC, short for Calculus of Constructions, the programming language became Coq when work on version 5 began in 1989.

The name according to software engineer Tho Zimmermann's initial entry to the Coq GitHub wiki on April 6 is a reference to the French word for "rooster," to the Calculus of Constructions, and to the contributions of Thierry Coquand, one of the creators of the language.

Coq also happens to sound like "cock," which while it means both "a male rooster" and "to tilt," can be used informally to refer to the male anatomy. And for some people, that deters community participation.

"This similarity has already led to some women turning away from Coq and others getting harassed when they said they were working on Coq," the project wiki, last updated on Friday, explains. "It also makes some English conversations about Coq with lay persons simply more difficult."

Tech terminology changes have roiled online communities for the past few years as efforts to make computer science and other fields more welcoming to a more diverse set of people have led to the deprecation and removal of terms that carry cultural baggage like "master," "slave," "blacklist," and "whitelist."

This has been particularly evident in volunteer-based open source communities, where the need to formalize governance through codes of conduct has met with frequent resistance among people who resent the imposition of rules on a sphere where they previously acted without constraint.

The Coq community went through this itself in 2017 and 2018 when there was some debate about the need for a code of conduct confusingly abbreviated a CoC in some discussion threads.

Ribald usage of Coq isn't exactly new. Its community has been aware of the pun potential for years. But with so many projects trying to make themselves more welcoming to new contributors, the programming language has finally decided to take a serious look at removing the barrier-to-entry that its name presents.

Members of the Coq community have undertaken the thankless job of evaluating the dozens of suggested new names and, after more than two months of discussion and wiki updates, they've already rejected many for obvious failings.

For example, "Gallus," the Latin word for "rooster" has been discarded because, again, it sounds like a word for a part of the male anatomy.

Then there's "coqi," where the added "i" stands for induction, a mathematical proof technique. Unfortunately, "coqi," it sounds like "," evoking Russian slang for another male anatomical feature.

Why not "Cocon," the French word for "cocoon"? Well, "con" isn't quite polite in French as it's slang for a part of the female anatomy. The project wiki notes that this is likely to lead to more jokes, which is the problem that prompted the whole renaming effort.

How about "Bando," Portuguese for a group of roosters? Er, no. Another male anatomy reference in French slang.

But there are some more promising proposals. One possible solution involves extending "Coq" to "Coquand," since the language's name is already derived at least in part from one of its main creators. There's precedent for homage-based branding with languages like Ada, Pascal, and Haskell. It is unclear how Coq's other contributors might feel about this.

Naming is hard. No pun intended.

Read more:
Realizing this is getting out of hand, Coq mulls new name for programming language - The Register

Different Types of Robot Programming Languages – Analytics Insight

Robots are by far the most efficient use of modern science. Robots not only reduce human labor but also execute error-free activities. Many businesses are expressing an interest in robotics. Automated machines have gained popularity in recent years. Keeping the situation in mind, we shall discuss robotic computer languages.

So, in order for robots to do tasks, they must be programmed. Robot programming is the process through which robots acquire instructions from computers. A robotic programmer must be fluent in several programming languages. So lets get started.

There are about 1500 robotic programming languages accessible worldwide. They are all involved in robotic training. In this section, we will go through the top programming languages accessible today.

The easiest way to get started with robotics is to learn C and C++. Both of these are general-purpose programming languages with almost identical features. C++ is a modified version of C that adds a few features. You should now see why C++ is the most popular robotic programming language. It enables a low-level hardware interface and delivers real-time performance.

C++ is the most mature programming language for getting the greatest results from a robot. C++ allows you to code in three different ways. The Constructor, Autonomous, and OperatorControl methods are among these. In this constructor mode, the initializing code runs to build a class. It will execute at the start of the program in this scenario.

It aids in the initialization of sensors and the creation of other WPILib objects. The autonomous approach guarantees that the code is executed. It only works for a set amount of time. The robot then moves on to the teleoperation section. The OperatorControl technique is used in this case.

Python is a powerful programming language that may be used to create and test robots. In terms of automation and post-process robotic programming, it outperforms other platforms. You may use this to build a script that will compute, record, and activate a robot code.

It is not necessary to teach anything by hand. This enables rapid testing and visualization of the simulations, programs, and logic solutions. Python uses fewer lines of code than other programming languages. It also includes a large number of libraries for fundamental functions. Pythons primary goal is to make programming easier and faster.

Any item can be created, modified, or deleted. In addition, we may code the robots motions in the same script. All of this is accomplished with very little code. Python is among the finest robotic programming languages as a result of this.

Java is a programming language that enables robots to do activities that are similar to those performed by humans. It also provides a variety of APIs to meet the demands of robots. Java has artificial language characteristics to a high degree.

It enables you to construct high-level algorithms, searching, and neural algorithmic algorithms. Java also allows you to run the same code on many computers.

Java is not built into machine code since it is an interpretative language. Rather, in execution, the Java virtual computer interprets the commands. Java has become quite popular in the field of robotics as a result of this. As a result, Java is preferable to alternative robotic programming languages. Java is used by modern AIs such as IBM Watson and AlphaGo.

Microsofts .NET programming language is used to create apps with Visual Studio. It provides a good basis for anyone interested in pursuing a career in robotics. .NET is primarily used by programmers for port and socket development.

It supports various languages while allowing for horizontal scaling. It also offers a uniform environment and makes programming in C++ or Java easier. All of the tools and IDEs have been thoroughly tested and are accessible on the Microsoft Developer Network.

In addition, the merging of languages is smooth. As a result, we can confidently rank this among the best robotic programming languages.

In robotic engineering, MATLAB and its open-source cousins like Octave are extremely popular. In terms of data analysis, it is considerably ahead of many other robotic computer languages. MATLAB is not really a programming language in the traditional sense. Yet, engineering solutions based on complex mathematics can be found here.

Robotic developers will learn how to create sophisticated graphs using MATLAB data. It is quite helpful in the development of the complete robotic system. It also aids the development of deeply established robotic foundations in the robot business. Its a tool that lets you apply your methods to simulate the outcome. Engineers may use this simulation to fine-tune the system design and eliminate mistakes.

There have been cases when MATLAB has been used to build a complete robot. As a result, it must be included among the top ten languages. Kuka kr6 is one of the greatest instances of MATLAB application. MATLAB was also used to create and simulate this robot by the developers.

One of the first robotic computer languages was Lisp. It was introduced to the market to allow computer applications to use mathematical terminology. Lisp is an AI domain that is mostly used for creating Robot Operating Systems.

Tree data structures, automated storage management, syntax highlighting, and elevated-order characteristics are among the features available. As a result, it is simple to use and aids in the elimination of implementation mistakes after an issue have been identified.

This problem-solving procedure takes place at the prototype stage, not the manufacturing stage. It also includes capabilities like the read-eval-print loop and self-hosting compilation.

One of the earliest programming languages to hit the market was Pascal. Its still quite useful, especially for newcomers. It is based on the Fundamental programming language and teaches excellent programming skills. Pascal is being used by manufacturers to create robotic programming languages.

ABBs RAPID and Kukas KRL are two examples. Nevertheless, most developers consider Pascal to be obsolete for everyday use. Theyve also highlighted its significance for newcomers.

It will assist you in learning other robot programming languages more quickly. This is only recommended for complete novices. When youve gained some expertise in robotics programming, you can transition to another language.

And its a wrap. We hope that you found this article helpful regarding robotic programming languages. Weve covered all of the pros and cons of the top robotic programming languages. You can choose the most appropriate language for your needs. Even now, robotics has a promising future. So now is the ideal moment to get started.

Follow this link:
Different Types of Robot Programming Languages - Analytics Insight